facta universitatis series: electronics and energetics vol. 35, no 1, march 2022, pp. i-ii © 2022 by university of niš, serbia | creative commons license: cc by-nc-nd guest editorial advanced low-dimensional nanoelectronic devices: physics and modeling nanoelectronic devices of various kinds are essential for vlsi circuits. the struggle to follow moore’s law is becoming increasingly difficult and complex, requiring multitudinous novel approaches in order to continue decreasing dimensions of the devices which are already firmly established in the nano-world. as an example, the most advanced state of the art vlsi’s (microprocessors) currently can contain more than 50 billion transistors per chip. as far as the actual physical dimensions are concerned, in 2021 the ibm company announced their 2 nm chip. the efforts behind such achievements are enormous. this special issue on advanced planar nanoelectronics investigates some points of interest related to the physics of such devices, as well as their simulation, thus giving its contribution to the existing trends in this rapidly evolving and constantly expanding field. on may 19–20, 2021, the ieee kgec student branch chapter, in association with department of ece, kgec, technically co-sponsored by ieee eds kolkata chapter, organized international conference “devices for integrated circuit (devic)”, held in virtual mode as a measure of precaution against the covid-19 pandemic. the devic 2021 ended being a major international conference in the area of electronic devices for application in integrated circuits, with more than 300 submitted papers. it brought together leading scientists, researchers and industry professionals who shared their information and experiences and discussed practical challenges encountered and solutions adopted related to the latest developments in the area of electronic devices, circuits and vlsi. the conference was dedicated to the design, modeling and simulation of nanoelectronic devices, components, circuits and systems. the acceptance rate for the conference was about 50%, which has shown the stringent quality criteria applied to all contributions. the full proceedings of the conference were published by the ieee (isbn: 978-1-7281-99559) and can be found at ieee xplore. selected papers from devic 2021 were used as a loose inspiration for writing extended and modified and amended manuscripts with qualitatively new results for this special section of facta universitatis series: electronics and energetics. thus the articles published here had been specifically written for this special section, being loosely based on the corresponding devic 2021 presentations. each newly produced manuscript was subject to a rigorous peer reviewing procedure in which two or three reviewers from different countries were engaged. five papers altogether were selected for this special issue. the chosen articles are the following 1. dhananjaya tripathy, debiprasad priyabrata acharya, prakash kumar rout, sudhansu mohan biswal, "influence of oxide thickness variation on analog and rf performances of soi finfet" received january 31, 2022 ii guest editorial 2. remya jayachandran, k. j. dhanaraj, p. c. subramaniam, "planar cmos and multigate transistors based wide-band ota buffer amplifiers for heavy resistance load" 3. surajit bosu, baibaswata bhattacharjee, "all-optical frequency encoded dibitbased parity generator using reflective semiconductor optical amplifier with simulative verification" 4. bibek chettri, abinash thapa, sanat kumar das, pronita chettri, bikash sharma, "first principle insight into co-doped mos2 for sensing nh3 and ch4" 5. pranati ghoshal, chanchal dey, sunit kumar sen, "realization of a modified 8bit semiflash analog to digital converter based on bit segmentation scheme" the guest editors hope that the high quality of the papers included in this issue will encourage young authors to present their own achievements. the greatest pleasure for the editors would be to see new publications inspired by this special section. the guest editors would like to express their gratitude to all of the authors who ensured the existence of this special issue through their excellent contributions. the gratitude also extends to the organizers of the devic 2021 who assembled such a choice group of worldclass researchers, to fuee editor-in-chief, prof. danijel danković, as well as to the late member of the serbian academy of sciences and arts, prof. ninoslav stojadinović, who, before his untimely death, initiated and outlined the work on this special section, in cooperation with the general chair of devic 2021, prof. dr. anguman sarkar. guest editors: prof. dr. angsuman sarkar professor, kalyani government engineering college, university of kalyani, kalyani, west bengal, india prof. dr. arpan deyasi assistant professor, department of electronics and communication engineering, rcc institute of information technology, kolkata, india prof. dr. jyotsna kumar mandal professor, faculty of engineering, technology and management, kalyani university, kalyani, nadia, west bengal, india prof. dr. chandan kumar sarkar professor, department of electronics and telecommunication engineering, jadavpur university, jadavpur, kolkata, west bengal, india prof. dr. zoran jakšić full research professor, institute of chemistry, technogy and metallurgy, national institute of the republic of serbia – university of belgrade, serbia associate editor, facta universitatis series: electronics and energetics instruction facta universitatis series: electronics and energetics vol. 29, n o 3, september 2016, pp. 325 338 doi: 10.2298/fuee1603325c design and technologies for implementing a smart educational building: case study ionut cardei, borko furht, luis bradley department of computer & electrical engineering and computer science florida atlantic university, boca raton, florida, usa abstract. in this paper we describe the design of an educational smart building and the innovative technologies that were implemented. in january of 2011, florida atlantic university opened its new leed platinum-certified “engineering east” building. the building was designed as both a model of how new technologies can drastically decrease the energy requirements of a large university building and for providing a “living laboratory” so that students and faculty may actually see how these systems work and interrelate. engineering faculty was involved in providing inputs to the builder in creating state-of-the-art engineering laboratories. key words: smart building, living laboratory, sensors, led-certified, power analysis 1. introduction the development of smart buildings has gained the importance, so innovative techniques can be created to optimize the operation of the building in order to reduce expenses and save energy. recent research deals with various approaches and related technologies in designing smart buildings [1-6]. in designing our educational smart building and providing research ability, the building is outfitted with hundreds of different sensors that record everything from the temperature of the cold water entering and leaving the building to the amount of electricity generated by the solar panels on the roof and walkways to level of co2 in a lab at a certain time. the paper provides an overview of the various innovative systems implemented in our new “smart, green building,“ shown in figure 1, and highlights the various sensors and data available for analyzing these systems, both in real time, and through storing this data and processing it through data mining routines. several research projects as part of the “living laboratory” are described. received june 25, 2015 corresponding author: borko furht department of computer & electrical engineering and computer science, florida atlantic university, boca raton, florida, usa (e-mail: bfurht@fau.edu) 326 i. cardei, b. furht, l. bradley 1.1. what is leed the leed or leadership in energy and environmental design green building certification program is a voluntary, consensus-based rating system for buildings designed, constructed and operated for improved environmental and human health performance. leed addresses all building types and emphasizes state-of-the-art strategies in five areas: 1. sustainable and development 2. water savings 3. energy efficiency 4. materials and resources selection, and 5. indoor environmental quality. points are attempted in each of these five areas in order to achieve a silver, gold, or platinum level of certification based on the type of project. when the building was completed in 2011, a minimum of 52 points were required to achieve platinum level certification. engineering east achieved 55 points. the leed scorecard for the building is given at: http://www.eng.fau.edu/pdf/green_bldg_scorecard071411.pdf. fig. 1 leed platinum-certified engineering building at fau 2. main building subsystems and their design innovative technologies were implemented in the following subsystems:  hvac system (heating, ventilation & air conditioning),  cloud computing system and its network control  power generation system and its control. the mechanical equipment and sensors, installed on the 1 st floor, are shown in figure 2. http://www.eng.fau.edu/pdf/green_bldg_scorecard071411.pdf design and technologies for implementing a smart educational building: case study 327 fig. 2 mechanical equipment, pumps, piping, and sensors the building is equipped with hundreds of sensors that measure and collect various parameters and display them in real-time on the dashboard, which is accessible through a web-based application called devisewise, created by ils technology. the devicewise system periodically captures sensor data from the building’s electrical, computing, and air conditioning systems and stores the information in a database. the system then provides a web-based application that displays the summarized information in an energy dashboard accessible from any internet browser. devicewise also provides an api that allows other programs to access the data stored in the database for extraction and reporting outside of the energy dashboard. figure 3 shows the dashboard indicator with a menu of views. the selected view is 1 st floor temperature. fig. 3 dashboard indicators: on the left is the menu of various views including hvac network, power, and water systems. 328 i. cardei, b. furht, l. bradley 2.1. hvac system unlike traditional cooling systems that use air conditioning system or heat pumps to remove hot air from a building and replace it with cooled air, the system is this building does the opposite. since on most days in florida there is a need to cool buildings rather than heat them, a campus-wide chilled water system delivers cold water through the campus. our building uses an innovative technique to temper the chilled water to reduce humidity, and other systems to heat the building if needed. there are three chilled water tertiary pumps in the building to circulate the chilled water. two pumps operate in parallel continuously at 50% capacity. the third pump is normally turned off and acts as a back-up to the first two pumps. on a periodic basis, the active pumps are cycled. as the chilled water arrives in the building some of it is first put through a heat exchanger that increases the temperature of the water by around 10 degrees fahrenheit. the rest of the water is then piped into the chiller beams in the building and sent to the roof to run through the coils of the air handler units. many sensors are installed throughout the system to ensure that all components are working properly. these sensors include the supply and return temperatures at different locations in the building, the output flow and status of the three pumps, the differential and the status of the chilled water control valve, as shown in figure 4. fig. 4 chilled water system and relating sensors the heating hot water system used to heat the building in winter and dehumidify in summer is comprised of three different water system circuits, a well water system, a source water system and a hot water system. the well water system pulls water from a well which maintains a constant water temperature of around 78 degrees f. this water is then run through a heat exchanger between the well water system and the source water system. depending upon the temperature of the source water, the heat exchange design and technologies for implementing a smart educational building: case study 329 either transfers heat energy from the source system to the well system or from the well system to the source system. the source water system runs the water through pipes linked into the computer server room cooling system and absorbs the heat energy from the it equipment. the water heated by the servers is passed through another heat exchange unit to the hot water system on the other side. the hot water system then absorbs the heat energy from the source water system. if additional hot water is required for dehumidification or heating above what is generated by the computer room servers, the source water may also run through one to three heat pumps that extract additional heat from the source water and transfer that heat energy to the hot water system. the resulting hot water is then sent through the piping system to each floor. there is one additional heat-pump used as a back-up and the order of which heat pump is first, second, third is rotated on a weekly basis. sensors record the temperature of the well water, the temperature before and after the first heat exchanger, after absorbing the energy from the servers, and at other spots throughout the building, as shown in figure 5. sensors also record the status and operating statistics for the various pumps and heat pumps used in the system. fig. 5 heating hot water system and relating sensors the high temperature chilled water component for air conditioning uses a heat exchange unit to transfer heat energy from a separate water flow to the chilled water system. this results in the separate system maintaining its water temperature at approximately 10f above the chilled water temperature (approximately 55f). the increase in temperature is necessary to reduce the possibility of condensation when the water is run through pipes directly above certain locations. this water is then circulated in separate pipes throughout the building. the system relies on two pumps with one pump running at a time and the other acting as a back-up. the pump’s responsibilities are swamped each week. the main sensors 330 i. cardei, b. furht, l. bradley used by the high temperature chilled water system include the status and output of the two pumps, the temperature of the water entering, after the initial heat exchange, and before the final heat exchange. 2.2. engineering server room and cloud computing system the server room is set up using a “room within a room” (or “hot isle”) configuration. the room itself is a 600 square feet open space cooled by fans blowing over the tempered cold water system. the inner room is comprised of 14 server racks, 4 computer room cooling units and an uninterruptible power conditioning system. the racks and equipment are installed so that they form an enclosed “hot aisle” in about square 300 feet of space. fig. 6 (a) the server room consist of private cloud computing system. (b) computer laboratory connected to cloud computer using thin clients. the four computer room cooling units supply additional cooling using a gas to liquid refrigerant configuration. the units remove heat from the “hot aisle” and pass that heat energy through a heat exchanger to the building heating system. the resulting cooler air is blown into the outer room further cooling that space. server and equipment fans then pull in this cooled air and blow warmer air back into the “hot aisle”. the server room maintains different type of sensors. one set of sensors captures power data from the four computer room cooling units including the total amperage for the supply fans and compressors. several lan connected temperature and humidity sensors are deployed throughout both the inside room and outside room that provide realtime access to that information. another set of sensors monitors the lan traffic flow to and from the servers and power used (in kilowatts) by the servers. the computer system is architected as a private cloud computing system. two computer laboratories and all computers in the building are using cloud computing technology to run software and access data stored in the cloud computing system (fig. 6). the cloud computer system consists of 14 blade computers and the network traffic is measured and controlled in real time, as illustrated in figure 7. design and technologies for implementing a smart educational building: case study 331 fig. 7 network trafic measurements: cumulative inbound transfer 2.3. building power generation and control the building generates approximately 4% of its power from three arrays of solar photovoltaic cells. one set of 96 cells is installed on the south-east facing roof of the davinci conference center. two other arrays are installed over the north-south (32 panels) and east-west (48 panels) walkways around the building (fig. 8). the panels are all rated at 282 watts each which results in a maximum output from all panels with direct sunlight at around 50 kw. the power generated by the solar arrays are routed through a central power conversion unit located in the server room where the direct current from the arrays is converted to alternating current and added to the building’s power grid. the converter provides real time reporting of the watts passed through to the grid. fig. 8 solar photovotalic cells for poer generation 332 i. cardei, b. furht, l. bradley the overall utilization of electricity within the building is monitored by a set of sensors. the utilization is divided into the following categories so that the changes to each system can be analyzed. the categories include:  mechanical equipment – power used to run the air handler units, pumps, heat pumps, dampers, fans and any other equipment involved in the air conditioning of the building.  lights – power used to light the building.  receptacle – power consumed by any devices plugged into wall receptacles in the building  kitchen – power consumed by the equipment in the kitchen.  power consumed by the uninterruptible power supply systems to charge back-up batteries.  emergency power, and  solar power generated by photovoltaic panels. figure 9 illustrates the cumulative power usage by various subsystems, while figure 10 shows the real-time network measurements for separate chassis. fig. 9 cumulative power usage by various subsystems the lights in the building are controlled through a system of linked sensors and switches provided by encelium technologies. the lighting system includes both occupied and unoccupied modes for the main hallway lighting, along with overrides based on building occupancy. it relies on sensors and switches installed through the building that determine room occupancy by motion detection and required illumination based on ambient light. the switches allow room occupants to temporarily override normal room lighting. design and technologies for implementing a smart educational building: case study 333 fig. 10 real-time network measurements for separate chassis. the system uses a light management application from encelium named polaris 3d to optimize the power required for room illumination. the system receives inputs from the various sensors and switches throughout the building, combines that information with configurable lighting parameters then determines the best lighting for each room and each area. it also records lighting and occupancy information for further analysis and research. 3. living laboratory: research projects 3.1. alerting and monitoring system as part of the nsf center project, we worked with aware technologies and their process data monitor (pdm) system [7], which is an alerting system that uses data mining techniques to categorize sensor data into similar clusters of information. once these clusters are identified, users can determine whether the clusters represent normal or abnormal running conditions for the sensor data used in creating the cluster. the system will then keep track of the number of times each cluster is computed and when providing an analytical tool to assist in optimization. it will also detect and alert when a cluster is calculated that is outside of the normal operating parameters and report those anomalies through emails. pdm uses a tool named xlreporter to extract information from building sensors and reformat the information into xml files that are then processed by the system. we also developed data warehouse that stores information from several different sensor systems including device wise, standalone wireless and wired sensors, pdm calculated clusters, and weather stations. the collected weather data comes from a link to the weatherbug api [8] and pulls meteorological information every 15 minutes including temperature, humidity, wind speed and direction, air pressure, rain amount and light 334 i. cardei, b. furht, l. bradley levels. the system provides for a set of utility programs that extract the data from the sensors, store it in the fau “green” database then summarize and export the information in a variety of different formats for use by other tools such as weka and excel. these systems have been used in several preliminary studies to help validate the data being collected and determine possible future research opportunities. these studies included:  determining correlation between photovoltaic energy generation and weather conditions,  calculating energy flow between the different components of the air conditioning systems,  categorizing cluster data from the pdm system, and  determining room occupancy based on room co2 levels. 3.2. building power analysis in this section we present a summary of results we obtained from analyzing the performance of various building systems. photovoltaic power system we analyzed the efficiency of the solar panels by tracking the power generated over a period of one year. figure 11 shows the solar energy generated per day (in kwh, right y axis) in comparison with the total energy consumed by the building systems per day (kwh, left axis), excluding the energy used by it equipment in the data center. fig. 11 photovoltaic energy generated and the total energy used by building systems, including mechanical, receptacles, kitchen, lighting, and standby power. design and technologies for implementing a smart educational building: case study 335 we noticed a high variation in the solar power generated; this is due mainly to variable day by day cloud coverage. between march and may it was a period of very clear skies with almost no rain and hardly any clouds. conversely, at the beginning of 2013 it was a period of high cloud coverage combined with the shorter day time that caused reduced solar energy generation. the total energy consumed by building systems, in orange color, depends on building occupancy, with lower consumption during school breaks in march, june-august, for thanksgiving (end of november), and the winter break. a summary of the energy statistics are listed in table 1. the solar energy produced is on average 4.45% of the total energy used by building systems and has a high standard deviation – 1/3 of the mean. we built a predictive model with the weka tool for the solar power having the time of day, outside temperature, light-level %, humidity %, and rain as attributes. a reptree decision tree algorithm achieves the lowest relative absolute error of 19.26% among all alternatives, and a mean absolute error of 1.38 (kw), with a correlation coefficient of 93.9%. the solar power prediction error is caused by measuring the light level from a meteorological station located 4 km from the building; clouds passing on cause a variable delay in measuring light levels. table 1 summary statistics for the solar energy produced and the total energy consumed during 1-year period. photo energy per day (kwh) total energy per day (kwh) photo / total % maximum 227.05 3720.83 7.60% minimum 16.53 2575.42 0.60% average 134.55 3030.52 4.45% stdev 45.84 223.83 1.54% the solar energy produced is on average 4.45% of the total energy used by building systems and has a high standard deviation – 1/3 of the mean. we built a predictive model with the weka tool for the solar power having the time of day, outside temperature, light-level %, humidity %, and rain as attributes. a reptree decision tree algorithm achieves the lowest relative absolute error of 19.26% among all alternatives, and a mean absolute error of 1.38 (kw), with a correlation coefficient of 93.9%. the solar power prediction error is caused by measuring the light level from a meteorological station located 4 km from the building; clouds passing on cause a variable delay in measuring light levels. a k-means clustering algorithm applied to the same photovoltaic data model yields the clusters seen in figure 12. the relation between the light value and the solar power generated is disturbed by the aforementioned measurement delay and by the orientations of the solar panels that don't match the normal orientation of the light sensor. 336 i. cardei, b. furht, l. bradley fig. 12 dependence of the photovoltaic power (vertical axis) of the light level (horizontal axis). instances are color-coded according to classes determined by the k-means clustering algorithm. prediction models for receptacle power receptacle power in the building is used by anything plugged into a power outlet. this depends in part on the building occupancy, as office equipment (e.g. laptops) is a major consumer. figure 13 shows the dependence of the receptacle power (vertical axis, in the 16.6-26.4 kw interval) on the time of day (0-2400), as measured for the sept. 2013 month. the power rises sharply after 8am, peaks at 1pm, then drops gradually after 4pm until 11pm. the points in brown represent measurements taken during the weekend, with a lower power value. the weekend peak is 19 kw, the weekday peak (on wednesday at 1pm) is 26.4 kw, while at night the power drops to 16.9 kw. the chart also partitions the measurements into 7 clusters determined by the k-means algorithm. a m5 decision tree classifier computed with weka has a relative absolute error of 36.15%, a mean absolute error of 0.6837, and a correlation coefficient of 92.31%. for training and evaluation in all experiments we used 10-fold cross validation and we searched for the lowest error. design and technologies for implementing a smart educational building: case study 337 fig. 13 weka screenshot showing the dependence of the receptacle power on the time of day (0 – 2400) and 7 clusters computed using k-means algorithm. data center power the data center consists of two racks located in a “hot isle” enclosure, with separate uninterruptible power supply units used by computing blades, discrete pcs, network attached storage, and networking equipment. the total power used by the data center includes that used by the four crac (air conditioning units) that cool the air inside the hot isle. the total data center power has grown in small chunks due to additions and upgrades to equipment, from 277 kw, in 08/2012, to 368 kw, in 10/2013. the total power has a standard deviation during a day and during a week of about 1.1% of the average for the corresponding period. the hot water circuit that extracts heat from the hot isle using heat pumps achieves a high temperature of 50°c and a maximum differential of 20°c, proving effective in reusing waste heat for other building systems. 4. conclusions strong instrumentation in the new leed platinum engineering building opens up a multidimensional view of the inner working of its hvac and power systems. a variety of sensors allow a detailed analysis of the building system performance and the data center power utilization. however, it is equally important to consider external variables, such as weather, building occupancy, and school schedule to get a more accurate picture. 338 i. cardei, b. furht, l. bradley our analysis found that the solar power generated covers a maximum of about 7.6% and 4.65% on average of the total building-related energy consumption and that it varies highly with cloud coverage. another interesting observation is that the total power used by the building (122 kw on average for 09/2013) is 2.79 times smaller than the power consumed by the data center (average of 341 kw). still, this figure does not include most of the power needed for cooling the building, as it is spent by the chilled water campus plant. in the future we will conduct more analysis aiming to estimate power savings from cooling in the data center by raising the hot isle temperature. acknowledgement: the paper is a part of the research done within the project funded by nsf industry/university cooperative research center for advanced knowledge enablement, 2009-2020. references [1] d. sciuto and a.a. nacci, “on how to design smart energy-efficient buildings,” in proceedings of 12 th ieee international conference on embedded and ubiquitous computing, milano, italy, 2014, pp. 205208. [2] “special section on intelligent buildings and home energy management in a smart grid environment,” ieee transactions on smart grid, vol. 3, no. 4, december 2012. [3] o. evangelatos, k. samarasinghe, and j. rolim, “evaluating design approaches for smart building systems,” in proceedings of the 9 th international conference on mobile adhoc and sensor systems, las vegas, nevada, 2012, pp. 1-7. [4] y. sun, t-y. wu, g. zhao, and m. guizani, “efficient rule engine for smart building systems,” ieee transactions on computers, vol. 64, no. 6, pp. 1658-1669, june 2015. [5] r. fantacci, t. pecorella, r. viti, c. carlini, and p. obino, “enabling technologies for smart building, what’s missing?”, in proceedings of the aeit conference, mondello, italy, 2013, pp. 1-5. [6] s. tadokoro et al, “smart building technology,” ieee robotics & automation magazine, vol. 21, issue 2, 2014, pp. 18-20. [7] aware technologies, process data monitor (pdm), http://awaretechnology.com/. [8] weatherbug api, http://weather.weatherbug.com/desktop-weather/api-documents.html. http://awaretechnology.com/ instruction facta universitatis series:electronics and energetics vol. 27, n o 1, march 2014, pp. 13 23 doi: 10.2298/fuee1401013i review of advanced igbt compact models dedicated to circuit simulation  petar igić 1 , nebojša janković 2 1 electronic system design centre, college of engineering, swansea university, singleton park, swansea sa2 8pp, united kingdom 2 department of microelectronics, faculty of electronics engineering, university of niš, serbia abstract. the paper aims to review the research area of the igbt compact modelling and to introduce different device models. the models are separated in two groups, one that solves ambipolar diffusion equation (ade) and one that does not. both types of compact models have been successfully used in the past for power electronic circuit design. key words: igbt, compact, model, power, inverter, circuit 1. introduction insulated gate bipolar transistors (igbts) are devices of choice in modern power converter systems targeting medium to high voltage and current applications, such as hybrid or electric vehicles [1], [2]. during the power circuitry early design stages, one could consider an igbt to be a binary on-off switch, thus achieving very fast simulation of the converter operation. however, this modelling approach cannot be used to analyze some key aspects of the device and converter performance such as heat dissipation for example, very important design parameter especially during operation at high switching frequencies [3]-[5]. obviously, this will not lead to robust equipment design, as it does not provide any information regarding switch failure mechanisms. to overcome all the above issues, one could develop and use the igbt models based on full internal physics of the device. these would be typically 2d or 3d finite-element (fe) models developed and run in some of the commercially available simulation tools. this modelling approach will provide designers with detailed knowledge of the igbt devices, but it requires very long simulation time, and it is numerically prohibitive if one would like to study complex circuits containing multiple power devices requiring many switching events [6]. the compact modelling approach is placed between these two extremes [7]-[27]. compact models are lower complexity, but yet fully physically based and very accurate,  received december 18, 2013 corresponding author: petar igic electronic system design centre, college of engineering, swansea university, singleton park, swansea sa2 8pp, united kingdom (e-mail: p.igic@swansea.ac.uk) 14 p. igic, n. jankovic models of the power devices dedicated to circuit simulation. this physical modelling approach could be based on certain mathematical simplifications of the fundamental semiconductor charge transport equations, for example [9], [20], [21], [24]. in order to develop an igbt model that will describe correctly its static and dynamic behaviour, the main challenge is to incorporate into a device model conductivity modulation and nonquasistatic charge storage effects [22], [23]. the absence of an industry-accepted igbt model and the pronounced industry-need for more accurate igbt compact model have triggered very intensive research in this area for more than a decade. the distinct challenge in developing igbt compact model for circuit simulation lays in the fact that model needs to satisfy some refuting requirements. it needs to provide high quantitative accuracy, short cpu time, and physical, yet easy to determine model parameters. as a result different igbt compact models have been developed and presented in the literature, some of those suitable for long time inverter simulations (minutes) [15]. the aim of this paper is to review the research area of the igbt compact modelling and to introduce different models, such as igbt models based on the ambipolar differential equation (ade) solutions [18]-[24] and the ones which are not solving ade, typically physics-based sub-circuit models [14], [16], [17], [25]-[27]. 2. ade solution based models to describe igbt's static and dynamic behaviour, the incorporation of conductivity modulation and non-quasistatic charge storage effects into the device model is vital [7], [22]. when the excess carrier density overcomes the igbt's n base doping level by several orders of magnitude within the carrier storage region, the assumption that the excess electron concentration, n, and excess hole concentration, p, are equal is valid [21]. then, the carrier transport is determined by the ambipolar diffusion equation (ade): 2 2 ( , ) ( , ) ( , )p x t p x t p x t d x t       (1) where d represents the ambipolar diffusion constant and stands for the ambipolar carrier lifetime. the boundary conditions for the above equation are determined by the current at the left (xl) and right (xr) ends of the carrier storage region. at the left end of the carrier storage region, the electron and hole currents are given by: ( ) ( ) l electron l nl n l n x n i x i anq e x aqd x       (2) ( ) ( ) l hole l pl p l p x p i x i apq e x aqd x       . (3) in the above equations, q stands for the electron unity charge, a is the cross sectional area of the carrier storage region, dn and dp stand for the electron and hole diffusion constants respectively, n represents the electron mobility, p is the mobility of the holes, and e stands for the electric field. dividing equation (2) with dn and equation (3) with dp and then subtracting (3) from (2) (under condition n  p andn /dn =p /dp) gives a derivative boundary condition on p for the ade at the left end of the carrier storage region [18]-[22]: review of advanced igbt compact models dedicated to circuit simulation 15            p pl n nl x d i d i qax p l 2 1 . (4) a similar expression is obtained for the right end of the carrier storage region:            p pr n nr x d i d i qax p r 2 1 , (5) where inr and ipr represent electron and hole currents respectively at the cathode end (see fig. 1). (a) (b) fig. 1 schematic representation of the nptigbt structure (a) and bipolar part of the structure (b) 2.1. exponential solution based models an exponential approximation based solution for this equation has been developed. to model the plasma carrier distribution, set of exponential shape functions is used [21]. these shape functions are found to model the shape of the plasma correctly, without 16 p. igic, n. jankovic oscillations in the internal distribution. the slopes to the carrier distribution at the boundaries are also physically correct. in steady state forward bias operation the plasma carrier concentration has a distribution of catenary form requiring just two exponential basis functions giving [21]: lxlx beaep //   (6) where l is the diffusion length. in transient operation, more complex profiles can be approximated using a number of exponential basis functions with a range of decay length parameters, shorter than the steady state ones. the models reported in [21], [22] actually uses up to seven exponential basis functions to model the plasma distribution during transient operation. to implement model and make it functional, one needs to determine forward junction voltage between p+ emitter and nbase (see fig. 1), the ohmic voltage drop across the plasma region, the depletion voltage, depletion capacitance, and depletion current at the anode end. the depletion current is a small extra current component that exists under high speed transient conditions as described in [21], [22]. this model has been used successfully to predict switching characteristics of different commercially available igbt devices; one example is given in fig. 2. turn-off time [s] 0.0 0.1 0.2 0.3 0.4 0.5 0.6 v a [ v ] -100 0 100 200 300 400 500 i a [ a ] -3 0 3 6 9 12 15 -------experiment -------compact model fig. 2 igbt turn-off characteristic experimental results vs. compact model the downside of this modelling approach is model complexity [20], large number of model parameters [19], [22], difficult model implementation in circuit simulators such as pspice, long simulation time. many modern igbt devices include localize life time control (llc) region in order to reduce current tail during device turn-off and increase operating frequency [11]. the above discussed model does not have the ability to directly include llc region. in order to include this feature, it needs some alterations. if, for example, llc region is inserted between the p-emitter and arbitrary dashed line shown in fig. 1b, the plasma carrier distribution model will need to be consider separately across these two regions(each having different lifetimes) left and right from the arbitrary dashed line. this would introduce another set of equations needed to describe boundary between the llc region and rest of the nbase region, thus making model even more complex. review of advanced igbt compact models dedicated to circuit simulation 17 2.2. fourier series solution based models a fourier series solution based model has been developed and described in [24]. it has been based on the research results showing that the diffusion equation could be solved by means of an electrical analogy [23]. the plasma carrier concentration has a distribution of a sum of fourier series components in space: 1 0 1 2 1 ( ) ( ) ( ) cos k k k x x p p t p t x x            (7) where k represents the harmonic number. set of equations described in (7) can be represented in the form of two rc lines corresponding to the even and odd values of k. the rc lines are driven by currents defined by the boundary conditions as described in [24]. fig. 3 analogue solution to the ade [24] fig. 3 shows analogue solution to the ade with fixed or mobile boundaries. in fig. 3, qs represents total carrier stored charge, p0,…,pk stand for fourier series coefficients, w is the width of the n-base region, xl and xr are the positions of the plasma region left and right boundaries (see fig. 1b), pxl and pxr corresponds to the excess carrier concentration at xl and xr respectively, and currents ipl,r and inl,r are as depicted in fig. 1a. the model could be implemented in any general purpose simulation software having non-linear elements and variable parameters [24]. 3. physics-based sub-circuit compact models common feature of all sub-circuit compact models is that they are not trying to solve ambipolar diffusion equation in order to reproduce measurement data or predict device characteristics. recently, hisim-igbt compact model has been developed and presented [25]-[27]. model is based on the consistency of the potential distribution within the igbt device by considering in great details the mosfet surface potentials and the bjt junction potentials, as described in [25], [27]. the model has been originally developed for trench igbt device. hisim-igbt equivalent circuit is shown in fig. 4. the igbt's 18 p. igic, n. jankovic mosfet part is described with a conventional model, and the main model development effort has been put into extending the bjt shown in fig. 4, since igbt output current is managed by the bipolar transistor theory. in this model, the igbt characteristics are determined by three parameters, the trench-bottom mosfet gate charge, qtb, the base resistance, rb, and the nqs igbt base charge model, qb. fig. 4 hisim-igbt equivalent circuit [27] another popular igbt model has been presented by jankovic et al. in [14], [16], [17]. in [14], the physics-based igbt sub-circuit model which successfully included the effects of localised lifetime control (llc) on device electrical performance has been described. in particular, the model depicts the non-punch trough igbts with different locations of llc region. in what follows, the description of model implementation in spice will be given with attention to the modifications performed to include the lifetime control effects. the equivalent sub-circuit of the igbt model implemented in spice is shown in fig. 5. fig. 5 llc igbt equivalent circuit it includes a n-channel mosfet, a wide-base pnp bipolar transistor (bjt), the voltage-dependent base resistor rbb, the p-n junction capacitances, cbc and cbe, the gate overlapping source capacitance cgs, and the drain-gate overlapping capacitance cgd (the gate-overlap capacitor cox in series with the gate induced depletion capacitance cd). the review of advanced igbt compact models dedicated to circuit simulation 19 n-channel mosfet part of the igbt is modelled using a spice level 5 model. pnp bjt is low efficient and its operation is fully affected by the llc technique. fig. 6a shows the schematic of the pnp bjt circuit model consisting of two voltage-controlled current sources (ie and ic) and the junction capacitances cbe and ccb. the current sources mirror the input/output currents of separately developed sub-circuit shown in fig. 3b. the carrier transport trough the emitter and the base quasi-neutral regions (qnrs) are described with two equivalent loosy transmission lines (tls) consisting of identical rccells shown in figure 6c. the input voltage generators f(ube) and f(ucb) perform the voltage transformations {exp(ube /vt) 1} and {exp(ucb/vt) 1}, respectively. the rc-cell elements are non-linear conductance, capacitance, resistance and load impedance denoted as gk, ck, rk and zl, in fig. 6. their values are calculated by the following formulas [21]: 2 2 1 2 5 1 6 1 2 3 1 2 4 2 2 1 2 1 7 1 2 8 1 2 1 2 9 1 2 1 1 1 ( 1 1 ) (1 1 ) (1 ) (1 1 ) 1 1 (1 1 ) (1 1 ) k k k k u k k k k k k k k l n c c c u g c c c c u c c u c u u r c c u c c u c u z c c u                                (8) where u2k-1, u2k+1 and u2k are the input, the output and the middle node voltage, respectively, in the k-th rc-cell. (a) (b) (c) fig. 6 llc igbt model details 20 p. igic, n. jankovic the parameters c1-c9 are related to the physical and technology parameters of the emitter or the base qnrs as: sendie endddpie ie d t sat dnie d ie iet d ied ie vnq n c wncqn c wqn n c w vc v c wncqn c n wqn c nqv wn c wqn c n n c 2 , , 9 2 62 0 3 2 10 8 2 5 2 2 0 27 0 42 2 1 2 , 2 , , 2 , 2 , , 2 , 2 , 4                (9) where nd is the doping concentration, 0 is the doping-dependent low-field mobility, vsat represents the drift saturation velocity, 0stands for the low-injection level dopingdependent minority carrier lifetime, nie is the effective intrinsic carrier concentration incorporating band-gap-narrowing effects , and cn, cp are the auger's recombination constants. the parameterwis a physical width of single rc-cell, which is obtained by dividing a zero-bias qnr width w with the chosen number n of rc-cells (w=w/n). in [14], the emitter qnr is represented with three rc-cells. since the llc substantially decreases0 of particular device area, it follows from eqs. (8) and (9) that the rc-cells of controlled recombination region must have different gk element. it is illustrated in fig. 6c where the rc-cell of the controlled region is shown separately with different conductance g(). note that the bjt with the first (from left to right) rc-cell shaded shown in fig. 6b corresponds to the location of the llc region within the igbt. the model, as described above, has been used successfully for the prediction of the inverter circuit power losses [16] and also to investigate igbt tail current characteristics at different temperatures as shown in fig. 7. fig. 7 simulated and measured anode tail current and anode voltage of pt igbt during the device turn-off at 25 o c, 75 o c and 125 o c review of advanced igbt compact models dedicated to circuit simulation 21 4. electro-thermal (et) modelling strategy thermal compact model of an igbt is equally important as its electrical counterpart to accurately predict circuit performance [3]-[5]. the work presented in [3] describes an et modelling strategy that has been widely accepted by compact modelling research community and successfully applied since. it could be described in what follows. adding an extra node, thermal node, to the electrical compact model of the igbt device an electrothermal (et) models can be formulated. this thermal node has information regarding junction temperature of the device tj and it represents a connection between the active devices and rest of the circuit thermal network [3]. this is schematically represented in the fig. 8. a structure diagram of the et compact device model which shows the interaction between thermal and electrical networks through the electrical and thermal nodes is shown in fig. 9. as can be seen from the fig. 9, the instantaneous value of the device temperature estimated by the thermal network is used for the calculation of the temperature dependent model parameters and temperature dependent silicon properties. then, these temperature dependent values are used by the et compact device models to calculate instantaneous electrical characteristics as well as instantaneous dissipated power. finally, the dissipated power is used as an input parameter by the thermal network, and the device electrical characteristics are transferred to the electrical network. fig. 8 igbt et compact model – electrical contacts (g, a, k) as well as thermal node (tj) are shown fig. 9 structure diagram of the et compact model – interaction with the electrical and thermal network is shown 22 p. igic, n. jankovic the thermal parts of the compact models are represented using a thermal rc network due to an electrical analogy [4]: thermal resistance is represented by an electrical resistance, thermal capacitance by an electrical capacitance, and dissipated power by current source [29]. either foster or cauer rc networks can be used for this purpose [28], [29]. since the foster network is not directly suitable for the heat-flow path identification (because of the node-to-node heat capacitances), the cauer rc network is preferred choice for thermal device characterisation. cauer network includes only node-to-ground capacitances and it represents a discretised image of the real heat-flow structure. network elements can be determined by using a deconvolution method for extraction of the rc thermal network parameters from the thermal transient response of the device for a step function excitation [28]. namely, applying an abrupt dissipation step onto the chip, the time-function of the rise of the chip temperature has to be determined. either experimental method or 3d finite element model could be employed to obtain these thermal transient responses [29]. 5. conclusions the research area of the igbt compact modelling has been reviewed and different device models have been introduced. the models could be separated in two groups, ones that solve ambipolar diffusion equation (ade) and others that do not. the models based on ade solution, one could claim, are more physically based, but they are more complex to include in standard circuit simulator, need longer cpu time, might have convergence problems when simulating the circuits with larger number of igbts. both types of compact models have been successfully used in the past for power electronic circuit design. references [1] r.s. chokhawala, j. catt and b.r. pelly, "gate drive considerations for igbt modules", ieee transactions on industry applications, vol. 31,pp. 603-611, 1995. [2] p. palmer and a.n. githiari, "the series connection of igbt's with active voltage sharing", ieee transactions on power electronics, vol. 12,pp. 637-644, 1997. [3] a.r. hefner and d.l. blackburn, "thermal component models for electrothermal network simulation", ieee transaction on components, packaging and manufacturing technology, vol. 17–a, pp. 413-424, 1994. [4] v. szekely, a. poppe, a. pahi, a. csendes, g. hjas and m. rencz, "electro-thermal and logi-thermal simulations of vlsi designs", ieee transactions on vlsi systems, vol. 5, pp. 258-269, 1997. [5] h. vinke and c.j. clemens, "compact models for accurate thermal characterisation of electronic parts", ieee transaction on components, packaging and manufacturing technology, vol. 20-a, pp. 411419, 1997. [6] s. wunsche, c. class, p. swartz and f. winkler, "electro-thermal circuit simulation using simulator coupling", ieee transactions on vlsi systems", vol. 5, pp. 277-282, 1997. [7] a.r. hefner, "a dynamic electro-thermal model for the igbt", ieee transactions on industry applications, vol. 30, pp. 394-405, 1994. [8] p. turkes and j. sigg, "electro-thermal simulation of power electronic systems", microelectronic journal, vol. 29, pp. 785-790, 1998. [9] r. kraus and h.j. mattausch, "status and trends of power semiconductor device models for circuit simulation", ieee transactions on power electronics, vol. 13, pp. 452-465, 1998. [10] a. ramamurthy, s. sawant and b.j. baliga, "modeling the [dv/dt] of the igbt during inductive turn off", ieee transactions on power electronics, vol. 14, pp. 601-606, 1999. [11] e. napoli, a.g.m. strollo, p. spirito, numerical analysis of local lifetime control for high-speed low-loss p-i-n diode design, ieee transactions on power electronics, vol. 14, pp. 615-621, 1999. review of advanced igbt compact models dedicated to circuit simulation 23 [12] a. ammous, s. ghedira, b. allard, h. morel, d. renault, "choosing a thermal model for electrothermal simulation of power semiconductor devices", ieee transactions on power electronics, vol. 14, pp. 300-307, 1999. [13] c.m. tan and k.-j. tseng, "using power diode models for circuit simulations a comprehensive review", ieee transactions on industrial electronics, vol. 46, pp. 637-645, 1999. [14] n. jankovic, p. igic and n. sakurai, "compact model of the igbt with localized lifetime control dedicated to power circuit simulations", solid state electronics, vol. 54, pp. 268 – 274, 2010. [15] p. igic and z. zhou, "high-speed electro-thermal modelling of a three-phase igbt inverter power module", international journal of electronics, vol. 97, pp. 195 – 205, 2010. [16] n. jankovic, z. zhou, s. batcup and petar igic, "an advanced physics-based sub-circuit model of pt igbt", international journal of electronics,vol.96, pp. 767 – 779, 2009. [17] n. jankovic, t. pesic and p igic: "all injection level power pin diode model including temperature dependence", solid-state electronics, vol. 51, pp. 719-725, 2007. [18] a.j. forsyth, s.y. yang, p.a. mawby, p. igic, "measurement and modelling of power electronic devices at cryogenic temperatures", ieee proc. on circuits, devices and systems, vol. 153, pp. 407 – 415, 2006. [19] p. igic, p.a. mawby and m.s. towers, "physically based 2d compact model for power bipolar devices", international journal of numerical modelling – electronic networks, devices and fields, vol. 17, pp. 397-405, 2004. [20] p. igic, p.a. mawby and m.s. towers, "a 2d physically based compact model for advanced power bipolar devices", elsevier's microelectronics journal, vol.35, pp. 591-594, 2004. [21] p. igic, p.a. mawby, m.s. towers and s. batcup, "a new physically based pin diode compact model for circuit modelling applications",iee proc. on circ., devices and sys.,vol.149, pp. 257-263, 2002. [22] p. igic, p.a. mawby, m.s. towers, w. jamal and s. batcup, "investigation of the power dissipation during igbt turn-off using a new physics-based igbt compact model", microelectronics and reliability,vol.42, pp. 1045-1052, 2002. [23] p. gillet, m. kallala, j-l. massol and p. leturcq, "analogue solution of the ambipolar diffusion equation", c.r. acad. sc. paris, t. 321, serie ii-b, pp. 53-59, 1995. [24] p. leturcq, j-l. debrie and m.o. berraies, "a distributed model of igbts for circuit simulation", in the proc. of epe'97, pp. 1.494-1.501, 1997. [25] m. miyake, a. ohashi, m. yokomichi, h. masuoka, t. kajiwara, n. sadachika, u. feldmann, h.j. mattausch, m. miura-mattausch, t. kojima, t. shoji andy. nishibe, "a consistently potential distribution oriented compact igbt model", in the proc. of power electronics specialists conference, pp. 998-1003, 2008. [26] d. navarro, t. sano and y. furui, "a sequential model parameter extraction technique for physicsbased igbt compact model", ieee trans. on electron devices, vol. 60, pp.580-586, 2013. [27] m. miyake, a. ohashi, m. yokomichi, h. masuoka, t. kajiwara, n. sadachika, u. feldmann, h.j. mattausch, d navarro, u. feldmann, t. kojima, t. ogawa and t. ueta, "hisim-igbt: a compact si-igbt model for power electronic circuit design", ieee trans. on electron devices, vol. 60, pp. 571-579, 2013. [28] v. szekely, "identification of rc network by deconvolution: chances and limits", ieee trans. on fundamental theory and applications, vol. 45, pp. 244-258, 1998. [29] p. igic, p.a. mawby, m.s. towers and s. batcup, "thermal model of power semiconductor devices for electro-thermal circuit simulations", in proc. 23 rd ieee international conference on microelectronics (miel 2002), nis, yugoslavia, vol. 1, pp. 171-174, 2002. http://ieeexplore.ieee.org/xpl/recentissue.jsp?punumber=2190 http://ieeexplore.ieee.org/xpl/abstractauthors.jsp?tp=&arnumber=4592060&querytext%3dmiyake+ohashi+igbt+model http://ieeexplore.ieee.org/xpl/abstractauthors.jsp?tp=&arnumber=4592060&querytext%3dmiyake+ohashi+igbt+model http://ieeexplore.ieee.org/xpl/abstractauthors.jsp?tp=&arnumber=4592060&querytext%3dmiyake+ohashi+igbt+model http://ieeexplore.ieee.org/xpl/abstractauthors.jsp?tp=&arnumber=4592060&querytext%3dmiyake+ohashi+igbt+model http://ieeexplore.ieee.org/xpl/abstractauthors.jsp?tp=&arnumber=4592060&querytext%3dmiyake+ohashi+igbt+model http://ieeexplore.ieee.org/xpl/abstractauthors.jsp?tp=&arnumber=4592060&querytext%3dmiyake+ohashi+igbt+model http://ieeexplore.ieee.org/xpl/abstractauthors.jsp?tp=&arnumber=4592060&querytext%3dmiyake+ohashi+igbt+model http://ieeexplore.ieee.org/xpl/abstractauthors.jsp?tp=&arnumber=4592060&querytext%3dmiyake+ohashi+igbt+model http://ieeexplore.ieee.org/xpl/abstractauthors.jsp?tp=&arnumber=4592060&querytext%3dmiyake+ohashi+igbt+model http://ieeexplore.ieee.org/xpl/abstractauthors.jsp?tp=&arnumber=4592060&querytext%3dmiyake+ohashi+igbt+model http://ieeexplore.ieee.org/xpl/abstractauthors.jsp?tp=&arnumber=4592060&querytext%3dmiyake+ohashi+igbt+model http://ieeexplore.ieee.org/xpl/abstractauthors.jsp?tp=&arnumber=4592060&querytext%3dmiyake+ohashi+igbt+model http://ieeexplore.ieee.org/xpl/abstractauthors.jsp?tp=&arnumber=4592060&querytext%3dmiyake+ohashi+igbt+model http://ieeexplore.ieee.org/xpl/abstractauthors.jsp?tp=&arnumber=4592060&querytext%3dmiyake+ohashi+igbt+model http://ieeexplore.ieee.org/xpl/abstractauthors.jsp?tp=&arnumber=4592060&querytext%3dmiyake+ohashi+igbt+model http://ieeexplore.ieee.org/xpl/abstractauthors.jsp?tp=&arnumber=4592060&querytext%3dmiyake+ohashi+igbt+model http://ieeexplore.ieee.org/xpl/abstractauthors.jsp?tp=&arnumber=4592060&querytext%3dmiyake+ohashi+igbt+model http://ieeexplore.ieee.org/xpl/abstractauthors.jsp?tp=&arnumber=4592060&querytext%3dmiyake+ohashi+igbt+model http://ieeexplore.ieee.org/xpl/abstractauthors.jsp?tp=&arnumber=4592060&querytext%3dmiyake+ohashi+igbt+model http://ieeexplore.ieee.org/xpl/abstractauthors.jsp?tp=&arnumber=4592060&querytext%3dmiyake+ohashi+igbt+model http://ieeexplore.ieee.org/xpl/abstractauthors.jsp?tp=&arnumber=4592060&querytext%3dmiyake+ohashi+igbt+model http://ieeexplore.ieee.org/xpl/abstractauthors.jsp?tp=&arnumber=4592060&querytext%3dmiyake+ohashi+igbt+model instruction facta universitatis series: electronics and energetics vol. 27, n o 3, september 2014, pp. 425 433 doi: 10.2298/fuee1403425b relevance of the types and the statistical properties of features in the recognition of basic emotions in speech  milana bojanić, vlado delić, milan sečujski faculty of technical sciences, university of novi sad, serbia abstract. due to the advance of speech technologies and their increasing usage in various applications, automatic recognition of emotions in speech represents one of the emerging fields in human-computer interaction. this paper deals with several topics related to automatic emotional speech recognition, most notably with the improvement of recognition accuracy by lowering the dimensionality of the feature space and evaluation of the relevance of particular feature types. the research is focused on the classification of emotional speech into five basic emotional classes (anger, joy, fear, sadness and neutral speech) using a recorded corpus of emotional speech in serbian. key words: emotional speech recognition, acoustic features, basic emotions 1. introduction basic emotion is a term used in categorical emotion models, among which ekman‟s concept of six basic emotions is the most prominent one. his theory of basic emotions, which are “psychological universals and constitute a set of basic, evolved functions that are shared by all humans”, is supported with experimental findings of cross-culturally recognized emotions from vocal signals and facial expressions [1]. from the beginning of its development, emotional speech recognition (esr) studies have used corpora of acted emotional speech since those corpora were easy to collect. such corpora usually contained several basic emotions reproduced by actors [2]. there are apparently reasonable objections about acted speech corpora, saying that acting emotions is not the same as producing „spontaneous‟ emotions and pointing out that within human-machine interaction emotion-related states are much more common than prototypical full-blown emotions (such as those represented in acted speech corpora) [3]. still, recent research has shown that the relationships between the acted emotions and their acoustic correlates and between real life emotions and their acoustic correlates do not necessarily contradict [4].   received february 10, 2014; received in revised form march 13, 2014 corresponding author: milana bojanić university of novi sad, faculty of technical sciences, trg dositeja obradovića 6, 21000 novi sad, serbia (milana.bojanic@uns.ac.rs) 426 m. bojanić, v. delić, m. seĉujski a more flexible solution to the problem of the representation of emotional states is to represent them as points in the continuous 2d space whose co-ordinates are the activation and evaluation involved in the emotional state [5]. such dimensional models also allow for the mapping of basic emotions into the continuous 2d emotional space [5, 6], thus enabling a broad field of application of the recognition of basic emotions in speech. the paper summarizes our approach to the recognition of basic emotions in speech, focusing particularly on the improvement of recognition accuracy by lowering the dimensionality of the feature space. additionally, a feature selection procedure has been performed in order to rank feature types and used statistical functionals. the presented research has been conducted on a corpus of acted emotional speech in serbian. the paper is organized as follows. aspects of the proposed approach that are relevant to the recognition of basic emotions, including acoustic modeling, classification scheme and speech corpus, are presented in section 2. in section 3, theoretical background about feature dimensionality reduction techniques is given and their possible benefits are pointed out. experimental results are shown and discussed in section 4. finally, the conclusions are given in section 5. 2. the proposed approach 2.1 the proposed approach to acoustic modeling the proposed approach to acoustic modeling is based on the statistical analysis of acoustic feature contours [7, 8] and it is performed in three stages, as shown in fig. 1. the first stage includes the extraction of acoustic features on a frame basis. these features belong to two acoustic feature sets, namely prosodic and spectral feature set. in the prosodic feature set, pitch and energy are extracted. as to spectral feature set, only the first 12 mfccs are taken into account in our analysis, since they correspond to slow changes in the spectrum, i.e., the spectrum envelope. the feature contours which correspond to the pitch contour, energy contour and mfcc contour, are, respectively, sequences of short-term pitch, energy and mfcc values extracted on a frame basis. fig. 1 feature extraction process in three stages relevance of the types and the statistical properties of features in recognition of basic emotions in speech 427 the extracted features are forwarded to the second stage, in which the first derivative of the acoustic features is calculated in order to model the dynamics of speech. the first derivative carries the information about the dynamics of emotional speech, which is useful in emotional speech classification [4]. the third stage of the feature extraction process involves a statistical analysis of the feature contours. the final feature set is obtained from the feature contours by applying so-called static modeling through functionals [9]. in the literature, larger numbers of statistical features are analyzed [10, 11]. our selection of statistical functionals was guided by the principle that chosen statistical features should describe the variations and follow the trend of changes of acoustic features correlated with different types of emotional speech. at the same time, since it was impossible to predict which statistical characteristics would be the most effective, the proposed set of statistical features included 12 features, bearing in mind that if particular information in the feature vector showed to be redundant and aggravating for classification, an efficient subset of features would be extracted using a dimensionality reduction technique. the proposed set of 12 statistical functionals has been chosen from three groups of functionals which are the most frequently used [9]. these groups and their corresponding functionals are [7]: 1. the first four moments (mean, standard deviation, skewness and kurtosis), 2. extrema and their positions (minimum, maximum, range, relative position of minimum and relative position of maximum), 3. regression coefficients (the slope and the offset of the linear regression of the contour) and regression error (the mean squared error between the regression curve and the original contour). by applying the proposed procedure, three sets of features have been extracted [7]. the first feature set includes only prosodic features (pitch and energy) and it will be referred to as prosodic feature set (p-fs). the second feature set includes only spectral features (12 mfcc); this set will be referred to as spectral feature set (s-fs). finally, the third feature set includes both prosodic and spectral features, and additionally the voicing probability and the zero crossing rate. for the mentioned 16 features, the first derivative is calculated, and then 12 functionals are applied on all of them, resulting in 384 features extracted for each utterance. the third feature will be referred to as prosodic-spectral feature set (ps-fs). 2.2 classification scheme for the purpose of emotional speech classification, we have considered the linear discriminant classifier (ldc) and the k-nearest neighbours classifier (knn), as they belong to well known and simple classifiers, which have been used by other researchers for this purpose and which have proved to be successful for both acted and spontaneous emotional speech [9]. as for ldc, two classification schemes have been considered. the first one is the linear bayes classifier with the underlying assumption that classes have gaussian densities and equal covariance matrices. the second one is the derivation of linear discriminant functions via the perceptron rule [12]. in the latter case, no assumptions have been made about the underlying class densities. 428 m. bojanić, v. delić, m. seĉujski 2.3 emotional speech corpus the research was conducted on the corpus of emotional and attitude expressive speech (gees, according to the serbian acronym), which is the first speech corpus recorded in serbian for the purpose of research on acoustic manifestations of emotions in human speech in the context of speech technology [13]. it contains recordings of acted speech-based emotional expressions corresponding to five basic emotional states: anger, joy, fear, sadness, and neutral, reproduced by six actors (3 female, 3 male). the underlying textual material is emotionally neutral with respect to lexical content and for the purpose of this study a section of the corpus including 30 short and 30 long sentences was used. the reported human recognition accuracy for this corpus is 94.7%. to avoid an imbalance between male and female speakers, an equal portion of the material from each emotional class belonging to each speaker was chosen and a total of 1740 sentences (75 minutes of speech) have been processed. both training and test sets included utterances from all speakers. therefore, these experiments belong to the case of speaker dependent emotion recognition. 3. dimensionality reduction dimensionality reduction can be performed through feature extraction or feature selection. while feature extraction employs a mapping (usually linear) of a given feature space onto a lower dimensional space, creating a feature subset which is a combination of existing features, feature selection involves a selection of a subset from the existing features without any transformation. 3.1 linear discriminant analysis linear discriminant analysis (lda) is a linear feature extraction technique whose goal is the enhancement of the class-discriminatory information in a lower dimensional feature space. fisher‟s lda for a two-class problem is based on a search for a projection that maximizes the ratio of between-class to within-class scatter. the solution is in a specific choice of direction for the projection of the data where the examples from the same class are projected so as to be very close to each other and, at the same time, the projected class means are projected so as to be as far from each other as possible [14]. fisher‟s lda generalizes easily for a c class problem (in our case c = 5 since we deal with 5 emotional classes). since the projection is no longer a scalar (it has c−1 dimensions), the determinants of the scatter matrices are used to obtain an objective function. betweenclass scatter matrix represents the scatter of the class mean vectors around the mixture mean, defined as: t 1 ))((    i c i iib ns , (1) where    icxi i x n 1 is the mean vector of each class in the original feature space x, and    x x n 1 is the mean vector of the mixture distribution. a within-class scatter matrix shows the scatter of samples around their respective class relevance of the types and the statistical properties of features in recognition of basic emotions in speech 429 mean vectors, and is expressed by:    c i iw ss 1 , (2) where    icx iii xxs t ))(( . (3) it can be shown that the optimal projection matrix ]|...||[ * 1 * 2 * 1 *   c wwww is the one whose columns are the eigenvectors corresponding to the largest eigenvalues of the following generalized eigenvalue problem [14]: 0)( *  iwib wss . (4) the projections with maximum class separability information are the eigenvectors corresponding to the largest eigenvalues i of the matrix sw 1 sb. 3.2 feature selection the drawback of feature extraction methods is that they are not very appropriate for feature mining, as the original features are not retained after the transformation [9]. in order to gain an insight into the significance of particular features, feature selection was used. we adopted sequential forward feature selection (sffs) as the search strategy and wrapper based evaluation as the objective function. sffs starts the selection with an empty set and sequentially adds the feature that results in the highest value of the objective function when combined with the already selected features [15]. in the case of wrappers the objective function is a classifier which evaluates feature subsets by their recognition rate on test data employing cross-validation. in our case, linear bayes classifier was selected as the wrapper as it had shown the best performance in previous recognition tests [7]. ideally, feature selection methods should not only reveal the single most relevant attribute (or groups thereof), but they should also decorrelate the feature space [9]. feature selection results in a reduced, interpretable set of significant features; their counts and weights in the selection set allow us to draw conclusions on the relevance of the feature types they belong to [16]. the feature set used in our feature selection experiments was ps-fs. since it is a combination of both prosodic and spectral features, the relevance of particular feature types within ps-fs was expected to be evaluated. 4. experimental results the focus of the research was on the investigation of a possible improvement of recognition accuracy in the case of a reduced feature space in the task of basic emotions classification. therefore, the performances of each classifier were tested in two ways: (1) using 3 extracted feature sets (p-fs, s-fs, ps-fs), and (2) using 3 feature sets obtained after lda feature reduction has been applied on the 3 initial feature sets. the experiments were carried out using 3 classification techniques (the knn classifier, the linear bayes classifier and the perceptron rule). 430 m. bojanić, v. delić, m. seĉujski table 1 shows the class and average recognition rate of the knn classifier (k = 9) in case of 3 feature sets, before lda (originally extracted feature sets) and after lda (original feature space reduced to 4 projection vectors). it can be observed that rather poor performance of knn in the case of all three original sets has been significantly improved in the reduced feature space. the highest improvement has been achieved in the case of prosodic-spectral feature set (an increase from 39.9% to 91.3% average recognition rate), which could be explained by the fact that the performance of the knn classifier is affected by the high dimensionality, which is particularly apparent in case of ps-fs. table 1 recognition accuracy of knn classifier using 3 feature sets (before and after feature reduction using lda) class recognition rate [%] feature set anger fear joy neutral sadness average p-fs 44.3 23.9 39.1 25 44.5 35.4 p-fs reduced with lda 53.7 51.2 52.3 53.2 61.2 54.3 s-fs 73.9 56.9 35.1 58.1 37.1 52.2 s-fs reduced with lda 81.3 92.8 81.6 95.7 93.9 89.1 ps-fs 57.8 37.9 32.2 23.6 33.3 39.9 ps-fs reduced with lda 86.8 93.7 83.6 95.9 96.3 91.3 table 2 shows the class and average recognition rate of the linear bayes classifier in case of three feature sets, before lda (initially extracted feature sets) and after lda (original feature space reduced to 4 projection vectors). an improvement of recognition accuracy is obtained only in the case of prosodic feature set (p-fs). this improvement amounts to about 5%, which is a rather moderate increase compared to the results in table 1, where the improvement is about 19%. as to s-fs and ps-fs there were no improvements, which is probably due to good linear separability in the original feature space (resulting in high recognition rates using non-reduced s-fs and ps-fs). table 2 recognition accuracy obtained with 3 feature sets (before and after feature reduction using lda) and with the linear bayes classifier class recognition rate [%] feature set anger fear joy neutral sadness average p-fs 51.4 43.7 46.8 45.4 62.4 49.9 p-fs reduced with lda 51.4 53.4 46.8 56.6 69.8 55.6 s-fs 85.1 91.7 81 95.9 93.9 89.5 s-fs reduced with lda 85.1 91.4 80.7 96.5 94.3 89.6 ps-fs 88.8 92.5 84.2 97.1 94.8 91.5 ps-fs reduced with lda 88.2 92.5 85.3 95.9 95.7 91.5 the class and average recognition rate of the perceptron rule in two test conditions (3 feature sets before lda and 3 feature sets after lda) are given in table 3. slight improvements of recognition accuracy are noticeable in the case of all three reduced feature sets. the improvement is the lowest in case of p-fs. relevance of the types and the statistical properties of features in recognition of basic emotions in speech 431 when these three classifiers are compared, it can be noted that a substantial improvement of recognition accuracy has been achieved for the simplest classifier, namely knn. using the ps-fs reduced using lda, knn achieves the accuracy almost equal to the best result in our experiments (91.5%). this holds for the perceptron as a classifier, although the relative improvement of the average performance of the perceptron is much smaller. table 3 recognition accuracy using 3 feature sets (before and after feature reduction using lda) and with the perceptron rule as the classifier class recognition rate [%] feature set anger fear joy neutral sadness average p-fs 34.8 29.3 36.2 21.3 56.9 35.7 p-fs reduced with lda 16.9 33.1 42.2 33 62.6 37.6 s-fs 79.9 81.9 72.1 89.7 87.9 82.3 s-fs reduced with lda 78.2 90.5 80.5 91.1 93.4 86.7 ps-fs 83.9 88.2 77.1 91.4 93.7 86.9 ps-fs reduced with lda 86.8 94.2 82.8 93.9 94.5 90.5 employing lda, the original feature space is transformed to a new one, making it impossible to interpret the relevance of particular feature types. for an insight into the list of the most relevant features in the original (untransformed) feature space, sffs (sequential forward feature selection) has been applied. the wrapper for sffs is the linear bayes classifier since it had the best recognition results. the number of selected features has been preset to 35. for the interpretation of results, three indicators have been used. the first indicator of the relevance of a feature type is the number (#) of the features selected by sffs. the other two indicators are so called „share‟ and „portion‟, as described in [16]. with „share‟, the count of the selected feature type is normalized by the total number of features in the reduced set (#/35 in our experiment). with „portion‟, the same number is normalized by the cardinality of a feature type in the original feature set (#/#total). for each feature type, the „share‟ indicator displays its percentage in modeling our 5-class problem, while the indicator „portion‟ gives the percentage of the total number of the feature type which contributes to the modeling of the problem. the results of the selection of 35 features from ps-fs and the effectiveness of each feature type are displayed through 3 indicators in table 4. the observed feature types from ps-fs are: zero crossing rate (zcr), energy, pitch (plus voicing probability) and mfcc. columns „#total‟ and „#‟ show the total number and the number of selected features per each feature type, respectively. from table 4 it can be observed that the most selected features („share‟=77.1%) belong to the mfcc type. the second important feature type is energy („share‟=11.4%). the third and the fourth feature type are zcr and pitch, respectively. as regards the indicator „portion‟, the list of feature types can be arranged in the following way: from the total feature set energy is selected with the highest percentage (16.7%), followed by zcr (12.5%). although the mfcc feature type is the most frequent one in the selected feature set, only 9.4% of the total number of mfcc is selected. the pitch feature type is selected by the lowest rate (2.1%). 432 m. bojanić, v. delić, m. seĉujski table 4 summary of feature selection results (35 features selected using sffs), displayed with respect to feature types zcr energy pitch mfcc #total 24 24 48 288 sffs # 3 4 1 27 share [%] 8.6 11.4 2.9 77.1 portion [%] 12.5 16.7 2.1 9.4 table 5 summarizes the results of the feature selection distributed along groups of used statistical functionals: moments, extrema and regression coefficients. the features derived via moments are the most frequent among the selected features („share‟=57.1%), followed by the features derived via extrema (22.9%) and the features derived via linear regression (20%). observing the „portion‟ of the total number of features in each group of functionals, the most highly ranked are moments, followed by regression functionals and extrema, in that order. table 5 summary of feature selection results, distributed along groups of used statistical functionals moments extrema regression #total 128 160 96 sffs # 20 8 7 share [%] 57.1 22.9 20 portion [%] 15.6 5 7.3 5. conclusion the paper gives an outline of a system for the recognition of basic emotions in speech, with particular emphasis on the extracted acoustic feature sets, classification schemes and emotional speech corpus. the paper discusses the obtained improvement of the recognition accuracy in a lower dimensional feature space obtained by applying linear discriminant analysis. the most substantial improvement of the recognition accuracy has been achieved for the simplest classifier in our experiments, namely the knn classifier. a combination of knn with a reduced prosodic-spectral feature set nearly approaches the best results obtained in the experiments (the accuracy of 91.5%). feature selection algorithm has been employed in order to evaluate the relevance of the feature types and their statistical properties in the given task of the recognition of 5 basic emotions. in descending order of relevance, the features are: mfcc, energy, zero crossing rate and pitch. observing the ratio of selected features to the total number of features in each feature type, features related to the energy are the most usually selected. the results of the feature selection distributed along groups of used statistical functionals imply that moments are the most relevant statistical features, although the extrema, regression coefficients and regression error also play notable roles. relevance of the types and the statistical properties of features in recognition of basic emotions in speech 433 combining chosen prosodic and spectral features, represented by appropriate statistical features, even with a most simple classification scheme (such as knn) the recognition results comparable with more complex systems can be achieved. acknowledgement: the research presented in this paper has been carried out within the project "the development of dialogue systems for serbian and other south slavic languages" (tr32035), supported by the ministry of education, science and technological development of the republic of serbia. references [1] d.a. sauter, f. eisner, p. ekman, s. scott, "crosscultural recognition of basic emotions through nonverbal emotional vocalizations", proceedings of national academy of sciences of the usa, vol. 107(6), pp. 2408-2412, 2010. [2] d. ververidis, c. kotropoulos, "emotional speech recognition: resources, features and methods", speech communication, vol. 48, pp. 1162-1181, 2006. [3] s.l. lutfi, f. fernandez-martinez, j.m. lucascuesta, l. lopez-lebon, j.m. montero, "a satisfactionbased model for affect recognition from conversational features in spoken dialog systems", speech communication, vol. 55, pp. 825-840, 2013. [4] m.e. ayadi, m.s. kamel, f. karray, "survey on speech emotion recognition: features, classification schemes and databases", pattern recognition, vol. 44, pp. 572-587, 2011. [5] n. fragopanagos, j.g. taylor, "emotion recognition in human-computer interaction", neural networks, vol 18, pp. 389-405, 2005. [6] b. schuller, b. vlasenko, f. eyben, g. rigoll, a. wendemuth, "acoustic emotion recognition: a benchmark comparison of performances", ieee workshop on automatic speech recognition and understanding, asru 2009, italy, 2009, pp. 552-557. [7] v. delić, m. bojanić, m. gnjatović, m. seĉujski, s.t. joviĉić, "discrimination capability of prosodic and spectral features for emotional speech recognition", electronics and electrical engineering, kaunas technologija, vol. 18, no. 9, pp. 51-54, 2012. [8] m. bojanić, extraction and selection of feature set for automatic emotional speech recognition. ph.d. dissertation, dept. elect. eng., faculty of technical sciences, university of novi sad, 2013. [9] b. schüller, a. batliner, s. steidl, d. seppi, "recognising realistic emotions and affect in speech: state of the art and lessons learnt from the first challenge", speech communication, vol. 53, pp. 1062-1087, 2011. [10] c.m. lee, s.s. narayanan, "toward detecting emotions in spoken dialogs", ieee transactions speech audio processing, vol. 13, no. 2, pp. 293-303, 2005. [11] h. altun, g. polat, "new frameworks to boost feature selection algorithms in emotion detection for improved human computer interaction", lncs, vol. 4729, berlin-heidelberg: springer, pp. 533-541, 2007. [12] r.o. duda, p.e. hart, d.g. stork, pattern classification, 2 nd edition. wiley, new york, 2000. [13] s.t. joviĉić., z. kašić, m. djordjević, m. rajković, "serbian emotional speech database: design, processing and evaluation", proceedings of international conference on speech and computer (specom 2004), st peterburg, 2004, pp.77–81. [14] k. fukunaga, introduction to statistical pattern recognition. academic press, 1990. [15] p. pudil, j. novovicova, j. kittler, "floating search methods in feature selection", pattern recognition lett., vol. 15, pp. 1119-1125, 1994. [16] a. batliner et al., "whodunnit – searching for the most important feature types signalling emotion-related user states in speech", computer speech and language, vol. 25, pp. 4-28, 2011. instruction facta universitatis series: electronics and energetics vol. 29, n o 2, june 2016, pp. 261 268 doi: 10.2298/fuee1602261s lcr of sc receiver output signal over α-κ-µ multipath fading channels  suad suljović 1 , dejan milić 1 , stefan r. panić 2 1 faculty of electronic engineering, university of niš, serbia 2 faculty of natural science and mathematics, university of priština, kosovska mitrovica, serbia abstract. wireless mobile communication system with selection combining (sc) diversity receiver is investigated in this paper. received signal envelope experiences α-κ-µ short term fading resulting in system performance degradation. level crossing rate (lcr) and average fade duration (afd) of sc receiver output signal envelope are obtained as rapidly converging infinite series expressions. numerically evaluated results are presented graphically, in order to discuss the effects of transmission parameters: multipath fading severity, dominant component power and nonlinearity propagation parameter on observed lcr performance of dual sc. key words: wireless transmission, α-κ-µ fading selection combining (sc), level crossing rate (lcr), average fade duration (afd) 1. introduction short term fading heavily influences and often degrades transmission quality of wireless communication system and limits channel capacity. there are few statistical models that can be used to describe signal envelope variation in multipath fading channel depending on communication scenario and propagation environment. the α-κ-µ distribution is recently reported in technical literature to describe small scale signal envelope variation in fading channels [1]. the α-κ-µ fading model can describe small scale signal envelope variations in nonlinear line of sight multipath fading environments with two or more clusters, and is presented as a function of three parameters: 1) parameter κ, often called rician factor, denoting the ratio of dominant components power to the power of scattered components; 2) parameter µ, related to the number of clusters in propagation environment; and 3) parameter α related to the non-linearity of propagation environment. presented α-κ-µ fading model describes propagation environments with more severe fading when the values of rician κ factor are lower. the α-κ-µ multipath fading is also more severe for lower values of parameter µ, and when parameter µ tends to received april 20, 2015; received in revised form august 13, 2015 corresponding author: suad suljović faculty of electronic engineering, university of niš, aleksandra medvedeva 14, 18000 niš, serbia (e-mail: suadsara@gmail.com) 262 s. suljović, d. milić, s. panić infinity, α-κ-µ fading channel approaches in its characteristics to a channel without fading effects. the α-κ-µ distribution is general distribution and α-µ, weibull, nakagami-m, rician and rayleigh distributions can be derived from α-κ-µ distribution as special cases. by setting κ=0, the α-κ-µ distribution reduces to α-µ distribution, and for κ=0 and µ=0, weibull distribution can be obtained from α-κ-µ distribution. by setting µ=1, and α=2, the α-κ-µ distribution reduces to rician distribution, while for α=2 and κ=0 the α-κ-µ distribution reduces to nakagami-m distribution, and by setting α=2, κ=0 and µ=1, rayleigh distribution is derived from the α-κ-µ distribution. there are several space combining techniques (spatial diversity combining), which can be used to mitigate the influence of multipath fading on receiver performance, depending on implementation complexity and quality of service [2,3,4]. maximal ratio combining (mrc) provides the best diversity gain, while sc enables the lowest implementation complexity. in sc diversity, receiver selects input branch with the highest signal-to-noise ratio, or highest envelope level in observed time instant. the established second order performance measures of wireless mobile communication system are average level crossing rate (lcr) and average fade duration (afd) [5]. lcr can be calculated as average value of the first time derivative of random process, while afd is defined as the average time over which the signal envelope ratio remains below a specified level after crossing that level in a downward direction. the system performs better when the values of average level crossing rate are lower. a considerable number of research papers consider lcr and afd of wireless system operating over multipath fading channels. in [6], macro diversity sc receiver with two micro diversity mrc receivers operating over gamma-shadowed nakagami-m multipath fading channel is considered. closed form expressions for lcr and afd are evaluated for the proposed system. lcr and afd of the wireless system in the presence of long term gamma fading and rician short term fading are determined in [7]. in [8], the expressions for the lcr and afd of sc receiver output signal for cases when rician, rayleigh and nakagami-m multipath fading are presented. in [9], an approach to for determining second order statistics over α-κ-µ fading channels was proposed. in this paper, we consider a wireless communication system with sc diversity receiver operating over α-κ-µ multipath fading channel. closed form expressions for lcr and afd of combiner output system have been efficiently evaluated. 2. system model the α-κ-µ random process can be obtained after transforming: 2y x   (1) where x denotes the κ-µ random process and α is a positive parameter. the κ-µ random variable follows probability density function (pdf): 2 1 ( 1)2 11 1 2 2 2 ( 1) ( 1) ( ) 2 , k y y k k k ky p y y e i k e                       0y  (2) lcr of sc receiver output signal over α-κ-µ multipath fading channels 263 where κ is rician factor, µ is fading parameter, ω is average power of y, and in(x) represents modified bessell function of n-th order. previous expression can be further written in the following form: 1 2 1 1 1 21 1 1 1 1 2 1 1 ( 1)2 1 1 2 1 0 1 12 2 2 3 1 ( 1)2 2 0 1 1 ( 1) 2 2 ( 1) ( ) ( ) !2 2 ( 1) ( ) ! i k y y i ik i ki i i y ik i k ky k p y y e i i k e k k y e e i i                                                       (3) probability density function (pdf) of α-κ-µ random variable now can be obtained after using relations: 2( ) | | x y dy p x p x dx         (4) and: 1 2 2 dy x dx     (5) after substituting (5) and (3) in (4), the expression for pdf for an α-κ-µ random variable becomes, as in [9]: 1 21 1 1 2 1 1 1 1 1 1 1 1 2 3 1 2 2 2 ( 1) 1 2 0 1 1 2 3 4 ( 1)2 4 0 1 1 2 ( 1) ( ) 2 ( ) ! ( 1) ( ) ! i i i i k x x ik i i ki i i x ik i k k x p x x e e i i k k x e e i i                                                               (6) now, cumulative distribution function (cdf) of α-κ-µ random variable can be determined as: 11 1 1 1 1 1 1 1 1 1 1 1 2 3 4 ( 1)2 4 00 01 1 2 2 4 4 1 6 0 4 1 1 2 ( 1) ( ) ( ) ( ) ! 2 3( 1) ( 1) , 4 ( ) ! 4 ( 1 i kx xi i i t x x ik i i i i i i k i i j k k f x p t dt t e dt e i i ik k k x e i i k k                                                                           1 1 11 2 3 4 ( 1)4 0 0 1 1 1 1 ( ) ) 2 3 5 ( ) !(2 3 ) 4 i j ki j x i j ki j j x e i e i i i                                         (7) where γ(a, x) is incomplete gamma function, and (a)n is pocchammer symbol [10]. the joint probability density function (jpdf) of κ-µ random variable and its first time derivative is: 264 s. suljović, d. milić, s. panić 1 2 21 1 1 2 1 1 21 1 1 2 1 2 2 1 1 2 3 1 ( 1)2 2 2 0 1 1 4 2 1 2 2 1 2 3 1 ( 1) ( 1) 2 2 2 2 2 2 1 2 2 1 1 2 ( 1) 1 ( ) ( ) ( ) ( ) ! 2 2 ( 1) 2 ( ) ! yy i yki i i y ik i i i i k y ki y fm i i k m k k y yy p y p y e e p e i i k k y e e f e i i                                                          (8) where β stands for the time derivate process variance and fm stands for the doppler frequency. the time derivate of α-κ-µ random process can be determined by using: 2 ,x y 2 ,y x   2 12 ,x y y    1 2 2 y x x     (9) now, the jpdf of α-κ-µ random process and its first time-derivative is: 1 2 2( ) , 2 xx yy p xx j p x x x            (10) where jacobian of transformation can be determined according to: 1 2 2 2 1 2 0 2 4 0 2 y y x x x j x y y x x x                     (11) after substituting (8) and (11) in (10), the expression for jpdf of α-κ-µ random variable and its first time derivative is: 21 1 1 2 2 1 22 1 1 1 2 2 4 2 1 2 2 1 2 3 3 8 ( 1)( 1)2 2 2 4 8 2 2 1 0 2 1 1 ( ) , 2 ( 1) 2 2 ( ) ! m xx yy i i i kki x xx f i i k m p xx j p x x x k k x e e f e i i                                                  (12) lcr of α-κ-µ random process is equal to the average mean value of the time derivative of α-κ-µ random process, namely: 21 1 1 2 2 1 22 1 1 1 1 1 1 0 4 2 1 2 2 1 2 3 3 8 ( 1)( 1)2 2 2 4 8 2 2 1 0 0 2 1 1 4 2 1 2 2 1 2 3 2 2 4 2 2 ( ) ( ) ( 1) 2 2 ( ) ! 2 ( 1) 2 m x xx i i i kki x xx f i i k m i i i i m k n x x p xx dx k k x e xe dx f e i i f k k x e                                                                1 1 ( 1) 1 0 2 1 1 ( ) ! k x i i e i i           (13) lcr of sc receiver output signal over α-κ-µ multipath fading channels 265 the expression for lcr of α-κ-µ random process can be used to determine afd of wireless communication systems operating over α-κ-µ multipath fading channels. namely, afd is equal to ratio of cumulative distribution function and its lcr. 3. performance analysis we further consider a wireless communication system with sc receiver operating over identically distributed independent α-κ-µ multipath fading channels. signal envelopes at inputs of sc receiver are denoted with x1 and x2, while the signal envelope at output of sc receiver is denoted by x. the sc receiver selects the branch with higher signal level, therefore pdf of sc receiver output signal envelope is: 1 2 2 1 1 2 1 1 1 1 1 1 2 2 2 2 2 2 3 4 ( 1)2 4 0 1 1 2 3 4 2 4 2 1 2 2 ( ) ( ) ( ) ( ) ( ) 2 ( ) ( ) ( 1) 2 ( ) ! 4 ( 1) 2 3 5 ( ) !(2 3 ) 4 x x x x x x x i ki i i x ik i i j i i j i j i j k p x p x f x p x f x p x f x k k x e e i i k k x i e i i i                                                               2 1 2 1 2 1 2 1 2 1 21 2 ( 1) 0 0 ( ) 3 2 2 2 ( 1)2 2 2 22 2 220 0 0 2 1 1 2 2 2 ( ) 8 ( 1) 2 3 5 ( ) ! ( ) !(2 3 ) 4 k x i j j i i j ki i j i i j i i x i i jki i j j e k k x e i e i i i i i                                                                       (14) now, cdf of sc receiver output signal envelope is obtained as:   1 2 1 1 1 1 1 11 2 2 3 4 ( 1)2 4 0 0 1 1 1 1 ( ) ( ) ( ) ( ( )) 4 ( 1) 2 3 5 ( ) !(2 3 ) 4 x x x x i j ki i j i j x i j ki j j f x f x f x f x k k x e i e i i i                                               (15) further, jpdf of sc receiver output signal and its first time derivative can be obtained as: 1 1 2 2 2 1 1 1 2 ( ) ( ) ( ) ( ) ( ) 2 ( ) ( ) xx x x x x x x x x x p xx p xx f x p xx f x p xx f x   (16) after substituting (7) and (13) in (17), the expression for lcr can be expressed as: 2 1 1 2 1 1 2 1 2 1 2 1 2 1 2 0 0 4 4 4 2 1 2 2 4 2 1 3 2 2 ( 1) 2 2 2 2 2 4 2 1 2 2 1 1 2 2 2 ( ) ( ) 2 ( ) ( ) 2 ( ) ( ) 16 ( 1) 2 2 ( ) ! ( ) !(2 3 ) x xx x x x x x i i j i i j i i j k x i i m i i j k n x x p xx dx f x x p xx dx f x n x f k k x e e i i i i i                                                       1 20 0 0 2 3 5 4 i i j j i                (17) where 1 ( ) x n x is given by (13). 266 s. suljović, d. milić, s. panić the afd of sc receiver can now be determined as [2, eq.4.14]:   2 2 2 1 1 2 2 2 2 22 1 1 1 1 1 2 2 4 3 2 4 0 0 2 2 2 2 4 2 1 2 2 1 2 3 2 2 4 2 2 ( ( )) ( )( ) ( ) 2 ( ) ( ) 2 ( ) ( 1) 2 3 5 ( ) !(2 3 ) 4 ( 1) 2 x xx x x x x i j i i j i j i ji j j i i i i m i f x f xf x t n x f x n x n x k k x i i i i f k k x                                                                 1 1 0 2 1 1 ( ) ! i i i         (18) -20 -10 0 10 1e-6 1e-5 1e-4 1e-3 0,01 0,1 1 10 n x (x )/ f m x[db]     fig. 1 lcr for different system parameters 4. numerical results in figure 1, normalized lcr values at the sc receiver output signal envelope, versus sc receiver output signal envelope for several values of fading severity and nonlinearity parameter are presented. first, we consider level crossing for the fixed level x, set below the average signal level. in this scenario, it is expected that the signal is going to be above level x most of the time, and the lcr is going to be relatively low. as the level x increases, and comes closer to average signal level, the lcr also increases. the lcr values decrease, and in general, the system will perform better, when parameter µ increases, resulting in reduced fading severity. also, it is obvious that lcr values decrease as nonlinearity parameter  increases. when the crossing level is above the average signal level, the lcr will start to decrease with increase of level x. again, this is an expected effect, as the signal excursions above its average value will quickly become less likely. the parameters  and  generally have similar effects as in the previous scenario. lcr of sc receiver output signal over α-κ-µ multipath fading channels 267 normalized afd are presented for different system parameters in fig. 2. when the crossing threshold x is below the average signal level, afd is low, and this is generally the regime in which the system normally operates. better performances are expected in the cases when rician κ factor increases, resulting in lower afd. rician κ factor increases when dominant los (line-of-sight) component power increases or the power of scattering components decreases, thus making the fading less severe. performance improvement is expected in less severe environments. -20 -15 -10 -5 0 5 10 0,01 0,1 1 10 100 1000 10000 t x (x ) f m x [db]     fig. 2 afd for different system parameters 5. conclusion in this paper, wireless communication system with dual selection combining (sc) diversity receiver operating over α-κ-µ multipath fading channel is considered. main contribution is generality of the analysis, since from α-κ-µ distribution model other models can be derived as special cases. closed form expressions for lcr and afd of sc receiver output signal envelope are efficiently evaluated and discussed in the function of system parameters. in order to point out the influence of propagation nonlinearity, fading severity and rician κ factor on observed performances, results are presented graphically. acknowledgement: the paper is supported in part by the project iii44006 funded by ministry of education, science and technological development of republic of serbia. references [1] g. fraidenraich and m. d. yacoub, ”the α −κ− μ and α − η −μ fading distributions,” in proc. ieee ninth international symposium on spread spectrum techniques and applications, aug. 2006, manausamazon, brasil, pp. 16-20. [2] panic s, anastasov j, stefanovic m, spalevic p. fading and interference mitigation in wireless communications. crc press: usa, 2013 268 s. suljović, d. milić, s. panić [3] simon mk, alouini ms. digital communication over fading channels. john wiley & sons: new york, 2000. [4] stüber gl. principles of mobile communications. kluwer academic publishers: massachusetts, 1996. [5] lee wcy. mobile communications engineering. mcgraw-hill: new york , 2003. [6] d. stefanovic, s. panić, p. spalević, "second order statistics of sc macrodiversity system operating over gamma shadowed nakagami-m fading channels", international journal of electronics and communications (aeu), vol. 65, issue 5, pp. 413-418, may 2011. [7] d. milic, d.djosic, c. stefanovic, s. panic, m. stefanovic, "second order statistics of the sc receiver over rician fading channels in the presence of multiple nakagami-m interferers", international journal of numerical modelling: electronic networks, devices and fields, accepted for publication. [8] m. bandjur, n. sekulovic, m. stefanovic, a. golubovic, p. spalevic, and d. milic "second-order statistics of system with microdiversity and macrodiversity reception in gamma-shadowed rician fading channels", etri journal, vol. 35, no. 4, pp. 722-725, august 2013. [9] papazafeiropoulos, a. k.; kotsopoulos, s. a., "second-order statistics for the envelope of α κ μ fading channels," communications letters, ieee, vol. 14, no. 4, pp. 291-293, april 2010. [10] gradshteyn i, ryzhik i. tables of integrals, series, and products. academic press: new york, 1980. instruction facta universitatis series: electronics and energetics vol. 27, n o 4, december 2014, pp. 601 611 doi: 10.2298/fuee1404601s fuzzy model reference adaptive control of velocity servo system  momir r. stanković 1 , milica b. naumović 2 , stojadin m. manojlović 1 , srđan t. mitrović 1 1 military academy, university of defence in belgrade, serbia 2 department of automatic control, faculty of electronic engineering, university of niš, serbia abstract. the implementation of fuzzy model reference adaptive control of a velocity servo system is analysed in this paper. designing the model reference adaptive control (mrac) and the problem of choosing adaptation gain is considered. tuning the adaptation gain by fuzzy logic subsystem and a simple synthesis procedure of fuzzy mrac are proposed. several simulation runs show the advantages of fuzzy mrac approach. experimental validation on laboratory speed servo is realized by the acquisition system. the results confirm benefits of the proposed controller in comparison with the standard mrac. key words: mrac, fuzzy mrac, adaptation gain. 1. introduction the major conventional controllers design concepts are model based. however, process modelling is a complex procedure, which at best, provides only an approximate model of the real process, followed with some level of model uncertainty. the controllers with constant parameters in most cases are unable to cope with parameters perturbations, unmodelled dynamics and external disturbances. in order to provide acceptable system behaviour in the presence of internal and external disturbances, the appropriate adaptation of controller parameters is necessary [1]. adaptive control contains a proper adjustment mechanism of controller parameters in accordance with working conditions and the current state of the system. recall that the adaptive systems are divided in two classes: self-tuning systems and model reference adaptive systems based on parameters adaptation technique [2], [3]. in self-tuning control systems some of the recursive methods for on-line process identification are used and controller parameters are adjusted in real time based on the estimated values and predefined algorithm [4].  received may 14, 2014; received in revised form october 13, 2014 corresponding author: momir r. stanković military academy, university of defence in belgrade, generala pavla jurišića šturma 33, 11000 belgrade, serbia (e-mail: momir_stankovic@yahoo.com) 602 m. r. stanković, m. b. naumović, s. m. manojlović, s. t. mitrović in model reference adaptive control (mrac) the system performance is given by the reference model. tuning of controller parameters is based on the error, defined as the difference between reference model and real process responses [2], [5]. the main weakness of the standard mrac, with mit rule, is non-existent clearly defined rules for the adaptation gain selection. in most cases, however, it is chosen based on the large number of simulations and trial and error methods [2]. the use of fractional order parameter adjustment rule instead of the gradient approach with mit rule and the employment of fractional order reference model is proposed in [6]. the different modifications of the mrac, with a variable structure design [7], based on performing repetitive tasks [8] and with a time-varying reference model [9] have successfully been applied for plants with unmodeled dynamics, external disturbances and unknown parameters and for a system with control effort bounded. since ichikawa presented the novel design of model reference adaptive fuzzy control [10], many authors have made progress in the application of fuzzy theory in mrac [11], [12]. the fuzzy set theory allows the use of experience in system control design. the great contribution of fuzzy logic is the possibility of modelling unstructured heuristic assertions, which are expressed linguistically [13]. fuzzy adaptive concept becomes closer to the designer and it allows the use of expert knowledge and experience in designing control systems. as a result, the performance/complexity ratio is better for fuzzy adaptive controllers [14], [15]. fuzzy mrac is suitable for application in industrial control systems, where the influence of internal and external disturbances is high. in [16] simple design procedure of fuzzy mrac using error signal as fuzzy subsystem input was presented and this controller has shown better results than conventional ones. in this paper the fuzzy mrac of the speed servo system is proposed. in speed servo systems the main influence on system performances has varying load torque. based on the estimation of load torque and its first derivative the adaptation gain is adjusted by fuzzy logic subsystem (flss). through matlab/simulink® simulation models, the proposed and standard mrac of speed servo system with different load disturbance profiles are compared. the proposed controller is implemented in laboratory dc velocity servo system, and experimental validation of simulation results is obtained. 2. a revisit to the model reference adaptive control (mrac) a block diagram of the model reference adaptive control is shown in fig. 1. the desired behaviour of the system is expressed by reference model. parameters of controller are adjusted based on error e = y  ym, which is the difference between plant output y and reference model output ym. the main sources of error e are difference of reference and plant dynamics and external disturbances, denoted with w in fig. 1. the system has two feedback loops: an ordinary one composed of the plant and controller and a feedback loop for controller parameters adjustment [2]. fuzzy model reference adaptive control of velocity servo system 603 controller plant reference model adjustment mechanism m y y u c u controller parameters w   e fig. 1 block diagram of model reference adaptive control adjusting the controller parameters in the direction of the negative gradient of e is realized by the well-known mit rule: d de e dt d      , (1) where  is the parameter of controller, the partial derivative de / d is called the sensitivity derivative of the system and  presents the adaptation gain [2]. consider a first order system described by the model [2]: dy ay bu dt    , (2) where u is the control variable. let the reference model be given as: m m m m c dy a y b u dt    , (3) where uc is the command signal. the perfect model reference following can be achieved by controller: 1 2c u u y   , (4) with parameters: 0 1 1 m b b    and 0 2 2 m a a b      . (5) the sensitivity derivatives directly follow from partial derivations of e with respect to the controller parameters 1 and 2 [2]: 604 m. r. stanković, m. b. naumović, s. m. manojlović, s. t. mitrović 1 2 c de b u d p a b     (6) 2 1 2 2 22 ( ) c bde b u y d p a bp a b          (7) where p = d /dt is the differential operator. equations (6) and (7) cannot be used directly because a and b represent parameters of the system, which are uncertain or unknown. in order to exclude parameters a and b the following approximations are required: 2 m p a b p a    (8) based on (8) and mit rule (1) the following equations for updating the controller parameters are obtained: 1 1 c m d u e dt p a           , (9) 2 1 m d y e dt p a          . (10) it can be noted that parameter b is absorbed in adaptation gain  [2]. from (9) and (10) it can be seen that mrac has only one parameter, the adaptation gain , which has to be chosen a priory and its selection influences system performances significantly [15]. by substitution of (2), (3) and (4) in (9) and (10), y and e are excluded, and the following equations are obtained: 21 1 2 ( ) 1 ( ) m c m m m g p ud y y dt g p              , (11) 2 1 12 2 2 ( ) ( ) ( ) ( ) 1 ( ) 1 ( ) m c m c ref m ref m c m c g p u g p ud g p y g p dt g p u g p u                , (12) where gm(p) and gref(p) are equivalent to transfer functions of the plant and reference model, respectively. the influence of adaptation gain  on the convergence rates of 1 and 2 to 1 0 and 2 0 , cannot be analytically derived from (11) and (12). if  is constant, the convergence rates depend only on uncertainty of plant transfer function. it is known that system performances differ for different values of  [2] and therefore it is assumed that varying , as a function of external disturbances influencing the system, can significantly increase the convergence rates. fuzzy model reference adaptive control of velocity servo system 605 3. fuzzy mrac of velocity servo system 3.1. concept of fuzzy mrac it is known that external disturbances significantly influence the convergence rates of controller parameters. the main external disturbance for speed servo system is the varying load torque on the motor shaft and it can be shown that convergence rates depend on disturbance and its dynamics. for example, the constant load torque can be effectively compensated with a small value of , while the compensation of small load torque of high frequency dynamics requires a significantly larger value. depending on load torque and its dynamics some different rules can be formed based on experience, but the exact mathematical solution cannot be easily found. this is one of the required prerequisites for fuzzy logic subsystem design: the system can be described trough set of rules based on experience, while its mathematical model is too complicated or does not exist at all [18]. this fact was the main motivation to include a special fuzzy logic subsystem (flss) in the control loop. 3.2. application of fuzzy mrac the block diagram of a velocity servo system with a fuzzy model reference adaptive control is shown in fig. 2. controller dc servo motor with tacho reference model adjustment mechanism m y y u c u observer of load torque fuzzy logic subsystem y , a i   ˆ d m ˆ d m controller parameters fig. 2 block diagram of fuzzy model reference adaptive control the performance of the system is given by the first order reference model: ( ) 10 ( ) ( ) 0.01 1 m ref c y s g s u s s    . (13) the perfect model reference following the controller (4) is designed with parameters adjustment rules (9) and (10) and fuzzy logic based tuning of gain . dc servo motor transfer function with armature voltage u as input and tacho voltage utg as output is given with: 606 m. r. stanković, m. b. naumović, s. m. manojlović, s. t. mitrović ( ) (s) ( ) 1 tg m tg m m u s k k g u s t s    , (14) where / ( ) m em a e em me k k r f k k  and / ( ) m e a a e em me t j r r f k k  are dc motor static gain and time constant, respectively [19]. the values of tacho constant ktg and dc motor electrical and mechanical parameters are previously identified [20] and shown in table 1. table 1 parameters of dc motor and tacho parameter value armature resistance ra 8.91 ω armature inductivity la 4.5 mh moment of inertia je 2.93e-5 kg m 2 coefficient of viscous friction fe 11.7e-5 kg m 2 /rad/s electromechanical constant kem 0.103 nm/a mechanical-electrical constant kme 0.103 v/rad/s tacho constant ktg 0.0191 v/rad/s the observer for load torque and its first derivative estimation is designed based on dc motor moment equation: ( )ˆ ( ) ( ) ( ) ˆ ( )ˆ ( ) d em a e e d d d t m t k i t j f t dt dm t m t dt      (15) where ia(t) and (t) are measured armature current and shaft angular velocity, respectively. the estimated values ˆ d m and ˆ d m are inputs to fuzzy subsystem for  tuning, and the corresponding membership functions are shown in fig. 3. the linguistic variable m (load torque) is described by five membership functions: small (ms), intermediate positive (mip), large positive (mlp), intermediate negative (min) and large negative (mln). linguistic variable cm (the first derivation of load torque) is defined by membership functions: small (cs), intermediate positive (cip), large positive (clp), intermediate negative (cin) and large negative (cln). -0.25 0 0.25-0.15 -0.05 0.05 0.15 mln  ms min mip mlp [nm] d m -0.5 0 0.25-0.2 -0.1 0.1 0.4 cln  cs  cin  cip  clp  [nm/ s]d dm dt -0.4 0.2 a b fig. 3 membership functions of: a) linguistic variable m and b) linguistic variable cm fuzzy model reference adaptive control of velocity servo system 607 the developed flss is of the takagi–sugeno type, with two inputs: ˆ d m and ˆ d m , and one output, adaptation gain . for t and s norm, the minimum and maximum method was selected, respectively [21]. the fuzzy rules base for suggested flss for adaptation gain selection is shown in table 2. the set of rules is comprised of 25 rules, and the rules are defined experimentally, based on repeated simulations with different values of . flss output  is nonnegative scalar, and can assume values from min = 0.005 to max = 0.25. table 2 fuzzy rules base of fls mln min ms mip mlp cln 0.6γmax 0.8γmax 0.9γmax 0.8γmax 0.4γmax cin 10 γmin 0.6γmax 0.4γmax 0.6γmax 2 γmin cs 2 γmin 10 γmin 0.4γmax 10 γmin γmin cip 10 γmin 0.8γmax 0.4γmax 0.6γmax 2 γmin clp 0.6γmax γmax γmax γmax 0.4γmax 3.3. simulation results based on matlab/simulink® simulation models, the mrac and proposed fuzzy mrac of the velocity servo system with parameters given in table 1 are compared. performances of tracking of square reference with magnitude of ± 100 rad/s and period of 10s are analyzed. in the first case the trapezoidal load, shown in fig. 4a, is applied on a motor shaft. responses of the speed servo system with mrac and fuzzy mrac are presented in fig. 4b and fig. 4d. it can be seen that during the transient of the load disturbance, when it has a constant rate of change, mrac with greater  provides smaller reference tracking error, but when load disturbance becomes of constant value, the response is more oscillatory. the fuzzy mrac has better reference tracking performances during both periods in load disturbance profile, due to fuzzy adjustment of  in which the information of load disturbance derivative is included. the change of  is shown in fig. 4c. the tracking performances in the presence of the sinusoidal load disturbance with angular frequency of 4 [rad/s], presented in fig. 5a, are also analysed. the responses of mrac and proposed controller are shown in fig. 5b and fig. 5d, respectively. it can be noted that mrac with greater  almost completely eliminates the influence of load disturbance of steady state, but the transient is more oscillatory. the proposed controller enables acceptable overshoot and steady state reference tracking performances. the adaptation gain for this case is shown in fig 5c. the integral of absolute error e = y  ym for all cases with trapezoidal load and sinusoidal load disturbance is summarized in table 3, and the results confirmed the advantages of the proposed fuzzy mrac controller. 608 m. r. stanković, m. b. naumović, s. m. manojlović, s. t. mitrović 0 2 4 6 8 10 -0.2 -0.1 0 0.1 0.2 t [s] m d [ n m ] 0 2 4 6 8 10 -150 -100 -50 0 50 100 150 t[s]  [ ra d /s ] y (=0.05) y (=0.02) y (=0.18) y m a b 0 2 4 6 8 10 0 0.05 0.1 0.15 0.2 0.25 t [s] a d a p ta ti o n g a in 0 2 4 6 8 10 -150 -100 -50 0 50 100 150 t[s]  [ ra d /s ] y m y (fuzzy mrac) c d fig. 4 simulation results for tracking reference model: a) load torque, b) mrac for different , c) change  of fuzzy mrac, d) fuzzy mrac 0 2 4 6 8 10 -0.1 -0.05 0 0.05 0.1 0.15 t [s] m d [ n m ] 0 2 4 6 8 10 -150 -100 -50 0 50 100 t [s]  [ ra d /s ] y (=0.02) y (=0.05) y (=0.18) y m a b 0 2 4 6 8 10 0 0.05 0.1 0.15 0.2 0.25 t [s] a d a p ta ti o n g a in 0 2 4 6 8 10 -150 -100 -50 0 50 100 t [s]  [ ra d /s ] y m y (fuzzy mrac c d fig. 5 simulation results for tracking reference model: a) load torque, b) mrac for different , c) change  of fuzzy mrac, d) fuzzy mrac fuzzy model reference adaptive control of velocity servo system 609 table 3 integral absolute error mrac fuzzy mrac γ=0.02 γ=0.05 γ=0.18 trapezoidal load 43.4 37.5 149.2 15.1 sinusoidal load 182.6 70 27.5 24.5 4. experimental validation the experimental validation of simulation results is realized with a laboratory velocity servo system. in fig. 6 the experimental setup is shown. a dc servo motor with outputs for angular rate and armature current signals is used. the motor is equipped with a magnetic brake for variable load torque generating. the communication between the personal computer and the dc servo motor is provided with the acquisition card dt 9812. the control signals from the acquisition card before applying to the armature of the motor are amplified by the power amplifier. fig. 6 experimental setup mrac and fuzzy mrac are designed in matlab/simulink® environment. the simulink model of the proposed fuzzy mrac is shown in fig. 7. analog out. dt9812 to power amplifire analog inp. dt9812 from tacho analog inp. dt9812 from dc motor 10 0.01s+1 reference model w [rad/s] ia [a] md [nm] dmd/dt[nm/s] load torque observer 1/ktg gain fuzzy logic teta 1 teta 2 ref uc control signal controller e y gamma uc teta 2 teta 1 adjustment mechanism 100 [rad/s] fig. 7 simulink model of fuzzy mrac 610 m. r. stanković, m. b. naumović, s. m. manojlović, s. t. mitrović the step reference with magnitude of 100 rad/s is software generated. the signals of tacho and dc motor armature current from the acquisition card are introduced in simulink environment by analog input blocks. the control signal from the controller is passed to acquisition card by analog output block. varying load torque, generated by magnetic brake, is estimated with the observer and is shown in fig. 8a. angular rate of motor shaft is acquired and graphically presented in fig 8b and 8d. from figures it can be seen that the experimental results are very similar to the simulation results. speed servo system performances are much better with the proposed fuzzy mrac then with conventional mrac. in fig. 8c the changing of the adaptation gain  is shown. 0 1 2 3 4 5 -0.1 0 0.1 0.2 0.3 t [s] m d [ n m ] 0 1 2 3 4 5 0 50 100 t [s]  [ ra d /s ] y (=0.03) y (=0.1) y m a b 0 1 2 3 4 5 0 0.05 0.1 0.15 0.2 0.25 t [s] a d a p ta ti o n g a in 0 1 2 3 4 5 0 50 100 t [s]  [ ra d /s ] y m y (fuzzy mrac) c d fig. 8 experimental results for tracking reference model: a) load torque, b) mrac for different , c) change  of fuzzy mrac, d) fuzzy mrac 5. conclusion the synthesis procedure of fuzzy logic model reference adaptive control (mrac) is realized in this paper. fuzzy mrac is suitable for use in industrial control applications under all disturbance conditions. the implementation of the proposed control algorithm is analysed on the laboratory velocity servo system where the varying load torque has the main influence on system performances. the influence of varying load disturbance is compensated by changing the adaptation gain parameter by using a relatively simple t-s fuzzy logic subsystem. some simulation results show the advantages of the fuzzy mrac concept. the experimental validation confirms the simulation results. fuzzy model reference adaptive control of velocity servo system 611 acknowledgement: the paper is a part of the research supported by the ministry of education, science and technology development within the project iii44004 (2011-2014). references [1] z. bubnicki, modern control theory, general characteristics of control system, springer, 2005. doi:10.1007/3-540-28087-1 [2] k. astrom, b. wittenmark, adaptive control, second ed. netherlands, addison-wesley, 1995. [3] e. nebosko, a. proskurnikov, v. yakubovich, "adaptive regulators for the control of an uncertain linear discrete time system with a reference model", doklady mathematics, vol. 82(1), pp.667–670, 2010. doi: 10.1134/s1064562410040423 [4] t. ren, t. chen,c. chen, "motion control for a two-wheeled vehicle using a self-tuning pid controller", control engineering practice, vol. 16, pp.365–375, 2008. doi: 10.1016/j.conengprac.2007.05.007 [5] s. abdeddaim, a. betka, s. drid, m. becherif, "implementation of mrac controller of a dfig based variable speed grid connected wind turbine", energy conversion and management, vol. 79, pp.281-288, 2014. doi: 10.1016/j.enconman.2013.12.003 [6] b. vinagre, i. petráš, i. podlubny, y.chen, "using fractional order adjustment rules and fractional order reference models in model-reference adaptive control", nonlinear dynamics, vol. 29, pp.269279, 2002. doi: 10.1023/a:1016504620249 [7] c. chien, k. sun, a. wu, l. fu, "a robust mrac using variable structure design for multivariable plants", automatica, vol.32, pp.833-848, 1996. doi: 10.1016/0005-1098(96)00009-x [8] a. tayebi, "model reference adaptive iterative learning control for linear systems", international journal of adaptive control and signal processing, vol. 20, pp.475–489, 2006. doi :10.1002/acs.913 [9] p. balaguer, "similar model reference adaptive control with bounded control effort", international journal of adaptive control and signal processing, vol.25, pp.577–592, 2011. doi:10.1002/acs.1222 [10] k. ichikawa, "an approach to the synthesis of model reference adaptive control system", international journal of control, vol. 32, pp.175-190, 1980. doi:10.1080/00207178008922852 [11] n. golea, a. golea, k. benmahammed, "fuzzy model reference adaptive control", ieee transactions on fuzzy systems, vol. 10(4), pp. 436-444, 2002. doi: 10.1109/tfuzz.2002.800694 [12] h. abid, m. chtourou, a. toumi, "an indirect model reference robust fuzzy adaptive control for a class of siso nonlinear systems", international journal of control, automation and systems, vol. 7, pp. 982-991, 2009. doi: 10.1007/s12555-009-0615-8 [13] s. mitrović, ţ. đurović, "fuzzy logic controller for bidirectional garaging of differential drive mobile robot", advanced robotics, vol. 24(8), pp.1291-1311, 2010. doi:10.1163/016918610x501444 [14] c. dragoş, s. preitl, r. precup, m. cretiu, "modern control solutions for mechatronic servosystems. comparative case studies", in proceedings of the 10th international symposium of hungarian researchers on computational intelligence and informatics cinti 2009, budapest, hungary, 2009, pp. 69-82. [15] m. kadjoudj, n. golea, m. benbouzid, "fuzzy rule – based model reference adaptive control for pmsm drives", serbian journal of electrical engineering, vol.4, pp. 13-21, 2007. doi: 10.2298/sjee0701013k [16] z.li, "model reference adaptive controller design based on fuzzy inference system", journal of information & computational science, vol. 8, pp.1683–1693, 2011. [17] p.swarnkar, s. jain, r. nema, "effect of adaptation gain in model reference adaptive controlled second order system", engineering, technology and applied science research, vol.1, pp.70-75, 2011. [18] n. sinha, m. gupta, l. zadeh, soft computing and intelligent systems: theory and applications, academic press, 2000. [19] ţ. đurović, b. kovačević, signals and systems, beograd, academic mind, 2006 (in serbian). [20] m.stanković, m. naumović, s. manojlović, "a simple servo system as a laboratory equipment for demonstrating optimal control design", in proceedings of the 57 th conference etran, zlatibor, serbia, 2013, pp. au4.2.1-6 (cd edition in serbian). [21] k.tanaka, h. wang, fuzzy control systems design and analysis, john wiley & sons, 2001. instruction facta universitatis series: electronics and energetics vol. 29, no 2, june 2016, pp. 285 296 doi: 10.2298/fuee1602285p dielectric properties of la/mn codoped barium titanate ceramics  vesna paunović 1 , vojislav mitić 1,2 , miloš marjanović 1 , ljubiša kocić 1 1 university of niš, faculty of electronic engineering, niš, serbia 2 institute of technical sciences of sasa, belgrade, serbia abstract. la/mn codoped batio3 ceramics with various la2o3 content, ranging from 0.3 to 1.0 at% la, were investigated regarding their microstructure and dielectric properties. the content of mno2 was kept constant at 0.01 at% mn in all samples. la/mn codoped and undoped batio3 were obtained by a modified pechini method and sintered in air at 1300 0 c for two hours. the homogeneous and completely fine-grained microstructure with average grain size from 0.5 to 1.5m was observed in samples doped with 0.3 at% la. in high doped samples, apart from the fine grained matrix, the appearance of local area with secondary abnormal grains was observed. the dielectric properties were investigated as a function of frequency and temperature. the dielectric permittivity of the doped batio3 was in the range of 3945 to 12846 and decreased with an increase of the additive content. the highest value for the dielectric constant at room temperature (r= 12846) and at the curie temperature (r= 17738) were measured for the 0.3 at% la doped samples. the dissipation factor ranged from 0.07 to 0.62. the curie constant (c), curie-weiss temperature (t0) and critical exponent () were calculated using the curie-weiss and the modified curie-weiss law. the highest values of curie constant (c=3.2710 5 k) was measured in the 1.0 at% la doped samples. the obtained values for  ranged from 1.04 to 1.5, which pointed out the sharp phase transformation from the ferroelectric to the paraelectric phase. key words: barium titanate, ceramics, dielectrical properties 1. introduction barium titanate has attracted a considerable amount of attention over the years due to its excellent physical and electrical properties and numerous practical applications 1-3. the batio3 based ceramics are widely used for multilayer capacitors (mlccs), ptc thermistors, varistors, and dynamic random access memories (dram) in integrated circuits 4-6. for mlc applications, dielectric materials need to be electrically insulating and received may 8, 2015; received in revised form october 20, 2015 corresponding author: vesna paunović faculty of electronic engineering, university of niš, aleksandra medvedeva 14, 18000 niš, serbia (e-mail: vesna.paunovic@elfak.ni.ac.rs) 286 v. paunović, v. mitić, m. marjanović, lj. kocić exhibit high permittivity values and low dielectric losses at room temperature. as overload protection devices, they are required to be semiconducting at room temperature and undergo a sharp rise in resistivity when heated above the ferroto paraelectric phase transition temperature, tc 7. at room temperature, batio3 adopts a tetragonal perovskite structure and is a ferroelectric with high permittivity. it transforms to the cubic, paraelectric state at the curie temperature, tc of 132°c. also, undoped batio3 is electrically insulating at room temperature. the dielectric properties of batio3 depend on the synthesis method, density, grain size, and sintering procedure. consequently, there is a considerable interest in the preparation the powder of high homogeneity and a ceramics of high density and small grain size. the homogeneous starting powders can be obtained by conventional solid state reaction, oxalate precipitation method and modified pechini process 8, 9. the pechini method of preparation has the advantage in raising the permittivity of modified batio3, compared with the samples obtained by the conventional solid state sintering. the electrical and dielectric properties of batio3 ceramics can be modified by using various types of additives, as well as processing procedures 10-12. generally, ions with large radius and low valence like la 3+ , ca 2+ , dy 3+ and y 3+ , tend to enter the a sites (ba 2+ sites), while ions with small radius and higher valence like nb 5+ and ta 5+ favor the b sites (ti 4+ ) 13-17. substitution of the barium or titanium ion with small concentrations of ions with a similar radius could lead to structure and microstructure changes, and furthermore, modify the dielectric and ferroelectric properties. some of the dopants shift transition temperature of batio3 or induce broadening of rt curve and many of them cause diffuseness of ferroelectric transition. the phase transition from the ferroelectric to the paraelectric phase can be with sharp dielectric maximum or with diffuse dielectric maximum which is characteristic for relaxor ceramics. according to literature data, partial substitution of ba or ti ions with dopants such as la, zn and sb cause the formation of diffuse phase transition, high dielectric constant and low losses, and sn, ce, zr, bi, hf cause the appearance of ferroelectric relaxor behavior [17]. the addition of cazro3 in batio3 ceramic enhances the capacitance of capacitor and reduces the curie temperature 18. dielectric behavior of nb5+ modified batio3 ceramics was leaded by the presence of nonferroelectric regions and causes to decrease in the value of dielectric constant. the shift of curie temperature towards lower temperature side is attributed to the replacement of ba 2+ with bi 3+ 19. the addition of sb affects to the grain growth inhibition and formation uniform microstructure and also to increase the dielectric constant. among the additives, lanthanum, la, is the most efficient in raising the dielectric permittivity of modified batio3 ceramics 20-22. la as donor dopant decreases the grain size and enhances the dielectric constant. in la doped ceramics the curie temperature was shifted towards lower temperatures and dielectric constant values were much higher than in pure batio3. also, it was found that the dielectric losses decrease with addition of la in batio3. at higher concentration of la, dielectric maximum was broadened. the relaxor-type frequency dependence of permittivity was also found in batio3. the substitution of la 3+ on the ba 2+ sites requires the formation of negatively charged defects. there are three possible compensation mechanisms: barium vacancies (vba // ), titanium vacancies (vti //// ) and electrons (e / ) 23-25. dielectric properties of la/mn codoped barium titanate ceramics 287 small additions of lanthanum (< 0.5 at%) which replace the ba ions, leads to the formation of a bimodal microstructure and n-type semiconductivity, which has been widely believed to occur via an electronic compensation mechanism, if the samples are heated in a reducing or argon atmosphere. la2o3+2tio2  2la  ba +e (1) in heavily doped samples ( 0.5 at%) sintered in air atmosphere, which are characterized by a small grained microstructure, a high insulation resistance and life stability of the multilayer capacitors can be achieved. the principal doping mechanism is the ionic compensation mechanism (titanium vacancy compensation mechanism). la2o3+3tio2  2la  ba + //// ti v +3titi +9oo (2) for low partial pressure of oxygen, the characterized mechanism is electronic compensation mechanism, while for high pressures it is the characteristic ionic compensation mechanism. mno2 are frequently added to batio3, together with other additives, in order to reduce the dissipation factor. manganese has double role, as acceptor dopant incorporated at ti 4+ sites, it can be used to counteract the effect of the oxygen vacancies donors. as additive, segregating at grain boundaries, can prevent the exaggerated grain growth. manganese belongs to the valence unstable acceptor-type dopant, which may take different valence states, mn 2+ , mn 3+ or even mn 4+ during post sintering annealing process. mn 2+ is stable in cubic phase and easily oxidized to mn 3+ state which is more stable in tetragonal phase. for codoped systems 26-28, the formation of donor-acceptor complexes such as 2[laba  ][mnti  ] prevent a valence change from mn 2+ to mn 3+ . generally, in codoped batio3 ceramics, the controlled incorporation of donor dopant, such as la, in combination with an acceptor (mn) leads to the formation ceramics with uniform microstructure and high dielectric constant at room temperature as well as at curie temperature. the codoped ceramic showed lower value of dielectric losses compared to the undoped ceramics. also, one of the reasons they used a modified batio3 is that the additives have the effect of moving the curie temperature in the temperature field that can be used effectively, significantly below 132c. the purpose of this paper is to study the dielectric properties of la/mn codoped batio3 ceramics, obtain by pechini method, as a function of different dopant concentrations. the curie-weiss and modified curie-weiss law were used to clarify the influence of dopant on the dielectric properties of batio3. 2. experiments and methods the la/mn codoped batio3 ceramics were prepared from organometallic complex based on the modified pechini procedure 9 starting from barium and titanium citrates. this method provides a low-temperature powder synthesis process (below 800c), good stoichiometry and easy incorporation of dopants in the crystal lattice. the content of additive oxides, la2o3, ranged from 0.3 to 1.0 at%. the content of mno2 was kept constant at 0.01 at% in all samples. for comparison purposes the samples free of la and mn were prepared in the same manner. the modified pechini process was carried out as a three stage process 288 v. paunović, v. mitić, m. marjanović, lj. kocić for the preparation of a polymeric precursor resin. solutions of titanium citrate and barium citrate were mixed, heated at 90c and then the la and mn were added. the temperature was raised to 120–140c, to promote polymerization and remove the solvents. the decomposition of most of the organic carbon residue was performed in an oven at 250c for 1 h and then at 300c for 4 h. thermal treatment of the obtained precursor was performed at 500c for 4 h, 700c for 4 h and 800c for 2 h. after drying at room temperature and passing through sieve, the barium titanate powder was obtained. the powders were isostatically pressed at 98 mpa into disk of 10 mm in diameter and 2 mm of thickness. the samples were sintered in air atmosphere at 1300c for 2 h and the heating rate was 10c /min. the bulk density was measured by the archimedes method. the specimens are denoted such as 0.3 la/mn-batio3 for specimen with 0.3 at% la and 0.01 at% mn and so on. the microstructures of the sintered or chemically etched samples were observed by scanning electron microscope jeol-jsm 5300 equipped with eds (qx 2000s) system. capacitance and dissipation factor was measured using an agilent 4284a precision lcr meter in the frequency range from 20hz to 1 mhz. the variation of the dielectric permittivity with temperature was measured in the temperature interval from 20 to 180c. the dielectric parameters such as curie-weiss temperature (t0), curie constant (c) and critical exponent  were calculated according to curie-weiss and modified curie-weiss law. 3. microstructure characteristics the relative density of the la/mn codoped samples varied from 90% to 95 % of theoretical density (td), depending on the amount of additives, being lower for higher dopant additive concentration. the main characteristic of the low doped samples, the samples doped with 0.3at% of la is a completely fine grained and homogeneous microstructure with fairly narrow size distribution. the grain sizes were ranged from 0.5 to 1.5 m (fig.1a) and no evidence of any secondary abnormal grain growth. with an increase of the additive content, the microstructure of the specimens doped with 0.5 at% of la showed quite significant grain growth with varied grain size. besides a small amount of 1 m grains, most of the grains were approximately 3-8 µm (fig.1b). fig. 1 sem images of la/mn codoped batio3, a) 0.3at% la and b) 0.5 at% la. dielectric properties of la/mn codoped barium titanate ceramics 289 the microstructure evolution in the samples doped with 1.0 at% of la was quite different from that observed in the other samples. in 1.0 at% la doped samples, apart from the fine grained matrix with grain size of 2-3 µm, some local area with secondary abnormal grains (fig.2a) were observed. the secondary abnormal grains size was in the range 10-15 µm. for undoped batio3 ceramics, (fig. 2b) the microstructure displayed the characteristic non-uniform microstructure and grain size distribution from 1-15 µm. fig. 2 sem images of a) 1.0 at% la/mn codoped batio3 and b) undoped batio3 ceramics. the difference in microstructural features is also associated with the inhomogeneous distribution of la as can be seen in the eds spectra taken from different areas in the same sample (fig. 3). the existence of x-ray peaks for lanthanum (l-la peak) in the 1.0 at% doped sample in eds spectrum indicates that la-rich regions are in coexistence with the nominal perovskite phase. it is worth noting that the concentrations less than 1.0 at% could not be detected by the eds attached to the sem, unless an inhomogeneous distribution or segregation of the additive was present. the la-rich regions are associated with the small grained microstructure, whereas eds spectrum free of la-content corresponds to the abnormal grains. also, the eds analysis did not reveal any content of mn, thus a homogeneous distribution of mn trough the specimens can be assumed. fig. 3 sem/eds images of 1.0 la/mn codoped batio3. 290 v. paunović, v. mitić, m. marjanović, lj. kocić 3. dielectrical characteristics all la/mn doped samples that were investigated are electrical insulators with an electrical resistivity   10 8 cm at room temperature. the high resistivity indicates that the ionic compensation mechanism (titanium vacancy compensation mechanism) is exclusively involved during the la incorporation into the batio3 matrix, and due to the immobility of cation vacancies, at room temperature, the doped samples remain insulating. the observed microstructural characteristics, which depend on the type and concentration of additive, have a direct influence on the dielectric properties. dielectric properties of batio3 ceramics (dielectric permittivity r and dissipation factor tan) were measured as a function of frequency and temperature. dielectric constant was determined in the frequency range from 20 hz to 1 mhz. after the initial high value at low frequency, dielectric constant becomes nearly constant at frequency greater than 10 khz. with an increase of additive content, the dielectric constant decreases. the highest value of the dielectric constant (r = 12846) was measured for samples doped with 0.3 at% of la characterized by small-grained microstructure and high sintering density (fig. 4). the lowest value of dielectric constant (r = 5200) was measured for 1.0 at% la doped samples. for the undoped batio3 ceramic, the dielectric constant was 2230 and for these samples dielectric constant was essentially independent of frequency. fig. 4 dielectric constant of undoped and la/mn-batio3 ceramics as a function of frequency. the dielectric loss (tan) values are in a wide range from 0.07 to 0.62 (fig 5). the main characteristics for all doped specimens are that after the initial high dielectric loss values, the tan decreases and are nearly independent of frequency greater than 20 khz. the highest value of tan, and a considerable change of tan vs. frequency from 0.61 to 0.2 were recorded in 0.3la/mn doped batio3 ceramics. dielectric properties of la/mn codoped barium titanate ceramics 291 fig. 5 the dielectric losses as a function of frequency for undoped and la/mn-batio3 ceramics. the dielectric properties of batio3 ceramics also can be analyzed through the permittivitytemperature dependence (fig. 6). the variation of the dielectric constant as a function of temperature clearly displays the effects of additive content and microstructural composition on dielectric properties. the highest value of the dielectric constant at room temperature (εr =12846) and at curie temperature (εr = 17738), was measured for the 0.3la/mn codoped batio3 samples, which is characterized also by a small grained and uniform microstructure and high density. fig. 6 dielectric constant of batio3 ceramics as a function of temperature. 292 v. paunović, v. mitić, m. marjanović, lj. kocić with an increase of additive content the dielectric constant decreases. for the samples doped with 1.0 at% la, the dielectric constant at room temperature is 3945 and at curie temperature is 8270. the variations in dielectric constant in low and heavily codoped la/mn ceramics, sintered at the same temperature, can be attributed on one hand to the different density (where density decreases with an increase of additive content); and on the other hand, to the presence of a la-rich phase and formation of secondary abnormal grains that obviously lead to a decrease in the dielectric permittivity. in general, the pronounced permittivity-temperature response and a sharp phase transition, from ferroelectric to paraelectric phase at curie temperature, are observed for all doped batio3 samples and for undoped batio3. it can be seen from the ratio of permittivity at curie point (εrmax) and room temperature (εrmin) i.e. (εrmax/εrmin) which for 0.3 at% doped samples has a value of 1.38, for the 0.5la/mn doped samples is 1.7, and for the 1.0 at% doped batio3 is 2.09. the curie temperature (tc) for codoped samples is shifted towards low temperature compared to undoped batio3 ceramics for which the curie temperature is 134c. for doped samples, the tc ranged from 110c for 0.3la/mn batio3 to 122c for 1.0la/mn batio3 ceramics (table 1). the shift of curie temperature for the codoped ceramics was heavily dependent on the ratio donor/acceptor. in the 0.3la/mn-batio3 ceramics, the donor/acceptor ratio is 30, and in 1.0la/mn is 100. with increasing la concentrations and the formation of donoracceptor complexes 2[laba  ]-[mnti  ], the possibility of oxidation mn 2+ to mn 3+ and mn 4+ state was reduced. so the influence of mn on the shift in curie temperature in 1.0la/mn batio3 ceramics was smaller. fig. 7 reciprocal value of r in function of temperature. dielectric properties of la/mn codoped barium titanate ceramics 293 all specimens have a sharp phase transition and follow the curie-weiss law: 0tt c r   (3) where c is the curie constant and t0 curie-weiss temperature, which is close to the curie temperature. the curie-weiss temperature (t0) was obtained from the linear extrapolation of the inverse dielectric constant of temperature above tc down to zero (fig. 7). the curie-weiss temperature decreased with an increase of additive concentration. the curie constant (c) was obtained by fitting the plot of the inverse values of the dielectric constant vs. temperature, and represents the slope of this curve for data above the tc. with an increase of dopant amount, the curie constant (c) increased. the highest value of c (c = 3.2710 5 k) was measured for the 1.0 at% la doped samples. the value of the curie constant is related to the grain size and porosity of the samples. the curie constant for undoped batio3 ceramic is (c = 2.1210 5 k). the curie constant and the curie-weiss temperature values are given in table 1. in order to investigate the curie-weiss behavior, the modified curie-weiss law was used 29 max / max ( )1 1 r r t t c       (4) where r is dielectric constant, rmax maximum value of dielectric constant, tmax temperature where the dielectric value has its maximum,  critical exponent for diffuse phase transformation (dpt) and c / the curie-weiss-like constant. the dielectric parameters for undoped and doped batio3 ceramics, together with the values calculated according to modified curieweiss law, are given in table 1. table 1 dielectric parameters for undoped and la/mn codoped batio3 samples r at 300k r at tc tan  at (300k) tc [ 0 c] t0 [ 0 c] c [k] 10 5  pure batio3 2230 5488 0.067 134 101.1 2.12 1.402 0.3la/batio3 12846 17738 0.610 110 106.9 1.95 1.509 0.5la/batio3 6550 11196 0.248 118 94.8 2.67 1.044 1.0la/batio3 3945 8270 0.177 122 87.1 3.27 1.536 the critical exponent of the nonlinearity  was calculated from the best fit of the curve ln(1/r  1/m) vs. ln (t  tm), as shown in fig. 8. the critical exponent  represents the slope of the curve. for a single batio3 crystal, the  is 1.08 and gradually increases up to 2 for diffuse phase transformation in doped batio3. 294 v. paunović, v. mitić, m. marjanović, lj. kocić fig. 8 the modified curie-weiss plot ln(1/r 1/m) vs. ln (ttm) for batio3 samples. the slope of the curve determines the critical exponent . as can be shown in fig.8, the critical exponent  value is in the range from 1.044 to 1.536, which is in agreement with the experimental data. these samples are characterized by a sharp phase transition from ferroelectric to paraelectric phase at the curie point. the highest value for the critical exponent  ( = 1.536) was calculated in the 1.0 at% la/mn doped samples. 4. conclusion the dielectric properties of la/mn codoped ceramics depends heavily on the additive concentration and obtained microstructure during sintering. all samples have a resistivity of 10 8 cm and they are electrical insulators at room temperature. the highest value of the dielectric constant was achieved at room temperature (r=12846) and at the curie temperature (r=17738), and these values were measured for the 0.3 at% la/mn doped ceramics. this composition displayed a high density and small grained microstructure. with an increase of the additive content, the dielectric constant decreased; for the samples doped with 1.0 at% la, the r is 3945. the differences in dielectric constant values in low and heavily doped batio3 are due first to the different density (porosity) of doped ceramics and secondly to the presence of non-ferroelectric la rich regions and secondary abnormal grains. the dielectric loss values are in a wide range from 0.07 to 0.62. after initially greater dielectric loss values at low frequency, the tan decreases and are nearly independent of frequency greater than 20 khz. all specimens followed a curie-weiss low with sharp phase transition. the curie temperature of doped batio3 ceramics was shifted towards low temperature compared to undoped batio3. the curie temperature values ranged from 110c for 0.3la/mn batio3 to 122c for 1.0la/mn batio3 ceramics. the curie constant dielectric properties of la/mn codoped barium titanate ceramics 295 increases with increase of additive content. the highest value of c (c = 3.2710 5 k) was measured in samples doped with 1.0 at% of la. the critical exponent  is in the range from 1.044 to 1.536 and pointed out the sharp phase transformation from ferroelectric to paraelectric phase at curie temperature. acknowledgement: this research is a part of the project “directed synthesis, structure and properties of multifunctional materials” (172057). the authors gratefully acknowledge the financial support of serbian ministry of education, science and technological development for this work. references [1] h. kishi, n. kohzu, j. sugino, h. ohsato, y. iguchi, t. okuda, "the effect of rare-earth (la, sm, dy, ho and er) and mg on the microstructure in batio3", j. e. ceram. soc. vol. 19, pp. 1043-1046, 1999. [2] lj. zivkovic, v. paunovic, n. stamenkov, m. miljkovic, "the effect of secondary abnormal grain growth on the dielectric properties of la/mn co-doped batio3 ceramics", science of sintering, vol.38, pp. 273-281, 2006. [3] m. vijatovic petrovic, j. bobic, t. ramoska, j. banys, b. stojanovic, "electrical properties of lanthanum doped barium titanate ceramics, materials characterization, vol. 62, pp.1000-1006, 2011. [4] d.h. kuo, c.h. wang, w.p. tsai, "donor and acceptor cosubstituted batio3 for nonreducible multilayer ceramic capacitors", ceramics international, vol. 32, pp.1-5, 2006. [5] j. qi, z. gui, y. wang, q. zhu, y. wu, l. li, "ptcr effect in batio3 ceramics modified by donor dopant", ceramic international, vol. 28, pp.141-143, 2002. [6] m. wegmann, r. bronnimann, f. clemens, t. graule, "barium titanate-based ptcr thermistor fbers: processing and properties", sens. actuators a: phys., vol. 135 (2), pp. 394–404, 2007. [7] e. brzozowski, m.s. castro, "conduction mechanism of barium titanate ceramics", ceramics international, vol. 26, pp. 265-269, 2000. [8] w. caia, c. fu, z. lin, x. deng, w. jiang, "influence of lanthanum on microstructure and dielectric properties of barium titanate ceramics by solid state reaction", advanced materials research, vol. 412, pp. 275-279, 2012 [9] m.p.pechini, method of preparing lead and alkaline earth titanates and coating method using the same to form a capacitor, us patent no. 3.330.697, 1967. [10] a. ianculescu, z.v. mocanu, l.p. curecheriu, l. mitoseriu, l. padurariu, r. trusca, "dielectric and tunability properties of la-doped batio3 ceramics", journal of alloys and compounds, vol. 509, issue 41, pp. 10040–10049, 2011. [11] a.k.yadav, c.gautam, " dielectric behavior of perovskite glass ceramics", j. mater sci: materials in electronics, vol. 25, pp. 5165-5187, 2014. [12] a.k.yadav, c.gautam, "a review on crystallisation behaviour of perovskite glass ceramics", advances in applied ceramics, vol. 113 (4), pp.193-207, 2014. [13] e.j. lee, j. jeong, y.h. han, "defects and degradation of batio3 codoped with dy and mn", jpn. j. appl. phys. vol. 45, pp. 822-825, 2006. [14] s. m. park, y. h. han, "dielectric relaxation of oxygen vacancies in dy-doped batio3", journal of the korean physical society, vol. 57, no. 3 pp. 458463, 2010. [15] k.j. park, c.h. kim,y.j. yoon, s.m. song, "doping behaviors of dysprosium, yttrium and holmium in batio3 ceramics", j.e.ceram.soc., vol. 29, pp. 1735-1741, 2009. [16] s.m. bobade, d.d. gulwade, a.r. kulkarni, p.gopalan, "dielectric properties of aand b-site doped batio3 (i): laand al-doped solid solution", j. appl. phys., 97:074105, 2005. [17] d. gulwade, p. gopalan, "dielectric properties of aand b-site doped batio3: effect of la and ga", physica b, 404, pp.1799–805, 2009. [18] p.r. krishnamoorthy, p. ramaswamy, b.h. narayana, "cazro3 additives to enhance capacitance properties in batio3 ceramic capacitors", j. mater. sci. mater. electron., vol. 3, pp.176–180, 1992. [19] y. yuan, m. du, s. zhang, z. pei, "effects of binbo4 on the microstructure and dielectric properties of batio3 –based ceramics", j. mater. sci. mater. electron., vol. 20, pp.157–162, 2009. http://www.sciencedirect.com/science/article/pii/s0925838811016653 http://www.sciencedirect.com/science/article/pii/s0925838811016653 http://www.sciencedirect.com/science/article/pii/s0925838811016653 http://www.sciencedirect.com/science/article/pii/s0925838811016653 http://www.sciencedirect.com/science/article/pii/s0925838811016653 http://www.sciencedirect.com/science/article/pii/s0925838811016653 http://www.sciencedirect.com/science/journal/09258388 http://www.sciencedirect.com/science/journal/09258388/509/41 http://www.sciencedirect.com/science/journal/09258388/509/41 http://jjap.jsap.jp/cgi-bin/findarticle?journal=jjap&author=e%2ej%2elee http://jjap.jsap.jp/cgi-bin/findarticle?journal=jjap&author=j%2ejeong http://jjap.jsap.jp/cgi-bin/findarticle?journal=jjap&author=y%2eh%2ehan http://jjap.jsap.jp/archive/jjap-45.html 296 v. paunović, v. mitić, m. marjanović, lj. kocić [20] v. paunovic, l.j. zivkovic, v. mitic, "influence of rare-earth additives (la, sm and dy) on the microstructure and dielectric properties of doped batio3 ceramics", science of sintering, vol. 42, pp. 69–79, 2010. [21] w. li, z. xu, r. chu, p. fu, "structure and dielectric behavior of la-doped batio3 ceramics", adv. mater.res., vol. 105–106, pp. 252–254, 2010. [22] f.d. morrison, d.c. sinclair, a.r. west, "electrical and structural characteristics of lanthanum-doped barium titanate ceramics", j. appl. phys., vol 86, pp. 6355–6366, 1999. [23] r. zhang, j.f. li, d. viehland, "effect of aliovalent substituents on the ferroelectric properties of modified barium titanate ceramics: relaxor ferroelectric behavior", j.am.ceram.soc., vol.87, pp. 864-870, 2004. [24] f.d. morrison, a.m. coats, d.c.sinclair, a.r.west, "charge compensation mechanisms in la-doped batio3", j.europ.ceram.soc., vol. 6, no. 3, pp. 219-232, 2001. [25] f.d. morrison, d.c.sinclair, a.r.west, "doping mechanisms and electrical properties of la-doped batio3ceramics", int. j. inorg. mater., vol. 3, pp.1205–1210, 2001. [26] h. kishi, n. kohzu, y. iguchi, j. sugino, m. kato, h. ohasato, t. okuda, "occupation sites and dielectric properties of rare-earth and mn substituted batio3", j.europ.ceram.soc., vol. 21, pp. 1643-1647, 2001. [27] h. miao, m. dong, g.tan, y.pu, "doping effects of dy and mg on batio3 ceramics prepared by hydrothermal method", journal of electroceramics, vol. 16, pp. 297–300, 2006. [28] k.albertsen, d.hennings, o.steigelmann, "donor-acceptor charge complex formation in barium titanate ceramics: role of firing atmosphere", journal of electroceramics, 2:3, pp. 193-198, 1998. [29] k. uchino, s. namura, "critical exponents of the dielectric constants in diffuse-phase transition crystals", ferroelectrics letters, vol.44, pp. 55–61, 1982. 10503 facta universitatis series: electronics and energetics vol. 35, no 2, june 2022, pp. 145-154 https://doi.org/10.2298/fuee2202145м © 2022 by university of niš, serbia | creative commons license: cc by-nc-nd review paper prior knowledge based neural modeling of microstrip coupled resonator filters zlatica marinković1, miloš mitić1, branka milošević1, marin nedelchev2 1university of niš, faculty of electronic engineering, niš, serbia 2technical university of sofia, faculty of telecommunications, sofia, bulgaria abstract. the design of microstrip coupled resonator filters includes determination of the coupling coefficients between the filter resonator units. in this paper a novel modeling procedure exploiting prior knowledge neural approach is proposed as an efficient alternative to the standard electromagnetic (em) simulations and to the neural models based purely on the artificial neural networks (anns). it has similar accuracy as the em simulations and requires less training data and less time needed for the model development than the models based purely on anns. key words: artificial neural networks, coupled filters, design, microstrip 1. introduction microstrip coupled resonator filters act as bandpass filters and they are widely exploited in the modern microwave communication systems. planar filters are a good choice for realizing low passband loss and high rejection ratio in the stopband. they are manufactured easily to utilize printed circuit board (pcb) technology with a high accuracy and a relatively low price. planar filters’ responses do not vary when manufactured in series and their adjustment and tuning is straightforward. the variety of classical and crosscoupled topologies of microstrip filters can realize the chebyshev equiripple and quasielliptic response. the preferred resonators for practical realizations are half-wavelength resonators and their compact variantshairpin and square open loop resonators [1]. the square open loop resonators offer compact size at good quality factor inhering the frequency properties of the half wavelength resonator. as many microwave systems are relatively narrowband, the square open loop resonator can realize the narrow bandwidths with weak coupling coefficients at reasonable distance between them. received february 17, 2022 corresponding author: zlatica marinković university of niš, faculty of electronic engineering, 18106 niš, aleksandra medvedeva 14, serbia e-mail: zlatica.marinkovic@elfak.ni.ac.rs 146 z. marinković, m. mitić, b. milošević, m. nedelchev the cross-coupled filters with quasi-elliptic frequency response require clear identification of the sign of the coupling coefficient, which leads to clarification of the electrical, magnetic or mixed type of coupling especially between the non-adjacent resonators. the square open loop resonators solve this difficulty comparing to the half-wavelength resonators with the benefit of flexibility of coupling topologies. the filter synthesis process follows the classical approach through the calculation of the coupling matrix according to the chosen approximation. in the microwave systems, the most popular and implemented approximation is the chebyshev one [2]-[3]. in [2] the design process of the polynomials and the transversal coupling matrix is given. many authors offer matrix rotations to transform the canonical or transversal matrix to the exact matrix corresponding to the chosen topology [1]-[2]. an optimization method for direct calculation of the interresonator coupling coefficients is proposed in [4]. nevertheless, whatever method for synthesis is chosen, the distance between the resonators should be calculated precisely. in [1] it is proposed to utilize a full-wave em simulator, which is a rigorous approach, but suffers from a high time consumption and high calculation power needed. to overcome time consuming em simulations or complex optimization methods, new approaches based on application of artificial neural networks (anns) have been proposed to model the filter coupling properties on the filter resonator physical dimensions and/or the properties of the chosen dielectric material [5]-[6]. moreover, the ann based approach has been applied to perform inverse modeling of the filter. namely, the anns are used to determine the distance between the filter resonators for the given coupling properties [7]-[8] or resonator dimensions and the given coupling coefficient [5]-[6]. however, the developed models of the filter coupling properties shown in [5] are valid for only one considered dielectric material (i.e., for one specified value of the relative dielectric constant). in other words, it means that for each dielectric material it is necessary to develop a new neural model. to build a model which would be valid for different values of the relative dielectric constant, it would be necessary to acquire a bigger amount of the em simulated data, which would be time consuming and thus making the whole modeling procedure inefficient. in this paper we propose a novel approach in microstrip coupled resonator filter modeling, which is based on the prior knowledge based neural approach. namely, instead of exploiting the anns only, here the anns are combined with the empirical formulae, aimed for the approximate determination of the filter coupling coefficient. this approach provides a single model for all considered values of the relative dielectric constant. moreover, the model can be built with less data than the separate purely ann based models. the rest of the paper is structured as follows. the considered microstrip coupled resonator structure as well as the empirical expressions used for approximate determination of the filter coupling coefficients are described in section 2. section 3 contains a brief background of the prior knowledge neural approach. the novel prior knowledge neural model is proposed in section 4, whereas the obtained results and the discussion are given is section 5. section 6 contains conclusions. prior knowledge based neural modeling of microstrip coupled resonator filters 147 2. microstrip coupled resonator filters the square open loop resonator is a half wavelength long microstrip line with open ends (see fig.1a). the form of the resonator is symmetrical and the electromagnetic field distribution along it is predictable due to the symmetry. the open ends are supposed to be shortened, because of the fringe capacitance [9]-[10]. (a) (b) fig. 1 (a) the topology of microstrip square open loop resonator, (b) an example of coupled resonators the different orientations of the resonators on the top plane of the substrate form various kinds of coupling topologies. the coupling mechanism is achieved by the fringe fields, when the resonators are adjacent each other. the electrical filed is stronger than the magnetic near the open end of the resonator and the magnetic field is predominant at the center of the resonator. the strength of the electrical field and magnetic field decays rapidly with the distance from the open end and the center of the resonator respectively. the coupling structures in fig.1b perform mixed coupling. it is not possible to determine which field is dominant. the value of the coupling coefficient of the coupled resonators in fig1.b is much lower, because the currents are out-of-phase. this topology is applicable in narrow bandwidth filters. the considered microstrip resonator is of a square shape with the length a and the line width w, fabricated on the substrate having the height h and the relative dielectric constant r. the coupling coefficient (including mixed electric and magnetic coupling) k is precisely calculated in the em simulators, but the rough value of the coupling coefficient can be calculated using the following expressions [11]: '' me kkk += , (1) mm kk = 5.0 ' , (2) me kk = 6.0 ' (3) the coefficient of magnetic coupling km and the coefficient of electric coupling ke are calculated as: )exp()exp()exp( 16 eeeee dbafk −−−=  , (4) )exp()exp()exp( 16 mmmmm dbafk −−−=  , (5) 148 z. marinković, m. mitić, b. milošević, m. nedelchev where: h w a rre ++−= 11.001571.02259.0  , (6) pe r e h s b                     + += 2 1 ln226.00678.1  , (7) 4 03146.00886.1       += h w pe , (8) 15.1 06945.01608.0                −= h s h a de , (9)         −+−= h a h a fe 2443.04087.19605.0 , (10)               ++−= 3 08655.014142.006864.0 h w h w am , (11) pm m h s b       = 2.1 , (12) h w pm 1751.08885.0 −= , (13)                +−= h s h a h a dm 1417.08242.0154.1 , (14) h a h a fm −+−= 1557.00051.15014.0 . (15) 3. prior knowledge neural modeling approach owing to their excellent fitting capabilities artificial neural networks have found many applications in the field of rf and microwaves [12]-[19]. most of the applications have been based on the black-box modeling approach, which means that one or more anns are used to extract the relationship between the sets of the input and the output parameters (see fig. 2a). however, in order to make the modeling procedure more efficient, less time consuming and more accurate, without increasing the number of training data, the prior knowledge input (pki) neural approach can be applied (see fig. 2b) [12]. namely, in the pki approach, beside the original n input parameters, there are additional inputs of the ann. they represent the prior knowledge, meaning that they are correlated in some extent with the output parameters. in general, the number of prior knowledge input parameters (l) can be equal, but not necessary, to the number of the output parameters (m). the prior knowledge can be, for instance, the values of the outputs which are obtained by an approximate or simplified method. prior knowledge based neural modeling of microstrip coupled resonator filters 149 (a) (b) fig. 2 (a) black-box neural modeling approach, (b) prior knowledge input neural modeling approach the anns used in this work are the multilayer perceptron networks, having one input, one output and one or two hidden layers [12]. the transfer function of the input layer neurons is a unitary transfer function. the hidden layer neurons have sigmoid transfer functions, whereas the output layer neurons have linear transfer function. the levenbergmarquardt algorithm is used for the ann training. the pki approach requires that for each data sample used for the ann training, as well as later for testing and employing the developed model, it is necessary to have the values of the prior knowledge parameters. the average test error (ate), the worst case error (wce) and the product-pearson correlation coefficient (r) have been used as the metrics for comparing the models [11]. if the error of the ann response for the i–th input combination (i-th sample), ki compared to the corresponding target value, kti, relative to the dynamic range of the target values in the test set (kt max − kt min) is calculated as minmax tt ti i kk kk − − = . (16) the ate, wce and r are defined as follows: 1 1 | | n i i ate n  = =  , (17) 1 max | | n i i wce  = = , (18) 1 2 2 1 1 (| | | |)(| | | |) (| | | |) (| | | |) n i i i n n i ti t i i k k k k r k k k k = = = − − =     − −           , (19) 150 z. marinković, m. mitić, b. milošević, m. nedelchev where n is the number of the samples in the training set, and k and tk mean values of the ann response and the target values, respectively:  = = n i ik n k 1 1 and  = = n i tit k n k 1 1 . (20) 4. proposed model in the proposed model, an ann (fig. 3) is trained to predict the coupling coefficient for the given resonator dimensions a, s, w, the substrate height h and the relative dielectric constant r. besides these original input parameters, the ann has an additional input representing the prior knowledge, which is the approximate value of the coupling coefficient, here marked as kapprox ̧which is calculated by eqs. (1)-(15) given in section 2. the training and test sets consist of data samples, where one sample contains one combination of the values of the original input parameters, the calculated kapprox for the given input combination and the corresponding target value of the coupling coefficient k obtained by precise simulations in the full-wave em simulator. fig. 3 proposed pki neural model of microstrip coupled resonator coupling coefficient 5. results and discussion the proposed approach has been applied to model the microstrip coupled resonator coupling coefficient by exploiting the same data used in [6] for developing the black-box neural models aimed to predict the coupling coefficient for the given resonator dimensions and the properties of the substrate, k = (a, w, s, h), for the constant value of r. in table 1 the considered ranges of the input dimensions as well as the considered values of r are given. the training set has consisted of 2089 samples covering all four r values, whereas the validation test set has consisted of 40 samples not used in the training set. several anns with different number of hidden neurons were trained and the best model has been obtained with the ann having two hidden layers, each containing 17 neurons. the ate, wce and r values for the training set and the test set are given in table 2. the corresponding scatter plots prior knowledge based neural modeling of microstrip coupled resonator filters 151 showing the correlation of the predicted and target values for the training and test sets are given in fig. 4. table 1 considered ranges/values of the input parameters parameter range/values a (5 20) mm w (0.1 – 4) mm s (0.1 – 3.5) mm h (0.254 1.575) mm r 2.33, 4.4, 6.15, 10.2 table 2 test statistics for the training and the test sets set ate[%] wce[%] r trainig set 0.5 2 0.99967 test set 0.24 2.55 0.99981 table 3 comparison of the predicted and target values for ten chosen test samples k target k – ann model ae re[%] 0.096523 0.097757 0.001234 1.28 0.082074 0.082802 0.000728 0.89 0.066264 0.065804 0.000460 0.69 0.068744 0.066463 0.002280 3.32 0.073213 0.074809 0.001596 2.18 0.066295 0.065904 0.000390 0.59 0.074939 0.075631 0.000692 0.92 0.070582 0.070506 0.000075 0.10 0.058675 0.059139 0.000464 0.79 0.047486 0.047668 0.000183 0.38 (a) (b) fig. 4 correlation of the ann generated coupling coefficient and the reference target values (a) training set, (b) test set 152 z. marinković, m. mitić, b. milošević, m. nedelchev small errors in predicting both training and test values, as well as a good correlation, show that the proposed model not only learnt well the training data but has a good generalization accuracy on the test set not seen by the ann during the training phase. as an additional illustration, in table 3, for ten randomly selected test samples, the target and predicted values are reported together with the corresponding absolute errors (ae the absolute difference of the predicted and target values) and the relative errors (re the ae devided by the target value and expressed in percent). the rest of the test samples shown the similar errors. the relative errors are mostly below 2%, which can be considered as a good predicting accuracy. this model includes the dependence of the coupling coefficient on the relative dielectric constant, which was not possible to achieve with a simple black-box model by using the available data, i.e. without increasing the training set. to investigate how much the training set can be downsized in order to keep the same level of accuracy of the proposed model additional analysis have been performed. with this aim, the training set has been reduced but removing certain data samples, taking care that all considered areas of the input space were properly represented. (a) (b) fig. 5 correlation of the ann generated coupling coefficient and the reference target test values for the models trained with the (a) training set of 873 samples, (b) training set of 692 samples the proposed model has been developed for each reduced size training set ensuring the same level of training accuracy as in the initial case. the models have been further tested on the same test set (consisting of 40 samples) used for testing the model developed by using the full training set. the process of downsizing the training set has been stopped when the accuracy in predicting the test values started to get worse. in total, the test has been performed with four data sets consisting of 1230, 1036, 873 and 692 data samples. the test statistics is shown in table 4. prior knowledge based neural modeling of microstrip coupled resonator filters 153 table 4 test statistics for the test set obtained by the models trained with the reduced size training sets training set ate[%] wce[%] r reduced – 1230 samples 0.93 5.48 0.998672 reduced – 1036 samples 1.01 4.20 0.998495 reduced – 873 samples 1.05 3.90 0.998728 reduced – 692 samples 6.25 26.47 0.952975 it can be seen that the accuracy of the first three models is very similar. however, for the last data set, although the model was well trained, the correlation with the target test values has significantly decreased, which is confirmed by the higher errors. this can be clearly seen from fig. 5, where the scatter plots of the predicted data versus the target data for the last two data sets, containing 873 and 692 samples, show much higher discrepancies between the predicted and the target values of the coupling coefficient. it can be concluded that the number of training data can be more than halved comparing to the considered initial training set. this further means that the proposed approach can be exploited to develop the model for determining the coupling coefficient a much smaller number of the training data than the pure black-box model. 6. conclusion in this paper a novel modeling procedure exploiting prior knowledge neural approach is proposed for accurate determination of the coupling coefficient of a microstrip coupled resonator. unlike the black-box neural approach, which assumes that an ann is exploited to model the coupling coefficient dependence of the filter geometry and substrate properties, in the proposed model, an additional input of the ann is a value of the coupling coefficient obtained by mathematical expressions for approximate calculation of the coupling coefficient, representing the prior knowledge for the ann. by introducing the prior knowledge, the number of needed samples in the training data is reduced, that mean that less time is needed to acquire the training data by the time consuming em simulations, making the whole process of the model development more efficient and faster. comparing to the black-box model, the proposed model needs significantly less training data to develop the model with the desired accuracy. moreover, it gives a good accuracy in the cases where the black-box approach would need much more data to be exploited. in the considered case, with the available training data, the model includes dependence on the relative dielectric constant, which was not possible to achieve with a pure ann model. the model provides values of the coupling coefficient which are very close to the target values obtained by the em simulations. as the ann can be described by a set of mathematical expressions based on the basic mathematical operations and exponential function, the ann response can be calculated in a very short time. consequently, the ann accompanied with can the expressions representing prior knowledge be used for instant prediction of the coupling coefficient. in other words, the proposed model can be successfully used as a fast and accurate replacement of the em simulation for the coupling coefficient determination. looking from the side of the expressions used as the prior knowledge, which are used for approximate determination of the correlation coefficient, the ann can be seen as an addition to these expressions improving their accuracy. 154 z. marinković, m. mitić, b. milošević, m. nedelchev acknowledgement: the presented research has been supported by the ministry for education, science and technological development of serbia and by the ministry of education, republic bulgaria and faculty of telecommunications under contract number дн07/19/15.12.2016 "methods of estimation and optimization of the electromagnetic radiation in urban areas". references [1] j. hong and m. j. lancaster, microstrip filters for rf/microwave applications, john wiley & sons, 2001. [2] r. j. cameron, c. m. kudsia and r. m. mansour, microwave filters for communication systems: fundamentals, design, and applications, second edition, john wiley & sons, 2018. [3] r. j. cameron, "advanced coupling matrix synthesis techniques for microwave filters", ieee trans. microw. theory tech., vol. 51, no. 1, pp. 1–10, jan. 2003. [4] s. amari, "synthesis of cross-coupled resonator filters using an analytical gradient-based optimization technique", ieee trans. microw. theory tech., vol. 48, no. 9, pp. 1559–1564, sept. 2000. [5] m. mitić, m. nedelchev, a. kolev and z. marinković, "ann based design of microstrip square open loop resonator filters", in proceedings of the joint international conference on digital arts, media and technology with ecti northern section conference on electrical, electronics, computer and telecommunications engineering , pattaya, thailand, 11–14 march 2020, pp. 158–161. [6] m. nedelchev, m. mitić, a. kolev and z. marinković, "modeling and design of microstrip coupled resonator filters based on anns", in proceedings of the 43rd international conference on telecommunications and signal processing, milan, italy, july 7-9, 2020, pp. 470–473. [7] m. nedelchev, z. marinković and a. kolev, "ann based design of planar filters using square open loop dgs resonators", in proceedings of the 53rd international scientific conference on information, communication and energy systems and technologies icest 2018, sozopol, bulgaria, june 28-30, 2018, pp. 59–92. [8] m. nedelchev, z. marinković and a. kolev, "ann modelling of planar filters using square open loop dgs resonators", in proceedings of the 4th eai international conference on future access enablers of ubiquitous and intelligent infrastructures (fabulous 2019), sofia, bulgaria, march 28-29, 2019, pp. 363–371. [9] j.-s. hong and m. j. lancaster, "transmission line filters with advanced filtering characteristics", in proceedings of the mtt-s international microwave symposium digest, vol. i, boston, ma, usa, june 2000, pp. 319–322. [10] j.-s. hong and m. j. lancaster. "theory and experiment of novel microstrip slow-wave open-loop resonator filters", ieee trans. microw. theory tech., vol. 45, no. 12, pp. 2358–2365, dec. 1997. [11] j.-s. hong and m. j. lancaster, "couplings of microstrip square open-loop resonators for cross-coupled planar microwave filters", ieee trans. microw. theory tech., vol. 44, no. 11, pp. 2099–2109, nov. 1996. [12] q. j. zhang and k. c. gupta, neural networks for rf and microwave design, artech house, 2000. [13] h. kabir, l. zhang, m. yu, p. aaen, j. wood and q. j. zhang "smart modelling of microwave devices", ieee microw. mag., vol. 11, no. 3, pp. 105–108, may 2010. [14] z. marinković, g. crupi, a. caddemi, v. marković and d. m.m.‐p. schreurs, "a review on the artificial neural network applications for small‐signal modeling of microwave fets", int. j. numer. model el., e2668, may/june 2020. [15] z. stanković, n. dončov, "prediction of the em signal delay in the ionosphere using neural model", facta univ. ser.: elec. energ., vol. 32, no. 2, pp. 287–302, 2019. [16] t. ćirić, z. marinković, r. dhuri, o. pronić-rančić and v. marković, "hybrid neural lumped element approach in inverse modeling of rf mems switches", facta univ. ser.: elec. energ., vol. 33, no. 1, pp. 27–36, march 2020. [17] j. jin, f. feng, j. n. zhang, s. x. yan, w. c. na and q. j. zhang, "a novel deep neural network topology for parametric modeling of passive microwave components", ieee access, vol. 8, pp. 82273– 82285, may 2020. [18] q.-j. zhang, e. gad, b. nouri, w. na and m. nakhla, "simulation and automated modeling of microwave circuits: state-of-the-art and emerging trends," ieee j. microwavs, vol. 1, no. 1, pp. 494– 507, jan. 2021. [19] j. n. zhang, f. feng, j. jin, w. zhang, z. zhao and q.-j. zhang, "adaptively weighted yield-driven em optimization incorporating neuro-transfer function surrogate with applications to microwave filters", ieee trans. microw. theory tech., vol. 69, no. 1, pp. 518–528, jan. 2021. instruction facta universitatis series: electronics and energetics vol. 30, n o 3, september 2017, pp. 285 293 doi: 10.2298/fuee1703285t modeling of magnetoelectric microwave devices  alexander s. tatarenko, darya v. snisarenko, mirza i. bichurin novgorod state university, veliky novgorod, russia abstract. the possibility of computer modeling implementation of electrically controlled magnetoelectric (me) microwave devices is considered. the computer modeling results of different structures of me microwave devices based on layered ferrite-piezoelectric structure formed on the slot line, microstrip line and coplanar waveguide are offered. results are reported as frequency dependencies of insertion losses of me devices. key words: magnetoelectric microwave devices, ferrite-piezoelectric resonator, dual magnetic and electrical control 1. introduction with the increasing significance of the microwave communication systems, radar and navigation in modern society are enhanced requirements for their reliability, mobility, power consumption. telecommunication and mobile satellite radiotelephone systems, mobile navigation and radar stations, global and local computer networks are need of an electrically controllable and inexpensive devices. this requirement can be achieved by replacing complex circuits with active components to tunable microwave devices based on thin film materials with nonlinear physical properties such as ferroelectric and ferrites. one way to control the parameters of electronic components is based on the change in the dielectric constant of components under the influence of an external electric field. "electric" method of control is characterized by high speed and low energy consumption, since the restructuring carried out without leakage currents through the control circuit. control property under the influence of the electric field is maintained in some ferroelectrics in a wide frequency range from the lowest to the highest frequencies. this feature is widely used in microwave devices for rapid regulation of the amplitude-frequency and phase-frequency characteristics. the disadvantages of ferroelectric control structures is a relatively narrow range of operating frequency regulation and a high level of voltage applied to the electrodes. these drawbacks can be overcome by design of new modifications of the transmission lines, as well as the use of layered structures containing not only the ferroelectric, but and ferromagnetic received november 30, 2016 corresponding author: mirza bichurin novgorod state university, 173003 veliky novgorod, russia (e-mail: mirza.bichurin@novsu.ru) 286 a. s. tatarenko, d. v. snisarenko, m. i. bichurin films. using of ferrite-ferroelectric layered structures can manage the performance by electric and magnetic field at the same time. in such devices, you can combine the advantages of an "electric" and "magnetic" management methods, i.e. the high speed and a wide range of operating frequency with the microwave device parameters. analysis of the current state in the field of microwave devices controlled by electric and magnetic fields, indicates the existence of scientific and technical issues, including radio physical and physical-technological aspects. this issue determines the number of academic assignments, such as theoretical studies of electrodynamics characteristics and improving the design of microwave transmission lines, experimental investigations of wave processes, design and development of the controlled devices. magnetoelectric (me) materials [1-6], which simultaneously exhibit ferroelectricity and ferromagnetism, have recently stimulated a sharply increasing number of research activities for their scientific interest and significant technological promise in the novel multifunctional devices. the me effect [7-9] in composite materials is known as a product tensor property, which results from the cross interaction between different orderings of the two phases in the composite. neither the piezoelectric nor magnetic phase has the me effect, but composites of these two phases have remarkable me effect. thus the me effect is a result of the product of the magnetostrictive effect (magnetic/mechanical effect) in the magnetic phase and the piezoelectric effect (mechanical/electrical effect) in the piezoelectric. one of the promising directions of development of microwave technology currently is the development of me microwave devices. application of me non-reciprocal devices eliminates the above drawbacks of ferrite devices. electric field control allows to implement such devices integrally, i.e. reduces the cost of the devices; improves performance; reduces power consumption in the control circuit; eliminating the interference arising from the magnetic field control [10-11]. 2. modeling of me microwave devices magnetoelectric interactions in ferrite-ferroelectric composites have facilitated a new class of microwave signal processing devices. such devices are based on either hybrid spin electromagnetic waves or mechanical force mediated magnetoelectric interactions. when a ferrite-piezoelectric bilayer is driven to ferromagnetic resonance (fmr) and an electric field e is applied across piezoelectric (ferroelectric), the me effect results in a frequency or field shift of fmr. thus devices based on fmr can be tuned with both electric field e and magnetic field h. several dual tunable me devices, including resonators, filters, attenuators, circulators, isolators and phase shifters have been demonstrated so far. simulation of me microwave devices by the modern computer program which calculate multimode s-parameters and the electromagnetic field in the three-dimensional passive structures greatly simplifies the selection of optimal parameters of such devices: the parameters of the transmission line (dimensions and relative substrate permittivity, the size of the conductors) and the resonator parameters (size, shape, material). as the industry turns to monolithic integrated/hybrid nonreciprocal microwave devices, planar geometries have to be used. this requires the development of planar elements, compatible with strip line and microstrip systems. as high-frequency systems are manufactured using monolithic microwave integrated circuit (mmic) designs, the size of the me resonator must be compatible with the mmic chip technology. modeling of magnetoelectric microwave devices 287 the difference between the proposed me non-reciprocal devices and ferrite devices is to replace the ferrite magnetic resonator and magnetic control systems to ferrite-piezoelectric resonator and a system of electrodes connected to the source of the control voltage. me resonator (fig. 1) is a layered composite in the form of a disk or plate. as a ferrite phase can be different type of spinels (nife2o4, cofe2o4, ni0.8zn0.2fe2o4, co0.6zn0.4mn2o4 and other), yttrium iron garnet (yig thick film or monocrystal); as piezoelectric phase we can use polycrystalline material lead zirconate titanate (pzt), or single-crystal materials as lead magnesium niobate-lead titanate (pmn-pt), lead zinc niobate-lead titanate pzn-pt. fig. 1 me resonator: 1 is piezoelectric component, 2 is ferrite component, 3 is metal electrodes the basis for the design of me microwave devices is a microwave transmission line on a dielectric substrate with a me resonator placed in the transmission line. the operating principle of the me non-reciprocal microwave devices is based on the me microwave effect. the point of this effect is to shift the fmr line under the influence of an electric field. me layered composite operate as a resonator in this case. electric field control allows carrying out the tuning of the device characteristics in the frequency range. this is also the ability to control the fmr line using a magnetic field. dual tunability of the devices control parameters open up new possibilities for the design of such devices. fmr is a powerful tool for studies of microwave me interaction in ferrite-piezoelectric structures. an efficiency of the magnetoelectric interaction in the ferrite-piezoelectric bilayers is characterized by coefficient of magnetoelectric interaction a=δh/δe, where δh is variation of the internal magnetic field in the ferrite and δe is variation of the electric field applied to the piezoelectric. magnitude of a depends mainly on magnetostriction constant of the ferrite and piezoelectric coefficient of the piezoelectric. an electric field e applied to the composite produces a mechanical deformation in piezoelectric that in turn is coupled to the ferrite and results in the shift δf in the resonance field. information on the nature of high frequency me coupling was therefore obtained from data on shift δf vs e. the shift is proportional to linear me coupling coefficient. the design of me microwave device assumes the presence of me resonator, which is placed on the microstrip line or circuit-resonator, slot line or into waveguide using the circular polarization area of microwave field. the circular polarization of microwave field allows more effectively to use of composite component and allow increase the magnetic susceptibility. the working point is selected depending on the purpose of the device. for example, in case of attenuator or isolator the device is tuned in the resonance absorption. for the phase shifter selects the area near a resonance with the lowest absorption, but maximal depth control. 288 a. s. tatarenko, d. v. snisarenko, m. i. bichurin computation, design and manufacturing technology of nonreciprocal microwave devices intended for application in receiving-transmitting modules of antenna array have a great interest in current time. currently, a large development has program high frequency system simulator (hfss) of company ansoft, which is intended for the analysis of threedimensional microwave structures, including antennas and non-reciprocal devices containing ferrites and ferroelectrics. electromagnetic simulation in hfss is based on the use of the finite element method (finite element method, fem). microstrip line [12], coplanar line and slot line are used in the microwave range. the microstrip lines are used most widely [13-14]. however, at designing the non-reciprocal devices using ferrites it requires the microwave field of circular polarization. in microstrip line this region is absent and the additional elements are needed, for example in the form of stubs to create an area of circular polarization. from this point of view, the slot and coplanar line are of interest. the structure of the microwave field in the slot line and coplanar waveguide is significantly different from the structure of the wave field in microstrip line. coplanar waveguide (cpw) is a transmission line which consists of a center strip, two slots and a semi-infinite ground plane on either side of it [15]. this type of waveguide offers several advantages over conventional microstrip line, namely, it facilitates easy shunt as well as series mounting of active and passive devices; it eliminates the need for wraparound and the holes, and it has a low radiation loss. another important advantage of cpw which has recently emerged is that cpw circuits render themselves to fast and inexpensive on-wafer characterization at frequencies as high as 50 ghz. lastly, since the microwave magnetic fields in the cpw are elliptically polarized, nonreciprocal components such as ferrite circulators and isolators can be efficiently integrated with the feed network. fig. 2 (a, b, c) shows the computer model of me devices on a different type of transmission line. fig. 2 a) microstrip line the transmission line structure in fig.2a) consisted of microstrip lines of nonresonant lengths with two stubs of lengths 1/8 and 3/8 wavelengths on a dielectric substrate with ground plane on bottom side. the stubs is required for creating of elliptically polarized microwave magnetic field. modeling of magnetoelectric microwave devices 289 fig. 2 b) slot line the slot line transmission systems [16-17] has been shown to contain elliptically polarized h field regions which are required for producing nonreciprocal microwave devices. the development of such a device was dependent on being able to determine a me composite slot line configuration that would yield good interaction between the me resonator and the propagating mode of the slot line with a minimum of concurrent insertion loss. the microstrip to slot line transition is used to convert input microwave signals from a tem mode to the required slot line mode. the slot width on the transition is designed so as to match into the slot line etched on one of the me resonator inserts in the slot of the device. the pertinent characteristics of this type of transmission system such as field configurations, propagation constants, and impedance as functions of dielectric material characteristics, dielectric thickness, and slot width were derived. the slot line contained an microwave magnetic field configuration which was suitable for generating nonreciprocal me devices. there existed regions within the slot line that contained circularly or elliptically polarized microwave magnetic field. fig. 2 c) coplanar waveguide 290 a. s. tatarenko, d. v. snisarenko, m. i. bichurin the use of modern simulation software allows the fast design of various types of nonreciprocal microwave devices. we conducted a simulation of various types of nonreciprocal magnetoelectric devices based on slot and coplanar lines by using the hfss. a comparison with similar devices based on the microstrip line was made. 3. results and discussion simulation of the devices is made in the software environment of the hfss program. s-parameters in the frequency range are optimized for investigated device. the amplitude characteristics were investigated. computer simulation results for different designs of me microwave devices realized on the strip transmission lines are shown in the figures. figure 3 shows the frequency dependence of the microstrip line attenuation in the forward and reverse directions. fig. 3 the microstrip transmission line. dependence of attenuation vs. frequency. the resonators parameters is yig disk: thickness is 0.1 mm on ggg substrate with thickness 0.44 mm and diameter of 3 mm; magnetizing field is 2700 oe. fig. 4 slot transmission line. the dependence of the attenuation vs. frequency. resonator dimensions is 10 mm×1 mm×0.2 mm; slot line width is 0.62 mm, widening the gap to 1.2 mm; the relative permittivity of the substrate is 30, the substrate thickness is 2 mm; magnetizing field is 2514 oe. modeling of magnetoelectric microwave devices 291 figure 4 shows the frequency dependence of the slot transmission line attenuation in the forward and reverse directions. figure 5 shows the frequency dependence of the coplanar transmission line attenuation in the forward and reverse directions. fig. 5 the coplanar transmission line. dependence of attenuation vs. frequency. resonator dimensions is 0.6×4×0.1 mm 3 ; slot width is 0.4 mm; the center conductor width is 0.6 mm; ε of substrate is 40; substrate thickness is 1 mm; magnetizing field is 3125 oe. figure 6 shows the experimental frequency dependence of the coplanar transmission line attenuation in the forward and reverse directions. the experimental investigation of the me microwave properties of the bilayer structures were based on the measurements of the resonators frequency responses for different values of external dc voltage and bias magnetic fields. namely, reflection spectra s11 ( f ) = 10 log|pref ( f ) / pin( f )|, where pin( f ) is an incident power, pref( f ) is a reflected power, and f is the excitation frequency, were measured. the frequency responses were carried out with agilent network analyzer. fig. 6 for comparison. coplanar waveguide, the experimental frequency dependence of attenuation. magnetizing field is 2780 oe. 292 a. s. tatarenko, d. v. snisarenko, m. i. bichurin computation, design and manufacturing technology of nonreciprocal microwave devices have a great interest in current time. the main directions for further research based on the use of modern computer design programs. the use of modern simulation software allows the fast design of various types of non-reciprocal microwave devices. that simulation allows to get the selection of substrate parameters and the shape of me resonator. the me resonator based on layered structure of yig and pzt was used. to decrease the control voltage and the increase the valve ratio it is necessary to reduce the thickness of the piezoelectric, and hence the thickness of the ferrite. the use of computer simulation for me structures in the non-reciprocal microwave devices opens promising opportunities for the creation of the new devices. 3. conclusion magnetoelectric layered structures are ideal for studies on wideband magnetoelectric interactions between the magnetic and electric subsystems that are mediated by mechanical forces. such structures show a variety of magnetoelectric phenomena including microwave me effects. the phenomenon can be used for creating electrical tuning the microwave me resonators and devices on their basis. the possibility of me microwave devices realization on the strip transmission lines controlled by both electric and magnetic fields are shown. the results of computer simulation of various me microwave devices designs with resonators based on me layered structures placed into the transmission line are given. the simulated results are compared with the experimental results. acknowledgement: the paper is a part of the research done within the project of russian science foundation 16-12-10158. references [1] r. heindl, h. srikanth, s. witanachchi, p. mukherjee, t. weller, a.s. tatarenko and g. srinivasan, "structure, magnetism, and tunable microwave properties of pulsed laser deposition grown barium ferrite/barium strontium titanate bilayer films", j. appl. phys., vol. 101, p.09m503, 2007. [2] g. srinivasan, a.s. tatarenko, y. k. fetisov, v. gheevarughese, and m.i. bichurin, "microwave magneto-electric interactions in multiferroics", in proc. of the mater. res. soc. symp, 2007, vol. 966, p.0966t14-01 [3] d. seguin, m. sunder, l. krishna, a. tatarenko, p.d. moran, "growth and characterization of epitaxial fe0.8ga0.2/0.69pmn-0.31pt heterostructures", journal of crystal growth, vol. 311, no. 12, p.p.32353238, 2009. [4] g. srinivasan, i.v. zavislyak, a.s. tatarenko, "millimeter-wave magnetoelectric effects in bilayers of barium hexaferrite and lead zirconate titanate", appl. phys. lett., vol. 89, p.152508, 2006. [5] c.-w. nan, m. i. bichurin, s.x. dong, d. viehland, and g. srinivasan, "multiferroic magnetoelectric composites: historical perspective, status, and future directions", j. appl. phys., vol. 103, p.031101, 2008. [6] a.s. tatarenko, a.b. ustinov, g. srinivasan, v.m. petrov, and m.i. bichurin, "microwave magnetoelectric effects in bilayers of piezoelectrics and ferrites with cubic magnetocrystalline anisotropy", j. appl. phys. vol. 108, p.063923, 2010. http://www.sciencedirect.com/science/journal/00220248 http://www.sciencedirect.com/science?_ob=publicationurl&_hubeid=1-s2.0-s0022024809x0014x&_cid=271622&_pubtype=jl&view=c&_auth=y&_acct=c000228598&_version=1&_urlversion=0&_userid=10&md5=8b251a8f35a97f64a9c7006684ddd5fe modeling of magnetoelectric microwave devices 293 [7] m.i. bichurin, i.a. kornev, v.m. petrov, a.s. tatarenko, yu.v. kiliba, g. srinivasan, "theory of magnetoelectric effects at microwave frequencies in a piezoelectric/magnetostrictive multilayer composite", phys. rev. b., vol. 64, p.094409, 2001. [8] m.i. bichurin, i.a. kornev, v.m. petrov, yu.v. kiliba, a.s. tatarenko, n.a. konstantinov, g. srinivasan, "resonance magnetoelectric effect in multilayer composites", ferroelectrics, vol. 280, p.187-198, 2002. [9] s. shastry, g. srinivasan, m.i. bichurin, v.m. petrov, a.s. tatarenko, "microwave magnetoelectric effects in single crystal bilayers of yttrium iron garnet and lead magnesium niobate – lead titanate", phys. rev. b., vol. 70, p.064416, 2004. [10] a.s. tatarenko and m.i. bichurin, "microwave magnetoelectric devices", advances in condensed matter physics, vol. 2012, p.10. [11] a.s. tatarenko, g. srinivasan, m.i. bichurin, "magnetoelectric microwave phase shifter", appl. phys. lett., vol. 88, p.183507, 2006. [12] m. perić, s. ilić, s. aleksić, n. raičević, m. bichurin, a. tatarenko, r. petrov, "covered microstrip line with ground planes of finite width", facta universitatis series: electronics and energetics, vol. 27, no. 4, pp. 589 – 600, december 2014. [13] m.i. bichurin, v.m. petrov, r.v. petrov, g.n. kapralov, f.i. bukashev, a.yu. smirnov, a.s. tatarenko "magnetoelectric microwave devices", ferroelectrics, vol. 280, pp.213-220, 2002. [14] m.i. bichurin, a.s. tatarenko, d.v. lavrenteva, s.r. aleksić, "magnetoelectric microwave devices", in proc. of the 11th international conference on applied electromagnetics πec'2013, niš, serbia, september 01 – 04, 2013, pp.77-78 [15] c.p. wen, "coplanar waveguide: a surface strip transmission line suitable for nonreciprocal gyromagnetic device applications," ieee transactions on microwave theory and techniques, vol. mtt-17, no. 12, pp. 1087-1090, december 1969. [16] s. b. cohn, "slot line an alternative transmission medium for integrated circuits", in digest of the 1968 ieee g-mtt international microwave symposium, pp 104-109. [17] mariani, heinzman, agrios and cohn, "slot line characteristics", ieee transactions microwave theory and techniques, vol. mtt-17, december 1969, pp 1091-1096. instruction facta universitatis series: electronics and energetics vol. 29, n o 3, september 2016, pp. i ii guest editorial nowadays, internet has evolved into a platform that reshapes modern life and removes borders between real, social and cyber worlds. internet of things (iot) is an emerging paradigm and a cutting edge technology that harnesses a network of embedded, interconnected objects (sensors, actuators, tags or mobile devices) in order to collect various types of information at anytime and anywhere. these devices can be used for building different complex smart environments [1], such as smart homes [2][3], smart classrooms [4], smart offices [5], smart factories [6], smart cities [7], intelligent transportation systems [8], smart power grids [9] or smart e-government. furthermore, networks of devices are based on advanced internet standards. iot implies seamless integration of numerous types of devices into the existing internet infrastructure. smart environments can be customized according to users’ needs and preferences which are suitable for automating these environments. internet of things solutions often encompass integration with cloud-based systems and services [7]: infrastructure as a service (iaas), platform as a service (paas) and software as a service (saas). the main subject of the special issue is internet of things and its application in business, industry, research and academic community works. this special issue aims to provide state-of-art and innovative papers on the design, implementation, and usage of intelligent iot and related technologies, such as: cloud computing, big data, pervasive computing, social computing, etc. the primary goal is to provide a variety of research and survey articles in the field of the internet of things and their application in different aspects of human activities. findings and discussion should foster potentials and capabilities of research, academic community, and industry as well. the first invited paper in this issue "design and technologies for implementing a smart educational building: case study" gives a design of an educational smart building at the florida atlantic university. the building was designed as a “living laboratory” so that students and faculty may actually see how iot for smart buildings works. furthermore, it represents a good example for designing and building smart buildings at other universities. the second invited paper "swarm intelligence based reliable and energy balance routing algorithm for wireless sensor network" deals with the aspects of energy efficient routing in wireless sensor networks and proposes a new algorithm for energy-balanced routing. the following two papers "an architectural design for cloud of things" and "from intelligent web of things to social web of things" discuss the architecture aspects of cloud and web for iot, and propose new models and applications. the next paper "smart outlier detection of wireless sensor network" deals with unreliability of data sets collected from wireless sensor networks, and proposes a technique to detect outliers among data collected by geographically distributed sensors. in the paper "a new telerehabilitation system based on internet of things" the authors propose a telerehabilitation system that uses wearable muscle sensor and microsoft kinect to create interactive personalized physical therapy that can be carried out at home. then, the authors of the paper "a platform for a smart learning environment" propose a solution for integration of elearning and iot services within a smart learning environment. the paper "using internet ii editorial of things in monitoring and management of dams in serbia" presents an example of iot application in dam safety management. authors of the paper "a hadoop-enabled sensororiented information system for knowledge discovery about target-of-interest" present a generic sensor-oriented information system based on hadoop ecosystem used for realtime situational awareness about the specific behavior of targets-of-interest. the following two papers "a smart home system based on sensor technology" and "designing an intelligent home media center" present applications of sensor technologies in smart homes. finally, the paper "bridging the snmp gap: simple network monitoring the internet of things" deals with the problem of network management in iot and smart environments. finally, we would like to take the opportunity to thank authors and reviewers for their endeavor. without the great efforts from them, this special issue could not have been made. we would also like to thank the editor-in-chief, professor ninoslav stojadinović for the opportunity to edit this special issue and all his support throughout the editing process. acknowledgement: the editors are thankful to ministry of education, science and technological development, republic of serbia, project number 174031. references [1] l. atzori, a. iera, and g. morabito, "the internet of things: a survey", computer networks, vol. 54, pp. 2787-2805, 2010. [2] d. ding, r. a. cooper, p. f. pasquina, and l. fici-pasquina, "sensor technology for smart homes", maturitas, vol. 69, pp. 131.136, 2011. [3] l. c. desilva, c. morikawa, and i. m. petra, "state of the art of smar thomes", engineering applications of artificial intelligence, vol. 25, pp. 1313-1321, 2012. [4] s. s. yau, s. k. s. gupta, e. k. s. gupta, f. karim, s. i. ahamed, y. wang, and b. wang, "smart classroom: enhancing collaborative learning using pervasive computing technology", in proceedings of the asee 2003 annual conference and exposition, 2003, pp. 13633-13642. [5] c. le gal, j. martin, a. lux, j. l. crowley, "smartoffice: design of an intelligent environment", intelligent systems, vol. 16, no. 4, pp. 60-66, 2005. [6] m. brettel, n. friederichsen, m. keller, and marius rosenberg, "how virtualization, decentralization and network building change the manufacturing landscape: industry 4.0 perspective", international journal of mechanical, aerospace, industrial, mechatronic and manufacturing engineering , vol. 8, no.1, pp. 37-44, 2014. [7] j. jin, j. gubbi, s. marusic, and m. palaniswami, "an information framework for creating a smart city through internet of things", ieee internet of things journal, vol. 1, no. 2, pp. 112-121, 2014. [8] j. barceló, e. codina, j. casas, j. l. ferrer, d. garcía, "microscopic traffic simulation: a tool for the design, analysis and evaluation of intelligent transport systems", journal of intelligent and robotic systems, vol. 41, no. 2, pp. 173-203, 2005. [9] t. samada, s. kiliccote, "smart grid technologies and applications for the industrial sector", computers and chemical engineering, vol. 47, pp. 76-84, 2012. marijana despotović-zrakić university of belgrade, serbia zorica bogdanović university of belgrade, serbia huansheng ning university of science and technology beijing, china božidar radenković university of belgrade, serbia guest editors instruction facta universitatis series: electronics and energetics vol. 27, n o 1, march 2014, pp. 129 136 doi: 10.2298/fuee1401129t thz technology for vision systems  j. trontelj 1 , a. sešek 1 , a. švigelj 2 1 faculty of electrical engineering, laboratory for microelectronics, tržaška 25, 1000 ljubljana, slovenija 2 letrikalab d.o.o, polje 15, 5290 šempeter pri gorici, slovenija abstract. the thz radiation brings new technology challenges and new opportunities to overcome some of the current application obstacles. in the paper a portable thz system is presented operating at room temperature. the presented solution is robust and inexpensive, convenient for many applications. the thz sensor fabricated at the faculty of electrical engineering in the laboratory for microelectronics is currently one of the best sensors in its frequency operating range. it reaches sensitivity up to 1000v/w and nep down to 5pw/√hz in vacuum. with the proposed system solution variety of application can be covered. some imaging results captured with the proposed system at different stand-off distances are shown in the paper. key words: thz sensors, thz systems, stand-off thz detection, bolometer 1. thz technology nowadays many kinds of material inspection systems that are used, for example in the semiconductor industry, paper industry, medical and security applications, or many other see-through systems are based on x-ray techniques. as x-rays are ionizing and thus extremely harmful for biological tissue, their usage is area and time limited. therefore, new technologies and non-destructive methods emerging especially for biological tissues are used. thz technology promises suitable substitution, as it is nonionizing and comprises some other new properties which are available only in the thz frequency region of the electromagnetic spectrum. thz waves propagate through different non-metal materials such as plastics, clothes, paper, ceramics, some thermal insulation materials, and also dry wood. furthermore, very good reflection can be obtained by flat surfaces, and especially from metals. the main obstacle is humidity, which brings very high attenuation in the thz spectral range. air humidity absorbs thz radiation and makes thz waves improper for use at long stand-off distances. it was also shown that when thz waves propagate through or reflect from different materials, especially drugs and explosives, a specific fingerprint of each material can be recognized. all of these facts open a lot of new  received january 10, 2014 corresponding author: janez trontelj faculty of electrical engineering, laboratory for microelectronics, tržaška 25, 1000 ljubljana, slovenija (e-mail: janez.trontelj1@guest.arnes.si) 130 j. trontelj, a. sešek, a. švigelj possibilities of thz waves usage in e.g. material quality control, the pharmaceutical industry, medical imaging, and security – which is one of the most emerging fields. generally, there are two approaches of generation and detection of thz waves. first is the electro-optical approach, which is based on an ultra-short laser pulse, where the pulse width is in the femtosecond range. such a pulse illuminates a special crystal or semiconductor material and an electromagnetic wave in the thz region is emitted. the same material can also be used for detection. the tds principle is mostly used for spectroscopy, but imaging can also be done. imaging is very time consuming, as systems consist of one source and only one detector. with such setups, frequencies from hundreds of ghz to up to 8 thz and above can be achieved in the power range of microwatts. the second is the microwave approach, where signals are generated at frequencies as low as 10 ghz with discrete rf components. with these techniques continuous wave signals with frequencies up to 1.2 thz can be generated. the typical power at 300 ghz can reach up to 20 milliwatts. commonly used detectors are schotkey diodes and bolometers with an antenna, where an incident wave causes a temperature change of the detector. such detectors are small enough and can be merged into arrays, and can be integrated into systems used for imaging. it should be noted that spectroscopy is also emerging as a future application. 1.1. time domain thz spectroscopy system an optical approach of thz generation and detection is used in many different types of time domain spectroscopy (tds) thz systems. the common base of all is to split a femtosecond laser beam into two signals. one is used for thz wave generation and the second, also named a probing signal, is used for thz pulse reconstruction on the detector. for generation and detection of thz waves various methods are used, such as photoconductive generation and optical rectification. to reconstruct the whole thz pulse, the probing signal has to be time shifted over the whole thz pulse. this is done by a probing signal delay with a motorized optical delay line. when using photoconductive generation and detection, signals have to coincide directly on the detector itself. in the case when the optical rectification detection method is used, both signals have to coincide on the crystal, where the probing signal is polarized according to the incident thz pulse intensity. the difference in polarization gives thz signal strength information. at first, thz time domain spectroscopy setups were mostly built on optical tables, where various optical components such as beam splitters and focusing parabolic mirrors could be easily adjusted. such systems are very sensitive to mechanical stress and also the samples have to be small enough to fit into the measurement place in the system. they operated mostly in the transmission mode. to change to reflection mode usually big part of the setup had to be changed. to obtain a visual thz image, a sample had to be moved with implemented translation stages. therefore, new systems were built using fiber optics to improve the flexibility of measurements. now both main and probing laser signals can be transferred to a remote distance of few meters using fiber optics, and thz pulses are generated at the remote location. this allows thz response measurements of larger objects. also, the reconstructing of a visual thz image can be easily done with transmitting and receiving thz heads mounted on to the translation stages. thz technology for vision systems 131 1.2. continuous wave thz systems the main thz core of a continuous wave (cw) thz source is a few ghz voltage controlled oscillator with a precise pll loop to achieve low phase noise and enough output power. normally the chosen basic frequency is 12.5ghz, which can be easily multiplied to several hundreds of ghz. during multiplication, which means higher harmonics filtration and amplification, the main issue is to keep low phase noise and to achieve high output power. the power on the output horn antenna is usually up to 100 milliwatts and is highly dependent on number of multiplications. with electronics sources a frequency of up to 1thz can be reached, which is highly suitable for 2d and 3d imaging. for a detector, many sensors can be used, as the power is higher and the thz beam illuminates a larger area. primary selection is the bolometer type sensor array, schotkey diode array, pyro sensor array, etc. [1]. 2. lmfe thz sensor and system the thz system in the laboratory for microelectronics (lmfe) is a cw thz-based system with scanning mechanic, thz lenses and a micro bolometer detection array [2], [3]. data acquisition object thz source pivoting mirror sensor array beam splitter lo fig. 1 block diagram of a lmfe thz imaging system the thz source is a 12.5ghz source, multiplied by x4, x2, and x3totally x24, which gives an output frequency of 300ghz with a peak power of 5mw. the thz beam is split in a ratio 40:60 with a silicon beam splitter. the larger part of the beam continues to the observed object. there it reflects and it is redirected to a sensor array by a pivoting mirror. the pivoting mirror scans through the vertical dimension of the object. the on sensor array, which gives the horizontal dimension, and both thz beams are merged to achieve a heterodyne detection. the core of the system is a 2x16 thz sensor array, fabricated and assembled in lmfe. 2.1. thz sensor and sensor array sensors used in a thz array [4]-[6] are designed and fabricated in lmfe. lmfe owns the cmos technology, which is able to produce a sensor and systems on silicon down to 500nm. the technological procedure of the thz sensor fabrication was described in patent [7]. materials for the sensor were evaluated with the equation 132 j. trontelj, a. sešek, a. švigelj ( ) √ (1) where se is sensitivity, tc is the temperature coefficient of the material, rho is sheet resistance, and g is thermal conduction [8]. from the several appropriate materials, titanium was chosen. the main goal of the design was to achieve the highest sensitivity (se), low noise equivalent power (nep), and to match sensor and antenna impedances. fig. 2 lmfe thz sensor as the detection principle is based on a titanium thermistor, a double dipole antenna is attached to it to receive and transfer thz energy to the bolometer which is therefore heated, and consequently its resistance changes according to the energy received. on figure 2, the realization of such sensor is shown. the sensors are fabricated from a silicon wafer, which is partly etched on the thermistor-antenna area to achieve better parameters regarding thermal dissipation. doubled contact pads allow connection from both sides. the antenna and connections material is aluminum. equation (2) describes power conditions on the bolometer: ⁄ ⁄ ⁄ (2) as it can be seen from equation (2), three basic power components are present on thermistor – biasing power (ub 2 /r), noise power (un 2 /r), and signal power (us 2 /r) [9]. thz sensors fabricated at lmfe have a 300ghz central frequency, a sensitivity of up to 1000v/w, and nep of up to 5pw/√hz when in a vacuum and at room temperature. the sensors are fabricated as quadruples for easier handling and a simple array assembly. the sensors are biased with an i0/4 current, where i0 is a physical limitation of the electrical damage. fig. 3 thz sensor array thz technology for vision systems 133 on figure 3 thz sensor array (2 x 16 sensors) is presented. the opening under the sensor can be clearly seen – a 3um silicon nitride membrane is practically invisible – cavity under the sensors is λ/4 deep and acts as a resonator which gains the thz signal. 2.2. lmfe thz system thz system was partly described in the block diagram on figure 1. the real setup is shown on figure 4. fig. 4 thz sensor system the image on figure 4 presents a portable thz system which consists of four main blocks, as presented in the block diagram in figure 1. some other system parts can be seen in figure 4, as the thz receiving lens and illumination focus lens. the system also needs additional low noise amplifiers, which are below the sensor array, the supply for each block and a/d converter for its operation. digitalization is made with a 16-bit national instruments a/d card with 32 channels, and a 2mhz total sampling frequency. data collected and transformed is transferred to a pc and processed to produce the thz image of the hidden object. 3. imaging results the thz system is capable of scanning through different materials which are invisible, but transparent for thz radiation as paper cardboard, plastics, paper, and packaging material. it covers approximately 0.1m x 0.1m area at a 1m distance with basic lenses and basic optical adjustments. with special stand-off lenses, a maximal area of 0.2m x 0.2m at 5m was achieved. the test setup of the system is presented in figure 5 where the lenses, thz source, and thz detector array box are separated due to different tests, different observation distances, and different operation modes (reflection and transmission mode). thz receiving lens illumination focus lens thz pivoting mirror thz beam splitter thz scanning array 134 j. trontelj, a. sešek, a. švigelj thz lenses are made of polystyrene and they have different diameter according to the distance of the observed object. in many cases, a lens can included in the thz source block and/or in the receiving block. design of the lens is important as it can significantly influence image quality and the system resolution. for larger diameter of the lens a fresnel principle is used and for the smaller diameters continuous lenses are choosen. for images three different objects were chosen to prove all operational modes and the standoff operation. fig. 5 test thz system in transmission mode figure 6 was captured. the observed objects were plant leaves, where water vessels can be clearly seen in the thz image. the visual image is added for better understanding. fig. 6 visual and thz image in transmission mode the main vessel in the center is almost opaque due to high water content (thz waves are totally absorbed), meanwhile other parts of the leaf are semi-transparent according to the water content level. the next mode is the reflection mode, where two different objects at two different distances are presented. the first in figure 7, the paper clip at a distance of 0.36m was scanned. thz technology for vision systems 135 fig. 7 visual and thz image of a paper clip the upper two images in figure 7 presents a thz and visual image of a paper clip without a barrier cover material, and the images below present the same clip with an additional two layers of textile cover, to prove the thz waves penetration. figure 8 presents the imaging result of the thz system of a small carpenter knife taken at a 5m stand-off distance. fig. 8 visual and thz image of a carpenter knife the upper image couple in figure 8 presents the uncovered object, and the bottom couple show the object covered with two layers of textile. the knife is clamped in expanded polystyrene, which is transparent for thz radiation. 4. conclusions in the paper the thz vision system developed in the university of ljubljana laboratory for microelectronics is described. both transmitted and reflected images are shown giving excellent resolution at up to a 5m stand-off distance. the core of the system is the thz 136 j. trontelj, a. sešek, a. švigelj thermistor sensor, which is fabricated in lmfe, and achieves one of the best reported results in sensitivity (se=1000v/w) and noise equivalent power (nep=5pw/√hz) at room temperature. acknowledgement: the thz research was partly funded by the ministry of defense of the republic of slovenia and the namaste center of excellence. references [1] a. rogalski, f. sizov, ˝terahertz detectors and focal plane arrays,˝ opto−electornics review 19(3), 346–404, (2011) [2] j. trontelj, a. sešek, a. švigelj, thz focal plane array for high resolution 3d imaging. v: leitner, raimund (ur.), arnold, thomas (ur.). international thz conference 2013: september 9-10, 2013, villach, austria. [wien]: österreichische computer gesellschaft, cop. 2013, pages 37–41, ilustr. [3] j. trontelj, room temperature antenna-sensor array and thz vision demonstrator, 7th nato set-124 task group business meeting, oslo, norway, 8. 9. 11. 2010; nato / rto 2010 [4] l. pavlovič, d. kostevc, a. pleteršek, m. maček, a. sešek, j. trontelj: 300 ghz microbolometer parallel-dipole antenna for focal-plane-array imaging, nato-otan, set-159, specialists meeting on terahertz and other electromagnetic wave techniques for defense and security; vilnius, lithuania, may 3-4 2010; nato / rto 2010 [5] j. trontelj, a. sešek, m. maček, thz sensor array operating at room temperature, set169 symposium on 8th nato military sensing symposium, friedrichshafen, germany, 16 – 18.5.2011; nato / rto 2011 [6] m. podhraški, a. švigelj, m. maček, j. trontelj, thermal analysis of thz microbolometer. v: belavič, darko (ur.), šorli, iztok (ur.), 48th international conference on microelectronics, devices and materials & the workshop on ceramic microsystems, september 19 september 21, 2012, otočec, slovenia. proceedings. ljubljana: midem society for microelectronics, electronic components and materials, 2012, pp. 421–426, illustr. [7] j. trontelj, m. maček, a. sešek. a detection system and a method of making a detection system: 1307052.9 20130418. [newport]: united kingdom intellectual property office, 2013. [8] l. pavliček, d. kostovec, a. pleteršek, m. maček, a. sešek, j. trontelj, 300 ghz microbolometer parallel-dipole antenna for focal-plane-array imaging. nato-otan, set-159, specialists meeting on terahertz and other electromagnetic wave techniques for defense and security, vilnus, lithuania, may 3-4, 2010, rto/nato, 2010. [9] j. trontelj, m. maček, a. sešek, a. švigelj, uncooled nanometric scale bolometer system for thz sensor array. v: 2011 nanoelectronic devices for defense & security (nano-dds) conference, 29 aug. 1. sept. / brooklyn, new york. technical program & abstract digest:2011 nano-dds conference theme: present & future roles of nanotechnology in the forensic sciences., 2011 instruction facta universitatis series: electronics and energetics vol. 29, n o 3, september 2016, pp. 407 417 doi: 10.2298/fuee1603407s a platform for a smart learning environment konstantin simić 1 , marijana despotović-zrakić 1 , živko bojović 1 , branislav jovanić 2 , đorđe knežević 1 1 faculty of organizational sciences, university of belgrade, serbia 2 institute of physics, university of belgrade, serbia abstract. in this paper, a modular platform which provides student services for smart educational environment is described. the platform represents a point of mutual integration of various services, such as hosting platform for students’ projects, platform for integrating sms service with students’ web applications, internet of things platform which enables acquiring data from sensors distributed within the university building and controlling various actuators. platform is deployed as a part of smart learning environment. it is integrated with single sign on service and it uses cas and oauth2. rest api is also provided. php symfony framework, relational and non-relational databases are used for deploying the platform. the platform was evaluated and tested. key words: smart environment, e-learning, web application, platform as a service 1. introduction internet of things (hereinafter: iot) enables interconnecting smart devices, such as sensors, actuators, microcontrollers and microcomputers with other information-communication infrastructure [1][2]. using smart devices brings increasing the level of automating everyday tasks which leads to gaining better productivity in many different environments. smart environments, such as smart homes, smart classrooms or smart factories, are formed by connecting and adding a large number of smart devices to an existing communication infrastructure. in education sphere, iot has wide application possibilities which have not used enough. by using adequate sensors and actuators, it is possible to track different features of a physical environment, to detect whether these features are in correlation with learning and teaching processes and to dynamically change some features of the environment according to needs. iot technologies are integral part of the smart learning environment such as smart classrooms. received july 3, 2015; received in revised form november 15, 2015 corresponding author: živko bojović faculty of organizational sciences, university of belgrade, jove ilića 154, 11000 belgrade, serbia (e-mail: zivko@elab.rs) 408 k. simić, m. despotović-zrakić, ž. bojović, b. jovanić, đ. knežević smart classroom is a concept which integrates several information and communication technologies to enable collaborative learning in order to improve the overall learning and teaching processes [3]. different technologies can be used for deploying a smart classroom, such as nfc, smart mobile devices, multimedia devices etc. furthermore, learning environment should be pleasant place for teaching and learning. therefore, smart classroom should be equipped with systems for heating and cooling, light and presence management, and with the necessary equipment for the realization of the teaching process. the hardware equipment in smart classrooms are usually managed by adequate software. existence of a platform that integrates all services in scope of smart learning environment is important for the teachers and students. this research represents a development of a platform for smart learning environment. these iot platform integrates several student services and collects data from the smart learning environment through various sensors, actuators, microcomputers and microcontrollers. the aim of this research is enhancing learning of internet of things in an academic environment by creating projects in developed iot platform. the research is conducted in scope of the department of e-business (hereinafter: elab), at the faculty of organizational sciences, university of belgrade. the elab iot platform was developed to help students to learn iot in an interesting way and achieve better learning outcomes. 2. literature review internet of things can be defined as a loosely coupled, decentralized system, made of smart objects or autonomous physical or digital network-equipped objects which are able to collect environmental data and to process these data [4]. the internet becomes a network of all devices, not solely computers. according to gartner’s predictions, around 26 billions of devices will have been an integral part of the internet by the year 2020 [5]. umbrella term “internet of things” is usually used for grouping sensors, actuators, microcontrollers and microcomputers into smart environments. sensors are analog or digital devices able to detect physical characteristic of the environment, such as temperature, humidity, pressure, levels of sound noice etc. [6]. actuators are devices which works like switches – they can be used for controlling other devices. sensors and actuators are not enough by themselves for creating smart environments. they are often used together with more complex devices, such as microcontrollers and microcomputers. in the sphere of the iot, a widely spread microcontroller platform is arduino and one of the best known microcomputers is raspberry pi. an important aspect of the iot is connecting with other networks. in the era of broadband technologies such as wifi and lte, this issue is especially growing. data provided by different devices should be available everytime and everywhere. by creating an adequate platform in the cloud, it is possible to integrate multiple data sources and to analyze these larage amounts of data. the vision of the internet of things can be seen from two aspects: the internet aspect, which focuses on providing adequate internet services, and the things aspect, which includes collecting and processing data aquired from devices. smart devices are going to be key-elements in software developed by using the object-oriented architecture. a platform for a smart learning environment 409 the iot platform can connect a sensor infrastructures which represent data generators with clients interesting in obtaining data which represent consumers. sensor infrastructure can contain one or many sensors. they can be mobile and connected to the same cloud wirelessly. data acquired by using the sensor infrastructure are stored into a nonrelational database. clients are able then to access these data [7]. database can be delocalized and distributed in order to exchange and store information. nowadays, iot platforms are based on cloud infrastructure. mainly these platforms are used for collecting data from sensors and other smart devices from the environment in which are implemented. cloud services and resources can be delivered by three cloud service models [8][9][10]: platform as a service (paas), software as a service (saas) and infrastructure as a service (iaas). in this research the focus is on iot paas. platform as a service enables the developers to consume the resources in iaas and deploy their applications onto a virtualized cloud platform [9]. one of the most widely used paas is xively. it is a free online service enabling developers to deploy their own application based on the iot and data acquired from sensors. xively manages large amounts of data every day. it is used by individuals, organizations and companies all around the world. data can be send from different sensors and devices. xively has the following features:  analysing and processing historical data acquired from sensors.  sending real-time notifications and alerts related to any devices.  calling custom scripts if user-defined conditions are fulfilled. xively is built to encourage open ecosystems, such as digital electric meters, weather stations biosensors and other devices. besides iot platforms like xively, there are also iot platforms which are developed for specific purposes. most of these platforms are used in business context for providing charged services. one example of these platforms is aneka paas which represents an adaptable, extensible and flexible cloud platform that enhance the performance and efficiency of applications by harnessing resources from private, public or hybrid clouds [4]. aneka supports provisioning resources on different public cloud providers such as amazon ec2, windows azure and gogrid. application domains of aneka are in the science, finance, entertainment and media, manufacturing and engineering, telecommunication, health and life science [11]. the clem project has established a cloud-based ecosystem for e-learning, resourcesharing and support for mechatronic vocational education teachers and learners [9]. clem is a platform that allows large number of distributed mechatronic devices to become sharable and to be used for e-learning. this paas is used for project realization. analysing studies from the literature, the authors concluded that there are lack of developed iot paas platforms for the educational purposes. the iot platform presented in this research was developed according to xively platform. elab iot platform is developed in php language and symfony framework. the main aim of the platform is collecting and evaluating data collected from the smart learning environment. furthermore, this platform is mainly developed to enhance learning iot and realization of internet of things project in an academic environment. the developed iot platform is an integral part of the elab student platform. elab iot platform can be integrated with other platforms and deployed in different smart environments. 410 k. simić, m. despotović-zrakić, ž. bojović, b. jovanić, đ. knežević 3. designing a platform for smart learning environment 3.1. platform architecture e-business department (elab) within the faculty of organizational sciences, university of belgrade, organizes courses in following fields: internet technologies, ebusiness, computer simulation, mobile computing, and internet marketing. each year, ecourses are attended by more than 700 students [12]. the central student service of the department is moodle, which enables enrolling students to different courses for all levels of studies, downloading all teaching materials and managing assignments. due to the specific nature of different subjects, it is needed to deploy new student services which enable new functionalities. for example, students enrolled to internet marketing course need to use web hosting and sms services, but students enrolled to internet of things course need to use a platform which can control sensors and actuators. all new services and platforms should be integrated and they should use sso (single sign on). student platform should include various heterogeneous services and it should be extendable. for that reason, the platform architecture has to be modular and integrated with other services. the logical infrastructure is shown in the figure 1. this solution should integrate a new platform with current services. by using the api, platform can be integrated with moodle lms. this integration enables gathering information about students and using this information in various contexts. fig. 1 a model of educational infrastructure based on the internet of things. a platform for a smart learning environment 411 the model consists of four components. cloud computing infrastructure and virtualized resources can be used for creating a highly-scalable and reliable infrastructure. identity management software is used for providing unique user accounts and single sign on services. lms is used for administrating courses for students. for running the lms, relational databases and web servers are required. big data infrastructure and non-relational databases (nosql) are used for collecting various data about students and data from sensors. the second component is an iot platform infrastructure, which consists of two subcomponents. the platform for learning iot enables students to use data from sensors, to control different actuators and to deploy their own smart environments for testing and educational purpose. the other subcomponent is related to the production environment where wireless sensor networks are used for enhancing students’ experience and for introducing new educational services. the last two components are used for integrating other components of the infrastructure and for providing external application programming interfaces (apis) to external users. elab student platform represents a point of integration of all student services. elab student is a modular platform. currently, the following modules are operational: 1. sms module, which enables integration with a sms gateway device and provides an api for sending and receiving short messages via public mobile network; 2. hosting module, which enables publishing students’ project on the internet; 3. iot module, which is used for publishing and reading data from sensors and managing actuators and other smart devices. the model of proposed architecture is shown in the figure 2. authentication authorization sso identity management common components html5 css3 js, jquery presentation moodle lms synchronization global configuration module configuration administration rest services web services user devices access main services modules data domains & dns ftp configuration hosting sending sms receiving sms sms projects devices sensors actuators iot fig. 2 the platform architecture 412 k. simić, m. despotović-zrakić, ž. bojović, b. jovanić, đ. knežević 3.2. digital identity management problems of digital identities are important in constructing platforms and other software solutions. separate identity layer is necessary in complex information system made from various heterogeneous parts. without the identity layer, integration of these parts would not be possible. digital identity is related to managing the relationship between individuals and objects that they use. it represents a set of digital subjects and their attributes. in other words, digital identity includes a set of information about the owner of the identity that can be an individual, company or even a service [13]. for managing digital identities, several protocols are used. saml is the best-known protocol for managing distributed identities [14]. it is a xml-based framework which provides user authentication and authorization. other frameworks which are used for identity management are cas, oauth, openid and others. a digital identity management tier enables centralized storage of all user accounts in ldap, as well as centralized authentication, authorization and single sign on/ single sign out features. cas server in combination with openldap and radius servers is used for digital identity management. cas (central authentication service) represents a single-sign on protocol and server developed by jasig. using this solution, users of the platform should enter their authentication credentials only once, afterwards they can have access to all pages they are authorized for (figure 3). cas server moodle lms elab student portal other services radius server ldap server services identity management fig. 3 platform services and identity management 3.3. use cases use cases of the elab iot platform are shown in the figure 4. there are four main use cases: project management, device management, sensor management and working with data from sensors. elab iot platform works with projects. students can create their iot projects and register team members. projects work similar as operating system’s folders. each project can be private (viewable only by team members) and public (viewable by all users). in each projects, devices, such as raspberry pi or arduino, can be registered. for each device, some metadata, such as ip address and location, can be added. under devices, particular sensors and actuators which are physically connected to the device can be added. students can use the api to write data from devices, sensors and actuators, to the platform. a platform for a smart learning environment 413 fig. 4 the platform use cases 4. deploying a platform for smart learning environment 4.1. technologies used in deployment the platform for iot is highly modular and it is created as a part of elabstudent platform which integrates all student services. elab iot platform is developed in php language and symfony framework. symfony includes reusable php components which enable creating powerful web application. it is a mvc-based framework. for rendering views, twig templating engine is used and for working with data, preferable data-mapping tool is doctrine orm. for developing elab iot platform, both relational (mysql with doctrine orm) and non-relational (mongodb) databases are used. mongodb is nosql database which uses bjson data collections. this data format is human-readable and it is convenient for working with large sets of data. for communicating with cas server, besimplessoauthbundle is used. this bundle can map user entities created in symfony to cas users. also, cas login and registration forms are used. 4.2. core component of the platform the core component of elab student platform integrates all modules, identity service, templates, identity module and tools for integration with other software. this component 414 k. simić, m. despotović-zrakić, ž. bojović, b. jovanić, đ. knežević enables registering new users to the platform. also, it manages which services are available to particular students. administrator is able to deny access to particular services or to the whole platform. elab student platform uses bootstrap frontend framework for designing templates. the basic template is slightly modified to suit the needs of the platform. the homepage of the platform, where students can choose their desired service, is shown in the figure 5. fig. 5 the platform homepage 4.3. internet of things module the module for the internet of things is one of the key-features of the platform. using this module, students can add their iot projects, devices they work with and sensors and actuators. students are able to use the provided api to send actual data collected from sensors and to read historical data, measured at any moment. this platform stores various metadata related to devices and sensors. for devices, user can define their type, latitude, longitude, image and description. for sensors, user can set their reliability, type, measuring units, image and description. in the following figure, a procedure for creating a new project is shown. first, a student can select team members from the list. all students who have had cas account, who have been registered to the elab student platform and who have not had created an iot project are shown in the list. afterwards, they can add devices which they want to include to the project. finally, students can view values from sensors and the graph of historical values. they can also filter the graph by entering start and end date and time they want to see in the graph (figure 7). a platform for a smart learning environment 415 fig. 6 creating a project fig. 7 graph with sensor data 4.4. results of using the platform in order to evaluate the usability of the designed environment, the research was conducted in the scope of the course internet of things on undergraduate studies at the faculty of organizational sciences, university of belgrade. in this research, 37 students 416 k. simić, m. despotović-zrakić, ž. bojović, b. jovanić, đ. knežević participated. all of them had similar backgrounds and interests in the sphere of business informatics. this course consisted of 12 lectures, which were grouped into the following topics: introduction to internet of things technologies, defining scenarios for automating smart environments, microcomputers and microcontrollers, developing web services, developing web and mobile applications for smart environments automation. after completing the course, students' knowledge was evaluated. students' assignments, projects and knowledge tests were used to calculate the final grade. each assignment was a part of the final students' project. each project was related to designing and implementing of a smart environment such as smart home, smart classroom, smart parking, etc. for the implementation of a smart environment students used elab iot platform. the average grade that students achieved on the course was 8.75 (on a scale of 6 to 10). during the course, students were given a survey. questions were mostly based on the five-point likert scale. students were asked to assess the quality the four topics studied during the course: arduino, raspberry pi, web applications and web services. all the topics were studied using the described platform. table 1 shows summary results of the survey analysis for each (x mean grade, from 1 to 5; 𝛿 standard deviation). table 1 survey analisis parameter topic x 𝛿 interesting arduino 4.31 0.62 raspberry pi 4.43 0.65 web applications 4.22 0.67 web services 3.89 0.81 simplicity arduino 4.03 0.65 raspberry pi 3.81 0.66 web applications 3.30 1.27 web services 3.62 1.01 motivation arduino 4.22 0.64 raspberry pi 4.05 0.70 web applications 3.51 0.90 web services 3.59 0.801 evaluation results show that the students were interested in learning iot and developing smart environments using the described platform. the designed platform could effectively support teaching and learning, leading to good results on knowledge tests and high level of students' satisfaction and motivation. 5. conclusion in this paper, we designed and deployed an internet of things platform. this platform was a part of the broader elab student platform and it was able to help students with their iot projects. students were able to register their iot devices and to send data from them to the platform. also, they were able to browse save data. in the future, this platform is going to be extended with more features. one of planned functionalities is better integration with big data infrastructure. a platform for a smart learning environment 417 acknowledgement: the paper is a part of the research done within the project 174031. the authors would like to thank to the mntrs for financial support. references [1] l. atzori, a. iera, g. morabito, the internet of things: a survey, computer networks, vol. 54, issue 15, 28 october 2010, pp. 2787-2805. [2] e. borgia, the internet of things vision: key features, applications and open issues, computer communications, vol. 54, 2014, pp. 1-31. [3] s.s. yau, s.k.s. gupta, e.k.s. gupta, f. karim, s.i. ahamed, y. wang, b. wang, smart classroom: enhancing collaborative learning using pervasive computing technology, in asee 2003 annual conference and exposition, 2003, pp13633-13642. [4] j. gubbi, r. buyya, s. marusic, and m. palaniswami, “internet of things (iot): a vision, architectural elements, and future directions,” futur. gener. comput. syst., vol. 29, no. 7, pp. 1645-1660, 2013. [5] gartner, “gartner identifies the top 10 strategic technology trends for 2014,” gartner, orlando, florida, 2013. [online]. available: http://www.gartner.com/newsroom/id/2603623. [6] s. s. iyengar, n. parameshwaran, v. v. phoha, n. balakrishnan, and c. d. okoye, fundamentals of sensor network programming: applications and technology. wiley-ieee press, 2010. [7] c. floerkemeier, the internet of things: first international conference, iot 2008, march 26-28, 2008, proceedings, vol. 4952. zurich, switzerland: springer, 2008. [8] l. george, developing software online with platform-as-a-service technology, ieee comput., vol. 41, 2008, pp. 13-15. [9] k.-m. chaoa, a. e. jamesa, a. g. nanosa, j.-h. chena, s.-d. stan, i. muntean, g. figliolini, p. rea, c. b. bouzgarrou, p. vitliemov, j. cooper, j. van capelle, “cloud e-learning for mechatronics: clem”, future generation computer systems, vol. 48, 2015, pp. 46-59. [10] m. despotović-zrakić, k. simić, a. labus, a. milić, b. jovanić, 2013 scaffolding environment for adaptive e-learning through cloud computing. educational technology & society, 16(3), pp. 301-314. [11] y. wei, k. sukumar, c. vecchiola, d. karunamoorthy and r. buyya, chapter 27. aneka cloud application platform and its integration with windows azure, in cloud computing methodology, systems, and applications edited by boualem benatallah, crc press 2011, pp. 645–679. [12] a. labus, k. simić, m. vulić, m. despotović-zrakić, and z. bogdanović, “an application of social media in elearning 2.0,” in proceedings of the 25th bled econference edependability: reliable and trustworthy estructures, eprocesses, eoperations and eservices for the future, 2012, pp. 557–572. [13] y. zhang and j.-l. chen, "universal identity management model based on anonymous credentials," in proceedings of the 2010 ieee international conference on services computing (scc), 2010. [14] k. d. lewis and j. e. lewis, “web single sign-on authentication using saml,” int. j. comput. sci. issues, vol. 1, no. 8, pp. 41-48, 2009. instruction facta universitatis series: electronics and energetics vol. 31, n o 1, march 2018, pp. 41 50 https://doi.org/10.2298/fuee1801041p investigating the steady state mode of power inverters for induction heating evgeniy popov, nikolay hinov faculty of electronic engineering and technologies, department of power electronics, technical university of sofia, sofia, bulgaria  abstract. this article formulates the new unified interpretation of the analysis of electromagnetic processes in the autonomous (usually resonant) inverters with power circuits having a serial rlc configuration either with or without free wheeling diodes. the investigation starts with clarifying the parameters of the inverter circuit by bringing the fourth order power network into such of a second order in a normalized form. on this basis the novel compendious relationships between the most important internal inverter parameters are given. a matlab program calculates and displays the frequency characteristics of both types of inverters and simulates their steady state. the results from characteristics and simulation confirm each other in many ways. they were also proved experimentally. the whole processed information helps better understanding and organizing intelligent design, measurement and control of the inverters for technological applications (induction heating). key words: modeling, electromagnetic processes, frequency characteristics, rlc inverters, steady state, unified interpretation. 1. introduction the voltage fed rlc inverter (fig. 1), its dual counterpart the current fed rlc inverter and the serial rlc inverter without free–wheeling diodes (fig. 2) cover a very wide range of practical autonomous inverter circuits generally applied in electronic technology [1]-[4]. in this case important and complex mutually connected problems are the accurate design of the power circuit, appropriate adjustment between the inverter and the load and adequate control providing stable, reliable and efficient operation of the converter when wide variations of the load are expected [5]-[7]. the mathematical relationships between the parameters of the mentioned inverters are rather complicated [5], [8]-[10]. it has been proved that all the quantities in these second order received september 15, 2016; received in revised form september 29, 2017 corresponding author: nikolay hinov faculty of electronic engineering and technologies, department of power electronics, technical university of sofia, sofia, bulgaria (e-mail: hinov@tu-sofia.bg) 42 e. popov, n. hinov topologies depend on two variables the ratio between the controlling angular frequency and the generalized angular frequency (frequency coefficient) in n /     on one hand, and the ratio between the damping coefficient and the generalized angular frequency i n n /      of the power circuit, on the other. at the same time engineering practice in the area of the power electronics needs fast and accurate means for simultaneous solution of the already stated problems. vt1 vt4 u d vd1 vd4 vt2 vt3 vd2 vd3 r i l i ci i u i d u i fig. 1 voltage fed rlc inverter when, for oscillatory mode i i ir 2 l / c of the inverter circuit (fig 1, fig. 2) coefficient is c 1 , for over damped mode ( i i ir 2 l / c ) c 1  and for critical mode ( i i i r 2 l / c ) c 0 . the parameters that determine the development of the electromagnetic processes in the steady state in the power inverter circuits are 2 2 cos ( .; );1 ( ) (4 sin cos ) i i i i i i i b n n osc over damped critical c b b           (1) for resonant inverters (oscillatory mode) the coefficient of hesitation is 0 1/ [1 exp( / )] 1/ [1 exp( )] i k n         (2) the frequency coefficient is 2 2 2 2 ( .; ); ( ) cos(4 sin cos ) i i ii i i i n n osc over damped critical bc b b           (3) vs3 vs2 c i rii u u d vs1 vs4 l i1 i d l i2 or l i =l i1 +l i2 b fig. 2 a serial rlc inverter without free–wheeling diodes the steady state mode of power inverters for electro technology applications 43 2. the inverter analysis in the steady state mode in contrast to the approaches used in [10]-[15] provided herein is a unified approach to the analysis, which alleviates the description of the behavior of the power circuit. a constant ci reflecting the type of the inverter is introduced, having values ci = +1 for the rlc inverter with free wheeling diodes (fig. 1) or ci = 1 for the rlc inverter without free wheeling diodes (fig. 2). the following designations are applied: s f (x) sin x , c f (x) cos x , i i r / (2l )  , 2 0 i 1/ (l ci)     for oscillatory mode; sf (x) sinh x , cf (x) cosh x , i ir / (2l )  , 2 i i 1/ (l c )   for over damped mode; s f (x) x , c f (x) 1 , i i r / (2l )  ,   for critical mode. then the inverter current and the voltage across the capacitor ci can be written in the following manner 0 0 ( ) ( ) ( ) t td s i c s u u i e f t c i e f t f t l              (4) 0 0 ( ) ( ) ( ) ( ) t t d d c s i s i u u u u e f t f t c e f t c               (5) the angle 2 0 / / n          corresponding to the half period is determined from the controlling angular frequency 2 f  and the generalized frequency  . the parameters of the inverter circuit (fig. 1, fig. 2) can be determined taking into account the initial conditions for the steady state 0 2 (0) . ( ) (0) ( ) i i c i u u       (6) and 1 ( ) 0i   (7) then the determination of the parameters follows. the parameter 0 0 / ( ) d a i l u u   is 2 2 . 2 2 ( ) . ( ) . . ( ) s n i c i s f a e c f c n f         (8) for the inverter in fig. 1 only the angle 1  is determined from 1 1 1 1 ( ) / ( ) 1 . ( ) / ( ) s c s c f f a n f f        (9) the generalized coefficient of hesitation is 2. 2 2 2 1 1 [( . . ) ( ) ( )] n i i s c k e c ca n c a n f f            (10) 44 e. popov, n. hinov it should be underlined that when calculating a and k for discontinuous inverter current mode for fig. 2 0 2   must be equal to  in the already given expressions (8) and (10). the initial capacitor voltage is 0 0 ' 2. 1 d u u k u    (11) the maximal voltage across the capacitor i c m u for fig. 2 is also given by (11). for fig. 1 it is given by 1 ' 2( ) 1m k m k d u u k u     (12) the expression for the coefficient 1 k is the same as (10), but the variable 2  is exchanged with 1  ( 2 1   ). the normalized inverter current and capacitor voltage are respectively   ( ) '( ) 2 (1 . . ) ( ) . . ( ) n i s i c d i l i ke c a n f c a f u             (13) 2( ) '( ) 1 2 ( . . . . ) ( ) ( ) n i i s c d u u ke c c a n c a n f f u                 (14) the average value of the input current (all values are normalized) is 0 2 0 00 1 1 2(2 1) ' '( ) .d d d i l k i i d u n c              (15) the rms value of the inverter current is ' / ' / (2. ) d d i i l u i n     (16) the output characteristic (fig. 1 and fig. 2) is '/ ' d och i i (17) the input characteristic is 1/ ( . ' ) d ich n i   (18) the characteristic of the coefficient of nonlinear distortion (klir – factor) of the inverter current is 2 2 (1) (1) [%] 100. ' ' / 'kf i i i  (19), where i'(m) (m = 1, 3, 5, 7...) is the m th harmonic component of the inverter current. (the mathematical expressions for calculation of the harmonic components in the different circuits and modes of operation are disparate, and rather complicated. they have been found and published in earlier authors’ publications.) from that point on the analysis of the power inverter may be continued without problems in a normalized or in a non normalized form. the steady state mode of power inverters for electro technology applications 45 3. an example with a serial resonant inverter without free wheeling diodes a half bridge circuit (fig. 3) is under study but it can be easily converted into the bridge one. the power losses in the inverter are neglected. the commutation of the power semiconductor devices is instantaneous. a parallel equivalent circuit represents the induction heater. the quality factor of the load circuit is sufficiently high that the voltage across the load has a close to the sine wave shape. a matlab program processes all the mathematical information describing the steady state operation of the inverter in the allowed frequency range. the frequency characteristics of the inverter are obtained and graphically displayed in fig. 4. these particular characteristics correspond to a practically implemented inverter (pt1-1002400) with the following data: ud=500 v; lk=45 h; ck=84 f; rlr= 0.4739599 , flr=2083 hz, llr=8.8717 h, cl,=657.88 f,n1=0, n2=1, f=1600 -2600 hz. the first graphic shows: the average input current id [a] (solid line). the second graphic shows: the maximal device voltage uvsm (solid line). other parameters of the circuit can also be calculated and displayed. u d /2 r l c k l k1 i vs1 vs2 i d l l c l l k2 u l u d /2 or fig. 3 a practical serial resonant inverter without free–wheeling diodes 1600 1700 1800 1900 2000 2100 2200 2300 2400 2500 2600 0 500 1000 x: 2196 y: 200.3 frequency (hz) t h e a v e ra g e i n p u t c u rr e n t (a ) 1600 1700 1800 1900 2000 2100 2200 2300 2400 2500 2600 0 1000 2000 3000 4000 x: 2196 y: 1005 frequency (hz) t h e m a x im a l th y ri s to r v o lt a g e ( v ) fig. 4 the inverter frequency characteristics 46 e. popov, n. hinov the stability and efficiency of the inverter can be studied from the characteristics. in general there is a minimum of the input current and power, load voltage, serial capacitor voltage and device voltage around the resonance of the load circuit. if the parameters of the inductor heater vary during the induction heating process it is advisable to maintain almost constant input current (power) in a slight capacitive detuning of the load circuit rl,ll,cl, where the operation is stable, by exercising an influence on the controlling frequency. the calculated slope of the frequency characteristic of the input current helps for determining the parameters of the closed loop automatic control system. 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 x 10 -3 -1000 -500 0 500 1000 time (s) t h e i n v e rt e r c u rr e n t (a ) 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 x 10 -3 -1000 0 1000 2000 x: 0.000386 y: 1045 time (s) t h e d e v ic e v o lt a g e ( v ) fig. 5 the detailed simulation results at f 2195 hz the steady state is simulated by a matlab program based on the method described in [16] for direct determination of the steady state mode (in the case for the discontinuous inverter current mode) that is experimentally confirmed. the results from the frequency characteristics, from the simulation and from experiments are in good agreement according to table i. that confirms the correctness of the whole study. the detailed simulation results of the power inverter are graphically displayed (fig. 5) for f=2195 hz. the first diagram shows: the inverter current ick [a] (solid line). the second diagram shows: the voltage across a device uvs1 [v] (solid line). the simulation results for the same circuit at f=694 hz displaying an interesting case – a non typical but possible operation in an inductive detuning of the load generally with the third harmonic component of the inverter current (a case that is difficult to be studied analytically) are given in fig. 6. the steady state mode of power inverters for electro technology applications 47 table 1 results for fig. 3 f[hz] 694 1823 2018 2083 study sim. fr.ch. sim. fr.ch. exper. sim. fr.ch. load ind. ind. ind. res. id [a] 169.5 400.7 388.7 200 200 196.5 178.2 ul[v] 200.4 304.4 303.5 215.4 217 215.8 201.7 tq.c.[s] 604 113.6 125.9 61 65 68.3 63.8 uckm[v] 1456 1308 1270 590 585 580 509 uvsm[v] 1927 1342 1342 866 870 875 863 f[hz] 2083 recom. 2195 2394 study sim. fr.ch. exper. sim. fr.ch. sim. load res. cap. cap id [a] 180.1 199.8 200 208.2 400.6 416.3 ul[v] 206.6 211.9 218 222.1 302.2 314.1 tq.c.[s] 72.5 82.9 85 89.9 97.3 99.6 uckm[v] 515 542 560 565 996 1034 uvsm[v] 886 1003 1030 1045 1636 1689 4. an example with a serial resonant inverter with free wheeling diodes a real circuit is given in fig. 7. the inverter frequency characteristics are graphically displayed in fig. 8. they correspond to an inverter (pt2-50-4000) with: ud=500 v; lk=0.3 mh; ck=4 f; rlr=4 , flr=4000 hz, coslr=0.24254, n1=0, n2=1, f=3000 – 4800 hz. the graphics show: the average input current id [a] (solid line); the load voltage ullm [v] (solid line). for each frequency there is a check whether the shape of the load voltage is close to the sine wave. the increase of the controlling frequency leads to increase of the input current (power), load voltage, peak serial capacitor voltage and rms value of the inverter current and to decrease of the circuit turn-off time. but around the load resonant frequency the character of most functions is opposite (inflexed points) and the changes of parameters are not so large. therefore, if the load rl,ll varies during the induction heating, it is advisable to maintain a resonance of the load circuit rl,ll,cl by influencing the controlling frequency. the results from the frequency characteristics, from the simulation of the steady state mode and from experiments are compared in table 2. they are in good agreement confirming the correctness of the results. the steady state results for f=4000 hz are shown in fig. 8. they are: ick [a] (solid line), ul [v] (solid line). 48 e. popov, n. hinov 0 0.5 1 1.5 2 2.5 3 x 10 -3 -4000 -2000 0 2000 4000 time (s) t h e i n v e rt e r c u rr e n t (a ) 0 0.5 1 1.5 2 2.5 3 x 10 -3 -2000 -1000 0 1000 2000 x: 0.00129 y: 1926 time (s) t h e d e v ic e v o lt a g e ( v ) fig. 6 the detailed simulation results for f 694 hz u d r l c k l k i vd1 vd4 vs1 vs4 vs2 vs3 vd2 vd3 i d l l c l  0  0 fig. 7 a real resonant inverter with free–wheeling diodes 3000 3200 3400 3600 3800 4000 4200 4400 4600 4800 0 50 100 150 200 x: 4000 y: 74.92 frequency (hz) t h e a v e ra g e i n p u t c u rr e n t (a ) 3000 3200 3400 3600 3800 4000 4200 4400 4600 4800 0 200 400 600 800 x: 4000 y: 385.7 frequency (hz) t h e l o a d v o lt a g e ( v ) fig. 8 the inverter frequency characteristics the steady state mode of power inverters for electro technology applications 49 table 2 results for fig. 8 hz f= 3200 f= 3600 recom. f= 4000 par. fr.ch. sim. fr.ch. sim. fr.ch. exper. sim. id ,a 16.74 16.62 88.53 88.39 74.92 75 74.34 vl ,v 182.3 182.3 420.2 420.5 385.7 387 385.6 tq.c,s. 69.83 70.31 35.98 36.88 22.33 22 23.19 vckm,v 1683 1700 2129 2125 1320 1325 1320 ick ,a 94.20 94.58 137.7 137.8 96.77 97 96.75 hz f = 4400 f= 4800 par. fr.ch. sim. fr.ch. sim. id ,a 75.81 75.45 182.0 181.7 vl ,v 388.7 388.5 603.1 602.8 tq.c,s. 31.13 31.31 24.28 24.42 vcsm,v 1548 1551 3120 3121 is ,a 122 122 268 268 0 50 100 150 200 250 300 350 400 450 500 -200 -100 0 100 200 x: 288.8 y: 145.8 time (0.000001 s) t h e i n v e rt e r c u rr e n t (a ) 0 50 100 150 200 250 300 350 400 450 500 -1000 -500 0 500 1000 x: 291 y: 546.1 time (0.000001 s) t h e l o a d v o lt a g e fig. 9 the detailed simulation results at f 4000 hz 5. conclusions serial rlc inverters with or without free wheeling diodes for induction heating are investigated. a novel unified interpretation of the electromagnetic processes is applied based on the previously calculated inverter network parameters in a normalized form. the frequency characteristics, the steady state simulation parameters and the experimental results are obtained and mutually confirmed. that proves the correctness of the whole study. the requirements for the control system are defined, which is useful for research on the type of offline simulation, hardware in the loop and rapid prototyping. the main contribution of the work is an approach to analysis of series rlc dc/ac converters for 50 e. popov, n. hinov induction heating. this allows the creation of methods for engineering design that are simple, such as mathematical ones, but with sufficient precision for engineering practice. references [1] b. l. dokić, b. blanuša, power electronics converters and regulators third edition, © springer international publishing, switzerland 2015, isbn 978-3-319-09401-4. [2] e. i. berkovitch, g. v. ivenskiy, yu. s. yoffe, a. t. matchak, v. v. morgun, higher frequency thyristor converters for electro technological units, st. petersborough, energoatomizdat, (1973), 1983, (in russian). [3] m. k. kazimierczuk and d. czarkowski, resonant power converters, ieee press and john wiley & sons , new york, ny 2nd edition, pp. 1-595, isbn 978-0-470-90538-8, 2011. [4] n. mohan, undeland, m. tore, william p. robbins, power electronics converters, applications, and design (3rd edition), © 2003 john wiley & sons. [5] a. dominguez, a. otin, l. a. barragan, j. i. artigas, d. navarro, and i. urriza, "frequency-to-output-power transfer function measurement of a resonant inverter for domestic induction heating applications", in proceedings of the ieee annual conference of the industrial electronics society iecon13, vienna, austria, 2013, pp. 5032-5037 [6] e. popov, analysis, modeling and design of converter units (computer – aided design of power electronic circuits), technical university printing house, sofia, 2005 (in bulg.), chapters 2-3, pp. 59–80. [7] o. lucía, p. maussion, e. dede, and j. m. burdío, "induction heating technology and its applications: past developments, current technology, and future challenges", ieee transactions on industrial electronics, vol. 61, pp. 2509-2520, may 2014. [8] a. dominguez, l. barragan, j. artigas, a. otin, i. urriza, d. navarro, "reduced-order models of series resonant inverters in induction heating applications", ieee transactions on power electronics, vol. 32, issue: 3, pp. 2300 – 2311, march 2017. [9] a. dominguez, l. a. barragan, j. i. artigas, a. otin, i. urriza, and d. navarro, "reduced-order model of a half-bridge series resonant inverter for power control in domestic induction heating applications", in proceedings of the ieee international conference on industrial technology (icit), 2015, pp. 2542-2547. [10] y. yin, r. zane, r. erickson, and j. glaser, "direct modelling of envelope dynamics in resonant inverters", electronics letters, vol. 40, pp. 834-836, 2004. [11] d. maksimovic, a. m. stankovic, v. j. thottuvelil, and g. c. verghese, "modeling and simulation of power electronic converters", in proceedings of the ieee, 2001, vol. 89, pp. 898-912. [12] f. h. dupont, c. rech, r. gules, and j. r. pinheiro, "reduced-order model and control approach for the boost converter with a voltage multiplier cell", ieee transactions on power electronics, vol. 28, pp. 3395-3404, 2013. [13] j. sun and h. grotstollen, "averaged modeling and analysis of resonant converters," in proceedings of the 24th annual ieee of the specialists conference on power electronics pesc '93, 1993, pp. 707-713. [14] s. tian, f. c. lee, and q. li, "a simplified equivalent circuit model of series resonant converter" ieee transactions on power electronics, vol. 31, pp. 3922-3931, 2016. [15] y. yan, r. zane, r. erickson, and j. glaser, "direct modeling of envelope dynamics in resonant inverters", in proceedings of the 34th annual ieee conference on power electronics electronic pesc '03, 2003, 2003, vol.3, pp. 1313-1318. [16] e. i. popov, “direct determination of the steady-state mode in autonomous inverters”, scientific journal electrical engineering and electronics e+e, sofia, bulgaria, (in bulg.), vol. 11-12, 2004. instruction facta universitatis series: electronics and energetics vol. 30, n o 3, september 2017, pp. 363 373 doi: 10.2298/fuee1703363m lte and wifi co-existence in 5 ghz unlicensed band  nenad milošević 1 , bojan dimitrijević 1 , dejan drajić 2 , zorica nikolić 1 , milorad tošić 1 1 university of nis, faculty of electronic engineering, nis, serbia 2 university of belgrade, school of electrical engineering, belgrade, serbia abstract. since the future mobile networks will require significantly higher data throughput, and the long-term evolution (lte) licensed bands are already occupied, the frequency band extension and the data rate increase may be achieved by using some of the available unlicensed bands. the most appropriate unlicensed band for this purpose lies in 5 ghz frequency range. however, this unlicensed band is already occupied by wifi networks and a special attention has to be paid to coordinate these two different networks in the shared spectrum usage. therefore, this paper considers the shared access co-existence in 5 ghz unlicensed band between uncoordinated lte and wifi networks. more precisely, it considers the influence of the lte downlink transmission on the performance of the wifi networks. the experimental results show that the lte significantly degrades the wifi network performance, which means that some of the coordination algorithms have to be employed. key words: wifi, lte, co-existence, unlicensed band, shared access 1. introduction mobile communications industry is rapidly growing over the past decade, and the mobile data transfer was almost completely based on the usage of the licensed spectrum. having in mind predictions of 1000 times cellular data traffic growth until 2020 [1], and the fact that there is an increasing amount of machine to machine data transfer [2], it is clear that the licensed band communications would have problems to support such a high bandwidth demand. one of the possible solutions to this problem use some additional spectrum out of the dedicated licensed band, while causing minimum interference to the existing systems in that frequency band. the co-existence of the mobile communication networks (global system for mobile (gsm) and long-term evolution (lte)) and digital terrestrial video broadcasting (dvb-t) systems are analyzed in [3]. the paper shows that there could be a significant mutual influence of these systems. besides, the available ultra received september 5, 2016; received in revised form december 7, 2016 corresponding author: dejan drajić university of belgrade, school of electrical engineering, belgrade, serbia (e-mail: ddrajic@etf.rs) 364 n. milosević, b.dimitrijević, d. drajić, z. nikolić, m.tošić high frequency (uhf) bandwidth is not very large. all of this indicates that uhf tv bands are not very appropriate for the mobile communication systems bandwidth increase. on the other hand, the unlicensed bands are particularly suitable for the bandwidth extension. the unlicensed band consist of industrial, scientific and medical (ism) and unlicensed national information infrastructure (u-nii) bands. ism bands occupy frequencies around 900 mhz, 2.4 ghz, and 5.8 ghz, whereas u-nii occupies frequencies from 5 to 5.8 ghz. 2.4 ghz band provides around 80 mhz of bandwidth, but it is heavily occupied by 802.11b/g wifi networks, bluetooth and other wireless personal area networks. on the other hand, 5 ghz band provides around 500 mhz of bandwidith and it is lightly occupied mainly by wifi 802.11ac/n networks. both 2.4 and 5 ghz wifi use carrier sense multiple access (csma) to access channel, and they are possible victims of some other technologies operating in the same frequency range. bluetooth use csma for data transmission and time division multiple access (tdma) for audio transmission. therefore, in case of audio transmission bluetooth may cause interference to other networks. having in mind the existing interference and the available bandwidth, 5 to 5.8 ghz band was chosen to be used for the bandwidth extension [4]. however, the implemented technology should be flexible enough to support other frequency bands. lte was first defined in 3rd generation partnership project (3gpp) release 8 [5]. it represents an evolving mobile communication standard that provides high data rates, higher capacity, smaller latency and new levels of user experience. in the 3gpp release 10 [6], lte was improved to fulfil the requirements of 4g mobile networks and it was named lte–advanced (lte-a). the most important advancement of the lte-a is the possibility of simultaneous use of multiple frequency bands by the means of the carrier aggregation (ca) technology. ca is the key technology that enables the unlicensed spectrum usage by the lte devices. however, the unlicensed spectrum would only be used for data rate increase, both in downlink and uplink, while the licensed spectrum, having predictable performance, will still be used for the important operations, such as network management, or delivery of critical information and guaranteed quality of service. although the unlicensed band may be freely used by the communication systems, there are some regulations that have to be followed, such as dynamic frequency selection (dfs) and listen-before-talking (lbt), which may use different technologies, such as carrier sense multiple access or spectrum sensing [7]. these coordination mechanisms, that are variants of dynamic spectrum access (dsa), are essential for achieving efficient co-existence between different systems that are operating in unlicensed spectrum. as the 5ghz band is primarily used by ieee 802.11ac wifi networks, the focus should be on the coordination between the lte and wifi. the main problem lies in the fact that the lte was designed to operate in a dedicated, licensed band. therefore, it does not have shared access mechanisms, like wifi does. papers [8] and [9] provide respectively simulation and theoretical results on the co-existence of lte and wifi networks and show the need for some sort of coordination between these two networks. experimental analysis of the 2.4 ghz band wifi communication influenced by lte is given in [10]. the lte is represented only by the base station, without any mobile stations. in this case, lte enb waits for the ue and transmits mainly control signals. there are two possible solutions to the problem of wifi and lte networks co-existence. the first approach is to modify the lte standard and adapt it to work in frequency shared environment. lte-u (lte-unlicensed), proposed by lte-u forum [11], uses a lte lte and wifi co-existence in 5 ghz unlicensed band 365 version with duty cycle i.e. with pauses in the transmission. in this way, wifi has the opportunity to transmit its data during the silent periods of the lte-u. besides, lte-u access point listens to wifi transmissions, tries to predict the usage patterns and to adapt to them. licensed assisted access (laa) will be a part of the future 3gpp lte release13 standard [12], [13], and includes listen before talk (lbt) mechanism to transmit when the channel is free. standardization progress and the summary of the laa is given in [14]. also, an operator level system performance is analyzed for indoor hotspot, indoor office, and outdoor small cell scenarios. the analysis showed that a significant lte capacity increase may be obtained by using laa and lbt. paper [15] considers the design of lbt for the laa system and analyzes the influence of laa clear channel assessment threshold on the performance of both lte and wifi networks. the paper shows that the proposed lbt algorithm is able to improve laa and to keep low interference to wifi. however, both lte-u and laa require significant modifications of the lte standard and will not be available in near future. the second approach is to introduce a coordinated access to the shared channel. there are two general approaches to spectrum coordination as follows [16]: reactive spectrum coordination and proactive spectrum coordination. the most straightforward reactive spectrum coordination concept is so called agile wideband radio scheme [17]. in this scheme, transmitter analyzes the spectrum and chooses its frequency band and modulation scheme, having in mind the highest allowed interference level. there is no higher-level coordination with the neighboring nodes. this coordination scheme is very simple, but has one serious possible problem with the hidden nodes, i.e. with the nodes that may not be visible to the station, but may interfere with it. another simple coordination scheme is reactive control [18]. all the radio stations in a network control its transmit power, rate, or frequency band in a way to optimize channel quality and interference levels. the name reactive comes from the fact that the station change its parameters as a reaction to the changes in the wireless environment. although these schemes are simple, with low software and hardware complexity, their application is limited to some simple scenarios. proactive spectrum coordination schemes are slightly more complex than the reactive. an example of proactive schemes is the spectrum etiquette protocol [19]. this scheme employs a distributed coordination by the means of either internet services or a separate coordination radio channel reserved for this purpose within the frequency band common to all participating radio nodes. these schemes enable radio nodes, using different radio access technologies, to coordinate its activities and adjust transmit parameters for successful joint operation. the etiquette approach is capable of operating in more complex scenarios than the reactive schemes. the common spectrum coordination channel (cscc) variant of the etiquette approach is given in [19], [20] together with the demonstration of proof-of-concept experiments for co-existing ieee 802.11b/g and bluetooth networks in the shared 2.4 ghz unlicensed band. with the coordination approach, only minor modifications of the existing standards are needed. however, the best solution would be to use coordination together with the lte-u or laa. having in mind the analyzed literature, it may be noticed that there is a lack of the experimental results for the scenario of lte and wifi networks co-existence in 5 ghz band. this paper gives the experimental data regarding the interference caused by lte towards the wifi in 5 ghz unlicensed band. unmodified versions of the existing standards are used, 802.11a for wifi, and 3gpp release-10 for lte. since there is no commercial lte 366 n. milosević, b.dimitrijević, d. drajić, z. nikolić, m.tošić hardware available that operates in any unlicensed band, we used software radio based lte implementation named openairinterface (oai) [21]. oai is also meant to be used in the licensed bands, so we had to modify source code to allow usage in 5 ghz unlicensed band. the experimentation is performed at nitos testbed [22]. the rest of the paper is organized as follows. section 2 briefly describes the openairinterface as well as the nitos testbed. the experiment description is given in section 3, while the experiment results and discussion are given in section 4. finally, the concluding remarks are presented in section 5. 2. openairinterface and nitos testbed the openairinterface lte implementation represents the full real-time software implementation of 4th generation mobile cellular systems compliant with 3gpp lte standards release-8/10. oai is implemented in gnu-c and uses x86 single instruction, multiple data (simd) hardware acceleration. it is primarily targeted for x86 real time application interface (rtai), but can be made to run on any gnu environment. oai implements both lte enb, i.e. lte base station, and lte user equipment (us), i.e. lte mobile station. it supports both frequency-division duplexing (fdd) and timedivision duplexing (tdd) configurations in 5, 10, and 20 mhz channel bandwidth. oai is designed to work with any hardware rf platform with minimal modifications. currently, two platforms are supported: eurecom exmimo2 [23], and universal software radio peripheral (usrp) xand bseries [24]. in our experiments, we used usrp b210. besides usrp, an intel core i5 or i7 based pc with usb 3.0 port is needed. the experiment will be performed at nitos testbed. nitos testbed consists of several experimentation environments: outdoor, indoor rf isolated, and office testbeds to meet different experimentation scenarios (fig. 1). users internet nitos server outdoor testbed indoor rf isolated tesbed office tes tbed openf low switch fig. 1 nitos testbed block diagram lte and wifi co-existence in 5 ghz unlicensed band 367 the experiments were executed at indoor rf isolated testbed because it is the only testbed currently equipped with usrp b210. it consists of 4 × 11 nodes arranged in the grid (11 rows with 4 nodes each), as shown in fig. 2. the distance between the neighboring nodes is 1 m. 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 fig. 2 indoor rf isolated testbed topology the nodes are numbered from 50 to 93 because previous 49 nodes are in outdoor and office testbeds. each node consists of a pc with different rf devices attached, such as wifi, usrp, bluetooth, and lte. after the reservation of a time slot, each node may be accessed online by the user and any software may be executed. 368 n. milosević, b.dimitrijević, d. drajić, z. nikolić, m.tošić 3. experiment setup and results 3.1. experiment description the topology of the experiment setup is shown in fig. 3. nodes 50 and 68 create an ad-hoc 802.11a wifi network. wireless network adapters are qualcomm atheros ar9580 (rev 01). due to wifi cards regulatory domain, available channels at 5 ghz frequency band are 36, 40, 44, and 48. it was chosen to use channel 48 with central frequency of 5.24 ghz. wifi adapters output power was set to 0 or 10 dbm in order to make it less than or equal to the output power of the usrp devices. the transmission control protocol (tcp) throughput between these two stations is generated and measured using iperf v2 [25] application during 60 seconds, without parallel streams. the lte enb and lte ue are run on nodes 59 and 60, respectively, using oai software. it may be noticed that the lte nodes are close to each other. that is because the oai is still in the development phase and the link quality between enb and ue is not very good. currently, the eurecom is paying the most attention to the development of oai enb in order to make it work correctly with different commercial lte devices, such as mobile phones. 50 59 62 68 69 51 52 53 54 55 56 57 58 60 61 90 91 92 93 63 64 65 66 67 wifi network lte network wifi node oai node fig. 3 the experiment setup topology the lte channel width may be configured using the number of resource blocks (nrb) parameter. possible channel widths are 1.4, 3, 5, 10, 15, 20 mhz for nrb = 6, 15, 25, 50, 75, 100. the oai is configured to work in fdd mode with 5 mhz channel bandwidth, i.e. the number of resource blocks is set to 25, because oai works the best with 5 mhz channel width.. the downlink frequency is set to be equal to the channel 48 central frequency, 5.24 ghz, and the uplink frequency offset is set to -100 mhz, i.e. the uplink frequency is 5.14 ghz. the throughput and the round-trip time (rtt) between wifi stations is constantly measured while the lte traffic is varied. again, iperf is used, now to generate user datagram protocol (udp) traffic in the downlink of the lte network. lte and wifi co-existence in 5 ghz unlicensed band 369 it should be noted that paper [8] and this paper consider a similar topic. however, the results in this paper may not be compared to those obtained in [8]. namely, paper [8] analyzes the influence of oai enb (without ues) on the wifi transmission in 2.4 ghz band. wifi stations are located at the same testbed node, with 25 cm distance between the antennas. oai enb distance to wifi was varied from 1 to 20 m. since we did not have a physical access to the nitos testbed, we could not put two wifi cards on one node. also, we could not move usrps to different nodes, and therefore could not change the distance between lte and wifi stations. 3.2. experimental results this section presents some experimental results that show the influence of lte on wifi network based on scenario described in the previous section. fig. 4 shows wifi throughput over time for different lte traffic intensity: no lte network present, only lte enb generating light load with control signals, 1 mb/s, and 10 mb/s of the downlink lte traffic. the usrp b210 output power is around 10 dbm, so wifi output power was chosen to be equal to usrp (10 dbm) and 10 db lower (0 dbm). it may be noticed that the higher the lte throughput, the lower the wifi throughput is. that is because wifi senses lte transmission and postpones its own transmission. on the other hand, lte does not use carrier sensing and it transmits continuously. wifi transmit power has almost no influence on wifi throughput (curves a, c, and d), except for the case of light lte traffic with only enb (curve b), because stronger wifi packets are more likely to reach the destination, even if they are hit by the lte signal during the transmission. 1 2 3 4 5 6 7 8 9 10 0 5 10 15 20 25 d c b wifi power 10 dbm wifi power 0 dbm w if i t h ro u g h p u t [m b /s ] t [s] a fig. 4 wifi throughput over time for different lte traffic intensity: a) no lte, b) only lte enb, c) 1 mb/s d) 10 mb/s besides the throughput, the transmission delay is also an important parameter of a communication network. the round-trip time, i.e. time needed for a packet to travel from source to destination and back to source, for the wifi network is shown in fig. 5. it is measured using ping application, which sends internet control message protocol (icmp) echo request 370 n. milosević, b.dimitrijević, d. drajić, z. nikolić, m.tošić packets, and waits for icmp echo response packets. the rtt is considered for different lte traffic intensity and for different icmp packet size: 100, 1000, and 10000 bytes. fig. 5 shows average value and standard deviation of the rtt. the conclusion from fig. 4 may be applied here: higher lte throughput increases both average value and the standard deviation of rtt. the average value increases significantly for 10 mb/s lte throughput. on the other hand, the rtt standard deviation increases approximately exponentially with the increase of lte throughput. no lte 0 1m 10m 0.01 0.1 1 10 100 1000 a v e ra g e r t t [ m s ] lte throughput [b/s] average value standard deviation packet size 100 bytes packet size 1000 bytes packet size 10000 bytes fig. 5 wifi network rtt as a function of lte throughput, for different values of packet size no lte 0 1m 10m 1 10 100 a v e ra g e r t t [ m s ] lte throughput [b/s] f = 0 hz f = 5 mhz f = 10 mhz c b a fig. 6 wifi network average rtt as a function of lte throughput, for different values of frequency offset between wifi and lte carrier frequency f, and wifi packet size a) 100 bytes, b) 1000 bytes, c) 10000 bytes lte and wifi co-existence in 5 ghz unlicensed band 371 finally, fig. 6 analyzes the influence of the carrier frequency offset between the wifi channel central frequency (fwifi) and the lte downlink frequency (flte) f. frequency s p e c tr u m m a g n it u d e f = 0 mhz frequency s p e c tr u m m a g n it u d e f = 5 mhz frequency s p e c tr u m m a g n it u d e f = 10 mhz a) b) c) fig. 7 mutual position of the wifi (solid line) and lte (dashed line) spectra for different carrier frequency offset a) 0 mhz, b) 5 mhz, c) 10 mhz we should have in mind that wifi occupies 20 mhz bandwidth (fwifi ± 10 mhz), and lte occupies 5 mhz (because nrb is chosen to be 25) bandwidth (flte ± 2.5 mhz), as shown in fig. 7. as can be seen from fig. 7, for 0 and 5 mhz offset, whole lte spectrum overlaps with wifi spectrum and 25% of the wifi channel is occupied by lte. note that 372 n. milosević, b.dimitrijević, d. drajić, z. nikolić, m.tošić lte carrier frequency lies within wifi channel. for 10 mhz offset, a half of the lte spectrum (2.5 mhz) overlaps with the wifi spectrum, and lte carrier frequency is on the edge, or practically out of wifi channel. the results show that the higher the offset the lower is the influence of lte on wifi network. if the offset is 10 mhz, lte has very little influence on the wifi network. figs. 6 and 7 show that the lte carrier itself is the main cause of the interference. 4. conclusion the influence of the lte on the wifi network, sharing the same 5 ghz frequency range without coordination, is considered in this paper. the results show that the higher the lte throughput, the lower the wifi throughput. the lte similarly influences the round-trip time of the wifi network packets. the influence is the highest if the lte downlink frequency is equal to the wifi channel central frequency. if the difference between these two frequencies is higher, the influence is lower. having in mind the presented results, a conclusion can be made that the coordination between the lte and wifi networks is very important and will be the topic of our future research. we are currently developing spectrum coordination based on an ontological framework. the coordination process will be centralized on one coordination server. it will communicate to wifi and lte clients and provide them all the needed parameters for the successful co-existence in a shared frequency band. acknowledgement: the authors thank the anonymous reviewers for their valuable suggestions and comments. the research leading to these results has received funding from the european union's seventh framework programme under grant agreement no 612050 (flex project) and from the european union's horizon 2020 research and innovation programme under grant agreement no. 687860 (softfire project). references [1] qualcomm, extending the benefits of lte advanced to unlicensed spectrum, http://www.qualcomm. com/media/documents/files/extending-the-benefits-of-lte-advanced-to-unlicensed-spectrum.pdf; 2014 [accessed 29.08.16]. [2] a. prijić, lj. vračar, d. vučković, d. danković, z. prijić, "practical aspects of cellular m2m systems design", facta universitatis, series: electronics and energetics, vol. 28, pp. 541-556, december 2015. [3] l. polak, o. kaller, l. klozar, j. sebesta, t. kratochvil, “mobile communication networks and digital television broadcasting systems in the same frequency bands: advanced co-existence scenarios”, radioengineering, vol. 23, pp. 375–386, april 2014. [4] 3gpp. lte in unlicensed spectrum, http://www.3gpp.org/news-events/3gpp-news/1603-lte_in_unlicensed; 2014 [accessed 129.08.16]. [5] 3gpp. 3gpp release 8, http://www.3gpp.org/specifications/releases/72-release-8; 2014 [accessed 29.08.16]. [6] 3gpp. 3gpp release 10, http://www.3gpp.org/specifications/releases/70-release-10; 2014 [accessed 29.08.16]. [7] r. deka, s. chakraborty, j. s. roy, "optimization of spectrum sensing in cognitive radio using genetic algorithm", facta universitatis, series: electronics and energetics, vol. 25, pp. 235-243, december 2012. [8] j. jeon, h. niu, qc li, a. papathanassiou, g. wu, “lte in the unlicensed spectrum: evaluating coexistence mechanisms”, in the proceedings of the ieee globecom work. gc wkshps 2014, 2014, austin, tx (usa), pp. 740–745. http://www.qualcomm.com/media/documents/files/extending-the-benefits-of-lte-advanced-to-unlicensed-spectrum.pdf http://www.qualcomm.com/media/documents/files/extending-the-benefits-of-lte-advanced-to-unlicensed-spectrum.pdf http://www.3gpp.org/news-events/3gpp-news/1603-lte_in_unlicensed http://www.3gpp.org/specifications/releases/72-release-8 http://www.3gpp.org/specifications/releases/70-release-10 lte and wifi co-existence in 5 ghz unlicensed band 373 [9] a. babaei, j. andreoli-fang, y. pang, b. hamzeh, “on the impact of lte-u on wi-fi performance”, int j wirel inf networks, vol. 22, pp. 336–344, december 2015. [10] s. sagari, s. baysting, d. saha, i. seskar, w. trappe, di. raychaudhuri, “coordinated dynamic spectrum management of lte-u and wi-fi networks”, in the proceedings of the ieee int. symp. dyn. spectr. access networks, dyspan 2015, stockholm, sweden, 2015, pp. 209–220. [11] lte-u forum, http://www.lteuforum.org; [accessed 19.08.16]. [12] 3gpp, 3gpp release 13, http://www.3gpp.org/release-13; 2015 [accessed 29.08.16]. [13] 3gpp, rp-151045: new work item on licensed-assisted access to unlicensed spectrum, http://www.3gpp.org/ftp/tsg_ran/tsg_ran/tsgr_68/docs/rp-151045.zip; 2015 [accessed 29.08.16]. [14] r. ratasuk, n. mangalvedhe, a. ghosh, “lte in unlicensed spectrum using licensed-assisted access”, in proceedings of the ieee globecom work. gc wkshps, austin, tx, usa, 2014, pp. 746–751. [15] li y, zheng j, li q, “enhanced listen-before-talk scheme for frequency reuse of licensed-assisted access using lte”, in proceedings of the ieee int. symp. pers. indoor mob. radio commun. pimrc, hong kong, china, 2015, pp. 1918–1923. [16] d. raychaudhuri, x. jing, i. seskar, k. le, jb evans, “cognitive radio technology: from distributed spectrum coordination to adaptive network collaboration”, pervasive mob comput, vol. 4, pp. 278–302, june 2007. [17] k. challapali, s. mangold, z. zhong, “spectrum agile radio: detecting spectrum opportunities”, in the proceedings of the intern. symp. adv. radio technol, boulder, co, usa, 2004, po. 61–65. [18] x. jing, sc. mau, d. raychaudhuri, r. matyas. “reactive cognitive radio algorithms for co-existence between ieee 802.11b and 802.16a networks”, in proceedings of the globecom ieee glob. telecommun. conf., st. louis, mo, usa, vol. 5, 2005, pp. 2465–2469. [19] d. raychaudhuri, x. jing, “a spectrum etiquette protocol for efficient coordination of radio devices in unlicensed bands”, in proceedings of the ieee int. symp. pers. indoor mob. radio commun. pimrc, beijing, china, vol. 1, 2003, pp. 172–176. [20] x. jing, d. raychaudhuri, “spectrum co-existence of ieee 802.11b and 802.16a networks using reactive and proactive etiquette policies”, mob networks appl, vol. 11, pp. 539–554, august 2006. [21] openairinterface software alliance, openairinterface, http://www.openairinterface.org/; 2015 [accessed 29.08.16]. [22] nitlab, nitos, http://nitos.inf.uth.gr/; [accessed 29.08.16]. [23] eurecom, expressmimo2, https://twiki.eurecom.fr/twiki/bin/view/openairinterface/expressmimo2; [accessed 29.08.16]. [24] ettus, usrp xand bseries, https://www.ettus.com/; [accessed 29.08.16]. [25] iperf, https://iperf.fr/; [accessed 29.08.16]. http://www.lteuforum.org/ http://www.3gpp.org/release-13 http://www.3gpp.org/ftp/tsg_ran/tsg_ran/tsgr_68/docs/rp-151045.zip http://www.openairinterface.org/ http://nitos.inf.uth.gr/ https://twiki.eurecom.fr/twiki/bin/view/openairinterface/expressmimo2 https://www.ettus.com/ https://iperf.fr/ instruction facta universitatis series: electronics and energetics vol. 27, n o 2, june 2014, pp. 235 249 doi: 10.2298/fuee1402235s execution time – area tradeoff in gausing residual load decoder: integrated exploration of chaining based schedule and allocation in hls for hardware accelerators  anirban sengupta 1 , reza sedaghat 2 , vipul kumar mishra 1 1 computer science and engineering, indian institute of technology, indore, india 2 electrical and computer engineering, ryerson university, toronto, canada abstract. design space exploration is an indispensable segment of high level synthesis (hls) design of hardware accelerators. this paper presents a novel technique for area-execution time tradeoff using residual load decoding heuristics in genetic algorithms (ga) for integrated design space exploration (dse) of scheduling and allocation. this approach is also able to resolve issues encountered during dse of data paths for hardware accelerators, such as accuracy of the solution found, as well as the total exploration time during the process. the integrated solution found by the proposed approach satisfies the user specified constraints of hardware area and total execution time (not just latency), while at the same time offers a twofold unified solution of chaining based schedule and allocation. the cost function proposed in the genetic algorithm approach takes into account the functional units, multiplexers and demultiplexers needed during implementation. the proposed exploration system (expsys) was tested on a large number of benchmarks drawn from the literature for assessment of its efficiency. results indicate an average improvement in quality of results (qor) greater than 26 % when compared to a recent well known ga based exploration method. key words: area; high level synthesis; exploration; scheduling; chaining; execution 1. introduction as the complexity of very large scale integration (vlsi) designs increases, the design of application specific integrated circuits (asic) should be addressed at higher levels of abstraction in order to meet the growing challenges. of late there has been a major shift among all well-known electronic design automation (eda) vendors from traditional register transfer level (rtl) designs to high level synthesis. however, for comprehensive high level system designs, efficient design space exploration techniques are required during hls that can concurrently meet the user specified constraints of  received january 27, 2014 corresponding author: reza sedaghat electrical and computer engineering, ryerson university, toronto, canada (e-mail: rsedagha@ee.ryerson.ca) 236 a. sengupta, r.sedaghat, vk. mishra hardware area and execution time. furthermore, design space exploration should also be able to concurrently resolve the orthogonal issues encountered during dse, such as minimizing the time of the exploration process and maximizing the precision required. hence, the tremendous advancement of highly complex digital vlsi circuits in the current generation of portable devices and other electronic products has mainly become possible owing to the efficient design techniques developed so far [1]. the process of hls can be broadly classified into three phases. the first phase involves the conversion of the algorithm into data flow graph (dfg). the second phase includes scheduling, which assigns operations into the appropriate control steps. allocation, the third phase in high level synthesis, is the data-path synthesis that allocates hardware resources such as registers and busses, and binds the operations of dfg to functional units [1]. the hls phase consists of interdependent tasks such as scheduling and allocation. scheduling is the process of assigning the operations in specific control step while resource allocation refers to the assignment of the functional units to perform the operations, multiplexers and demultiplexers to switch between different inputs and output. however, the problem of solving the integrated scheduling and allocation by exhaustive analysis is strictly prohibited [1]. 2. related work the problem of design space exploration was addressed in [2], where the authors have proposed the use of a genetic algorithm in the binding and allocation phase in high level synthesis. this method involves crossover dependence on the force directed data path binding completion algorithm. one of the problems with [2] is that the method accepts a scheduled data flow graph as an input. this clearly signifies the inability of their approach to resolve the scheduling problem. authors in [3] have also proposed a genetic algorithm for time constrained scheduling. the chromosome is encoded with the permutation of operations, which is decoded by a list decoder, to decode the chromosome into a valid schedule. however, the approach does not handle chaining and execution time optimization. in addition, authors in [4] have proposed a problem space genetic algorithm for design space exploration of data paths. the authors have used the concept of heuristic/problem pair to convert a data flow graph into a valid schedule. the chromosome is encoded based on the „work remaining‟ value of each node. one of the problems with approach [4] is that the second special parent chromosome built in correspondence with the minimum functional units (i.e. serial implementation) does not differ in the work remaining field of the first special chromosome. this may not always properly lead to reaching the optimal solution. further, the cost function considers only latency and not total execution time. the problem of design space exploration was also addressed in [5] by suggesting order of efficiency, which assists in deciding preferences amongst the different pareto optimal points. research in [6] suggested that identification of a few superior design points from the pareto set suffices for an excellent design process. evolutionary algorithms in [7], such as the genetic algorithm (ga), have been suggested to yield better results for the design space exploration process. the use of ga has also been suggested as a framework for dse of data paths in high level synthesis in [8]. authors in this approach have proposed a priority order based chromosome for the data schedules and an independent chromosome for the functional units. their work uses the robust search capabilities of the genetic algorithm for scheduling and execution time – area tradeoff in ga using residual load decoder 237 allocation of datapath with the aim to find a solution for both the module selection and scheduling. one of the drawbacks of [8] is that the approach does not consider resource binding. thus, the cost function proposed does not reflect the multiplexer and demultiplexers‟ resources. furthermore, like other ga design space exploration approaches, [8] only considers optimization of latency and area. another approach introduced by researchers in [1] was also based on pareto optimal analysis. according to their work, the design space was arranged in the form of an architecture vector design space for architecture variant analysis and optimization of performance parameters. though the results proved promising the approach was unable to handle chaining based scheduling. furthermore in [9] and [10], authors described another approach to dse in high level systems based on binary encoding of the chromosomes. work shown in [11] for dse suggests that authors used an evolutionary algorithm for successful evaluation of the design for an application specific soc. approaches [9]-[11] only considered traditional latency and not the execution time constraint for data pipelining. the work shown in [12] discusses the optimization of area, delay and power in behavioral synthesis, but does not focus on the high level design space exploration using multi chromosomal genetic algorithm nor does it consider execution time during data pipelining. furthermore, authors in [13] introduce a tool called systemcodesigner that offers rapid design space exploration with rapid prototyping of behavioral systemc models. automated integration was developed by integrating behavioral synthesis into their design flow, while authors in [14] describe current state-of-the-art highlevel synthesis techniques for dynamically reconfigurable systems. additionally, authors in [15]-[17] also used genetic algorithms for scheduling and resource allocation for data path synthesis. another class of scheduling methods employed previously was probabilistic in nature. for example the simulated annealing (sa) and simulated evolution (se) based scheduling techniques have been used for the high level synthesis problem. authors in [18], [19] have proposed simulated annealing scheduling method called „salsa‟ which uses many probabilistic search operators to enhance the performance of sa-based technique for high level synthesis problem. moreover, authors have also proposed an extended binding model for handling the scheduling problem in high level synthesis. furthermore, authors in [20] also used sa for scheduling problem with simultaneous minimization of registers and function units. se has been proposed by authors in [21] for solving the combined problem of scheduling and resource allocation in high level synthesis. all aforementioned approaches [15]-[21], however, do not consider execution time, chaining and data pipelining. in contrast to the proposed approach, [15]-[17] do not incorporate a special seeding process based on serial and parallel implementation in order to efficiently guide the ga to optimal/nearoptimal solution. other previously proposed approaches [22], [23] are based on integer linear programming (ilp). here, the computational complexity is massive and although able to provide good results, consume enormous time. furthermore, the concept of data pipelining based on execution time was not shown during system trade-off. constructive approaches [24][27] are very straightforward to implement but suffer from the major drawback of leading to poor quality of solutions owing to their greedy nature. 3. the proposed approach for genetic algorithm based exploration system (expsys) the approach proposed in this paper for finding the optimal integrated scheduling, allocation, binding and module selection, employs a special multi chromosomal compound 238 a. sengupta, r.sedaghat, vk. mishra chromosome structure that has the efficient ability to search the design space. it provides an integrated solution to the problem of scheduling, allocation and binding by yielding a set of hardware resources that contains the details of functional units (e.g. number and kind). further, this solution reduces the cost function based on constraints provided for hardware area (consisting of function units, multiplexers, demultiplexers) and execution time (considering latency, cycle time and number of sets of data to be executed). in order to reduce the final cost, the module selection indicates the optimal number of resources needed of each kind, as well as the right version of a specific resource needed from the module library during implementation the expsys has been developed by a new chromosome encoding technique that consists of separate chromosome structures for each of the resources, rather than the traditional method consisting of a single chromosome structure to represent all the resources. moreover the proposed approach also includes an independent chromosome representation of the module allocations fields. 3.1 the expsys overview the input to the ga framework is the behavioral description of the dataflow graph (dfg), or the high level description of the algorithm in c language, that describes the behavior of the application. in addition to the behavioral description of the application input to the ga framework also includes the set of user specified design constraints for hardware area and execution time (with the user specified weight factors for hardware area-execution time tradeoff), control parameters for the genetic algorithm, and the module library that contains specifically three different information viz. maximum resources available, clock cycles and area. the proposed framework is comprised of two basic units. the first unit is the proposed heuristic that acts as an input to the skeleton for the genetic algorithm. the second unit processes the information provided by the first unit to produce a final integrated scheduling, allocation and module selection solution. the proposed skeleton (algorithm) for the genetic algorithm is shown is fig.1. it uses a new heuristic based on residual load criterion that assigns a specific priority for each operation in the chromosome structure. the first parent (p1) chromosome of the nodal string (this string is defined later in section 4.2) is encoded based on the residual load (α) of each resource from the asap scheduling graph. on the contrary, each operation of the second parent (p2) nodal string is encoded based on the difference of the latency obtained by using asap scheduling with maximum resource (l asap ) and the residual load (α) for each operation (oi) obtained for p1 chromosome. hence, the encoded value of each operation (oi) of the second parent chromosome is calculated using equation (1). asap i l (o )   (1) the rest of the parents of the population in the nodal string encoded with the residual load values are obtained by random perturbation. the other parent chromosomes (p3…..pn) of the population obtained by the perturbation function should be individuals lying between the parent p1 derived from the schedule based on maximum resource and parent p2 derived based on minimum resource. this is more logical because the optimal solution to the integrated problem lies somewhere between the maximum and the minimum resource. the developed perturbation function, which yields the residual load values, is given in equation (2) execution time – area tradeoff in ga using residual load decoder 239 pf ( ) / 2   (2) where „µ‟ is a random value between „α‟ and „β‟. the additional random value „µ‟ is added to the perturbation function because, in order to have more diversity in the initial population, the residual load value for the rest of the parents (p3…..pn) should be different (note: this residual load value determines the priority among nodes during the decoding process. thus, it is necessary to have different residual load values by adding the random value to the perturbation function). moreover, having greater diversity results in searching all the corners of the design space, thereby assisting in finding the optimal/near-optimal solution. ignoring „µ‟ in the above function would encode the nodal string part for the rest of the parents (p3…..pn) with the same residual load values, thereby reducing the diversity of the initial population. the function in equation (2) is used when encoding the values of the nodal string for the rest of the parents. on the other hand, the perturbation of the resource allocation string (this string is defined later in section 3.2) for the other parents is obtained by applying the algorithm shown below: algorithm 1) schedule the dfg using asap algorithm and calculate the latency (l). 2) generation g =1. 3) creation of the initial population by chromosome encoding with priority list of nodes based on „residual load‟ which is done as follows: a) encode the first parent (p1) of the nodal string using the residual load (α) based on the asap schedule. encode the first parent (p1) of resource allocation string with maximum resources. b) encode the second parent (p2) of the nodal string using residual load (β) calculated as: l asap – α (oi) based on minimum resources. encode the second parent (p1) of the resource allocation string with minimum resources. c) create the rest of the parent (p3…pn) of the nodal string with residual load based on the perturbation function = (α + β)/2 ± µ; where „µ‟ is a random value between „α‟ and „β‟. 4) perform crossover with very high probability (pcross) among parents to create off-springs. 5) decode the chromosomes using the proposed „residual load heuristic‟ to find scheduling solutions by binding dfg operations to fu, allocating mux‟s and demux‟s. 6) get information about the functional units (fu) such as versions, area occupied, clock cycle etc. from the module library. 7) calculate the global cost function and determine the fitness of each individual. global cost function considers a) total area which is a combination of: i) area of fu ii) area of mux iii) area of demux. b) total execution time which is a combination of, i) latency ii) cycle time and iii) number of sets of data. 8) perform mutation on the least fit nodal string chromosome and the resource allocation string chromosome with probability, pm = 0.25. mutation is performed once every generation 9) decode the mutated chromosomes using the proposed „residual load heuristic‟ to find scheduling solutions and then calculate the cost of the mutated chromosome again. 10) select the best population from the set of off-springs and parents from this generation and take it forward to the next generation. increment g, (g=g+1) until g< generation max 11) end ga run. fig. 1 the proposed skeleton for the expsys 240 a. sengupta, r.sedaghat, vk. mishra perturbation rule for the resource allocation chromosome for rest of the parents 1. randomly pick any two nodes (v1, v2) from the chromosome that represents the resource allocation. 2. randomly select any integer value (i) ranging between or equal to „α‟ and „β‟ for that specific operation (node). hence, α <=i<= β once the parents for the initial population are formed direct crossover is applied. crossover results in creation of off-spring in that generation. for every mating between two parents, two off-springs can be created. if, for example, size of the parents in the population is 8, then 16 off-spring will be produced. therefore, the total population of the first generation is 24. the next task is to decode the generated individuals of the first generation by applying a new „residual load heuristic‟ that always results in a valid schedule. during the process of formation of the schedule solution, the data dependency is strictly followed before any operation is selected for scheduling. the global cost function is then determined in order to judge the fitness of each individual solution. the least fit individual is mutated in order to hope for a better solution. after mutation, the mutated chromosome is again decoded and its fitness is adjudged. the best fit individuals from this first generation are then forwarded to the next generation. this process continues until the maximum generation g(max) specified in reached. 3.2 chromosome representation suitable encoding of the problem dictates the capability of the genetic algorithm to find optimal or near–optimal solutions. the proposed approach uses a multi chromosome structure consisting of independent strings to separately represent the priority of the nodes of the dfg for each fu type and the resource allocation information. the approach is called multi chromosomal because each fu (resource) is represented as an independent substring in the nodal string structure. it has two independent strings to separately represent the nodes of the dfg (called „nodal string‟) and the resource allocation (called „resource allocation string‟). the „nodal string‟ contains the residual load values of each node which will determine the priority of the nodes during scheduling. the „residual load heuristic‟ is used when decoding the nodal string in order to obtain a valid scheduling solution. the „resource allocation string‟ contains a list of integers, which indicate the maximum number of resources allowed during scheduling. the resource allocation string contains a substring with integers to represent the maximum number of functional units of each type available for scheduling in every time step of the schedule. this encoding scheme for both the resource allocation string and nodal string assures that the genetic algorithm always produces a valid schedule as well as reaching all the corners of the design space to explore the integrated solution of scheduling, allocation and binding. the encoding scheme for the „nodal string‟ and the „resource allocation string‟ is shown with an example of a benchmark „differential equation solver‟. small values of delay in cc are used during demonstration. for clarity, during experimentation real values have been used. the schedule of the dfg of the differential equation solver using asap is shown in fig.2. the latency (l) obtained is 12cc (note: assumes multipliers and adders/subtractors take 4cc and 2cc respectively). the corresponding chromosome encoding for the first parent (p1) of the nodal string is shown in fig. 3(a). the total residual load of each operation (node) is obtained by summation of the residual load of the successor operations following that node. e.g. for execution time – area tradeoff in ga using residual load decoder 241 node 1, the residual load is (4+4+2+2) cc = 12cc. the second parent (p2) chromosome is encoded based on the residual load values obtained using equation (1). the second parent (p2) chromosome encoding is shown in fig. 3(b). the rest of the parents of the initial population is obtained using equation (2) which is a perturbation function used to encode the residual load values. the residual load values for rest of the parents always lie between the values from the first parent and second parent. this scheme has been developed because the optimal solution to the problem should always lie between the serial and maximally parallel implementation [4]. on the other hand, the first parent (p1) shown in fig. 3(a) and second parent (p2) of the resource allocation string shown in fig. 3(b) are based on the user specified maximum and minimum resources respectively. for example, the first parent (p1) of the resource allocation string shown in fig. 3(a) consists of three multipliers, three adders, two subtractors and one comparator. additionally, second parent (p2) of the resource allocation string shown in fig. 3(b) consists of one multiplier, one adder, one subtractor, and one comparator. the rest of the parents (p3…p8) of the „resource allocation string‟ are obtained using the algorithm in fig 3. the „resource allocation string‟ for the rest of the parents of the initial population is also encoded with multiplier, adder, subtractor, and comparator option (note: „m‟, „a‟, „s‟, „c‟ refers to multipliers, adders, subtractor, and comparators respectively in the resource allocation string). thus, the final solution found by the proposed expsys is able to indicate the final combination of multipliers, adders, subtractor, and comparators needed to implement the problem based on the user specified hardware area and execution time constraints. the nodal string and the resource allocation string for the rest of the parents are shown in fig.4(a) and fig.4(b) respectively. for example, in case of fig 4(a), the encoding of the third parent for the resource allocation string is obtained by first picking up randomly any two nodes m (multiplier) & a (adder) and then randomly selecting any integer value between „3‟ and „1‟ for m and between „3‟ and „1‟ for a. the randomly selected value for both m & a is „2‟. similarly, the rest of the parent chromosomes can be built by perturbation. this type of perturbation for the „resource allocation string‟ and the perturbation function for the „nodal string‟ described before aids in searching all the possible combinations of the design space so that the ga can reach an optimal or nearoptimal solution. fig. 2 scheduling of differential equation solver using asap 242 a. sengupta, r.sedaghat, vk. mishra fig. 3 chromosome encoding for the first parent (a) and second parents (b) fig. 4 chromosome encoding for the third parent (a) and fourth parent (b) fig. 5 crossover between p1 and p2 execution time – area tradeoff in ga using residual load decoder 243 3.3 crossover technique crossover is a technique for producing off-spring when two parents mate. the parents are selected by a binary tournament selection method [28]. in this work, we propose the independent direct crossover of the two independent strings viz. nodal string and resource allocation string to produce separate off-spring for each with a very high crossover probability (pcross = 1.0). furthermore, the direct crossover is applied to each sub structure of the nodal string structure. for example, direct crossover is independently applied to adder substring, multiplier substring, subtractor substring, etc. of each nodal string as well as resource allocation string. since the nodal string encodes the residual load of each operation for a particular fu, the crossover results in crossing only the residual load values. hence the precedence relationship among the operators is not disobeyed. 3.3.1 multi-point crossover of the nodal string before the crossover scheme can be applied to the nodal strings, the two parents are randomly divided into two halves at point n. the crossover point selected during crossing is absolutely random. this is because the nodal string is encoded with residual load values of the nodes and crossover operation only crosses the residual load values, hence choosing a random cut point for crossover does not disturb the precedence relationship among the nodes. only random cut point has been used in the proposed work as this technique has been widely used by other approaches and provided efficient results. the proposed crossover is called multi-point because each substring of the nodal string representing independent fus is divided at a different point. for example, applying the direct crossover operator to the nodal string between the first parent (fig. 3(a)) and second parent (fig.3(b)) at point 2 for multiplier and point 1 for adder and subtractor, yields offspring 1 and offspring 2 respectively. offspring 1 inherits all the properties of the first half from the first parent, while the second half of the offspring is inherited from the second parent. the properties that are inherited from the parents are the residual load values and its corresponding node numbers (operations). the offspring 1 obtained after crossover between p1 and p2 is shown in fig 5(a), while offspring 2 obtained after crossover between p2 and p1 is shown in fig. 5(b). similarly the other offspring are obtained by crossing between the rest of the parents. for the sake of brevity, the rest of the offspring obtained have been omitted in this paper. 3.3.2 crossover of the resource allocation string the resource allocation string is responsible for encoding the number of hardware functional units of each type available for scheduling operations in each time step. since the number of allocated functional units of each type is totally independent of each other, the 1-point crossover can be easily applied. for instance, in the case of the dfg for differential equation solver benchmark, the two parents (p1 and p2) for the resource allocation string are shown in fig. 3(a) and 3(b) respectively. p1 represents a solution with three multipliers, three adders, two subtractors and one comparator while p2 represents a solution with one multiplier, one adder, one subtractor and one comparator. application of the direct crossover at a random cut point between p1 and p2 yields offspring 1 while crossing between p2 and p1 yields offspring2 as shown in fig 5(b). 244 a. sengupta, r.sedaghat, vk. mishra 3.4 mutation operation 3.4.1 mutation operator of the nodal string the mutation algorithm for resource allocation string is adopted from [8] based on random increment or decrement while mutation for nodal string is shown below: algorithm 1. randomly pick any two nodes (vi, vj) from the nodal string [k]. 2. swap the residual load values of the two selected nodes. if, vi = li and vj = lj, then, vi = lj and vj = li. according to the algorithm, any two nodes (vi, vj) in the string (k) are randomly selected for mutation. next, the residual load values of the two selected nodes are swapped. for example, let the residual load value for the two nodes (vi) and node (v2) selected be „l1‟ and „l2‟ respectively. therefore, after mutation the new residual load values for node (vi) is „l2‟ and node (vj) is „l1‟. this mutation technique drastically alters the residual load values, which act as the priority to select the operations for scheduling. as a result of this drastic alteration, the new operation to be scheduled can vastly affect the scheduling cost. 3.5 decoding process (determination of a valid schedule) the decoding of chromosomes always results in a valid scheduling solution, which strictly obeys the data dependency present between the operations. for the decoding process, a „residual load heuristic‟ is proposed. the residual load heuristic is shown in fig. 6. for example, in the case of offspring 1, the resource allocation string and the nodal string are shown in fig.5(a) and fig.5(b) respectively. the resource allocation string of offspring1 represents an allocation solution containing three multipliers, three adders, one subtractor, and one comparator. on the other hand, the priority of each operation for a particular type of fu is indicated by the residual load values in the nodal string (fig.5(b)). therefore, for the dataflow graph shown in fig.3, the scheduling solution of offspring 1 is shown in fig. 7. the resulting solution is a valid schedule, allocation and binding obtained for offspring 1. the solution provides an integrated solution to the concurrent problem of scheduling, allocation and binding. 3.6 global cost function and fitness evaluation methodology the proposed approach objective is to simultaneously reduce the execution time required for a specific set of data as well as the total hardware area occupied. most of the previous approaches [2], [4], [7], [8] have only considered latency as a design constraint and not total execution time, which considers the latency, cycle time and also the number of sets of data to be executed. in the presented approach, a comprehensive cost function has been developed that considers the total execution delay, taking data pipelining as well as the total hardware area into account. the decoding process strictly follows the „residual load heuristic‟ and hence always results in a feasible solution. the cost function (cg) developed considers total execution time and area is shown in eq. (3). execution time – area tradeoff in ga using residual load decoder 245 fig. 6 flow chart for residual load heuristic fig. 7 chaining schedule and allocation to offspring 1 (decoded) 246 a. sengupta, r.sedaghat, vk. mishra exe cons fu mux demux cons g max max t t [a (a a )] a c w1 w2 t a         (3) texe = total execution time taken for execution of the given sets of data; where texe is calculated using the function from [1] given in equation (4): exe c t {l (n 1) t }    (4) l= latency of the scheduling solution. tc = cycle time of the scheduling solution. (note: the cycle time is the difference in clock cycles between any consecutive outputs of pipelined data instances. the cycle time information is therefore not extracted from the module library since it is not readily available, i.e. the cycle time calculation for the integrated solution (fig. 7). the output for first set of data is arriving after 14cc while the output for second instance of data is arriving after 26cc. thus, due to pipelining there is a cycle time difference of 12 cc resulting from considering the initiation interval. therefore the option of cycle time during pipelining which is the resulting effect of considering initiation interval during data pipelining has been also taken into account during the exploration process. at= total area calculated using eq. 5. t fu mux demux a = a +(a +a ) (5) n = number of sets of data to be executed. cg = global cost of the integrated solution tcons = execution time specified by the user. tmax = max execution time taken by a solution during the specific generation (g). afu = total area of the functional units. amux = total area of the multiplexer used during implementation. ademux = total area of the demultiplexers used during implementation. acons = area constraint specified by the user. amax = max hardware area of a solution during the specific generation (g). w1 and w2 = user specified preference of the constraints. the cost function requires input from various sources to evaluate the fitness of each solution found. for the calculation of the execution time, the sources consist of: a) module library information, b) data extracted for the hardware implementation, c) data flow graph and d) scheduling solution found after decoding the chromosome (latency), number of sets of data, cycle time together. 3.7 termination criterion for the genetic algorithm the maximum generation has been kept constant for each benchmark run. although making the number of generations proportional to the problem size is more logical, settling on an average number of maximum generations for both small and large size benchmarks is a good compromise. therefore, experiments dictated that retaining the maximum generation g(max) at 100 is an optimal compromise. execution time – area tradeoff in ga using residual load decoder 247 4. experimental results various dsp benchmarks [29], [30] such as digital filter, auto regressive filter (arf), discrete wavelet transformation (dwt), digital butterworth filter, band pass filter (bpf) and elliptic wave filter (ewf), mpeg motion vectors, mesa: matrix multiplication and jpeg: down sample were tested and verified. the proposed approach has been implemented in java and run on intel core i5-2450m processor, 2.5 ghz with 3mb l3 cache memory and 4gb ddr3 ram. expsys finds optimal/near-optimal results for all the benchmark applications. moreover, the proposed expsys was also compared to [8] with respect to the mentioned benchmarks under the same constraints to make a qualitative assessment and strength of the proposed approach. the proposed achieved better quality of result (determined by eq.6) as shown in table i. furthermore, expsysalso considers cycle time resulting from initiation interval and latency to create a genuinely pipelined functional data-path during performance calculation. [8], on the other hand, is not able to optimize the execution time considerably due to its inability to create a genuinely pipelined functional data-path. thus, for determining of execution time in [8], “n” set of processing data is multiplied directly with the latency as per: [8] exe t n * l. where the qor is determined as: max max 1 2 t exe a t qor a t        (6) with respect to achieved qor, expsys produces better solutions compared to [8] for all the benchmarks as evident in table 1. for example, in the case of arf benchmark, the optimal resource configuration found 3 (*) and 1(+), the area of solution is 10934au, the execution time is 54281µs and the qor is 0.35. on the other hand [8], based on same constraints, yields an optimal resource configuration which is 4(*), 1(+) with 13776au area, 45630 µs execution time and 0.36 qor. expsys achieves an average improvement in qor greater than 26% (table 1). 5. conclusion this paper proposed a novel technique for area-execution time tradeoff using residual load decoding heuristics in genetic algorithm (ga) for integrated design space exploration (dse). to the best of the authors‟ knowledge, this approach is the first gabased dse method for area-execution time tradeoff in hls. based on the results obtained from the experiment, the proposed expsys is able to provide not only competitive but also superior results for almost all tested dsp benchmarks. acknowledgement: this work is supported by the optimization and algorithm research lab (opral), ryerson university, canadian microelectronics corporation (cmc), motorola, nserc crsng, ontario innovation trust and sun microsystems. additionally, this work acknowledges the assistance provided by science and engineering research board (serb), department of science and technology, govt. of india. 248 a. sengupta, r.sedaghat, vk. mishra references [1] anirbansengupta, reza sedaghat, zhipengzeng, “a high level synthesis design flow with a novel approach for efficient design space exploration in case of multi parametric optimization objective”, microelectronics reliability, elsevier, volume 50, issue 3, march 2010, pages 424-437. [2] c. mandal, p. p. chakrabarti, and s. ghose, “gabind: a ga approach to allocation and binding for the high-level synthesis of data paths,” ieee transaction on vlsi, vol. 8, no. 5, pp.747–750, oct. 2000. table 1 experimental results of comparison with [8] for the dsp benchmarks dsp benchmarks parameters of comparison (note: us = micro seconds and au = area unit; au = 1transistor, g(max)=100 and w1=w2=0.5 ) optimal resource combination execution time n=1000 (us) area (au) qor expsys [8] expsys [8] expsys [8] expsys [8] auto regressive filter (arf) fu 3(*),1(+) 4(*),1(+) 54281us 45630us 10934au 13776au 0.35 0.36 mux 8 10 constraint 70000us constraint 15000au demux 4 5 discrete wavelet transformation (dwt) fu 4(*),1(+) 2(*),1(+) 10844us 66420us 13776au 8092au 0.38 0.56 mux 10 6 constraint 30000us constraint 10000au demux 5 3 digital butterworth filter fu 2(*),1(+) 3(*),1(+) 22880us 22410us 8092au 10934au 0.42 0.49 mux 6 8 constraint 30000us constraint 9000au demux 3 2 band pass filter (bpf) fu 4(*),1(+) 2(*),1(+) 11642us 68310us 13776au 8092au 0.42 0.52 mux 10 6 constraint 30000us constraint 15000au demux 5 3 elliptic wave filter (ewf) fu 3(*),1(+) 2(*),2(+) 21085us 46440us 10934au 10500au 0.45 0.57 mux 8 8 demux 4 4 constraint 50000us constraint 8000au jpeg downsample fu 2(*),1(+) 1(*),1(+) 10818us 29700us 8092au 5250au 0.31 0.59 mux 6 4 constraint 15000us constraint 15000au demux 3 2 mpeg motion vector fu 4(*),1(+) 5(*),1(+) 32680us 35640us 13776au 16618au 0.24 0.27 mux 10 12 constraint 40000us constraint 25000au demux 5 6 discrete cosine transformation (dct) fu 4(*),1(+) 2(*),2(+) 31467us 88290us 13776au 10500au 0.33 0.47 mux 10 8 constraint 50000us constraint 15000au demux 5 4 mesa horner fu 3(*),1(+) 2(*),1(+) 10843us 65070us 10934au 8092au 0.35 0.59 mux 8 6 demux 4 3 constraint 25000us constraint 12000au mesa matrix multiplication fu 7(*),1(+) 4(*),2(+) 53628us 132570us 32570au 16184au 0.19 0.24 mux 16 12 constraint 200000us constraint 40000au demux 8 6 execution time – area tradeoff in ga using residual load decoder 249 [3] m. j. m. heijlingers, l. j. m. cluitmans, and j. a. g. jess, “high-level synthesis scheduling and allocation using genetic algorithms,” in proc. asp-dac., pp. 61–66, 1995. [4] m. k. dhodhi, f. h. hielscher, r. h. storer, and j. bhasker, “datapath synthesis using a problem-space genetic algorithm,” in ieee trans.comput.-aided des., vol. 14, pp. 934–944,1995. [5] i. das. a preference ordering among various pareto optimal alternatives. structural and multidisciplinary optimization, 18(1):30–35, aug. 1999. [6] alessandro g. di nuovo, maurizio palesi, davide patti, fuzzy decision making in embedded system design,” proc. of 4th intl conference on hardware/software codesign and system synthesis, pp: 223-228, october 2006. [7] j. c. gallagher, s. vigraham, and g. kramer,“a family of compact genetic algorithms for intrinsic evolvable hardware,” ieee trans. evolutionary computation., vol. 8, no. 2 , pp. 1–126, apr. 2004. [8] vyas krishnan and srinivaskatkoori, “a genetic algorithm for the design space exploration of datapathsduring high-level synthesis, ieee tran.on evolutionary computation, vol.10, no.3, 2006. [9] e. torbey and j. knight, “high-level synthesis of digital circuits using genetic algorithms,” in proc. int. conf. evol. comput., pp.224–229, may 1998. [10] e. torbey and j. knight, “performing scheduling and storage optimization simultaneously using genetic algorithms,” in proc. ieee midwest symp. circuits systems, pp. 284–287, 1998. [11] giuseppe ascia, vincenzo catania, alessandro g. di nuovo, maurizio palesi, davide patti, “efficient design space exploration for application specific systems-on-a-chip” jrnl of systems architecture 53, pp:733–750, 2007. [12] a.c.williams, a.d.brown and m.zwolinski,“simultaneous optimisation of dynamic power, area and delay in behavioural synthesis”, iee proc.-comput. digit. tech, vol. 147, no. 6, pp: 383-390, 2000. [13] christian haubelt, thomas schlichter, joachim keinert, mike meredith, “systemcodesigner: automatic design space exploration and rapid prototyping from behavioral models”, proceedings of the 45th annual acm ieee design automation conference, pages 580-585, 2008. [14] xuejie zhang and kam w. ng, “a review of high-level synthesis for dynamically reconfigurable fpgas”, microprocessors and microsystems, elsevier, volume 24, issue 4, pages 199-211,1 2000. [15] n. wehn et al., “a novel scheduling and allocation approach to datapath synthesis based on genetic paradigms,” in proc. ifipworking conf. logic architecture synthesis, pp. 47–56, 1991. [16] r. m. san and j. p. knoght, “genetic algorithms for optimization of integrated circuit synthesis,” in proc. 5th int. conf. genetic algorithms, san mateo, ca, pp. 432–438, 1993. [17] r. j. cloutier and d. e. thomas, “the combination of scheduling, allocation and mapping in a single algorithm,” in proc. 27th design automation conf., pp. 71–76, jun. 1990. [18] j. a. nestor and g. krishnamoorthy, “salsa: a new approach to scheduling with timing constraints,” ieee trans. comput.-aided des., vol. 12, pp. 1107–1122, 1993. [19] g. krishnamoorthy and j. a. nestor, “data path allocation using extended binding model,” in proc. 32nd acm/ieee design automation conf., pp. 279–284, 1992. [20] s. devadas and a. r. newton, “algorithms for hardware allocation in data path synthesis,” ieee trans. comput.-aided des., vol. 8, pp.768–781, 1989. [21] t. a. ly and j. t. mowchenko, “applying simulated evolution to high level synthesis,” ieee trans. comput.-aided des., vol. 12, no. 2, pp.389–409, feb. 1993. [22] c. h. gebotys and m. i. elmasry, “global optimization approach for architectural synthesis,” ieee trans. comput.-aided des., vol. 12, pp. 1266–1278, 1993. [23] c. t. hwang, j. h. lee, y. c. hsu, and y. l. lin, “a formal approach to the scheduling problem in highlevel synthesis,” ieee trans. comput.aided des., vol. 10, no. 2, pp. 464–475, feb. 1991. [24] g. de micheli, synthesis and optimization of digital circuits. new york: mcgraw-hill, 1994. [25] r. camposano, “path-based scheduling for synthesis,” ieee trans.cad., vol. 10, pp. 85–93, 1991. [26] p. g. paulin and j. p. knight, “force-directed scheduling for the behavioral synthesis of asics,” ieee trans. comput.-aided des., vol. 8, no.6, pp. 661–679, 1989. [27] a. c. parker, j. t. pizarro, and m. mlinar, “maha: a program for datapath synthesis,” in proc. 23rd acm/ieee design automation conf., 1986, pp. 461–466. [28] t. blickle and l. thiele, “a mathematical analysis of tournament selection,” in proc. 6th int. conf. genetic algorithms, pp. 9–16, 1995. [29] http://www.cbl.ncsu.edu/benchmarks/. [30] saraju p. mohanty, nagarajanranganathan, elias kougianos and priyadarsanpatra, “low-power highlevel synthesis for nanoscale cmos circuits” chapterhigh-level synthesis fundamentals, springer us, 2008. instruction facta universitatis series: electronics and energetics vol. 30, n o 2, june 2017, pp. 161 178 doi: 10.2298/fuee1702161p spice modeling of ionizing radiation effects in cmos devices  tatjana pešić-brđanin faculty of electrical engineering, university of banja luka, republic of srpska, bosnia and herzegovina abstract. electric characteristics of devices in advanced cmos technologies change over the time because of the impact of the ionizing radiation effects. device aging is caused by cumulative contribution of generation of defects in the gate oxide and/or at the interface silicon-oxide. the concentration of these defects is time and bias-dependent values. existing models include these effects through constant shift of voltage threshold. a method for including ionizing radiation effects in spice models of mos transistor and finfet, based on an auxiliary diode circuit using for derivation of values of surface potential, that also calculates the correction time-dependent voltage due to concentration of trapped charges, is shown in this paper. key words: ionizing radiation effects, trapped charges, spice model, cmos devices 1. introduction with aggressive scaling of device dimensions in cmos technologies, which includes the decrease of oxide thickness and the increase of doping concentration in the channel, the susceptibility of the most cmos technologies has been reduced. scaling of the oxide thickness caused the decrease of concentration of fixed charge in the oxide, because the value of the concentration is directly proportional to the oxide thickness. on the other side, the increase of doping concentration in the channel decreased the oxide trapped charge effect on the surface potential of the channel, which also caused robustness of the components on ionizing radiation [1]. however, recent studies showed that the negative bias temperature instability damage and hot carrier injection damage were attributed to the charges trapped in the oxide (with areal density nox) and/or at the interface of the silicon and oxide layers (with energy density distribution dit) [2-4]. therefore, trapped charges still represent a potential radiation threat and have measurable impact on the integrated circuits performances [2,5].  received november 2, 2016 corresponding author: tatjana pešić-brđanin faculty of electrical engineering, patre 5, 78000 banja luka republic of srpska, bosnia and herzegovina (e-mail: tatjanapb@etfbl.net) 162 t. pešić-brđanin a harmful effect of ionizing radiation on cmos devices can be diminished by using well-known techniques, such as radiation-hardening-by-process (rhbp) and radiationhardening-by-design (rhbd) techniques [6,7]. however, even with significant efforts in rhbp and rhbd techniques, the capability of estimating the influence of ionizing radiation on electric characteristics of devices in advanced technologies are still improper [8]. analysing of test ic circuits on ionizing radiation is quite expensive [7], so the incorporation of ionizing radiation effects in devices compact models used in standard electric circuits simulators is put upon as an alternative. the incorporation of these effects needs the knowledge of physical processes which contribute to emerging of the defects due to ionizing radiation and the impacts which these effects have on the electric characteristics of components in advanced cmos technologies [8,9]. numerous existing techniques for modelling these effects in circuit simulators are based on the fixed change of threshold voltage (threshold voltage shift), not considering the special impact which these defects have on the electric characteristics of the transistors [2,10-12]. previously derived surface-potential based non-quasi static mos model (nqs mos model) and non-quasi static soi model (nqs soi model) can be modified as to include these effects of oxide trapped charges and interface trapped charges is described in this paper [13,14]. 2. ionizing radiation effects in cmos devices the main cause of the damage that occurs in cmos devices after ionizing radiation is the generation of the electron-hole pairs in the oxide (or another dielectric) as a material that is the most sensitive to ionizing radiation in cmos devices. after the generation of the electron-hole pairs, some of the pairs are immediately recombined. since the electron mobility in the oxide is considerably bigger that the hole mobility [15,9], the electrons will be soon swept out of the oxide or the dielectrics, while the holes will move slowly through the oxide to the interface sio2-si, causing long-term effects of the ionizing radiation. fig. 1 shows the processes after the ionizing radiation. fig. 1 processes in the oxide after the ionizing radiation [16] spice modeling of ionizing radiation effects in cmos devices 163 vacancies in the oxide or the dielectrics can trap the generic holes. a total amount of trapped charge in the oxide is nox. the trapped charge changes the threshold voltage thv of cmos devices for the threshold voltage shift [17]: , 2 ox oxox th ntq v   (1) where q is the electron charge, tox is the oxide thickness and ox is the oxide permittivity. the threshold voltage shift vth is negative, which means that in the case of the nmos transistor the off current increases, while in the case of the pmos transistor the total value of threshold voltage vth increases, as shown in fig. 2(a). it can be concluded from (1) that vth depends on the square of the oxide thickness; with the decrease of the oxide thickness in nanometer cmos technologies and due to the change of the threshold voltage the oxide trapped charge will be smaller. fig. 2 illustration of the threshold voltage shift vth due to the oxide trapped charges (a) and increase in subthershold swing due to interface trapped charges (b) [17] after the ionizing radiation, the generation of interface traps occurs, which concentration is nit. the generation holes react with hydrogen atoms in the oxide, making in such a way h + ions [18]. these ions move by drifting to sio2-si interface, and create 164 t. pešić-brđanin dangling bonds (i.e. pb centres) [2]. interface trapped charges are often linked with the permanent effects of components aging [2,10]. fig. 2(b) shows the impact of the generation of trapped charges at the sio2-si interface on the transfer characteristic of the transistors. it can be noted that these charges increase the swing in the device subthreshold region. for nmos and pmos transistors, the generation of interface trapped charges decreases the transistor off current. 3. nqs mos and nqs soi transistor models static and dynamical characteristics of transistors can be described by set of basic equations, which are comprised of poason's equation, drift-diffusion and continuity equations [19]. since mos transistor modelling is three dimensional problem, solving these sets of equations is complex and memory demanding. however, for numerous practical applications of mos transistors, changes in the third direction can be neglected and problem can be reduced to two dimensional problem (to x and y direction). 3.1. nqs mos transistor model in [13] a physically based nqs mos transistor model is described, which belongs to a group of models based on surface potential. fig. 3 shows equivalent model scheme, which as a subcircuit can be embedded into electric circuit simulators. external elements of transistor model (resistors and capacitors) can be modelled in a similar way as in other stationary or non-stationary models. unlike some known models [20-22], in the nqs mos model there are no analytical expressions for node currents, but they are obtained after the solution of equivalent circuit shown on fig. 3(a). this subcircuit has two parts, as shown on fig. 3(b):  internal part is connected to transistor gate terminal. this part of the model is, in fact, equivalent line that models drift-diffusion transport of electrons in transistor channel;  external part is connected to source, drain and gate terminals, and it contains current-controlled current sources is1 and isn. this part of the circuit is defined by the potential of source, drain and substrate that is obtained by mirroring the currents which flow through voltage sources s1 and sn. voltage generators s1 and sn copy values of boundary surface potentials to subcircuit in the source end and the drain end of channel. voltage generator vb serves to copy bulk polarisation to equivalent subcircuit. capacitance coxk represents gate-oxide capacitance (coxk = cox / n). the other model elements rk and ck, non-linear channel resistance and depletion region capacitance, are respectively defined by the equations: 3 31/ 1 2 1 4 5 6 (1 ( )) (1 ( ( )) ) , ( ) a a gs sk sk sk k gb fb sk sk a v a r a a v v a                (2) 1/ 20 7 ( ) , 2 bk si ch k sk sk sk q qn c a           (3) spice modeling of ionizing radiation effects in cmos devices 165 where the constants a1  a7 are physically based, nch is doping concentration in the channel and si is the silicon permittivity. surface potential of every cell is denoted with sk. the derivations for (2) and (3) and the expressions for a1  a7 are given in [13]. (a) (b) fig. 3 nqs mos model (a) and the equivalent subcircuit (b) in a surface charge-sheet model, which describes mos transistor operation [23], the boundary channel potentials s1 and sn at the source and drain side are functions of biasing voltage of transistor terminals through the following recurrent relations [24]: 166 t. pešić-brđanin 2 1 1 12 1 1 2 ln ( ) , s f sb t gb fb s s t v v v v v                    (4) 2 2 1 1 2 ln ( ) . sn f sb ds t gb fb sn sn t v v v v v v                     (5) in the previous equations  is the body factor, vt is the thermal voltage, f is the channel potential (=vt ln(nch/ni))) and vfb is the flatband voltage. since the equations (4) and (5) are implicit relations, to determine surface potentials s1 and sn there are several iterative methods proposed in the literature [25]. in the nqs mos model, relations (4) and (5) are determined by diode circuits. for any point y in the channel is: 2 2 1 1 exp( / ) 1 exp(2 / ) ( ) 1. sy t fy t gb fb sy sy t v v v v v                     (6) by comparing the equation (6) with the diode current expression: 0 (exp( / ) 1) d sy t ss i i v i   (7) the conclusion is that: 2 2 0 1 1 exp(2 / ) ( ) 1, 1. ss fy t gb fb sy sy t i v v v v i                    (8) when determining the boundary surface source potential s1, in the equation (8) sy and fy should be replaced with sy = s1 and fy = 2f + vsb, consecutively, while for determining boundary surface potential on the drain side sn instead sy and fy should be used sn and 2f + vsb + vds, respectively. owning to this type of analysis, it is possible to construct a circuit for solving equations (7) and (8), which is comprised of a diode (with unit current i0 = 1) and voltage-controlled current source, where the current is calculated by the equation (8). figure 4 shows this type of auxiliary diode circuit. for determining both boundary surface potentials, 1s and sn , there are used two identical diode subcircuits and the described method is used to solve the equations (4) and (5). the values of the boundary surface potentials determined in this way are copied with voltage generators s1 and sn (shown in fig. 3(b)) on the input and output of equivalent circuit to solve the transport of the electrons in the channel. knowing the boundary surface potentials allows us to calculate the values of nonlinear resistors and capacitors rk and ck, namely to determine the transistor currents. fig. 4 diode subcircuit for solving surface potentials spice modeling of ionizing radiation effects in cmos devices 167 a physical base of the nqs mos model in an easy way allows including significant effects shown in aggressive scaling of transistor dimensions, like, for example, short channel effects and quantum-mechanics effects. 3.2. nqs soi transistor model a compact model for n-channel fully depleted soi mos transistor with double gate (fd soi transistor) is developed based on the nqs mos model, and it is applicable for asymmetrical and symmetrical planar structures [14]. in non-stationary model of fd soi mos transistor (nq soi model), a transistor is represented by parallel connection of two soi transistors with one gate, as shown in fig. 5, to model current in a front and back channel [14]. fig. 5 schematic presentation of fd soi transistor (a) and its electric equivalent (b) by comparison with the nqs mos model, recurrent expressions for calculating boundary surface potentials in the nq soi model also contains the influence of biasing of both gates. so the boundary surface potentials in channel s1 and sn in the fd soi transistor are connected with biasing of front (vgf) and back (vgb) gate, and biasing between drain and source vds with new recurrent relations [26,27]: 1 11 1 2 2 2 1 12 2 2 / / // / 1 1 1 ( ) ( ) ( ) ( ) ,f t s t s tb t b t oxf gf fbf s gf fbb b oxb v v vv v t t s b t v v v v t v e e e v e e                             (9) 2 2 2 2 2 ( 2 ) / / / / / 1 ( ) ( ) ( ) ( ) ,f ds t sn t bn t sn t bn t oxf gf fbf sn gf fbb bn oxb v v v v v v t t sn bn t v v v v t v e e e v e e                               (10) where, in the case of fully depleted silicon layer, boundary potentials of back channel can be expressed as: ,and 22 11 si sich snbn si sich sb tnqtnq      (11) while for a fully symmetrical transistor applies toxf = toxb. in the equations (9)-(11) the index f relates to the front gate, and the index b relates to the back gate. recurrent 168 t. pešić-brđanin relations (9) and (10) are calculated with the assumption that the difference of fermi’s potentials between the source and the drain is equal to the voltage vds. electric potential distribution in the channel through depth, i.e. in the line of axis x, is obtained by solving these recurrent relations (fig. 6). fig. 6 electric potential distribution in the channel through depth of fd soi transistor for applications in the nq soi model for a symmetrical fd soi mos transistor, recurrent equations for calculating boundary surface potentials can be written with basic algebraic transformations [14] in the following form: / / 1 2 ( ) ( ) ,sx t sx t v v s s s i e i e i     (12) while: 2 2 2 2 2 1 1 ( ) ,ch si ch si s gf fbf sx gf fbb sx t si si qn t qn t i v v v v v                           (13) ,exp1 2 / 1                   sit sichtvfx s v tnq ei   (14) ,exp1 2 2          sit sich s v tnq i  (15) where on the source side sx = s1 and fx = 2f , while on the drain side the changes have to be made sx = sn and fx = 2f + vds. in the previous expressions, tsi is the silicon film (body) thickness. auxiliary diode circuits, similar to the nqs mos model for solving recurrent relations, are used in this way for calculating boundary values of surface potentials in the nq soi model. fig. 7 shows equivalent diode circuit for solving the equation (12) [14]. fig. 7 diode subcircuit for solving surface potentials in nqs soi model spice modeling of ionizing radiation effects in cmos devices 169 4. inclusion of nox and dit in nqs mos and nqs soi models a physical foundation of previously described models allows easily inclusion of effects important for transistor operation. modelling of the effects of generation interface trapped charge with energy density distribution dit and oxide trapped charge with areal density nox is possible in nqs mos and nqs soi model by changing the surface potential equations. it is possible to model the impact of these effects onward on the characteristics of transistor in two ways: 1. auxiliary diode circuits, with the included effects of nox and dit, are used for determining surface potentials for use in nqs mos and nqs soi models or 2. auxiliary diode circuits, with the included effects of nox and dit, are used for determining surface potentials, and then to connect consecutively to gate of some standard models (for example, bsim 4 for mos transistor or bsim.cmg for finfet). a total amount of electric charge caught in oxide is: ,oxox qnq  (16) while a total amount of interface charge [19]: , 2 2/           f g it ge fe ititit e e qddedqq (17) where eg / 2 is the midgap energy level at the interface and ef is the energy of fermi level. if we add and subtract the factor egb / 2, where egb is the bulk midgap energy level, to the factors in the equation parenthesis (17) we have:  . 222 222 fsit f gbggb it gb f gbg itit qd e eee qd e e ee qdq                                      (18) as stated in the section 2, charges qox and qit have impact on the change of the transistor voltage threshold. this change can be expressed by correction potential nt [6]: [ ( )]. ox it nt ox it s f ox ox q q q n d c c         (19) in the nqs mos model, the equations (4)-(6) are modified in a way to include correction potential nt. eqn. (6) in a modified form with included correction potential is: 2 2 1 1 exp( / ) 1 exp(2 / ) ( ) 1. sy t fy t gb fb sy nt sy nt t v v v v v                         (20) 170 t. pešić-brđanin for determining surface potential sy, two identical diode circuits are used, as shown in fig. 3(b). in the nqs soi model, for a symmetrical fd dg soi transistor, the equation for surface potential is modified in a way to include nt in the following way: 1 11 1 2 2 1 12 ( 2 ) / / // / 1 1 1 ( ) ( ) ( ) ( ) .f ds t s t s tb t b t gf fbf nt s gf fbb nt b bv v v vv v t t s b v v v v v e e e v e e                               (21) the parameter b, which appears in the equation (21), can have the value b = 0 for the source end of the channel and b = 1 for the drain end of the channel (in accordance with the equations (9) and (10)). however, the main problem in modelling of trapped charges with (21) is the fact that the distribution of surface potential in the channel depends not only on gate voltage, but also on drain voltage vds due to split of quasi fermi levels [19]. it means that the concentration qit will change along the channel, even for the constant nit. the impact of the changeable charge qit along the channel can be modelled with a modified value of the parameter b  (0,1). in the equation (21) it is calculated with in advance known value, and it is possible with the fine tuning [28] to accomplish better match of the model results with the results of 2d tcad numeric simulator silvaco atlas [29]. the equation (21) can also be solved with auxiliary diode circuits (fig. 7) with: 2 2 2 2 2 1 1 ( ) ,ch si ch si s gf fbf nt sx gf fbb nt sx t si si qn t qn t i v v v v v                               (22) ,exp1 2 /)2( 1                   sit sichvbv s v tnq ei tdsfx   (23) .exp1 2 2          sit sich s v tnq i  (24) the surface potential s from the diode circuit in fig. 7 represents the equation solution (21) for any combination of voltage variables vds and vgs. 5. simulation results and discussion the ionizing radiation has the effects on the changes of the electric characteristics of the transistor. in the paper, the approaches described in the section 3 are used for the simulation of electric characteristics of the transistor and the results are compared with numerical results. spice modeling of ionizing radiation effects in cmos devices 171 5.1. modeling of nox and dit effects in mos transistor including of the effects nox and dit in the nqs mos transistor model is made by incorporation of the correctional potential nt in the surface potential equation (eqn. 20). as already stated, with diode circuits as in fig. 4, by using mathematical apparatus available in the spice, the boundary surface potentials are acquired, and based on them the equivalent line is solved (fig. 3). in this paper, the equivalent line is divided on 10 equal segments. fig. 8 shows the acquired surface potentials that show the impact of nox (fig. 8(a)) and the impact of the interface trapped charges through dit on the surface potential value. the results acquired with diode circuits are shown with solid line, while the numerical results are shown with open circles. a solid compliance of the results confirms the efficiency of the diode circuit as a new method for solving iterative relations (21). as it can be seen on the figure, the surface potential is changed for constant negative voltage shift with the increase of nox, while dit = 0. in the case of the increase of dit while nox = 0, the voltage shift of the surface potential will depend on its value due to the dynamic charge contribution on sio2-si interface. namely, the interface charges have the energy inside forbidden zone. interface trapped charges with energies above intrinsic energy level ei behave as acceptor-like charges, while all interface trapped charges with energies below intrinsic energy level behave as donor-like charges, which is experimentally verified [2,30,31]. fig. 8 surface potential versus gate voltage dependence for different values of nox at dit = 0 (a) and for different values of dit at nox = 0 (b) obtained from spice simalation of proposed model (solid line) and tcad numerical results (open circles) for mos transistor with tox = 5 nm and nch = 410 17 cm 3 fig. 9 shows the transfer characteristics of mos obtained from the spice and compared with tcad numerical results, which shows solid compliance of the results of the applied method in nqs mos model with the tcad numerical results. it is important to state that in [2] is used the same expression for correctional potential due to the effects of ionizing radiation, by using voltage-controlled voltage source (vcvs) with voltage: )(),,,( sitoxsbgbdf fdnvvfv  (25) 172 t. pešić-brđanin and which is series connected to transistor gate, for which some of standard models are used (for example, bsim model). for determining vdf = nt, respectively solving (19) the authors used the non-iterative algorithm inside the verilog-a model [2], while in our method the iterative equation for determining the surface potential was solved in a physical way, with diode subcircuits. fig. 9 transfer characteristics id(vgs) for different values of nox at dit = 0 (a) and for different values of dit at nox = 0 (b) obtained from spice simalation of proposed model (solid line) and tcad numerical results (open circles) 5.2. modeling of nox and dit effects in finfet with the scaling of the device dimensions, conventional transistors reached its limits, so new technological structures for future generations of integrated circuits are emerging. such structure is fully-depleted floating-body (fin) multi-gate fet (finfet) [32]. however, recently it has been shown that finfet technology has rapid rate of aging, so that the degradation on finfet exceeds the degradation of the planar technology node by higher stress voltage and longer time [33]. therefore, the modelling of ionizing radiation effects in these structures is important. in the standard bsim.cmg model [34] for finfet, however, there is only fitting parameter cit (interface trap capacitance parameter) in sub-threshold region [35], while it does not have a possibility for user-defined input of oxide trapped charges. fig. 10 shows a schematic presentation of n-type finfet analysed in this paper (with the following parameters l = 0.9 m, tox = 5 nm, tsi = 20 nm, nch = 2.410 18 cm 3 and nd = 10 20 cm 3 ). fig. 10 schematic representation of n-type finfet spice modeling of ionizing radiation effects in cmos devices 173 fig. 11 shows the output characteristics of transistor obtained by using tcad numerical results, bsim.cmg model which parameters are acquired by fitting, and modified nqs soi model. in order to simplify the tuning of the parameters of bsim.cmg model, a simulate structure has a long channel and the thickness of oxide gate and silicon fin, so the effects of a short channel can be neglected, and the silicon fin is fully depleted [28,36]. the same parameter set is used for p-type finfet, with the fact that the fin film has the opposite doping (n-type fin film). in the absence of the ionizing radiation effects, the compliance of results of different models is shown [28]. fig. 11 the output characteristics of n and p-type finfets simulated for nox = 0 and dit = 0 with spice using bsim.cmg model (solid line), nqs soi model (dashed line) and tcad simulator silvaco atlas (open circles) modeling of nox and dit effects by using auxiliary diode subcircuits (ads) for solving surface potential equations (21) is possible in two ways: by using nqs soi model (time consuming), or as shown in [2,6], for determining surface potential as control voltage of vcvs for producing vdf = nt = f (vgb, vsb, nox, dit). this vcvs is connected in series with gate node of bsim.cmg model, as shown in fig. 12. second approach of modelling the ionizing radiation effects in finfet is at time more comfortable, because the simulation execution time is shorter and there are no problems due to convergence, but due to a physical dependency the nqs soi model is more convenient, because other effects important for the operation of finfet can be easily included (for example, quantum-mechanic effects). the second approach, bsim.cmg model with ads, was used in this paper for modelling the ionizing radiation effects. 174 t. pešić-brđanin fig. 12 schematic of diode subcircuit shown together with the bsim.cmg finfet model as implemented in spice simulations to include the effects of nox and dit fig. 13 shows transfer characteristics of n and p-type finfets for different values of dit while nox = 0. fig. 14 shows transfer characteristics for different values of nox while dit = 0, and fig. 15 shows characteristics for combinations of different values of nox and dit. in figs. 14 and 15 there are no results obtained by bsim.cmg model because oxide trapped charge effect is not included in this model. all characteristics are generated for vds = 1.2v. in the bsim.cmg model, a parameter cit is determined for given dit. parameter b, which appears in the equation (21), was used with value b = 0.05, for the reason previously explained in section 4. all stated characteristics show good match of suggested approaches with tcad numerical results [28,37]. fig. 13 transfer characteristics id(vgs) for different values of dit at nox = 0 spice modeling of ionizing radiation effects in cmos devices 175 fig. 14 transfer characteristics id(vgs) for different values of nox at dit = 0 fig. 15 transfer characteristics id(vgs) for combined influence of nox and dit for n-type finfet fig. 16 shows changes of threshold voltages for n and p-type finfets after ionizing radiation, obtained from tcad and proposed method. the constant current method is used for threshold voltage extraction [28,38], with i'd = 100 na/m. the impact of this ionizing radiation effect is also experimentally confirmed [39]. 176 t. pešić-brđanin fig. 16 theshold voltages vth for p and n-type finfet as function of nox and dit. 6. conclusion the modelling of ionizing radiation effects for cmos devices is presented in this paper. it is shown how surface potential equations can be modified with correctional potential, which is a result of existence of oxide charges and interface trapped charges. auxiliary diode circuits were used for determining modified surface potentials, while for obtaining electric characteristics of devices, two approaches were used, previously developed non-stationary models for cmos devices and, second approach, vcvs (with controlled voltage obtained by diode circuits) in series with gate node of standard models. in comparison with tcad numerical simulations, the efficiency of suggested approaches for prediction of impacts of dynamic effects of both oxide and interface trapped charges on electrical characteristics of devices is shown. references [1] n. s. saks and m. g. ancona, "generation of interface states by ionizing radiation at 80k measured by charge pumping and subthreshold slope techniques," ieee trans. on nucl. sci., vol. 34, pp. 1348-1354, 1987. [2] i. esqueda, h. barnaby, "a defect-based compact modeling approach for the reliability of cmos devices and integrated circuits," solid-state circuits, vol. 91, pp. 81-86, 2014. [3] v. huard, cr. parthasarathy, a. guerin, e. pion, "cmos device design in reliability approach in advanced nodes," ieee irps conference, pp. 624-633, 2009. [4] v. huard, "two independent components modeling for negative bias temperature instability," ieee irps conference, pp. 32-42, 2010. [5] a.v. sogoyan, a.s. artamonov, a.y. nikiforov, d.v. boychenko, "method for integrated circuits total ionizing dose hardness testing based on combined gammaand x-ray irradiation," facta universitatis, series: electronics and energetics, vol. 27, no. 3, pp. 329-338, 2014. spice modeling of ionizing radiation effects in cmos devices 177 [6] h.j. barnaby, m.l. mclain, i.s. esqueda, v. xiao jie, "modeling ionizing radiation effects in solid state materials and cmos devices," ieee trans. on circuits and systems i, vol. 56, pp. 1870-1833, 2009. [7] d. boychenko, o. kalashnikov, a. nikiforov, a. ulanova, d. bobrovsky, p. nekrasov, “total ionizing dose effects and radiation testing,” facta universitatis, series: electronics and energetics, vol. 28, no. 1, pp. 153-164, 2015. [8] t.p. ma and p.v. dressendorfer, ionizing radiation effects in mos devices and circuits, new york: wiley, 1989. [9] m.m. pejovic, "p-channel mosfet as a sensor and dosimeter of ionizing radiation," facta universitatis, series: electronics and energetics, vol. 29, no. 4, pp. 509-541, 2016. [10] t. grasser, b. kacter, w. goes, t. aichinger, "a twostage model for negative bias temperature instability," ieee irps conference, pp. 33-44, 2009. [11] j.p. campbell, p.m. lenahan, a.t. krishnan, "nbti: an atomic-scale defect perspective," ieee irps conference, pp. 442-447, 2006. [12] w. wang, s. yang, s. bhardwaj, s. vrudhula, f. liu, y. cao, "the impact of nbti effect on combinational circuit: modeling, simulation and analysis," ieee trans. on vlsi syst. vol. 18, pp. 173– 83, 2010. [13] t. pešić, n. janković, "a compact non-quasi-static mosfet model based on the equivalent nonlinear transmission line", ieee trans. on computer-aided-design of integrated circuits and systems, vol. 24, pp. 1550-1561, 2005. [14] n. janković, t. pešić, "non-quasi-static physics based circuit model of fully-depleted double-gate soi mosfet", solid-state electronics, vol. 49, pp. 1086-1089, 2005. [15] g. a. ausman and f. b. mclean, "electron-hole pair creation energy in sio2," appl. phys. lett., vol. 26, pp. 173-177, 1975. [16] f. b. mclean and t. r. oldham, "basic mechanisms of radiation effects in electronic materials and devices," harry diamond laboratories technical report, vol. hdl-tr, pp. 2129, 1987. [17] esko mikkola, "hierarchical simulation method for total ionizing dose radiation effects on cmos mixed signal circuits", doctorate thesis, university of arizona, 2008. [18] f. b. mclean, "a framework for understanding radiation-induced interface states in sio2 mos structures," ieee trans. on nucl. sci., vol. 27, no. 6, pp. 1651-1657, dec. 1980. [19] s. m. sze, semiconductor devices, physics and technology, wiley, new york, 2008. [20] a.s. porret, j.-m. sallese, c. enz, "a compact non-quasi-static extension of a charge-based mos model," ieee trans. on electron devices, vol. 48, pp. 1647-1654, 2001. [21] m. miyake et al., "hisim-igbt: a compact si-igbt model for power electronic circuit design," in ieee trans. on electron devices, vol. 60, no. 2, pp. 571-579, feb. 2013. [22] g. gildenblat et al., "psp: an advanced surface-potential-based mosfet model for circuit simulation," in ieee trans. on electron devices, vol. 53, no. 9, pp. 1979-1993, sept. 2006. [23] j. r. brews, "a charge-sheet model of the mosfet", solid-state electronics, vol. 21, pp. 345-355, 1978. [24] f. van de wiele, "a long channel mosfet model," solid-state electronics, vol. 22, no. 12, pp. 991997, 1979. [25] m. miura-mattausch, u. feldman, a. rahm, m. bollu, d. savignac, "unified complete mosfet model for analysis of digital and analog circuits", ieee trans. on computer-aided design of integrated circuits and systems, vol. 15, pp. 1-7, 1996. [26] j. sleight, r. rios, "a continuous compact mosfet model for fullyand partially-depleted soi devices", ieee trans. on electron devices, vol. 45, pp. 821-825, 1998. [27] s. bolouki, m. maddah, a. afzali-kusha, m. el nokali, "a unified i-v model for pd/fd soi mosfets with a compact model for floating body effects", solid-state electronics, vol. 47, pp. 19091915, 2003. [28] nebojsa jankovic, tatjana pesic-brdjanin, "spice modeling of oxide and interface trapped charge effects in fully-depleted double-gate finfets", springer journal of computational electronics, vol. 14, no. 3, pp. 844-851, 2015. [29] silvaco atlas user's manual, http://www.silvaco.com, 2010. [30] ch helms, eh poindexter, "the silicon–silicon-dioxide system: its microstructure and imperfections," rep progr phys., vol. 57, pp. 791-852, 1994. [31] nh thoan, k. keunen, vv. afanas’ev, a. stesmans, "interface state energy distribution and pb defects at si(110)/sio2 interfaces: comparison to (111) and (100) silicon orientations," journal of appl. phys., 2011; 109:013710. 178 t. pešić-brđanin [32] j.-p. colinge (ed.), finfets and other multi-gate transistors, springer, 2008. [33] h. kukner, p. weckx, p. raghavan, b. kaczer, f. catthoor, lauwereins r. van der perre, g. groeseneken, "bti reliability from planar to finfet nodes," in proc. of the 3rd workshop on manufacturable and dependable multicore architectures at nanoscale (median'14), pp.11-14, 2014. [34] n. paydavosi, s. venugopalan, y.s. chauhan, j.p. duarte, s. jandhyala, a.m. niknejad, c.c. hu, "bsim-spice models enable finfet and utb ic designs," ieee access, vol. 1, pp. 201-215, 2013. [35] s. yao, t.h. morshed, d.d. lu, s. venugopalan, w. xiong, c.r. cleavelin, a. m. niknejad, c. hu, "global parameter extraction for a multi-gate mosfets compact model," in proc. of the ieee international conference on microelectronic test structures (icmts), pp. 194-197, march 2010. [36] h r. khan, d. mamaluy, d. vasileska, "approaching optimal characteristics of 10-nm highperformance devices: a quantum transport simulation study of si finfet," ieee trans. on electron devices, vol. 55, no. 3, pp. 743-752, march 2008. [37] t. pesic-brdjanin and nebojsa janovic, "sub-circuit model of fully-depleted double-gae finfet including the effects of oxide and interface trapped charge", in proceedings of the 16 th edition of ieee region 8 eurocon conference, pp. 273-276, salamanca, spain, september 2015. [38] a. ortiz-conde, f.j. garcia sanchez, j.j. liou, a. cerdeira, m. estrada, y. yue, "a review of recent mosfet threshold voltage extraction methods," microelectronics reliability, vol. 42, pp. 583-596, 2002. [39] yang-kyu choi, daewon ha, e. snow, j. bokor and tsu-jae king, "reliability study of cmos finfets," in proc. of the ieee international electron devices meeting, 2003. iedm '03, washington, dc, usa, 2003, pp. 7.6.1-7.6.4. instruction facta universitatis series: electronics and energetics vol. 27, n o 1, march 2014, pp. 103 112 doi: 10.2298/fuee1401103k multiple quantum walkers on the line using hybrid coins: a possible tool for quantum search  ioannis g. karafyllidis 1 , paul isaac hagouel 2 1 democritus university of thrace, department of electrical and computer engineering, 671 00 xanthi, greece 2 optelec, 11 chrysostomou smyrnis street, 54 622 thessaloniki, greece abstract. in this paper discrete quantum walks with different coins used for odd and even time steps are studied. these coins are called hybrid. the calculation results are compared with the most frequently used coin, the hadamard transform. furthermore, quantum walks on the line which involve two or more quantum walkers with hybrid coins are studied. quantum walks with entangled walkers and hybrid coins are also studied. the results of these calculations show that the proposed types of quantum walks can be used for quantum search, because the walker can be directed towards preferred directions and can also be confined in certain segments of the line. key words: quantum walk, quantum computing, simulation max 1. introduction quantum walks are quantum versions of classical random walks. they were first introduced in 1993 [1] and since then considerable work has been done on this subject. quantum walks are useful models for physical processes such as brownian motion and may serve as a basis for the development of new quantum algorithms [2], [3]. furthermore, quantum walk is a natural model for quantum search using parallel quantum computer architectures [4], [5]. quantum walks may also become an effective tool for studying biological systems [6]. several studies of continuous-time and discrete-time quantum walks on the line [7], [8], and some implementation proposals have been published [9]-[11]. the effect of noise on the discrete-time quantum walk has also been studied [12]. recently, a study of a quantum walk on the line with one walker and several coins has been published [13]. on the other hand, in [14] a quantum walk on the line with two entangled walkers and one coin, the hadamard transform, is studied. in this paper a study of discrete quantum walks in which multiple walkers use hybrid coins is presented. "hybrid coin" means the use of two different coins, one for odd and one for even time steps. the question to be answered by this study is: is it possible to use multiple quantum  received january 9, 2014 corresponding author: ioannis g. karafyllidis democritus university of thrace, department of electrical and computer engineering, 671 00 xanthi, greece (e-mail: ykar@ee.duth.gr) mailto:ykar@ee.duth.gr 104 i. g. karafyllidis, p. i. hagouel walkers and hybrid coins to direct the search towards a desired direction and, furthermore, is it possible to confine the search in a desirable segment of the line? calculation results show that this is possible. 2. quantum walk on the line with hybrid coins in discrete quantum walk a walker (which can be a particle or a state) moves on a one-dimensional periodic lattice. the sites of this lattice are numbered by: 0, 1, 2,i n    (1) the hilbert space of the discrete quantum walk comprises two subspaces, the location subspace hl, which is spanned by the basis: , 2 , 1 , 0 , 1 , 2 ,i n n    (2) and the two-dimensional coin subspace, hc, which is spanned by the two coin basis states 0 and 1 . the hilbert space, h, of the quantum walk is: l c h h h  (3) the state of the quantum walker found at location j with coin in state 0 is , 0j . the quantum walk usually starts with the walker in state 0 , 0 . at each step of the walk two operations are applied to the walker state. the coin toss operation, c, which acts on the coin state, is applied first: 0,0 0,1 1,0 1,1 , 0 , 0 , 1 , 1 , 0 , 1 c j c j c j c j c j c j     (4) any two-dimensional unitary transformation can by used as a coin toss operation. usually, the hadamard transform, h, is used. in this case: 0,0 0,1 1,0 1,1 1 1 1 2 1 1 c c h c c                    (5) the second operation applied is the walker shit operation, s, which acts on the location state and is given by: 1 1 1 0 0 1 n j n s j j j j         (6) this operation shifts the walker to the right (towards +n) if the coin state is 1 and to the left if the coin state is 0 . the probability distribution for a discrete quantum walk with initial walker state 0 , 0 , in which the hadamard transform is used as coin, is shown in multiple quantum walkers on the line using hybrid coins: a possible tool for quantum search 105 figure 1. the probability distribution is biased towards left because of quantum interference. the initial walker state 0 , 1 results in probability distribution biased towards right. fig. 1 probability distribution for a quantum walk after 40 steps. the initial state is 0 , 0 and the hadamard transform is used for coin toss the dependence of the probability distribution on the coin initial state leads naturally to the question: what is the probability distribution in the case where two coins are used alternatively? the case where the coin used in odd steps is the hadamard transform and the coin used in even time steps is a phase shift, p, is considered first. the phase shift is given by: 1 1 1 i p e            (7) figure 2 shows the probability distribution in this case. the initial walker state is 0 , 0 and the phase angle, ф, is 60 o . the probability distribution is biased towards left as in the case where only the hadamard transform was used, but the probability to find the walker in certain locations, which are periodically distributed, is larger. on the other hand, there is a zero probability to find the walker in much more locations that the case where only the hadamard transform was used. this is an expected result for all phase angles, in the case of a single walker. phase shift is important in the case of multiple walkers, because it affects quantum interference. 106 i. g. karafyllidis, p. i. hagouel fig. 2 probability distribution for a quantum walk after 40 steps in the case where two coins are used alternatively, namely h and p. the initial walker state is 0 , 0 the case where the coin used in odd steps is the hadamard transform and the coin used in even time steps is a more general transform, g, is considered next. the transform g is given by: cos ( ) sin ( ) sin ( ) cos ( ) g               (8) figure 3(a) shows the probability distribution after 40 steps in this case where φ = 30 o . the initial walker state is 0 , 0 . the walker is localized between in the region [-10, +10]. the order of magnitude of the probability to find the walker in locations outside this region is 10 -3 . it is therefore acceptable to say that using the aforementioned hybrid coin we can confine the walker within a certain segment of the line. this walker can be confined in any segment of the line [x-10, x+10] by setting the initial walker state to , 0x . different values of φ result in walker confinement in regions with different sizes. for example, figure 3(b) shows the probability distribution after 40 steps in this case where φ = 55 o and initial walker state 20 , 0 . in this case the walker is confined in the region [18, 22] or [20-2, 20+2], that is ±2 around its initial location. multiple quantum walkers on the line using hybrid coins: a possible tool for quantum search 107 (a) (b) fig. 3 probability distribution for a quantum walk after 40 steps in the case where the coins h and p are used alternatively. (a) initial walker state 0 , 0 and φ=30 o . (b) initial walker state 20 , 0 and φ=55 o . 108 i. g. karafyllidis, p. i. hagouel 3. multiple walkers on the line more that one walker can be used in order to exploit quantum interference. a number of w walkers can be used. these walkers can be distinct particles each one with a different initial state. in this case the initial state of the quantum walk, in w , is given by: 1 1 1 2 2 2 3 3 3, , , ,in w w ww a w c a w c a w c a w c     (9) with 2 2 2 2 1 2 3 1 w a a a a     (10) multiple walkers can also be states of the same particle located initially at different locations. in this case: 2 2 2 2 1 2 3 1 w a a a a     (11) figure 4 shows the probability distribution after 40 steps of quantum walk. a hybrid coin h and p with ф = 40 o is used. the initial state for this walk is: 1 1 14 , 0 12 , 0 2 2 in w     (12) the calculation results show that the walk is directed towards left. fig. 4 probability distribution after 40 steps of quantum walk with a hybrid coin h and p with ф = 40 o . the initial state for this walk is given by equation (12). multiple quantum walkers on the line using hybrid coins: a possible tool for quantum search 109 figure 5 shows the probability distribution after 40 steps of quantum walk in which a hybrid coin h and g with φ = 30 o is used. the initial state for this walk is: 1 1 18 , 0 18 , 1 2 2 in w    (13) the walkers are confined into two segments of the line. from figure 5 it is evident that the two probability patterns are located symmetrically to the left and to the right of the origin and are exactly the same. this walk results in two probability patterns which are the same and are displaced by 36 line sites. fig. 5 probability distribution after 40 steps of quantum walk with a hybrid coin h and g with φ = 30 o . the initial state for this walk is given by equation (13). a quantum walk with two entangled walkers will be considered next. the initial state of the walk is: 1 1 6 , 0 5 , 1 2 2 in w    (14) α hybrid coin h and g with φ = 50 o is used. the calculation results are shown in figure 6. the probability distribution pattern has mirror symmetry with respect to the origin. more than two walkers can be used. let us examine the case of a quantum walk with three walkers in which a hybrid coin h and g with φ = 30 o is used. the initial state is: 1 1 1 10 , 1 5 , 1 11 , 0 2 2 2 in w      (15) 110 i. g. karafyllidis, p. i. hagouel fig. 6 probability distribution after 40 steps of quantum walk with a hybrid coin h and g with φ = 50 o . the initial state for this walk is given by equation (14) the calculation results shown in figure 7 indicate that the walkers are confined within the region [-20, 20]. there is a non-zero probability to locate a walker in every location within the region [-15, 1]. fig. 7 probability distribution after 40 steps of quantum walk with a hybrid coin h and g with φ = 30 o . the initial state for this walk is given by equation (15) multiple quantum walkers on the line using hybrid coins: a possible tool for quantum search 111 the periodic structure of probability distribution shown in figure 8 was obtained using four walkers with hybrid coin h and g (φ = 30 o ). the initial state was: 1 1 1 1 20 , 0 10 , 1 10 , 0 20 , 1 2 2 2 2 in w       (16) a periodic probability distribution with more periods can be obtained using more walkers. figures 5 and 6 indicate that two periods correspond to two walkers. fig. 8 probability distribution after 40 steps of quantum walk with a hybrid coin h and g with φ = 30 o . the initial state for this walk is given by equation (16) 4. conclusions in this paper the study of discrete quantum walks involving multiple walkers and hybrid coins was presented. an analytical study of these quantum walks is probably impossible because the probability distribution patterns depend on the choice of hybrid coins, the number of the walkers and the initial states. a large variety of patterns can be achieved including walker confinement, walker direction and periodic patterns. the results presented here indicate that quantum walk on the line with multiple walkers using hybrid coins is an effective tool for quantum search. references [1] y. aharonov, l. davidovich and n. zagury, quantum random walks, physical review a, 48, 1993, 1687. [2] a. ambainis, quantum walks and their algorithmic applications, quant-ph/0403120. [3] j. kempe, quantum random walks: an introductory overview, contemporary physics, 44, 2003, 302. [4] i. g. karafyllidis, simulation of entanglement generation and variation in quantum computation, journal of computational physics, 200, 2004, 383. 112 i. g. karafyllidis, p. i. hagouel [5] i. g. karafyllidis, definition and evolution of quantum cellular automata with two qubits per cell, physical review a, 70, 2004, 044301. [6] tai-hsin hsu and su-long nyeo, diffusion coefficients of two-dimensional viral dna walks, physical review e, 67, 2003, 051911. [7] d. ben-avraham, e. m. bolt and c. tamon, one-dimensional continuous-time quantum walks, quantum information processing, 3, 2004, 295. [8] o. buerschper and k. burnett, stroboscopic quantum walks,quant-ph/0406039. [9] w. dur, r. raussendorf, v. m. kendon and h.-j. briegel, quantum walks in optical lattices, physical review a, 66, 2002, 052319. [10] j. du, h. li, x. xu, m. shi, j. wu, z. zhou and r. han, experimental implementation of the quantum random-walk algorithm, physical review a, 67, 2003, 042316. [11] h. jeong, m. paternostro and m. s. kim, simulation of quantum random walks using the interference of a classical field, physical review a, 69, 012310, 2004. [12] d. shapira, o. biham, a. j. bracken and m. hackett, one-dimensional quantum walk with unitary noise, physical review a, 68, 2003, 062315. [13] p. ribeiro, p. milman and r. mosseri, aperiodic quantum random walks, physical review letters, 93, 2004, 190503. [14] y. omar, n. paunkovic, l. sheridan and s. bose, quantum walk on a line with two entangled particles, quant-ph/0411065. instruction facta universitatis series: electronics and energetics vol. 29, n o 4, december 2016, pp. 689 700 doi: 10.2298/fuee1604689d a novel analytical method for the selective multiplierless linear-phase 2d fir filter function  jelena r. djordjević-kozarov, vlastimir d. pavlović university of niš, faculty of electronic engineering, niš, serbia abstract. in this paper, a novel analytical method for new class of selective linear-phase two-dimensional (2d) finite impulse response (fir) filter functions generated by applying a new modified 2d christoffel–darboux formula for classical orthogonal chebyshev polynomials of the first and the second kind is proposed. fundamental research proposed in this paper is also illustrated by examples of 2d fir filter and adequate comparison with new class of multiplierless linear-phase 2d fir filter function given in the literature. key words: 2d fir filter function, multiplierless, linear-phase, frequency response analysis, chebyshev polynomials, hilbert transform 1. introduction successful applications of powerful orthogonal polynomials, in the filter theory, are well-known and described in [1]. a lot of problems in various scientific and technical areas have been solved applying the classical christoffel-darboux formula for all classic orthogonal polynomials. the new class of the explicit filter functions for continuous signals, generated by the classical christoffel-darboux formula for the classical jacobi orthogonal polynomials, is given in detail in [2]. design of the linear-phase fir filters for defined specifications is discussed in [3]. the grid density requirement for the design of fir filters, with a useful design rule, is presented in detail in [4]. in [5] the authors present the relationship between the accuracy and the frequency grid density in 2-d filter designs. a new formula for determining the frequency grid spacing is proposed. one-dimensional half-band linear-phase fir filter design approach is efficiently used in realization of 2d linear-phase fir filter [6]. the paper [7] describes the approach for the successful design of 2d fir filters with multipliers. moreover, 2d fir filters with nonstandard specifications are designed using transformation technique in [8, 9]. they are based on transformation of one-dimensional fir filters, as well as direct application of the approximation techniques in two dimensions. received october 13, 2015; received in revised form april 16, 2016 corresponding author: jelena r. djordjević-kozarov university of niš, faculty of electronic engineering, aleksandra medvedeva 14, serbia (email: jelena.djordjevic-kozarov@elfak.ni.ac.rs) 690 j. r. djordjević-kozarov, v. d. pavlović a simple recurrence formula for computing the impulse response coefficients of the sinc n fir filter, consisting of a cascade of n sinc filters, each of length m, is presented in [11]. the initial consideration for the synthesis of the 2d fir filter functions is given in a short paper [12]. proposed christoffel-darboux formula for four orthogonal polynomials on two equal finite intervals for powerfully generating filter functions is proposed. in [13] is described in detail the analytical method for the synthesis of the multiplierless linearphase 1d and 2d fir filter functions in an explicit form using chebyshev orthogonal polynomials of the first kind. in [14] is described an analytical method for the synthesis of the multiplierless linear-phase 2d fir filter functions in a compact form that can have the effect of hilbert transformer in the z2 domain. a novel analytical method for a new class of linear-phase 2d fir filter functions with a full effect of hilbert transformer in z1 and z2 domains is proposed in [15]. the main motivation for this research is the extreme property of christofell-darboux sum for the classical orthogonal polynomials that provides new results in the continuous domain, the domain of 1d z and 2d z domain. these results are written in explicit form and give a huge contribution to the filter theory. it should be emphasized that the multiplierless solutions are new solutions worthy of attention. there is no effect of the final quantization of filter coefficients because all the filter coefficients are equal per module. this paper presents further generalization of the previous research [4] in two dimensions. the proposed solution is a filter function in the z1 domain, and the hilbert transformer in the z2 domain, and with the solution from [14] constitutes the whole. an analytical method of the christoffel-darboux formula for the classical orthogonal chebyshev polynomials, of the first and the second kind, is proposed in this paper in an explicit form in continuous domain. also, the new class of the linear-phase 2d fir digital filters, generated by the proposed modified formula and by direct mapping from the continuous domain into 2d z domain, is given. in order to illustrate, the examples of the efficient design of the new class of selective linear-phase 2d fir filter functions are also given. 2. review of the 2d fir filter function multiplierless linear-phase 2d fir filter function with two free real parameters is considered in this paper. a linear-phase 2d fir filter of (m x m)-order is defined by 1 2 1 1 2 0 0 ( , , ) ( , , ) m m r k r k h m z z k h m r k z z        (1) where m is desired order of the filter, 1 k is the real constant and ( , , )h m r k are the impulse response coefficients that are real numbers. squared filter frequency response, in absolute units, can be presented by 1 21 2 1 2 1( , , ) ( , , ) , for , 2 i j h m z z h m z z z e z e     (2) or 2 1 2 1 2 1 2( , , ) ( , , ) ( , , ) i j i j i j h m e e m e e h m e e         (3) alternatively in db, squared filter frequency response can be presented by a novel analytical method for the selective multiplierless linear-phase 2d fir filter function 691 1 2( ) 20 log ( , , ) i j a db h m e e    (4) 3. new class of two-dimensional linear phase multiplierless fir filter functions directly applying the formula proposed by eq. (a.8), the new class of non-causal twodimensional symmetric fir filter functions can be obtained as 1 1 1 1 1 1 1 1 1 1 1 1 1 2 1 2 1 1 1 1 1 1 11 2 2 2 2 2 2 1 2 sin 12 2 2 ( ) ( ) ( , , ) sin 12 2 2 ( ) ( ) m r z z z z z z t u r ri i h r h r h m z z k z z z z z z t u r rj j h r h r                                                                      (5) or 2 2 2 2 1 2 1 1 1 2 2 (m, , ) [( ) ( )] 1 m r r r r r h z z ij k z z z z           (6) multiplying the eq. (6) with factor 2 2 1 2 n n z z    , the filter function can be generated as 2 2 2 2 2 2 2 2 1 2 1 1 1 2 2 1 ( , , ) [( ) ( )] m m r m r m r m r r h m z z ij k z z z z               (7) it is obvious from eq. (7) that the linear-phase fir filter contains no multipliers and has only adders. the frequency response, 1 2( , , ) i j h m e e   , can be defined as 1 2 ( 2 ) ( 2 ) 1 22 2 1 2 0 ( , , sin (2 ) sin (2 )) i m j m m i j r h m e e e e r r                 (8) and the magnitude characteristic is defined as 1 2 1 2 0 ( , , ) sin (2 ) sin (2 ) mi j r h m e e r r        (9) and the amplitude characteristic, 1 2 (2 , , )a m   , is defined as 1 2 1 2 0 ( , , ) sin (2 ) sin (2 ) m r a m r r       (10) the linear-phase function, 1 2 (2 , , )m   , can be defined as ( 2 ) ( 2 ) 1 22 2 1 2 ( , , ) i m j m e em             (11) 692 j. r. djordjević-kozarov, v. d. pavlović a filter function of k = 2r cascaded identical blocks can be written as (h(m, z1, z2)) 2r . if we propose that the filter function h(m, 2r, z1, z2) is performed as a product of three functions of successive orders 1m , m and 1m , than the form of the filter function can be given by: 2 1 2 1 2 1 2 1 2 (2 , , , ) [ ( 1, , ) ( , , ) ( 1, , )] r h r m z z h m z z h m z z h m z z   (12) 4. examples of the new class of two-dimensional linear phase fir filter functions the proposed design algorithm, for original 2d fir filter function, has limitations in addition to the filter dimension (6mr x 6mr) and in addition to the value of two free real integer parameters 2r and m. this means that the linear-phase characteristic forms are limited. in table 1, for some low values of 2r and m, the form of linear-phase characteristics and type of filter functions are given. when 2r is an even number, the proposed filter function has filter properties in both domains. table 1 explicit form of the linear phase characteristics of the proposed fir filter for some low values of free integer parameters 2r and m 2r m 1 ( 4 ) 2( 4 ) 2 2 1 2 2( , , , ) j mr i mr m r e e             type 4 4 ( 32 ) ( 32 )1 2 1 2 (4, 4, , ) i j e e             z1 filter z2 filter 4 5 40 401 2 1 2( 5, 4, , ) i i j j e e              z1 filter z2 filter 4 6 ( 48 ) ( 48 ) 1 2 1 2 (6, 4, , ) i j e e             z1 filter z2 filter 4 7 ( 56 )( 56 ) 1 2 1 2 (7, 4, , ) ji e e           z1 filter z2 filter 4 8 ( 64 )( 64 ) 1 2 1 2 (8, 4, , ) ji e e           z1 filter z2 filter 4 9 ( 72 ) ( 72 ) 1 2 1 2 (9, 4, , ) i j e e             z1 filter z2 filter 4 10 ( 80 )( 80 ) 1 2 1 2 (10, 4, , ) ji e e           z1 filter z2 filter 4 11 ( 88 ) ( 88 ) 1 2 1 2 (11, 4, , ) i j e e             z1 filter z2 filter using the standard technique, the amplitude, magnitude and phase characteristics are obtained from eq. (7) for the numerical values 2r = 4 and m = 6, and detailed characteristics of the filter function 1 2 ( , , ,2 )h m zr z are given in the following figures. a novel analytical method for the selective multiplierless linear-phase 2d fir filter function 693 (a) (b) fig. 1 a) 3d plot of normalized amplitude characteristics of proposed 2d fir filter for 2r =4 and m=6; b) zoomed panel a) illustrated examples of pass-band and stop-band characteristics of the considered fir filter function for given values of attenuation, 1 2 (2 , , , )a r m   , are shown in fig. 2. (a) (b) fig. 2 2d contour plot of normalized magnitude characteristics: a) shape of the pass-band with attenuation of 0.28 db for 2r =4 and m=6; b) shape of the stop-band with attenuation of 100 db for 2r =4 and m=6 fig. 3 shows the phase characteristic of the considered linear-phase multiplierless 2d fir filter function in the initial part, for the same values of the free integer parameters 2r and m, i.e., 2r = 4 and m = 6. 694 j. r. djordjević-kozarov, v. d. pavlović fig. 3 3-d plot of the phase characteristic of proposed 2d fir filter for 2r = 4 and m = 6 5. comparison in this part of the paper, the comparison of the proposed solution for the multiplierless linear-phase 2d fir filter functions and the solution described in [13] is discussed. for the same values of the real free parameters, m and 2r, and the same value of constant group delay we have considered the comparison of amplitude response characteristics and cut-off frequencies of the pass-band of filter and cut-off frequencies of the stop-band of filter. for the value of the free integer parameter 2r = 4, the paper described in [13] and this paper have the same filter property both in z1 and in z2 domain. for the same odd m = 6 we discussed the comparisons between the amplitude responce characteristics, as well as cut-off frequencies of the pass-band of filter with defined attenuation of 0,28 db and cutoff frequencies of the stop-band of filter with attenuation of 100 db. we correctly compare two examples of filter functions that have the same values of free real integer parameters, and thus the same form of linear-phase characteristics which is given in a compact explicit form in the next expression ( 48 ) ( 48 )1 22 2 1 2 (6, 4, , ) i j e e             (13) in fig. 4 are shown the amplitude response characteristics of 2d fir filter of the solution from [13] and the proposed solution, respectively, for free parameters 2r=4 and m=6. a novel analytical method for the selective multiplierless linear-phase 2d fir filter function 695 (a) (b) fig. 4 3d contour plot of amplitude response characteristics, review of the comparison between: a) solution from [13], and b) the proposed filter from eq. (10) fig. 5 shows zoomed forms of pass-bands of filters with attenuation of 0.28 db of the solution from [13] and the proposed solution, respectively, for parameters 2r=4 and m=6. (a) (b) fig. 5 2d contour plot of normalized magnitude characteristics, shape of the pass-band with attenuation of 0.28 db; review of the comparison between: a) solution from [13], and b) the proposed filter from (10) in fig. 6 are shown the stop-bands of filters with attenuation of 100 db of the solution generated by expressions from [13] and proposed solution, respectively. 696 j. r. djordjević-kozarov, v. d. pavlović (a) (b) fig. 6 2d contour plot of normalized magnitude characteristics, shape of the stop-band with attenuation of 100 db; review of the comparison between: a) solution from [13], and b) the proposed filter from (10) in table 2 and table 3 are given the values of the surface area of pass-band and stopband, respectively, of considered 2d fir filter function for different values of given maximal attenuation compared with 2d fir filter function given in [13]. results in table 3 are given in (%) in relation to a total area of the amplitude characteristic. table 2 normalized surface area of pass-band for proposed 2d fir filter function for given values of maximal attenuation compared with 2d fir filter function given in [13] 2r m passa (db) normalized surface area of the proposed filter function pass-band normalized surface area of the filter function pass-band proposed in [13] 4 6 0.28 24.3284219110 -4 0.6779299310 -4 table 3 normalized surface area of stop-band for proposed 2d fir filter function for given values of maximal attenuation compared with 2d fir filter function given in [13] 2r m stopa (db) normalized surface area of the proposed filter function stop-band (%) normalized surface area of the filter function stop-band proposed in [13] (%) 4 6 100 21.405425 34.789375 a novel analytical method for the selective multiplierless linear-phase 2d fir filter function 697 6. conslusion this paper presents an original approach to the multiplierless linear-phase 2d symmetric fir digital filter function synthesis, bringing the significant improvements in the filter theory. the new christoffel-darboux formula for classical orthogonal chebyshev polynomials of the first and the second kind is proposed in appendix of this paper. the presented formula can be used for successfully solving extremely complicated problems of the linear-phase 2d filter design, with high selectivity and high order. transition from the continuous domain into the 2d z domain is successfully presented. this new formula can be directly applied in generating 2d filter functions. all parasitic effects, such as gibbs phenomenon, are suppressed and there is no need for using multipliers. filters design by the proposed method can be applied in various areas, such as telecommunications, medicine, pharmacy, seismology, general localizations and diagnostics, where they can be of special interest. the illustrated examples of the 3d frequency responses and the corresponding 2d contour plots of the proposed linear-phase 2d fir filter are also presented. these examples illustrate the high advantages of the proposed approach and an efficient way of designing ultra-selective filters. the difference between capital research described in [13] and the proposed new classes of filter function for even and odd real value of the free parameter 2r, is following. formulae in the z domain proposed in this paper and in the papers [13] are highly sophisticated and written on the model of extreme properties of christoffel-darboux sum for the continuous classical orthogonal polynomials [3]-[5]. the proposed multiplierless filter functions do not have the problem of the final quantization of filter coefficients, and these solutions are still superior and still very useful for real-time and require a minimum area of integrated technology implementation. undesirable gibbs phenomenon, presented in the analog and in 1d digital filters, has been completely eliminated by the proposed solution. in many practical solutions that requires a minimum dissipation of dc power supply, this solution successfully realizes circuits without multipliers and without quasi multipliers. acknowledgement: the paper is a part of the research done within the project no. 32023, funded by the ministry of science of the republic of serbia. references [1] m. abramowitz, i. stegun, handbook on mathematical function, national bureau of standards, applied mathematics series, usa, 1964. [2] v. d. pavlović, “new class of filter functions generated directly by the modified christoffel–darboux formula for classical orthonormal jacobi polynomials, international journal of circuit theory and applications”, john wiley & sons, vol. 40, pp. 1059–1073, 2013. [3] s. n. hazra, m. s. reddy, “design of circularly symmetric low–pass two–dimensional fir digital filters using transformation”, ieee transactions on circuits and systems, vol. 33, no. 10, pp. 1022–1026, 1986. [4] r. h. yang, y. c. lim, “grid density for design of oneand two-dimensional fir filters”, electronics letters, vol. 27, no. 22, pp. 2053–2055, 1991. [5] s. h. low, y. c. lim, “frequency grid density for the design of 2-d fir filters”, electronics letters, vol. 32, no. 16, pp. 1460–1461, 1996. 698 j. r. djordjević-kozarov, v. d. pavlović [6] a. klouche-djedid, s. s. lawson, “simple design and realisation of linear phase 2d fir filters with diamond frequency support”, electron. letters, vol. 35, no. 14, pp. 1148–1150, 1999. [7] n. vijayakumar, k. m. m. prabhu, “two-dimensional fir compaction filter design”, iee proc., vis. image process, vol. 148, no. 3, pp. 173–181, 2001. [8] v. l. narayana murthy, a. makur, “design of some 2-d filters through the transformation technique”, iee proc., vis. image process, vol. 143, no. 3, pp. 184–190, 1996. [9] b. g. mertzios, a. n. venetsanopoulos, “fast block implementation of two-dimensional fir digital filters via the walsh–hadamard decomposition”, international journal of electronics, vol. 68, no. 6, pp. 991-1004, 1990. [10] s. k. mitra, digital signal processing. – the mcgraw–hill companies, new york, usa, 1998. [11] s. c. dutta roy, “impulse response of sinc n fir filters”, ieee transactions on circuits and systems, vol. 53, no. 3, pp. 217–219, 2006. [12] d. g. ćirić, v. d. pavlović, “linear phase two-dimensional fir digital filter functions generated by applying christoffel-darboux formula for orthonormal polynomials”, elektronika ir elektrotechnika, vol. 4, pp. 39-42, 2012. [13] v.d. pavlović, n. doncov, d.g. ćirić, “1d and 2d economical fir filters generated by chebyshev polynomials of the first kind”, int. journal of electr., vol. 100, no. 11, pp. 1592-1619, 2013. [14] j.r. djordjevic-kozarov, v.d. pavlovic, “an analytical method for the multiplierless 2d fir filter functions and hilbert transform in z2 domain”, ieee trans. on circuits and systemsii: express briefs, vol. 60, no. 8, pp. 527-531, 2013. [15] v.d. pavlovic, j.r. djordjevic-kozarov, “ultra-selective spike multiplierless linear-phase twodimensional fir filter function with full hilbert transform effect”, iet circuits devices syst., vol. 8, no. 6, pp. 532–542, 2014. appendix 1. proposed mathematical background if 1 ( ) r u x  and 1 ( ) r u y  are two sets of the orthogonal chebyshev polynomials of the second kind [5], where x and y are real variables and r is the order of the continuous non-periodical polynomials on a finite interval 1 1x   and 1 1y   respectively, with regard to the nonnegative continuous weight functions, 1 ( )w x and 2 ( )w y , defined as 2 1 ( ) 1w x x  (a.1) and 2 2 1 ( ) 1 w y y   (a.2) then, for the orthogonal chebyshev polynomials of the first kind, tr(y), and the second kind, ur + 1(x), the following relations are valid 1 1 1 1 1 ( ) ( ) ( ) 0 ; , 0, 1, 2, 3, m k w x u x u x dx m k m k       (a.3) and 1 2 1 ( ) ( ) ( ) 0 ; , 0, 1, 2, 3, m k w y t y t y dy m k m k     (a.4) a novel analytical method for the selective multiplierless linear-phase 2d fir filter function 699 for the polynomial, tr(y), r-th order norm h2(r), is: 1 2 2 2 1 , 0 ( ) ( ) ( ( )) , 1, 2, 3, 2 r r h r w y t y dy r           (a.5) and for the polynomial um + 1(x), m-th order norm h1(m), is 1 2 1 1 1 1 ( ) ( ) ( ( )) , 1, 2, 3, 2 mh m w x u x dx m       (a.6) a novel analytical method for the linear phase two-dimensional symmetric fir digital filter functions generated by applying the modified christoffel-darboux formula with alternating sign. components of that sum are determined by multiplication of orthogonal classical chebyshev polynomials of the first and the second kind, with x and y as a real continual variables, ur+1(x), ur+1(y) and tr(x), tr(y), r = 1,2,...,n (where n is the order of the continual orthogonal polynomials), on the equal finite interval [1,1], is proposed in the following explicit form of the orthogonal components: or 1 1 1 1 2 1 2 ( ) ( ) ( ) ( ) ( , ) sin ( ) sin ( ) ( ) ( ) ( ) ( ) n r r r r n r t x u x t y u y x y x y h r h r h r h r       (a.7) i. e. 2 1 1 1 2 ( , ) sin ( ) sin ( ) ( ) ( ) ( ) ( ) n n r r r r r x y x y t x u x t y u y              (a.8) using the standard technique, the eq. (a.8) can be mapped into the new domains, analogue, s, and digital, z, [3, 4, 17-19]. thus, in the z1 domain for example, the following relations are always valid: 1 1 1 1 1 1 22 2 2 2 2 ( cos ) cos ( ) ( ) / 2 ( ) / 2 ( cos ) cos ( ) ( ) / 2 ( ) / 2 j k j k k k k j kj k k k k t x k e e z z t y k e e z z                        (a.9) where tk (x = cos 1) and tk (y = cos 2) are the orthogonal continuous chebyshev polynomials of the first kind, and 1 1 1 1 1 1 1 2 2 2 1 2 2 2 sin ( ) ( ) sin ( ) ( ) /(2 ) ( ) /(2 ) sin ( ) ( y ) sin ( ) ( ) /(2 ) ( ) /(2 ) i k i k k k k j k j k k k k u x k e e i z z i u k e e j z z j                         (a.10) where uk+1 (x = cos 1) and uk+1 (y = cos 2) are the orthogonal continuous chebyshev polynomials of the second kind. as we can see, for odd k, e.g. k = 9, following eq. (a.12) and eq. (a.13) the orthogonal chebyshev polynomials tk (y) and uk+1 (x), respectively, becomes 700 j. r. djordjević-kozarov, v. d. pavlović 9 9 2 2 9 9 9 2 2 2 ( ) cos ( 9 ) ( ) / 2 ( ) / 2 j j t y e e z z           (a.11) and 9 9 1 1 9 9 1 10 1 1 1 sin ( ) ( ) sin ( 9 ) ( ) / 2 ( ) / 2 i i u x e e z z            (a.12) instruction facta universitatis series: electronics and energetics vol. 30, n o 2, june 2017, pp. 223 234 doi: 10.2298/fuee1702223s e-plane waveguide bandstop filter with double-sided printed-circuit insert  snežana stefanovski pajović 1 , milka potrebić 1 , dejan tošić 1 , zoran stamenković 2 1 school of electrical engineering, university of belgrade, serbia 2 ihp, frankfurt (oder), germany abstract. in this paper a novel design of an e-plane bandstop waveguide filter with a double-sided printed-circuit insert is presented. split-ring resonators are used as the resonating elements to obtain the bandstop response. the amplitude response of the waveguide resonator with a single resonating element on the insert is analyzed for various dimensions and positions of the split-ring resonator. the coupling between two resonators on the insert, in terms of their mutual distance, is considered as a next step to the filter design. various positions of the resonators are considered, including the case with the resonators on the different sides of the insert, which is of interest for the proposed filter design. finally, a third-order bandstop filter with a double-sided printed-circuit insert, operating in the x-frequency band, is introduced. the filter response is analyzed for various distances between the resonators and for various positions of the resonator printed on the other side of the insert. proposed filter design is simple, providing for the accurate fabrication, miniaturization and possibility to relatively easy obtain multi-band response, using resonators with different resonant frequencies on the different sides of the insert. key words: e-plane waveguide filter, bandstop filter, split-ring resonator, double-sided printed-circuit insert 1. introduction waveguide filters are widely used components for communication systems operating with high-power signals. they are qualified as passive components with high quality factors and low losses [1]. for example, microwave waveguide filters are elements of various satellite and radar systems, either as bandpass or bandstop filters. e-plane waveguide filters, considered in this paper, are relatively simple to design, fabricate and measure. however, in spite of simple design, there are lots of possibilities to implement received june 25, 2016; received in revised form october 3, 2016 corresponding author: milka potrebić school of electrical engineering, bulevar kralja aleksandra 73, 11120 belgrade, serbia (e-mail: milka.p@mts.rs) 224 s. stefanovski pajović, m. potrebić, d. tošić, z. stamenković single e-plane insert using different resonating elements. various implementations, for different frequency bands, can be found in the available open literature, thus confirming the great interest for the waveguide filters among the researchers in the area of microwave filter design. there are various solutions for the bandpass filter design, using simple or complex resonating elements on the insert. bandpass filter with ladder-type pattern on substrate, for ka-band operation, can be found in [2]. an example of the bandpass filter with t-shaped resonator to operate in the x-band is introduced in [3]. furthermore, rectangular ring resonators (rrrs) are used for the ka-band bandpass filter design in [4], while the combination of c-shaped and central-folded stripline resonators (cfsrs) for the waveguide filter design is introduced in [5]. for the bandstop filters, solutions with splitring resonators (srrs), quarter-wave resonators (qwrs) and other types of simple resonators can be found. bandstop filters using srrs with single rejection band are proposed in [6]-[9], while multiple rejection bands are obtained in [10]-[11]. in [12], the authors have exemplified the use of the srr array for the waveguide filter design. folded srrs are used for the third-order ka-band bandstop filter in [13]. in [14], the possibility to obtain bandpass and bandstop filter response using srrs with microstrip structures, is explained and illustrated. second-order bandstop filter with qwrs, combined with srr as a coupling element, is introduced in [15]. dual-band e-plane bandstop filter with qwrs is proposed in [16]. both latter filters are designed to operate in the x-band. simple rectangular resonating slots are used for single-band and dual-band filter design in [17]. for the e-plane filters with multiple resonating elements on the insert it is important to properly couple them, as explained in [18]. the goal of our research is to design a novel e-plane filter using srrs. therefore, we propose a bandstop waveguide filter with a double-sided printed-circuit insert, using srrs with optimized parameters as the resonating elements, in order to obtain the bandstop response in the x-frequency band (f0 = 10 ghz). according to the available open literature, waveguide filter design with double-sided printed inserts is still not widespread. so far, several solutions for the waveguide structures with double-sided printed-circuit inserts have been introduced. in [19], the operation of the x-band rectangular waveguide with double-sided single ring resonator array is analyzed in the frequency range 2-10 ghz, in order to investigate the characteristics of metamaterials in the considered waveguides. furthermore, bandpass filters using various types of resonators (rrrs, c-shaped resonators and csfrs), printed on different sides of the insert, are proposed in [20], for the w-band, and in [21], for the ka-band. the bandstop waveguide filter realization, using double-sided printed-circuit insert with srrs, for the x-band operation, as considered here, represents a novel solution. the following steps are carried out to achieve the targeted filter design. the amplitude response of a waveguide resonator using single srr is analyzed in terms of the dimensions and the position of the srr. furthermore, the coupling between two srrs on the same insert is considered in terms of their mutual distance. various positions of the srrs are observed, taking into account the possibility to have srrs on different sides of the insert, as well. finally, a novel third-order bandstop filter with a double-sided printedcircuit insert is introduced. the filter response in analyzed in terms of mutual distance between the srrs and the position of the srrs printed on the different sides of the insert. wipl-d software [22] is used to make three-dimensional electromagnetic (3d em) models of the considered structures and to perform 3d em full-wave simulations. e-plane waveguide bandstop filter with double-sided printed-circuit insert 225 the advantage of the proposed design is simple and more accurate fabrication when the distance between the resonators on the insert is critical. also, the novel design provides possibility to have so-called “overlapped“ resonators, meaning that the srr on the other side of the insert does not necessarily have to be positioned between the other srrs, but it may partly overlap with them. such design contributes to the compactness of the structure, meeting demanding miniaturization requirements in this manner. another important aspect of the proposed design is possibility to relatively easy obtain multi-band filter response, having resonators with different resonant frequencies on the different sides of the insert. 2. waveguide resonator using e-plane insert with srr the amplitude response of the waveguide resonator using e-plane insert with a single srr (figure 1a) is analyzed in terms of the parameters of the srr and its position. waveguide resonator and filter, considered in this paper, are designed using standard rectangular waveguide wr-90 (width a = 22.86 mm, height b = 10.16 mm). they are excited by properly designed ports with probes (monopoles), placed at a distance of λg/4 from the short-circuited end of the port (λg – guided wavelength in the waveguide). the te10 mode of propagation is observed. the printed-circuit insert is modeled using copper clad ptfe/woven glass laminate tlx-8 (εr = 2.55, tanδ = 0.0019, h = 1.143 mm, t = 18 μm). dimensions of the e-plane insert are apl = 22.86 mm, bpl = 10.16 mm. according to figure 1a, the parameters used for the srr centrally positioned on the insert are given in table 1. the obtained amplitude response is shown in figure 1b (f0 = 10 ghz, b3db = 193 mhz). the amplitude response is analyzed in terms of dimensions of the srr and its position. the obtained results are presented in figure 2 and table 2. (a) (b) fig. 1 waveguide resonator using e-plane insert with a single srr: (a) 3d model, (b) amplitude response table 1 dimensions of the srr in figure 1a dimension [mm] d1 d2 c p l value 2.76 2.5 0.4 0.6 3.43 226 s. stefanovski pajović, m. potrebić, d. tošić, z. stamenković (a) (b) (c) (d) fig. 2 comparison of amplitude responses: (a) d1 varies, (b) c varies, (c) p varies, (d) l varies table 2 comparison of amplitude responses for single srr d2 = 2.5 mm, c = 0.4 mm, p = 0.6 mm, l = 3.43 mm d1 = 2.76 mm, d2 = 2.5 mm, p = 0.6 mm d1 [mm] f0 [ghz] b3db [mhz] c [mm] f0 [ghz] b3db [mhz] 2.6 10.276 190 0.2 10.748 174 2.8 9.945 194 0.4 10.009 193 3.0 9.641 192 0.6 9.408 209 d1 = 2.76 mm, d2 = 2.5 mm, c = 0.4 mm, l = 3.43 mm d1 = 2.76 mm, d2 = 2.5 mm, c = 0.4 mm, p = 0.6 mm p [mm] f0 [ghz] b3db [mhz] l [mm] f0 [ghz] b3db [mhz] 0.4 9.750 193 2.43 10.115 183 0.6 10.009 193 3.43 10.009 193 0.8 10.238 193 4.43 9.944 178 variation of resonator length (d1) primarily influences resonant frequency (longer printed resonator provides lower resonant frequency), while the 3-db bandwidth practically does not change. similarly, the increase of the gap width (p) moves the resonant frequency toward higher values, but the bandwidth remains the same. however, the change of the width of the printed trace (c) has the influence on both resonant frequency and bandwidth: by increasing c, f0 decreases while the band becomes wider, and vice versa. it should be noticed that the change of the trace width c causes small change of l, in order to have centrally positioned srr regardless of its dimensions. furthermore, by moving the resonator up and down from its central position on the insert, both resonant frequency and bandwidth change. these results are important for optimization of the parameters and positions of the srrs used for the filter design, in order to obtain desired amplitude response. e-plane waveguide bandstop filter with double-sided printed-circuit insert 227 3. coupling between two srrs differently positioned on e-plane insert coupling between two srrs on the same e-plane insert, depending on their mutual distance, is analyzed for several different cases. namely, possible solutions assume various orientations of the srrs in terms of gap position, and also various positions of the srrs, i.e. both srrs can be on the same or different side of the insert. in order to be able to calculate the value of the coupling coefficient, the amplitude characteristic s21 [db] is observed when the resonators are practically decoupled from the ports, meaning that the excitation is weakened, as proposed in [23]. this is achieved by adding metal plates (s = 8 mm), on both ends of the insert, toward the ports (figure 3a). therefore, two characteristic frequencies (f1 and f2), denoting local maxima of the s21 characteristic (figure 3b), are obtained and used for the coupling coefficient k calculation, according to the following formula [24]: 2 2 2 1 2 1 2 2 ff ff k    . (1) (a) (b) fig. 3 method of determining coupling coefficient: (a) 3d model with additional metal plates, (b) s21 characteristic with two local maxima the inserts with two srrs considered for the coupling analysis are shown in figure 4. the srrs depicted using dashed lines are printed on the other side of the insert. both srrs have the same dimensions, given in table 1. however, for cases 1 and 2, l1 = l2 = 3.43 mm, and for case 3 l1 = 3.43 mm and l2 = 3.30 mm. figure 5 shows coupling coefficient k as a function of the distance d between the srrs. for all considered cases, coupling gets weaker (i.e. k decreases) by increasing the distance d. for cases 1 and 2, coupling between resonators is stronger compared to case 3, so it is analyzed for wider range of values of the distance d. it can be noticed that the coupling is pretty much the same for cases 1a and 1b, meaning there is no significant difference whether the srrs are printed on the same or different sides of the insert. however, for d ≤ 2 mm, there is significant difference between values of the coupling coefficient obtained for case 2a and 2b. the same stands for case 3a and 3b, for d ≤ 1 mm. also, when both srrs are on the same side of the insert, the strongest coupling is obtained for case 2. on the other hand, when the srrs are printed on the different sides of the insert, cases 1 and 2 provide stronger coupling, compared to case 3, for the same distance d. 228 s. stefanovski pajović, m. potrebić, d. tošić, z. stamenković (a) (b) (c) fig. 4 srr inserts used for coupling analysis: (a) case 1, (b) case 2, (c) case 3 4. e-plane bandstop filter with double-sided printed-circuit insert using srrs based on the aforementioned results, a third-order bandstop filter is developed using double-sided printed-circuit insert with srrs. two srrs are printed on the same side of the insert, and the third one (central srr) is printed on the other side (depicted using dashed line in figure 6a). the parameters of the srrs are given in table 3. dimensions of the insert are apl = 22.86 mm, bpl = 10.16 mm. the amplitude response of the proposed filter, for the distance d = 11 mm, is shown in figure 6b (f0 = 10 ghz, b3db = 277 mhz). the total length of the proposed filter is 0.456 λg. the amplitude response of the filter is analyzed for various values of the distance d between two outer srrs. according to the amplitude responses shown in figure 6c, it is notable that the increase of the distance d results in a narrower bandwidth, while the center frequency remains practically the same. furthermore, the influence of the position of the central srr on the filter response is investigated, as well. considered srr can be centrally positioned on the insert, as previously proposed, but it can be also shifted up and down, so it does not have to be in line with the outer srrs (figure 7a). the obtained amplitude responses, for various values of the shift, are compared as shown in figure 7b. by moving central srr up or down for 1 mm, there is no significant change of the center frequency (less than 1 %). however, the 3-db bandwidth is notably changed, particularly when the srr is moved up (in the considered case, 3-db bandwidth is increased for 45 %). this property of the filter can be used for bandwidth tuning. e-plane waveguide bandstop filter with double-sided printed-circuit insert 229 0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 55 0 0.02 0.04 0.06 0.08 0.1 0.12 0.14 0.16 0.18 0.20.2 d [mm] k case 1a case 1b (a) 0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 0 0.02 0.04 0.06 0.08 0.1 0.12 0.14 0.16 0.18 0.2 d [mm] k case 2a case 2b (b) 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0 0.02 0.04 0.06 0.08 0.1 0.12 0.14 0.16 0.18 0.20.2 d [mm] k case 3a case 3b (c) fig. 5 coupling coefficient k as a function of distance: (a) case 1, (b) case 2, (c) case 3 230 s. stefanovski pajović, m. potrebić, d. tošić, z. stamenković (a) (b) (c) fig. 6 waveguide filter using e-plane insert with srrs: (a) 3d model and wipl-d model of the insert, (b) amplitude response, (c) comparison of amplitude responses for various values of distance d table 3 dimensions of the srrs in figure 6a dimension [mm] di1 di2 ci pi li r1 (i = 1) 2.76 2.5 0.4 0.6 3.43 r2 (i = 2) 2.77 2.5 0.4 0.6 3.43 (a) (b) fig. 7 waveguide filter using e-plane insert with shifted central srr: (a) models of the insert, (b) comparison of amplitude responses for various values of the shift e-plane waveguide bandstop filter with double-sided printed-circuit insert 231 figure 8 shows comparison of amplitude responses of the filter with double-sided printed-circuit insert (figure 6a), and the one when all three srrs are printed on the same side of the insert. for both cases, dimensions of the corresponding srrs are the same, as well as the distance between them (d = 11 mm). as can be seen, there is no significant change of the filter response; resonant frequency is the same, while the bandwidth is narrowed for 10 mhz, which is 3.6 % of the reference bandwidth. however, a novel solution with srrs printed on both sides of the insert allows more accurate fabrication when the distance between the printed traces is critical, so the srrs can be closer to each other or can even overlap. in this manner, the requirements regarding device miniaturization can be easily met. also, multi-band filters can be developed having srrs with different resonant frequencies on the different sides of the insert, occupying less space compared to the solution when the srrs are printed on the same insert, next to each other, but separated enough to avoid undesired coupling. fig. 8 comparison of amplitude responses of e-plane bandstop filters with srrs (model 1: double-sided printed-circuit insert, model 2: single-sided printed-circuit insert) another possible solution with double-sided printed-circuit insert is shown in figure 9a. the outer srrs are oriented in such manner so their gaps are positioned on the left/right side. similarly as in the previous examples, the central srr is printed on the other side of the insert. dimensions of the srrs are given in table 4. the distance between the outer srrs is set to d = 9 mm. the filter length is equal to 0.392 λg. the amplitude response of the filter is shown in figure 9b (f0 = 10 ghz, b3db = 1027 mhz). as can be seen, a wide-band filter is obtained, using the proposed simple approach. the proposed filters are compared to the similar solutions from the available open literature (e-plane filters of the third order, with a single rejection band), in terms of the filter size on the printed insert. the filter given in figure 9a exhibits a smaller size than the ka-band filter presented in [13], whose length is 0.406 λg, while each of the filters given in figures 6a and 9a is shorter than filter in [7] (total length 0.501 λg) and x-band filters in [8] (total length 0.572 λg) and [17] (total length 1.766 λg). therefore, it can be concluded that the compact structures are designed, with the possibility for further miniaturization. the filter order can be easily increased by adding resonators. 232 s. stefanovski pajović, m. potrebić, d. tošić, z. stamenković (a) (b) fig. 9 waveguide filter using e-plane insert with srrs of various orientations: (a) 3d model and wipl-d model of the insert, (b) amplitude response table 4 dimensions of the srrs in figure 9a dimension [mm] di1 di2 ci pi li r1 (i = 1) 2.8 2.5 0.4 0.6 3.28 r2 (i = 2) 2.76 2.5 0.4 0.6 3.43 5. conclusion novel design of an e-plane bandstop waveguide filter using a double-sided printedcircuit insert with srrs has been proposed. design has started with a model of the waveguide resonator using single srr. the amplitude response has been thoroughly investigated in order to be able to optimize the parameters of the srrs for the filter design. the coupling between two srrs on the insert has been analyzed for various positions of the srrs and their orientation. since the double-sided printed-circuit insert is of interest for the presented research, the model with srrs printed on different sides has been also taken into account. for each considered case, it has been shown that the coupling becomes weaker as the distance between the srrs increases. based on these findings, the third-order e-plane filter with the double-sided printed-circuit insert is introduced. the amplitude response has been investigated in terms of the distance between the srrs and position of the central srr. by moving the central srr up or down the bandwidth can be tuned. it has been shown that the amplitude response of the e-plane waveguide bandstop filter with double-sided printed-circuit insert 233 filter with the double-sided printed-circuit insert matches relatively good with the response of the filter with all srrs printed on the same side of the insert. however, the advantage of the novel solution has been recognized in the fact that printing resonators on different sides of the insert allows more accurate fabrication when the distance between the traces is critical, so the srrs can be closer to each other. proposed filter design provides the possibility to have various combinations of the resonators on the insert, resulting in different responses. in this manner, a wide-band filter using double-sided printed-circuit insert with srrs has been also introduced. besides the abovementioned advantage regarding fabrication precision, the proposed filter design allows overlapping resonators printed on the different sides of the insert, therefore providing for the device miniaturization. also, double-sided printing allows development of multi-band filters using single e-plane insert, having resonators with different resonant frequencies on the different sides of the insert. such layout of the srrs occupies less space on the insert, compared to the design assuming srrs on the same side, next to each other, separated enough to avoid undesired coupling of different bands. comparison with the similar solutions found in the available open literature has confirmed the proposed filter design in terms of the possibility for device miniaturization. it has been shown that the filters presented here occupy less space on the inserts than some previously reported filters of the same order, operating in different frequency bands, assuming that the filter length is normalized to the guided wavelength in the waveguide for the considered center frequency. the future work will be based on the different implementations of compact multi-band filters using e-plane double-sided printed-circuit inserts, which are recognized as relatively simple to design and fabricate, and can be used in real systems operating at microwave frequencies. acknowledgement: this work was supported by the ministry of education, science and technological development of the republic of serbia under grant tr32005. references [1] r. j. cameron, c. m. kudsia, r. r. mansour, microwave filters for communication systems: fundamentals, design, and applications. new jersey: john wiley & sons, 2007. [2] z. wang, r. xu, b. yan, “a covering ka-band two-way switch filter module using a three-line and an e-plane waveguide band-pass filters“, int. j. rf microw. c. e., vol. 25, no. 4, pp. 305-310, may 2015. [3] d. budimir, o. glubokov, m. potrebic, “waveguide filters using t-shaped resonators“, electron. lett., vol. 27, no. 1, pp. 38-40, january 2011. [4] j. y. jin, x. q. lin, q. xue, “a miniaturized evanescent mode waveguide filter using rrrs”, ieee t. microw. theory, vol. 64, no. 7, pp. 1989-1996, july 2016. [5] j. y. jin, x. q. lin, y. jiang, q. xue, “a novel compact e-plane waveguide filter with multiple transmission zeroes”, ieee t. microw. theory, vol. 63, no. 10, pp. 3374-3380, october 2015. [6] a. shelkovnikov, dj. budimir, “miniaturized rectangular waveguide filters“, int. j. rf microw. c. e., vol. 17, no. 4, pp. 398-403, july 2007. [7] a. shelkovnikov, dj. budimir, “left-handed rectangular waveguide bandstop filters“, microw. opt. techn. let., vol. 48, no. 5, pp. 846-848, may 2006. [8] m. mrvić, m. potrebić, d. tošić, z. cvetković, “e-plane microwave resonator for realization of waveguide filters”, in proceedings of the 12th international saum conference on systems, automatic control and measurements. niš, serbia, 2014, pp. 205–208. [9] b. jitha, c. s. nimisha, c. k. aanandan, p. mohanan, k. vasudevan, “srr loaded waveguide band rejection filter with adjustable bandwidth”, microw. opt. techn. let., vol. 48, no. 7, pp. 1427-1429, july 2006. 234 s. stefanovski pajović, m. potrebić, d. tošić, z. stamenković [10] m. n. m. kehn, o. quevedo-teruel, e. rajo-iglesias, “split-ring resonator loaded waveguides with multiple stopbands“, electron. lett., vol. 44, no. 12, pp. 714-716, june 2008. [11] e. rajo-iglesias, o. quevedo-teruel, m. n. m. kehn, “multiband srr loaded rectangular waveguide”, ieee t. antenn. propag., vol. 57, no. 5, pp. 1571-1575, may 2009. [12] n. purushothaman, a. jain, w. r. taube, r. gopal, s. k. ghosh, “modeling and fabrication studies of negative permeability metamaterial for use in waveguide applications”, microsyst. technol., vol. 21, no. 11, pp. 2415-2424, november 2015. [13] j. y. jin, q. xue, “a type of e-plane filter using folded split ring resonators (fsrrs)”, in proceedings of asia-pacific microwave conference 2015. nanjing, china, 2015. [14] s.n. burokur, m. latrach, s. toutain, “influence of split ring resonators on the properties of propagating structures”, iet microw. antenna p., vol. 1, no. 1, pp. 94-99, february 2007. [15] s. stefanovski, m. potrebić, d. tošić, “a novel design of e-plane bandstop waveguide filter using quarterwave resonators”, optoelectron. adv. mat., vol. 9, no. 1-2, pp. 87-93, january 2015. [16] m. mrvić, s. stefanovski, m. potrebić, d. tošić, "novel implementation of dual-band bandstop waveguide filter using quarter-wave resonators", (in serbian), tehnika, vol. 64, no. 3, pp. 473-480, 2015. [17] r. lopez-villarroya, g. goussetis, “novel topology for low-cost dual-band stopband filters”, in proceedings of asia-pacific microwave conference 2009. singapore, singapore, 2009. [18] s. lj. stefanovski, “microwave waveguide filters using printed-circuit discontinuities”, ph.d. dissertation, school of electrical engineering, university of belgrade, belgrade, serbia, 2015. [19] c.-t. chiang, j.-c. liu, y.-c. huang, c.-p. kuei, y.-s. lee, k.-d. yeh, a. h.-c. chen, “both transversal negative permeability and backward-wave propagation in x-band waveguide with double-side srr metamaterials”, int. j. rf microw. c. e., vol. 26, no. 3, pp. 240-246, march 2015. [20] j. y. jin, q. xue, “novel w-band passband filters using the e-plane planar resonators”, ieee international workshop on electromagnetics: applications and student innovation competition (iwem) 2016. nanjing, china, 2016. [21] j. y. jin, x. q. lin, q. xue, “a novel dual-band bandpass e-plane filter using compact resonators”, ieee microw. wirel. co., vol. 26, no. 7, pp. 484-486, july 2016. [22] wipl-d pro 11.0, http://www.wipl-d.com, wipl-d d.o.o., belgrade, serbia, 2013. [23] r. l. villarroya, “e-plane parallel coupled resonators for waveguide bandpass filter applications“, ph.d. dissertation, heriot-watt university, edinburgh, scotland, uk, 2012. [24] j.-s. hong, microstrip filters for rf/microwave applications. new jersey: john wiley & sons, 2011. instruction facta universitatis series: electronics and energetics vol. 30, n o 1, march 2017, pp. 137 144 doi: 10.2298/fuee1701137k modified internal model control for a therapeutic robot  miloš d. kostić 1 , miroslav r. mataušek 2 , dejan b. popović 2,3 1 tecnalia, san sebastian, spain 2 university of belgrade, faculty of electrical engineering, belgrade, serbia 3 institute of technical sciences of the serbian academy of sciences and arts, belgrade, serbia abstract. we present the use of the modified internal model controller (mimc) and the “probability tube” (pt) action representation for robot-assisted upper extremities training of hemiplegic patients. the robot-assisted training session has two phases. during the first "demonstration" phase the robot learns from the therapist the target path through examples. in the second "exercise" phase the robot assists a patient to follow the target path. during this process, the control limits the interface force between the robot and the hand to be below the preset threshold (f = 50 n). the system allows the assessment of the range of movement, the positional error between the target and the reached position, the amount of added assistance (the interface force between the hand and the robot). we demonstrate the operation in two hemiplegic patients. the patients and therapist suggested after the tests that the new system is straightforward and intuitive for clinical applications. key words: stroke, disability, assistant robot, modified internal model control, assessment 1. introduction intensive repetition of functional movements is proven to be an efficient method of motor control relearning during the neurorehabilitation process [1]. robotic devices are inherently well suited for repetitive tasks as well as for providing the quantitative assessment of performed movements, which is why they are becoming the preferred tools to support such therapeutic modality [2]. two types of robot assistants are dominantly used for intensive exercise: 1) devices that assist the end-point movement of the arm and interface the patient at hand (e.g., mit-manus [3], braccio di ferro [4]) and 2) exoskeleton robots that assist individual arm joints and interface the arm at multiple points (e.g., armin [5], cozens arm robot [6]).  received may 24, 2016; received in revised form june 15, 2016 corresponding author: dejan b. popović institute of technical sciences of sanu, kneza mihaila 35, 11000 belgrade, serbia (e-mail: dbp@etf.rs) 138 m. d. kostić, m. r. mataušek, d. b. popović depending on the chosen therapy modality the robotic device can support, assist, resist or even perturb the movement of the arm/hand. to do so, robot assistants apply sophisticated methods for actuator control in position, velocity or impedance space. the rehabilitation gain is maximized when the device adapts to the patient’s performance in a manner which encourages the efforts, e.g., by providing "assistance-as-needed" or "faded guidance" [5 9]. to implement these complex assistance schemes the “haptic” approach, where the device acts on the patient with the force determined by the computer model, is frequently employed in the control of robot assistants [4, 5, 10, 11]. the essential elements of haptic control that are used in current robot assistants can be described with the following two equations: motor intrinsic haptict (t) = t (q,q,q,p) + t (t) (1) t haptic haptict (t) = j(q) f (t) (2) where [ q,q,q  ] are kinematic variables and p is a set of unknown parameters in the nonlinear model of intrinsic torque, tintrinsic [4]. this torque relates to inertia, dissipative friction, and external forces (i.e., gravity). j(q) is the jacobean of the device's geometry, and fhaptic is the targeted interface force between the arm and the apparatus. the application of such a system requires an adequate nonlinear model and experimental assessment of unknown parameter p for an extensive range of operating conditions. a difficulty is that on-line compensation of the intrinsic dynamics is highly complex [4]. another major practical problem for the implementation is the selection of the target trajectory for the hand that the robot needs to assist. we show here one possible method for solving two problems: 1) how to select a target trajectory which is suited to the current patient needs, and dynamically changing abilities; and 2) how can this trajectory be translated to the controller of a robot to is used in daily clinical work? we demonstrate a solution for both tasks in the case of point-to-point movements. the demonstration is presented with a new 3d robot prototype (r3-beg), shown in fig. 1. the assumed principle for the operation of the r3-beg is: "teach-and-repeat" scenario [2, 7, 12], which is adopted in current clinical practice and present in some commercial devices [13]. the "teach-and-repeat" consists of the "demonstration" phase, in which the therapist and patient hold the endpoint of the robot, and the therapist selects a target trajectory based on heuristics; and the "exercise" phase, in which the robot assists the arm to move along the preferred trajectory with the force constraint (threshold maximum force) [13]. the following elements of the system are new: 1) the interface between the therapist, the patient and the robot used during the demonstration phase; 2) the action representation which translates the captured kinematics to the controller; 3) integration of the natural variability of the therapist’s movements into the target trajectory [14, 15]; 4) two-level control comprising at higher level velocity the set points selection in each movement phase, based on the “probability tube” (pt) action representation and at the low-level control implementation of the modified internal model control (mimc) [16, 17] to ensure offset-free following of the set point; and 5) motivating feedback based on the online assessment of the patient’s performance in the “exercise phase” (fig. 1). the presentation modified internal model control for a therapeutic robot 139 starts with the description of the robot and controller, and continues to the presentation of tests in two post-stroke patients. 2. the r3-beg robot assistant the r3-beg combines a two-segment planar manipulandum (arm) and a vertical slider (fig. 1). following the analogy with the patient’s arm, the joints were named shoulder (s) and elbow (e). fig. 1 the r3-beg robot for the arm exercise (left panel). the sketch of the robot arm showing the task (top left panel) and feedback presented to the therapist (bottom right panels). the handle (fig. 1) is instrumented by a set of force transducers allowing the estimation of the size and direction of the force acting at the handle in the plane orthogonal to the handle. this handle serves as the interface between the patient and the robot. the top part (extension) of the same handle is the interface between the therapist and the robot. this configuration allows the therapist to set the target trajectory by moving the end-point of the robot while the patient is holding the same handle. the force sensor is used in the second phase as the source of feedback for controlling the maximum assistive force constraint and for assessment of the added amount of assistance. high level control is based on methods described in [18], which suggested high rehabilitation potential, but required a sophisticated haptic platform. here the pt is used as a lookup table to determine velocity set point, based on current movement phase and performance. this can be presented as: 1k),i, k )i),t(v(pt1 )i),t(v(pt( 1 pt)t(refv      (3) 140 m. d. kostić, m. r. mataušek, d. b. popović where v(t) is current acceleration and i current phase. the factor k determines the level of allowed variability and is set up by the therapist. the low level control is based on two single-input-single-output mimc linear digital controllers [17] to control the shoulder and the elbow of the system. the essential characteristics of the mimc design and tuning concept from [18] are: it is well suited to exploit the benefits of prior knowledge and experience gained from the open-loop dynamics of the plant; the control system structure is directly obtainable from the model used to approximate process dynamics; a small number of tuning parameters, with clear meaning, followed by simple tuning rules, enough easy to apply. this concept also allows scalability of the presented solution, as it is suitable for designing multiple-input multipleoutput (mimo) neural network (nn) digital controllers [19]. measured variables on the plant are the elbow and shoulder positions, pe(t) [rad] and ps(t) [rad], however, the controlled variables consist of the velocity of the elbow ve(t) [rad/s] and the velocity of the shoulder vs(t) [rad/s], which are obtained from e s e s e s sv (kt )=(p (kt ) p ((k 1)t ))/t  (4) s s s s s s sv (kt )=(p (kt ) p ((k 1)t ))/t  (5) their dynamic characteristics are defined by the elbow velocity model gmve(s) and the shoulder velocity model gmvs(s), which are obtained from open loop step response test. models gmve(s) and gmvs(s)are defined by equations 6 and 7: 1stζ2st ek (s)g ee 22 e sl e mve e    , (6) 1stζ2st ek (s)g ss 22 s sl s mvs s    (7) where ke = 0.00024, ks = 0.00023, le = 0.07, ls = 0.1, te = 0.04, ts = 0.08, ζe = ζs = 0.7. fig. 2 mimc controller block diagram, modified from fig. 2 in [17] the velocity models gmve(s) and gmvs(s) were used to design and tune mimc velocity controllers, defined by the structure presented in fig. 2, modified from [17]. the elbow mimc velocity controller is defined by: 2 4 re e le 0.4z f (z)º1, f (z) = , g (z) = z z-0.6       (8) modified internal model control for a therapeutic robot 141 -1 2 m0e 2 p (z) 1 z 1.3205z + 0.4966 = z 0.00024 0.1761z  (9) where z -1 represents the unite delay operator, z -1 = e− sts . the shoulder mimc velocity controller is defined by 2 5 rs s ls 0.2z 0.2z f (z) = , f (z) = , g (z) = z z 0.8 z-0.8        (10) -1 2 m0s 2 p (z) 1 z 1.6522z + 0.7047 = z 0.00023 0.0525z  (11) both mimc controllers are implemented with the sample time ts = 0.02 s. we validated linear models of r3-beg joints. the parameters of the models were estimated based on recordings of the open-loop step responses. the set-points to the elbow and shoulder controllers of the r3-beg are defined in the phase-plane by the procedure described in kostić et al. [14, 15]. however, to test the closed-loop tracking performance of the low level control (mimc controllers equations 8-11), without the influence of higher level control algorithm, sinusoidal set-points defined in time were applied to the shoulder and the elbow control systems. results presented in fig. 3 were obtained for the control system defined in the loop with the mimc elbow velocity controller by equations 8 and 9. fig. 3 closed-loop responses for the elbow in the loop with mimc elbow controllers (8) and (9): model (red line), plant (black line) and set-point (blue line). 3 implementation of the r3-beg two hemiplegic patients signed the informed consent approved by the local ethics committee of the clinic for rehabilitation "dr miroslav zotović", belgrade, serbia. patient p1 had a small range of movement and was highly spastic while the patient p2 had a larger range of motion and less pronounced spasticity. the level of disability was assessed by an experienced clinician before the beginning of the tests (the ashworth spasticity scale (as), the action research arm test (arat), and the fugl-meyer (fm) motor test for upper extremities). 142 m. d. kostić, m. r. mataušek, d. b. popović the session with r3-beg followed the previously described two-phase procedure. in the "demonstration phase", the therapist "presented" the movement to the patient and the robot by manipulating the handle while the patient held the instrumented handle and was instructed not to resist the imposed movement between the starting and end points. the robot was passive (decoupled motors), and sensors captured movement kinematics and interface force. each movement was repeated several times to create the action representation using a procedure described in detail elsewhere [argall et al., 2009). the obtained pt provided set-points to the elbow and shoulder mimc controllers of the r3beg in the phase-plane while the maximal force of assistance was defined as maximal interface force exerted by the therapist. in the "exercise phase", the robot assisted a patient to perform the desired point-topoint movement. there were two different movements, one in the ipsilateral direction and one in the contralateral direction. the starting position and the target were marked with a green and a red circle (diameters d = 4 cm), respectively. the handle was instrumented with a laser pointer which projected the position of the handle to allow the patient to know the position of the handle. data presented in figure 4 illustrate the performance of patients p1 and p2, respectively. the efficacy of the robotic intervention is documented by two objective measures: 1) the euclidian distance between the reached position and the target point, which relates to the range of movement, and 2) the interface force between the hand and the r3-beg, compared to the amount of provided assistance. these metrics were selected based on the recommendations of the european scientific community [19]. fig. 4 trajectories achieved by the patients p1 (severe spasticity left panels) for the two target points. f is the force. d is distance between the end point of the trajectory and the target t. right panels show the performance of patient p2 (mild spasticity) as shown in fig. 4 (left panels), the patient p1 was not able to completely perform the task and could not reach the target point in the case in which the handle needed to be moved to the contralateral side of his body (the distance between the endpoint of the movement and the target was 9.6 cm). however, he encountered fewer problems with the radial movement in the ipsilateral direction (d = 2.9 cm). the interface force indicates that the robot was assisting the movement all along the trajectory. the robot assisted the movement with significant force during the last 25 % of the movement (f ≈ 30 n). the force was gradually increasing to about 10 n during the first 75% of the movement. this result is by the patent’s impairment (constraints introduced by spasticity and decreased the range of movement) fig. 4 (right panels) illustrates the modified internal model control for a therapeutic robot 143 performance of the patient p2 characterized with mild spasticity. in this case, the interface force was substantially smaller compared with the interface force estimated during the tests with patient p1. the distance between the endpoint and the target was only 2 cm and an interface force never reached f = 15 n. this indicates that the patent used the robotic guidance to compensate for the lack of motor control, rather than the compromised range of motion, which supports the reported patient impairment. 4 conclusions we developed a control method for a rehabilitation robot. the new system was proved to be simple for tuning and implementation in the clinical environment. the novel "teachand-repeat" method for high-level control, described in [14,15] implemented in this scenario was found to be useful for translating the therapist's skills and experience to the robot-assisted therapy. the signals from sensors used for control allow direct assessment of the differences between passive and active arm movements (range and smoothness of the movement and required force assistance). the force controlled interface (haptics) also allows the setup of the tasks that need to be trained to improve the performance. acknowledgement: the work on this project was partly supported by the project no rs35003, ministry of education, sciences and technological development of serbia. references [1] g. kwakkel, "intensity of practice after stroke: more is better", schweizer archiv für neurologie und psychiatrie, vol. 160.7, pp. 295-298, 2009. [2] t. nef, m. mihelj, and r. riener, "armin: a robot for patient-cooperative arm therapy", medical & biological engineering & computing, vol. 45(9), pp. 887-900, 2007. [3] n. hogan, h. i. krebs, j. charnnarong, p. srikrishna and a. sharon, "mit-manus: a workstation for manual therapy and training i", in proceedings of the ieee international workshop robot and human communication, 1992, pp. 161-165. [4] m. casadio, v. sanguineti, p. g. morasso, and v. arrichiello, "braccio di ferro: a new haptic workstation for neuromotor rehabilitation", technology and health care, vol. 14(3), pp. 123-142, 2006. [5] t. nef, and r. riener, "armin-design of a novel arm rehabilitation robot", in proc. of the 9 th ieee international conference on rehabilitation robotics, 2005, pp. 57-60. [6] j. a. cozens, "robotic assistance of an active upper limb exercise in neurologically impaired patients", rehabilitation engineering, ieee transactions on, vol. 7(2), pp. 254-256, 1999. [7] l. marchal-crespo, and d. j. reinkensmeyer, "review of control strategies for robotic movement training after neurologic injury", journal of neuroengineering and rehabilitation, vol. 6(1), pp. 20, 2009. [8] m. casadio, p. giannoni, l. masia, p. g. morasso, g. sandini, v. sanguineti, v. squeri, and e. vergaro, "robot therapy of the upper limb in stroke patients: rational guidelines for the principled use of this technology", functional neurology, vol. 24 (4), pp. 195-202, 2009. [9] h. i. krebs, j. j. palazzolo, l. dipietro, m. ferraro, j. krol, k. rannekleiv, b. t. volpe, and n. hogan, "rehabilitation robotics: performance-based progressive robot-assisted therapy", autonomous robots, vol. 15 (1), pp. 7-20, 2003. [10] r. q. van der linde, p. lammertse, e. frederiksen, and b. ruiter, "the hapticmaster, a new high-performance haptic interface", in proc. eurohaptics, 2002, pp. 1-5. [11] r. loureiro, f. amirabdollahian, m. topping, b. driessen, and w. harwin, "upper limb robot mediated stroke therapy: gentle/s approach", autonomous robots, vol. 15 (1), pp. 35-51, 2003. 144 m. d. kostić, m. r. mataušek, d. b. popović [12] j. l. emken, s. j. harkema, j. a. beres-jones, c. k. ferreira, and d. j. reinkensmeyer, "feasibility of manual teach-and-replay and continuous impedance shaping for robotic locomotor training following spinal cord injury", biomedical engineering, ieee transactions on, vol. 55 (1), pp. 322-334, 2008. [13] b. d. argall, s. chernova, m. veloso, and b. browning, "a survey of robot learning from demonstration", robotics and autonomous systems, vol. 57 (5), pp. 469-483, 2009. [14] m. d. kostić, m. b. popović, and d.. b. popović, "a method for assessing the arm movement performance: probability tube", medical & biological engineering & computing, vol. 51 (12), pp. 1315-1323, 2013. [15] m.d. kostić, ;m. d. popović, and d. b. popović, "the robot that learns from the therapist how to assist stroke patients", new trends in medical and service robots. springer international publishing, pp. 1729, 2014. [16] m. r. mataušek and d. m. stipanović, "modified nonlinear internal model control", control and intelligent systems, vol. 26 (2), pp. 57-63, 1998. [17] m.r. mataušek, a. d. mićić, and d. b. dacić, "modified internal model control approach to the design and tuning of linear digital controllers", international journal of systems science, vol. 33 (1), pp. 67-79, 2002. [18] m. r. mataušek, d. m. miljković, and b. i. jeftenić, "nonlinear multi-input-multi-output neural network control of dc motor drive with field weakening", ieee transactions on industrial electronics, vol. 45 (1), pp. 185-187, 1998. [19] "cost action td1006", http://www.rehabilitationrobotics.eu/2013. facta universitatis series: electronics and energetics vol. 29, n o 3, september 2016, pp. 451 460 doi: 10.2298/fuee1603451d a smart home system based on sensor technology boban davidović 1 , aleksandra labus 2 1 vaimo norge as, oslo, norway 2 faculty of organizational sciences, university of belgrade, serbia abstract. this paper presents a new approach to utilize technology in a practical and meaningful manner within a smart home system that can be widely deployed into residential settings. in the modern world, people are rapidly turning to technology as a fast and cost-effective way of improving quality of daily living. this primary goal is to address the needs of the end user by employing networked low-power sensors sensitive to the environment, so it can be altered to their liking. the proposed system consists of following steps: direct environment sensing, collecting and analyzing data and then allowing user to customize the settings and initiate specific commands. this research will present the design and implementation of a practical and simple smart home system, which can be further extended. the system is based on: group of sensors, raspberry pi device as a server system and bluetooth as a communication protocol. these devices can be easily controlled via user-friendly interface for android phones. the main advantage of the proposed system is that it is a sensible, secure and easily configurable system that provides end users with a neat home automation solution. key words: smart home, android, raspberry pi, bluetooth, sensor technology 1. introduction smart home itself does not means smart when the home is built friendly to the environment, how space it uses, or using solar power and recycling waste water, but what makes it smart is the interactive technologies that it contains [1]. a smart home is called "intelligent", because its computer systems can monitor many aspects of daily life [2]. the concept of the smart home is a promising and cost-effective way of improving home care for the elderly and the disabled in a non-obtrusive way, allowing greater independence, maintaining good health and preventing social isolation [3]. smart home consists of home appliances, sensors, actuators and data processors and analyzers [4]. home automation of appliances can be either wired or wireless [5]. in this paper model of smart home based on raspberry pi and android device is suggested. mobile application has been developed in order to manage smart home behaviour. the main contribution of this paper is that it presents easy to implement, received june 12, 2015, received in revised form september 13, 2015 corresponding author: boban davidović vaimo norge as, 0484 oslo, norway, nydalsveien 12 (e-mail: bobanbobanboban87@gmail.com) 452 b. davidović, a. labus flexible and scalable solution for making a smart home environment. smart home system presented in this paper is based on mobile device, because of constant growth of smartphones and tablets usage. 2. literature review smart home is defined as a home that has programmable electronic controls and sensors that regulate heating, cooling, ventilation, lighting, and appliance and equipment operation in a way that responds to interior climate conditions in order to conserve energy [6][7]. smart homes use home automation technologies to provide homeowners with intelligent feedback and information by monitoring many aspects of a home on daily basis [8]. main elements of smart home [9][10]: 1. internal network – wire, cable, wireless. 2. intelligent control – gateway to manage the systems. 3. home automation – products within the homes and links to services and systems outside the home. the range of different smart home technologies available is expanding rapidly along with developments in computer controls and sensors [11]. smart homes present exciting opportunities to change the way we live and work, and to reduce energy consumption at the same time [12]. there are already various implementations of smart homes. most of the implementations use wireless technologies for communication between home appliances and main unit [13]. the main problem that people are trying to solve in smart home is how to make a home that will help people to automate regular daily activities [14]. for example, like adjusting home temperature, ensuring that home has enough daily light and make home secure. lead with this idea, people developed smart homes based on different technologies [15]: 1) smart home based on custom microcontroller and mobile application [16]. smart home system is using bluetooth for communication between mobile application and system. it depends on the controller that it is using. some microcontrollers are used more than others, which makes those smart home systems more flexible. 2) smart home based on a custom microcontroller and computer [17]. smart home system is using bluetooth for communication between appliances. it is based on a computer as entry point for communication between user and smart home system. computer is connected using wire to the microcontroller. 3) smart home based on arduino and mobile application [18]. smart home system is using bluetooth for communication between mobile application and arduino. this system is flexible and scalable. limitation of this system is bluetooth range. 4) smart home based on a computer [19]. smart home system is using wi-fi for communication between appliances and main computer. main computer is communicating with appliances through microcontroller. main advantage of this system is that unlimited number of appliances can be connected to it. some of the solutions mentioned above use bluetooth for communication between main computer / microcontroller and appliances. also, some of the solutions are based on remote control using mobile phone, which also use bluetooth for communication between mobile and main computer. smart homes today offer similar functionality to the end user. that functionality is based on the following [20]: a smart home system based on sensor technology 453 1) integration with smart appliances. 2) integration with sensors for tracking conditions in smart home. 3) single point of control for the whole smart home. 4) remote control of smart home [21]. in the research the authors [22] analyzed smart home system based on atmega microcontroller and bluetooth technology. this system is based on two main parts. first part is mobile application based on android platform. second part of the system is an electronic circuit board that is used for controlling. first usage scenario is when android application is used for communication with an electronic circuit board using bluetooth as communication channel. android application required for usage in this system can be downloaded free of a charge from google play (android application market). electronic circuit board consists of a microcontroller, a bluetooth module, relay driver ic along with relays which are used to switch electrical loads on the circuit and to switch power supply. android application sends a command which is received by bluetooth module and forwarded using usart serial interface to the microcontroller that performs necessary actions. the microcontroller that is used in this system is atmel's atmega 128, high performance and low power risc architecture based 8-bit mcu. the operating voltage of this microcontroller is 5v. voltage regulator (lm78055) is used to get the desired voltage on this device [22]. the main advantage of this system is that it is system that is easy to use and setup. the most important disadvantage of this system is that it is limited to bluetooth usage, so, it has limitation of bluetooth range. in another research the authors [23] analyzed android based smart home system based on bluetooth and arduino. this system is based on arduino micro web server as the main controller. it suggests usage of mobile application based on the android os. it uses bluetooth and restful based web services as interoperable layer. this home automation system is feature rich and it enables some advanced features, as user authentication, bluetooth and internet connectivity, security, fire system with siren and email alerts and automated control of home appliances. the main controller is arduino micro web server that contains arduino mega 2560 device and arduino ethernet shield. it also includes other hardware, such as the siren nrf24l01+ radio module, which is used in order to communicate and coordinate actions with the other sensor nodes within the environment and the bluetooth module. the system base usage is to control security and surveillance, door locks, gate control, fire detection and intrusion detection with alarm and notifications. it also allows user authentication for accessing the smart home system, it provides smart energy management and home environment control [23]. the main advantage of this system is that it is flexible and scalable solution, because it is based on arduino. the most important disadvantage of this system is that it is limited to bluetooth usage, so, it has limitation of bluetooth range. this paper shows a project for monitoring, tracking and analyzing a smart home. these values can be read on android mobile device and via website that reads the values from the cloud. all smart appliances are connected to the same network, using wi-fi router and are communicating with main computer in the system – which is raspberry pi device. all sensors are connected directly to the raspberry pi device, and raspberry pi device is collecting all information from sensors and is sending that information to the cloud, using existing wi-fi network. that means that the values on the cloud will show real time stats, which are collected on the sensors. also, the same information is sent to android mobile device. user is able to monitor different factors read in the system, like: temperature, humidity, amount of light, smoke density, etc. this gives user a clear picture, how are the conditions in the smart home. if user in the radius of 100m from the server, 454 b. davidović, a. labus user is able to send commands to the raspberry pi device directly via mobile phone. if user is not within 100m, user can use mobile to send commands to the raspberry pi device, and that communication is done via cloud (which is communicating to the raspberry pi device). 3. model of smart home system based on raspberry pi and android devices this paper presents the model of smart home system based on raspberry pi and android device. the system is designed to be scalable and easy to setup and extend. it is based on powerful raspberry pi microcomputer. it includes sensors for listening of the environment and appliances that are controlled via android device. fig. 1 model of smart home system based on raspberry pi and android device. there are six main parts of the system (as shown on fig. 1): 1. group of connected sensors. 2. raspberry pi device that acts as a server system. 3. android device as a remote client. 4. wi-fi router as device which communicates to the cloud. 5. smart home appliances. 6. cloud server which is used for storing data for analyses. 3.1. components and technologies in the ever-changing technology trends a few components are being used in an attempt to make a more efficient, powerful and user-friendly smart home system. the components and technologies used for development of this smart home system are: a smart home system based on sensor technology 455 1. raspberry pi. 2. android. 3. bluetooth. 4. sensors. the raspberry pi is a small, barebones computer developed by the raspberry pi foundation, a uk charity, with the intention of providing low-cost computers and free software to students. the allure of the raspberry pi comes from a combination of this powerful device’s small size and affordable price and that makes it perfect for use in developing countries and pretty much every institution that has a need for a low cost programming solution. raspberry is fully programmable and it acts as the core of our home automation system. it communicates with sensors that collect the data [24]. android is a mobile operating system (os) currently developed by google. it is used worldwide in majority of smartphones and tablets on the market today. it allows users access to google services (youtube, google search, google maps, gmail, etc.) and offers a tremendous amount of apps, which make it so popular. it is primarily designed for touchscreen input, but it has also been used in game consoles, regular pcs, digital cameras and other new age technology. nowadays, mobile phones usually are using android operating systems [25]. bluetooth technology is considered to be more practical inside smart home system than other technologies for a few reasons. the main advantage is that it tends to be cheaper since it's widely used for quite some time. bluetooth also tends to works faster in practice than other technologies because it doesn't have to go through a hub so the commands can be executed without any kind of lag. it also has higher data bandwidth than zigbee and zwave (though lower than wi-fi), allowing bluetooth-enabled products to do more than simply flip a switch or report movement [26]. the new version of the technology will utilize mesh networking, meaning that one bluetooth radio can extend the distance it communicates by using the nearest other bluetooth radio. that would allow to users to just put a few bluetooth smart bulbs in his house and get a house-wide coverage. and bluetooth le, which stands for “low energy,” is a newest version that uses very little power in comparison to wi-fi. this is very important for devices such as portable light switches and door locks because they don't have ready access to a power outlet. the home automations system starts with sensors, devices that detect and respond to some type of input from the physical environment. they can sense things like window and door contact, the presence of person (for lighting control, heating, security), movement, humidity, temperature etc. sensors used in our example include generic, light, motion, sound, humidity and temperature. it's possible to add more sensors, but even with just these the system is very universal, flexible and versatile. for example, temperature + humidity sensor can be used as a plant monitor to let the end user know when it needs water, as a leak detector in the bathroom or even be placed on clothesline to send an alert when the laundry is done drying. then, couple the temperature sensing with light and the results can be a light-based alarm that sends user a notification when the sun rises, or even a security system that will alert when a light comes on while there's nobody home. and a simple temperature and motion sensing can be used as a baby monitor, fridge or window alarm. there is a plethora of potential uses and it is perfectly customizable. 456 b. davidović, a. labus 3.2. example of scenario and response procedures the main purpose of a proposed home automation system is to be able to read values from sensors within smart home and send commands in order to customize those values, tailoring them to the end user's specific needs. that way, this system will help in providing better living environment for the user, and will help user to make his stay in his home more comfortable. raspberry pi device is used to control flow between mobile device and sensors (and connected actuators). for our system we used model b, which works on 700 mhz arm processor and 512 mb ram. raspbian os is used on raspberry pi and the server running on raspberry pi device is written in python since raspberry pi has python already installed on it. our server contains few separate threads: 1. thread for direct communication with sensors and for collecting data. 2. thread for communication with mobile device for sending data. 3. thread for communication with mobile device for receiving orders. 4. thread for sending orders to connected smart home appliances (communication via local wi-fi network). 5. thread for sending data to the cloud. fig. 2 main screen of android application that is used to control smart home. android application contains following functionality (android application is shown on fig. 2 and fig. 3): 1. pairing with raspberry pi server (via bluetooth protocol). 2. sending commands to raspberry pi server. 3. receiving data from raspberry pi server. 4. receiving data from smart home appliances. 5. sending data to smart home appliances. 6. sending data to the cloud. a smart home system based on sensor technology 457 fig. 3 reading values from sensors and sending commands to home appliances via android application. communication between the server (raspberry pi) and devices (smart home appliances) is done over wi-fi network. it is necessary that wi-fi router is part of this system. all requests can be also monitored over cloud, as server is sending all information / communication details to the cloud storage. wi-fi is used only for communication between raspberry pi and home appliances. for communication between mobile phone and raspberry pi, bluetooth protocol is used. whenever mobile phone is in the bluetooth reach, android application is using bluetooth for communication with raspberry pi device. if mobile phone is not close to the raspberry pi, all commands and data are sent via cloud. raspberry pi monitors values that sensors detect in the house and sends requests to the raspberry pi server for calling actions that will be directed to actuators connected to raspberry pi and smart home appliances that are communicating with raspberry pi wireless, using wi-fi. in this scenario, raspberry pi is a server that handles all requests. it is used to receive information from the sensors, process that information and deliver processed data to mobile device. also, raspberry pi is used as a server in the other direction, when user uses mobile phone to send commands and change the environment in our smart home. this system is providing end customer with following actions:  reading data from sensors and showing results of readings on android device.  sending direct commands to the actuators and smart home appliances.  collecting reports from smart home appliances.  processing data with raspberry pi device and uploading it to the cloud (via wi-fi).  once collected in the cloud, collected data can be further analyzed.  automatic adjusting of the living environment based on the set preferences (via android application). this system is intended to allow following features to the end users: 1. users can monitor and track values that are read via sensors. raspberry pi device is collecting values that are read on the sensors. these values are sent to the mobile phone and uploaded to the cloud. if mobile device is within bluetooth 458 b. davidović, a. labus range from the raspberry pi device, bluetooth is used for communication between these two devices. if mobile device is not within bluetooth range, then raspberry pi is communicating with android application via cloud. android application is reading the real time values from sensors. 2. users can send commands to the raspberry pi device. users can communicate to the home appliances and other actuators via android application. users are allowed to completely control smart home system only using mobile device. because of that, user can make his home more comfortable and user can prepare his home, before he comes home. the whole communication is done via bluetooth or cloud. the main objective of the proposed system is to allow cheap and reliable smart home system that can control and automate home appliances using android device as a remote controller. main advantages of proposed system, compared to other systems:  controlling whole smart home using simple remote device. this is especially useful for large homes, where it's not necessary to check each room separately.  savings made in adequate usage of power, as well as added efficiency and time savings.  scalable, so it can be used to communicate to numerous devices and other types of smart environments.  data can be monitored and controlled real-time and always available everywhere via cloud server. 4. discussion many studies showed how using bluetooth technology in a smart home system is optimized way of communication between home appliances and smart home controlling device. the "brain" of smart home system can be different microcontroller as it is shown in smart home system based on atmega microcontroller [22] or arduino microcontroller [23] or any other custom made microcontroller with bluetooth transmitter and receiver. these systems are focused on the same topic and they are trying to solve the same problem as in this paper, and that is, how we can make easy adjustable, scalable and configurable system that will help people to feel more comfortable in their homes. the main advantage of the proposed system is that it is a flexible and scalable system, but also powerful, and easy to implement solution. also, system is based on raspberry pi device and there are many different sensors that can be easily connected to this system. communication between main system that controls sensors and mobile device is secure and fast. system gives user opportunity to have history of all actions that were sent to actuators. the main focus is given on the two ways of communication between user and smart home – via bluetooth (while person is in the bluetooth range) and via cloud (while person is not in bluetooth range). all communication is done via mobile phone, and it uses similar approach as described in several different studies [16][17]. we acknowledged that the proposed solution has also some limitations. the main disadvantage of the proposed system is limitation of raspberry pi device, that has defined maximum number of directly connected sensors. this limitation can be overridden by adding extensions to raspberry pi device, which will allow users to connect more sensors to it. the proposed system is also easier to setup and configure for small to medium size a smart home system based on sensor technology 459 homes, because of the limitation of bluetooth signal. this disadvantage can be overridden by setting multiple bluetooth receivers in the smart home. 5. conclusion smart home consists of different features that are oriented to individual users of the smart environment. the range of options that can be adjusted for user is wide. current trends in home automation includes remote mobile control, wearable devices, automated lighting, automatic temperature adjustments, energy management, mobile or email notifications, streaming media, remote video surveillance and much more [27]. home automation systems are also perfectly useful in residential settings of elderly or disabled, where they aim to support autonomous living [28]. and, like most technologies, smart homes technology improves with age. it gets smarter, less expensive and easier to use each year. the smart home system can provide a kind of easier, ordered and effective life style to human, and must be the development tendency for future inhabitancy mode [29]. with new smart technology inventions popping up on every corner, the future of smart home is bright and rapidly expanding. at the moment of writing, one of the most advanced smart home technologies relies on bluetooth as a communication protocol and raspberry pi as a server. the ease of use, safety and the seamless integration of this technology, although on a high level already, can and will be only improved with time. some of the trends that are bound to take over the newest generation of smart homes are cloud based smart environments, even cheaper and versatile sensors, push notifications for just about anything, bigger control and more functionality. also, it is expected that soon our homes will even be able to distinguish between family members by fingerprints, heart rate and body temperatures so they can adapt to each individual’s needs [30]. we will be able to use everyday objects as our personal helpers and they will be incorporated in our homes in a non-intrusive way. with both sensors and devices approaching free and sizes approaching invisible, we are about to enter the age of smart everything. acknowledgement: the authors would like to thank to the ministry of education, science and technological development, republic of serbia, for financial support project number 174031. references [1] r. harper, ''inside the smart home: ideas, possibility and method'', inside the smart home, pp. 1-13, 2003. [2] b. hamed, ''design & implementation of smart house control using labview'', international journal of soft computing and engineering (ijsce), vol. 1, no. 6, january 2012. [3] b. henkemans, a. olivier, l. laurence, d. adrie, ''aging in place: self-care in smart home environments'', smart home systems, intech open access publisher, pp. 105-120, february 2010. [4] h. ghayvat, s. mukhopadhyay, x. gui, n. suryadevara, ''wsnand iot-based smart homes and their extension to smart buildings'', sensors 2015, vol. 15, no. 10350-10379, may 2015. [5] w.s. lee, s. h. hong, ''implementation of a knx-zigbee gateway for home automation'', in proceedings of the ieee 13th international symposium consumer electronics isce '09, pp. 545-549. [6] e. burden, ''illustrated dictionary of architecture'', the mcgraw-hill companies, inc., 2012. [7] w.d. werff, x. gui, ''a mobile-based home automation system'', in proceedings of the 2nd international conference mobile technology, applications and systems, 2005, pp. 1-5. 460 b. davidović, a. labus [8] d. bregman, ''smart home intelligence the ehome that learns'', international journal of smart home, vol. 4, no. 4, october 2010. [9] r. teymourzadeh, s. a. ahmed, k. w. chan, m. v. hoong, ''smart gsm based home automation system'', in proceedings of the ieee conference on systems, process & control, 2013, pp. 306-309. [10] r. j. robles, t. h. kim, ''applications, systems and methods in smart home technology: a review'', international journal of advanced science and technology, vol. 15, february 2010. [11] d. ding, r. a. cooper, p. f. pasquina, l. f. pasquina, ''sensor technology for smart homes'', maturitas the european menopause journal, vol. 69, no. 2, pp. 131136, june 2011. [12] m. xu, l. ma, f. xia, t. yuan, j. qian, m. shao, ''design and implementation of a wireless sensor network for smart homes'', in proceedings o the ubiquitous intelligence & computing and 7th international conference on autonomic & trusted computing (uic/atc), october 2010, pp. 239-243 [13] n. sriskanthan, f. tan, a. karande, ''bluetooth based home automation system'', microprocessors and microsystems, vol.26, pp. 281-289, may 2002. [14] m. li, h. j. lin, ''design and implementation of smart home control systems based on wireless sensor networks and power line communications'', industrial electronics, ieee transactions, vol. 62, no. 7, 2014. [15] j. xiao, r. boutaba, ''the design and implementation of an energy-smart home in korea'', journal of computing science and engineering, vol. 7, no. 3, pp. 204-210, 2013. [16] r. piyare, m. tazil, ''bluetooth based home automation system using cell phone'', in proceedings of the ieee 15th international symposium on consumer electronics, june 2011, pp. 192-195. [17] r. a. ramlee, m. h. leong, r. s. s. singh, m. m. ismail, m. a. othman, h. a. sulaiman, m. h. misran, m. a. said, ''bluetooth remote home automation system using android application'', the international journal of engineering and science (ijes), vol. 2, no. 1, pp. 33-43, 2013. [18] m. a. l. mowad, a. fathy, a. hafez, ''smart home automated control system using android application and microcontroller'', international journal of scientific & engineering research, vol. 5, no. 5, pp. 935-939, 2014. [19] d. yuan, s. fang, y. liu, ''the design of smart home monitoring system based on wifi electronic trash'', journal of software, vol. 9, no. 2, pp. 425-428, 2014. [20] d. retkowitz, s. kulle, ''dependency management in smart homes'', distributed applications and interoperable systems lecture notes in computer science, vol. 5523, no. 0302-9743, pp. 143-156, 2009. [21] y. zhai, x. cheng, ''design of smart home remote monitoring system based on embedded system'', in proceedings of the ieee 2nd international conference computing, control and industrial engineering (ccie), 2011, pp. 41-44. [22] m. rana, r. singh, ''smart homes for a better living using bluetooth communication based on atmega microcontroller'', international journal of research in engineering and technology, vol. 3, no. 6 pp. 210-213, 2014. [23] s. kumar, s.r. lee, ''android based smart home system with control via bluetooth and internet connectivity'', in proceedings of the 18th ieee international symposium on consumer electronics, june 2014, pp. 1-2. [24] m. richardson, s. wallace, "getting started with raspberry pi", december 2012. [25] e. smith, "small tablet vendors gain ground in q1 2015, says strategy analytics: apple and samsung led 8 percent year-on-year contraction of tablet market" (press release), strategy analytics, may 2015. [26] j. h. shin and d. park, ''a virtual infrastructure for large-scale wireless sensor networks'', computer communications, vol. 30, no. 14-15, pp. 2853-2866, 2007. [27] p. vigneswari, v.indhu, r.r.narmatha, a.sathinisha and j.m.subashini, ''automated security system using surveillance'', international journal of current engineering and technology, vol. 5, no. 2, pp. 882-884, 2015. [28] l. liang, l. huang, x. jiang, y. yao, ''design and implementation of wireless smart-home sensor network based on zigbee protocol'', in proceedings of the international conference on communications, circuits and systems, icccas 2008., pp. 434-438. [29] . chana, e. campoa, d. est vea, . . fourniolsa, mart homes current features and future perspectives'', maturitas the european menopause journal, vol. 64, no. 2, pp. 90-97, october 2009. [30] f.adib, h. mao, z. kabelac, d. katabi, r. c. miller, ''smart homes that monitor breathing and heart rate'', massachusetts institute of technology – in proceedings of the 33rd annual acm conference on human factors in computing systems, 2015, pp. 837-846. 10631 facta universitatis series: electronics and energetics vol. 35, no 4, december 2022, pp. 587-601 https://doi.org/10.2298/fuee2204587a © 2022 by university of niš, serbia | creative commons license: cc by-nc-n original scientific paper doherty amplifier linearization by digital injection methods* aleksandar atanasković1, nataša maleš-ilić1, aleksandra đorić1, djuradj budimir2 1faculty of electronic engineering, university of niš, niš, serbia 2university of westminster, london, uk abstract. verification of two linearization methods, applied on asymmetrical two-way microstrip doherty amplifier in experiment and on symmetrical two-way doherty amplifier in simulation, is performed in this paper. the laboratory set-ups are formed to generate the baseband nonlinear linearization signals of the second-order. after being tuned in magnitude and phase in the digital domain the linearization signals modulate the second harmonics of fundamental carrier. in the first method, adequately processed signals are then inserted at the input and output of the main doherty amplifier transistor, whereas in the second method, they are injected at the outputs of the doherty main and auxiliary amplifier transistors. the experimental results are obtained for 64qam digitally modulated signals. as a proof of concept, the linearization methods are also verified in simulation, for doherty amplifier designed to work in 5g band below 6 ghz, utilizing 20 mhz lte signal. key words: doherty amplifier, baseband signal, second harmonic, linearization, experimental verification. 1. introduction in modern wireless communications, the efficiency of rf transmitters largely depends on the efficiency of power amplifiers (pa), so the development of 5g/6g systems requires new pa architectures that will ensure that amplifiers hold high efficiency while maintaining good linearity. therefore, it is necessary to find a compromise between the key parameters of the pa, such as efficiency, power output and linearity. with the classic architecture of the pa, this is not easy to be achieved, and it is very difficult to optimize all the key parameters of pa. usually the optimal design of the pa for one parameter leads to the degradation of another important parameter; therefore, the solution of this problem is to design an energy efficient pa, which is then to be linearized by one of the appropriate linearization techniques. the pa characterized by high efficiency is doherty received march 31, 2022; revised june 10, 2022; accepted june 28, 2022 corresponding author: aleksandar atanasković faculty of electronic engineering, university of niš, aleksandra medvedeva 14, 18000 niš, serbia e-mail: aleksandar.atanaskovic@elfak.ni.ac.rs *an earlier version of this paper was presented at the 15th international conference on advanced technologies, systems and services in telecommunications (telsiks 2021), october 20-22, 2021, in niš, serbia [1] 588 a. atanasković, n. maleš-ilić, a. đorić, dj. budimir topology (da), which is widely used in the contemporary wireless communication systems. in recent time, different linearization techniques [2] are used for pa nonlinearity compensation, such as feedback linearization [3-4], feedforward linearization [5-6], digital predistortion [7-9] and digital injection methods [10-11]. we deployed in earlier work the digital linearization technique [10], [12-14] which processes the i and q signals to generate the adequate 2nd order baseband linearization signals adjusted in the magnitudes and phase angles. these signals are then driven at the gate and drain of the amplifier transistor, after modulating the 2nd harmonic of the fundamental carrier, in order to lower the nonlinearity of the single stage pa [10], [12], and the two-way da [10], [13]. in [14], da was linearized by inserting the modulated signals for linearization at the outputs of the main and auxiliary amplifier transistors. the comparison of two digital linearization methods was carried out in simulation on the designed broadband microstrip da, for different two-tone signal power and maximum tone separation of 30 mhz, as well as for ofdm signal. in this paper, the various experiments are performed on doherty amplifier fabricated in microstrip technology [15] for evaluation of two linearization methods. the tests were realized for 64qam signal with useful spectrum bandwidth of 2 mhz. measured results show the adjacent channel power ration -acpr at dominant third-order intermodulation products and fifth-order intermodulation products. to confirm the efficiency of the linearization method, in this paper the verification also was performed in the simulation procedure for da designed to operate at frequency 3.5 ghz [16]. simulation was performed for a 20 mhz lte signal and various output power levels up to 1-db compression point. 2. analysis the applied linearization methods can be explained theoretically by modeling the nonlinearity of the amplifier transistor by using a taylor-series polynomial model, which does not include memory effect. the fet output current (ids) in terms of the gate-source voltage (vgs), and drain-source voltage (vds) is given by eq. 1, [10], [12], [15]. / / / 2 / 3 1 2 3 / / 2 / 3 1 2 3 / / / 2 / / / 2 1 1 2 1 1 2 ( ) ( ) ( ) ( ) ( ) ( ) m a m a m a m a ds m gs m gs m gs m a m a m a d ds d ds d ds m a m a m a m a m a m a m d gs ds m d gs ds m d gs ds i g v g v g v g v g v g v g v v g v v g v v = + + + + + + + + + + (1) where gmx represents transconductance terms, gdy is the drain-conductance terms and gmxdy is mixed terms (the order of each coefficient can be calculated as x y+ ), and m/a relates to the main and auxiliary amplifiers in doherty circuit. the nonlinear terms defined by the coefficients gd1 − gd3 can be neglected according to the previous performed analysis. also, the mixing terms of the 3rd order gm1d2 and gm2d1 produce the 3 rd order intermodulation products (im3) that can be considered to reduce each other to some extent, so that they are omitted from the final equations that relate to the im3 of da output current given in text below, based on the results obtained in [10], [12], [15]. however, those mixing terms are included into the equations that describe the 5th order intermodulation products of da doherty amplifier linearization by digital injection methods 589 output current (im5), so that the influence of the injected 2nd order linearization signals to the im5 can be explained. the basband signals for linearization are formed by the adeqate processing of the inphase i and quadrature-phase q components of the digital signal resulting in the in-phase linearization component – iim2 = (i2 − q2), and quadrature-phase component –qim2 = 2iq, which are the products of the 2nd order nonlinearity. those signals are tuned in magnitude by / { } m a i o a and phase by / { } θ m a i o , where adaptation coefficients are denoted by i and o in subscript for the injection of the signals for the linearization at the input and output of the transistor in amplifier. the baseband signals prepared in this manner then modulate fundamental carrier second harmonic. in the first linearization approach applied in this paper for the doherty amplifier linearization, the signal for linearization are inserted at the input (together with the fundamental signal) as given by eq. 2 and at the output of the main amplifier transistor in da, eq. 3, whereas in the second approach the signals for linearization are led to the transistor output of the main, eq. 3, and auxiliary stages, eq. 4, in da. 0 0 2 2 0 0 [ cos( ) sin( )] 1 [( ) cos(2 ) 2 sin(2 )] 2 m i m m gs s jm i v v i t q t a e i q iq      − = − + + − − (2) 0 0 2 2 0 0 [ cos( ) sin( )] 1 [( ) cos(2 ) 2 sin(2 )] 2 m o m m ds o jm o v v i t q t a e i q iq      − = − + − − − (3) 0 0 2 2 0 0 [ cos( ) sin( )] 1 [( ) cos(2 ) 2 sin(2 )] 2 a o a a ds o ja o v v i t q t a e i q iq      − = − + − − − (4) where v m s , and v m o, are the magnitudes of the input and output signal of the main amplifier transistor at fundamental frequency, and v a o is the magnitude of the output signal of the auxiliary amplifier transistor at fundamental frequency. the distorted output current of the doherty amplifier analysed for the im3 products is expressed by eq. 5 when the first linearization method is applied and by eq. 6 for the second linearization method. the im5 products are included into eqs. 7 and 8 for both linearization methods. 1 3 3 3 3 23 2 2 1 1 1 1 0 0 3 3 1 ( ) ( ) 4 4 2 1 1 ( )( cos(ω ) sin(ω )) 4 4 m i m m o i st jm a m m out s m s m i s mim j jm m m m o s m d i o m d i v g v g a e v g a e v g a e v g i q i t q t    − − −  = + +   − + + −  (5) 2 3 3 3 33 2 2 1 1 1 1 0 0 3 3 ( ) ( ) 4 4 1 1 ( )( cos(ω ) sin(ω )) 4 4 m a o o nd m a out s m s mim j jm m a a o s m d o s m d i v g v g a e v g a e v g i q i t q t  − −  = + +   − − + −  (6) 590 a. atanasković, n. maleš-ilić, a. đorić, dj. budimir 1 25 5 2 5 5 35 2 ( )2 1 2 1 2 2 ( )2 2 2 2 2 1 2 1 0 0 5 5 3 ( ) ( ) ( ) 8 8 2 1 ( ) 2 1 ( ) ( ) ( cos( ) sin( )) 2 m i m m m o i o m m m i i o st jm a m m out s m s m i s mim j jm m m m m o s m d i o o m d j jm m m m m i o m d i o s m d i v g v g a e v g a e v g a a e v g a e v g a a e v g i q i t q t          − − − + − − +  = + + +  + − +  + − + −  (7) 2 5 5 5 55 2 22 2 2 2 2 2 1 2 1 0 0 5 5 ( ) ( ) 8 8 1 1 ( ) ( ) ( ) ( cos( ) sin( )) 2 2 m a o o nd m a out s m s mim j jm m a a o s m d o s m d i v g v g a e v g a e v g i q i t q t     − −  = + +   + + + −  (8) the 1st and 2nd terms in eqs. 5 and 6 represent da linearity degradation by the 3rd order nonlinearity of the amplifier stages. the 3rd to 5th terms in eq. 5 are the nonlinear products of the second order between the linearization signal injected at the input and at the output of the main amplifier transistor in da and fundamental signals. the 3rd and 4th terms in eq. 6 relate to the mixing terms of the 2nd order between the fundamental signals and the signals for the linearization put at the output of the main and auxiliary amplifier in da. it can be observed that the nonlinear terms of the 2nd order generated due to the injection of the signals for the linearization can reduce the originally produced im3 distortion by the adequate adjustment of the magnitude and phase of the signals for the linearization, [10], [12], [15]. equations 7 and 8 define the im5 products of the da output current generated by the 5th order nonlinearity of da main and auxiliary stages by the 1st and 2nd terms. additional terms for two linearization methods are the 3rd order products that mix the linearization signal with the fundamental signals, which can reduce original im5 products if their magnitudes and phases are related appropriately, [10], [12], [15]. 3. da design two doherty amplifiers were designed to verify proposed linearization method: twoway asymmetrical doherty amplifier operating at 900 mhz central frequency [15] and symmetrical doherty amplifier operating at 3.5 ghz central frequency [16]. the linearization effects were examined in experiment on the fabricated two-way asymmetrical doherty amplifier shown in figure 1, which consist of: 1. main amplifier and frequency diplexers; 2. auxiliary amplifier and frequency diplexer; 3. offset line and output combining networks; 4. pi attenuator; 5. power combiner for injection of the signal for linearization at auxiliary amplifier output, 6. port for the injection of the linearization signal at the main transistor input, 7. port for the injection of the linearization signal at the main transistor output. detailed description of the doherty amplifier design can be found in [15], (it should point out that wilkinson power combiner denoted as 5. in figure 1 was used for another purpose in the linearization method exploited in [15]). in this paper, one port of the combiner was utilized for the linearization. the maximal transducer gain 9 db was measured for the fabricated two-way asymmetrical doherty amplifier for the main amplifier biased in class-ab (vd = 5 v, vgm =−3 v), and the auxiliary amplifier operating in class-c regime (vd = 5 v, vga =−5 v) when ap602a-2 gaas mesfet transistor was used in doherty amplifier linearization by digital injection methods 591 amplifying cells. moreover, measured 1-db compression point of da is at 15 dbm output power and 18 dbm maximum output power is achieved. symmetrical two-way doherty amplifier that operates at 5g band below 6 ghz was designed according to the instructions given in [16]. the da pa was designed by using cgh40010f gan hemt transistor. the drain voltage is 28 v, whereas the gate voltage of the main and auxiliary amplifier is -2.8 v, and -5.7 v, respectively. main characteristics that relate to gain, 1-db compression point, power added efficiency pae, dc power consumption etc. were represented in table 1 for frequency 3.5 ghz. additionally, the gain, gain compression, pae and supply current are shown in figure 2 in the range of da output power. fig. 1 asymmetrical two-way doherty amplifier (all dimensions are in millimeters) 592 a. atanasković, n. maleš-ilić, a. đorić, dj. budimir table 1 characterization of two-way symmetrical da 3.5 ghz a) b) fig. 2 symmetrical two-way doherty amplifier characteristics: a) gain and gain compression; b) pae and supply current 4. measurement set-up the measurement set-up shown in figure 3 was established to verify in experiments the linearization methods developed by our researcher group [17]. the linearization methods of power amplifiers are based on the 2nd order baseband nonlinear digital signals, which adequately modified and processed in the baseband, modulate the fundamental carrier second harmonic. the measurement system was designed to enable verification of the linearization methods, which generally use two linearization signals, which after the digital processing in the baseband modulate the second harmonic of the fundamental signal. the measuring system can generate three independent, synchronized signals – the fundamental signal and two linearization signals at the frequency of fundamental signal second harmonic. two ni usrp 2920 models and one ni usrp 2922 model were used for the measurement system, which are connected to the computer via an ethernet switch. the labview environment was used for implementation of the interface for management and control of the ni usrp devices (figure 4). a unique challenge during the implementation of the measurement system was the synchronization of the ni usrp devices. the synchronization of the ni usrp devices was performed by a mimo cable which was used to synchronize two of the three usrp devices (using mimo expansion input), while the third device was synchronized with the previous two via external referent 10 mhz and 1 pps signals. rigol dg1022 two-channel function/arbitrary waveform generator was used as the generator of these signals. the outputs from the function generator were distributed to the corresponding inputs of the ni usrp device (ref in and pps in). doherty amplifier linearization by digital injection methods 593 a) b) fig. 3 experimental verification of linearization methods 1 and 2: a) measurement set-up; b) schematic diagram 594 a. atanasković, n. maleš-ilić, a. đorić, dj. budimir fig. 4 interface for management and control of the ni usrp devices when using the ni usrp with the labview environment, it is necessary to provide high processing power, large amount and high speed of memory as well as fast ethernet connection to the computer on which the ni usrps are connected, which is especially required if multiple usrp devices are used at the same time. the lack of any of these resources can significantly affect the reliable operation of the ni usrp and lead to frequent downtime [17]. to demonstrate the results of linearization, useful 64qam signal, the signals for linearization and their control in magnitude and phase were performed by the usrp platforms. the linearization effects were examined on the fabricated two-way asymmetrical doherty amplifier operating at 900 mhz central frequency, shown in figure 1. the measurements of output spectra, the adjacent channel power ratios-acprs, for the states before and after the linearization carried out for 64qam modulation format and different signal power levels were spotted in exa signal analyzer n9010a. 5. results of linearization asymmetrical two-way doherty amplifier was tested for 64qam signal with 2 mhz useful channel bandwidth. central frequency of operation is 900 mhz. the linearization effects were measured on the fabricated da for different input signal power levels 1 dbm to 5 dbm. the presented results shown in figures 5 to 7 compare the acprs obtained without and with applying two digital linearization methods: 1) the first-standard method that injects signals for the linearization at the gate and drain of the transistor in the main cell of the da and 2) the second-modified method, where the linearization signals are put at the drain of the main and auxiliary amplifier transistors in the da. doherty amplifier linearization by digital injection methods 595 a) b) c) fig. 5 output spectrum for 64qam signal of 2 mhz useful signal frequency bandwidth for input signal power 1 dbm: a) before linearization; b) after linearization by 1st method; c) after linearization by 2nd method the results of acprs are illustrated in the lower and upper adjacent channels (at ±2 mhz offset from carrier where im3 products are dominant) and in the alternate channels (at ±3 mhz offset from carrier where im5 products are dominant). we can observe for 1 dbm input power, that the acpr in the adjacent channels is improved by 4 db for both linearization methods, whereas for 3 dbm input power they become better by 6 db for the 1st method and 8 db for the 2nd method. with the power increase to 5 dbm, acprs decreases by 3 db and 5 db in the 1st and 2nd methods, respectively. no evident improvement in the alternate channels can be noticed for 1 dbm and 3 dbm input power levels, but it is 4 db in case of 5 dbm power. comparing the measured results with the simulated results represented in [14], we can infer that the 2nd linearization method achieves slightly better acprs improvement in the adjacent channels, especially for higher power, as it was also deduced in [14] when simulated results were analyzed. even though the simulated results attained for the twotone test show more apparent improvement when the 2nd method is used, it should indicate that for the ofdm signal test in simulation, the less divergence between results accomplished with two linearization methods can be observed for higher power, closer to amplifier saturation region. 596 a. atanasković, n. maleš-ilić, a. đorić, dj. budimir a) b) c) fig. 6 output spectrum for 64qam signal of 2 mhz useful signal frequency bandwidth for input signal power 3 dbm: a) before linearization; b) after linearization by 1st method; c) after linearization by 2nd method symmetrical two-way doherty amplifier was simulated for 20 mhz lte signal at 3.5 ghz central frequency of operation. both linearization methods were considered for different power levels up to 1-db compression point. simulation results obtained without and with applying linearization methods are shown in figures 8 to 10. it can be observed from the figures that the acpr at ±20 mhz offsets from the carrier frequency over 2 mhz bandwidth is improved for about 6 db at 32 dbm output power (near 1-db compression point) for the 1st method. for lower power levels, acpr improvement is much better: about 11 db at 27 dbm output power and nearly 12 db at 22 dbm output power for the 1st method. for all power levels, the 2nd method shows an improvement in acpr of 1 db to 2 db more comparing to the 1st method. also, a slight asymmetry in the acpr reduction can be observed at lower (-20 mhz offset) and upper (+20 mhz offset) adjacent channels for both methods. doherty amplifier linearization by digital injection methods 597 a) b) c) fig. 7 output spectrum for 64qam signal of 2 mhz useful signal frequency bandwidth for input signal power 5 dbm: a) before linearization; b) after linearization by 1st method; c) after linearization by 2nd method 6. conclusion experimental results of the linearization of asymmetrical doherty amplifier fabricated in microstrip technology obtained by applying two digital linearization methods are presented in this paper, as well as simulation results for symmetrical doherty amplifier designed to operate in 5g band below 6 ghz. the linearization methods utilize the adequately processed baseband digital signals that modulate the second harmonic of the fundamental carrier. in the 1st linearization method, formed signals for the linearization are injected at the input and output of main transistor in doherty amplifier, while in the 2nd method these signals are led to the outputs of the main and auxiliary amplifier transistors in the da circuit. the ni usrp platforms programmed by labview software were used for generation of the useful 64qam signals for da test and measurements of acprs in adjacent and alternate channels for various input power levels. additionally, these platforms form the signals for linearization, and process them in amplitude and phase. measurements performed by signal analyzer illustrate the results of the linearization for two applied linearization methods and compare them to the states before the linearization. 598 a. atanasković, n. maleš-ilić, a. đorić, dj. budimir a) b) c) fig. 8 output spectrum for lte signal at 22 dbm output power: a) before linearization; b) after linearization by 1st method; c) after linearization by 2nd method on the bases of the achieved experimental results, it can be noticed that the 2nd method provides slightly better results for higher power then the application of the 1st method regarding adjacent channels, where the 3rd order im products are dominant. the same conclusion can be derived for the alternate channels (the band of dominant 5th order im products) but these results of only 1 db or 2 db are inconsiderable, except for the 5 dbm input power where higher improvements of acprs were attained in case of both linearization methods. doherty amplifier linearization by digital injection methods 599 based on the obtained linearization results in simulation for symmetrical da for the 20 mhz lte signal at 3.5 ghz, it can be assumed that the proposed linearization method can be successfully used for 5g band signals with a bandwidth of 20 mhz. the test for wider 5g modulation formats is a subject of further analysis. a) b) c) fig. 9 output spectrum for lte signal at 27 dbm output power: a) before linearization; b) after linearization by 1st method; c) after linearization by 2nd method 600 a. atanasković, n. maleš-ilić, a. đorić, dj. budimir a) b) c) fig. 10 output spectrum for lte signal at 32 dbm output power: a) before linearization; b) after linearization by 1st method; c) after linearization by 2nd method acknowledgement: this work was supported by the ministry of education, science and technological development of republic of serbia (grant no. 451-03-68/2022-14/200102) and science fund of the republic of serbia (grant no. 6398983 serbian science and diaspora collaboration program: vouchers for knowledge exchange – project name: digital even-order linearization of 5g power amplifiers in bands below 6ghz delfin). doherty amplifier linearization by digital injection methods 601 references [1] a. atanasković, n. m. ilić, a. djorić and d. budimir, "doherty amplifier linearization in experiments by digital injection methods", in proceedings of 15th international conference on advanced technologies, systems and services in telecommunications telsiks 2021, niš, serbia, october 20-22, 2021, pp. 82-85. [2] a. borel, v. barzdėnas and a. vasjanov, "linearization as a solution for power amplifier imperfections: a review of methods", electronics, vol. 10, no. 9, may 2021. [3] s. kang, e. t. sung and s. hong, "dynamic feedback linearizer of rf cmos power amplifier", ieee microw. wirel. compon. lett., vol. 28, no. 10, pp. 915-918, oct. 2018. [4] j. li, r. shu, q. j. gu, "a fully-integrated cartesian feedback loop transmitter in 65nm cmos", in proceedings of the ieee mtt-s international microwave symposium digest, honololu, hi, usa, june 2017, pp. 103-106. [5] h. choi, y. jeong, c. d. kim and j. s. kenney, "efficiency enhancement of feedforward amplifiers by employing a negative group-delay circuit", ieee trans. microw. theory techn., vol. 58, no. 5, pp. 1116-1125, may 2010. [6] r. n. braithwaite, "a comparison for a doherty power amplifier linearized using digital predistortion and feedforward compensation", in proceedings of the 2015 ieee mtt-s international microwave symposium, ims 2015, phoenix, az, usa, 17–22 may 2015, pp. 1-4. [7] s. jung, o. hammi and f. m. ghannouchi, "design optimization and dpd linearization of gan-based unsymmetrical doherty power amplifiers for 3g multicarrier applications", ieee trans. microw. theory techn., vol. 57, no. 9, pp. 2105-2013, sept. 2009. [8] p. l. gilabert, d. vegas, z. ren, g. montoro, j. r. perez-cisneros, m. n. ruiz, x. si and j. a. garcia, "design and digital predistortion linearization of a wideband outphasing amplifier supporting 200 mhz bandwidth", in proceedings of the ieee topical conference on rf/microwave power amplifiers for radio and wireless applications, pawr 2020, san antonio, tx, usa, 26–29 january 2020, pp. 46-49. [9] s. n. ali, p. agarwal, s. gopal and d. heo, "transformer-based predistortion linearizer for high linearity and high modulation efficiency in mm-wave 5g cmos power amplifiers", ieee trans. microw. theory techn., vol. 67, no. 7, pp. 3074–3087, may 2019. [10] a. đorić, a. atanasković, n. maleš-ilić and m. živanović: "linearization of rf pa by even-order nonlinear baseband signal processed in digital domain", int. j. electron., vol. 106, no.12, pp. 1904-1918, dec. 2019. [11] d. bondar, n. d. lopez, z. popovic and d. budimir, "linearization of high-efficiency power amplifiers using digital baseband predistortion with iterative injection", in proceedings of the ieee radio and wireless symposium, new orleans, la, usa, 10–14 january 2010, pp. 148-151. [12] a. atanasković, n. males-ilić, k. blau, a. đorić and b. milovanović, "rf pa linearization using modified baseband signal that modulates carrier second harmonic", microw. rev., vol. 19, no. 2, pp. 119-124, dec. 2013. [13] a. đorić, n. maleš-ilić, a. atanasković and v. marković, "linearization of broadband doherty amplifier by baseband signal that modulates second harmonic" in proceedings of the ieee eurocon 2017, ohrid, macedonia, 6-8 july, 2017, pp. 206-211. [14] a. đorić, a. atanasković, b. alorda and n. maleš-ilić: "linearization of doherty amplifier by injection of digitally processed baseband signals at the output of the main and auxiliary cell", in proceedings of the 14th international conference on advanced technologies, systems and services in telecommunications telsiks 2019, niš, serbia, october 23-25, 2019, pp. 339-342. [15] n. maleš-ilić, a. atanasković, k. blau and m. hein, "linearization of asymmetrical doherty amplifier by the even-order nonlinear signals", int. j. electron., vol. 103, no. 8, pp. 1318-1331, aug. 2016. [16] z. zhang, z. cheng and g. liu, "a power amplifier with large high-efficiency range for 5g communication", sensors, vol. 20, no. 19, oct. 2020. [17] a. atanasković, n. m. ilić, a. đorić and d. budimir, "experimental verification of the impact of the 2nd order injected signals on doherty amplifiers nonlinear distortion", in proceedings of 29th telecommunications forum – telfor 2021, belgrade, serbia, november 23-24, 2021, pp. 1-4. instruction facta universitatis series: electronics and energetics vol. 30, n o 2, june 2017, pp. 235 244 doi: 10.2298/fuee1702235g performance analysis of dual-branch selection diversity system using novel mathematical approach  aleksandra golubović 1 , nikola sekulović 2 , mihajlo stefanović 1 , dejan milić 1 1 university of niš, faculty of electronic engineering, niš, republic of serbia 2 college of applied technical sciences, niš, republic of serbia abstract. in this paper, novel mathematical approach for evaluation of probability density function (pdf) of instantaneous signal-to-interference ratio (sir) at the receiver output in interference-limited environment is proposed. dual-branch selection combining (sc) receiver operating over correlated weibull fading channels applying sir algorithm is considered. analytical expression for joint pdf of desired signal and interference at the receiver output is derived and used for evaluation of pdf of instantaneous sir. the expression for pdf of sir is used for system performance analysis via outage probability, average bit error probability (abep) and average output sir as system performance measures. numerical results are graphically presented showing the effects of fading severity, average sir at the input and level of correlation on the diversity receiver performance. in addition, results obtained for the pdf of instantaneous sir in this paper, are compared to the results when the pdf of instantaneous sir is directly calculated. key words: cochannel interference, correlated channels, decision algorithms, selection diversity, weibull fading channels. 1. introduction the main performance limitations in wireless communications systems are fading and cochannel interference (cci). fading emerges due to multipath propagation while cci develops as a side effect of frequency reuse. in order to make as accurate system design as possible, depending on propagation environment, several models are used to describe the statistical behaviour of the multipath fading envelopes. the most frequently used in literature are rayleigh, rice, nakagami-m and weibull. this paper focuses on weibull distribution since it is simple and flexible yet not exploited as much as the other models. it represents an received july 8, 2016; received in revised form october 16, 2016 corresponding author: aleksandra golubović faculty of electronic engineering, aleksandra medvedeva 14, 18000 niš, serbia (e-mail: aleksandra321@gmail.com) 236 a. golubović, n. sekulovic, m. stefanović, d. milić excellent fit to experimental fading channel measurements for indoor [1], [2] and outdoor [3]-[5] environments. wireless communication system performance can be improved at relatively low cost by diversity techniques. basic idea behind diversity systems is simultaneous reception of the same radio signal over two or more paths in order to increase the overall signal-to-noise ratio (snr) [6]. the diversity paths can be separated by space, frequency or time and in all cases some redundancy in time, frequency and/or spatial domain is required [7]. compared with other diversity techniques, space diversity is powerand bandwidth-efficient that makes it the most commonly used diversity technique [8]. if the best of the received signals is selected or if they are properly combined, the outage time can be substantially reduced [9]. depending on the communication system complexity restrictions and the amount of channel state information (csi) available at the receiver, space diversity has several principal types of combining techniques. combining techniques like maximal ratio combining (mrc) and equal-gain combining (egc) require some amount of the channel state information of received signal and separate receiver chain for each branch of the diversity system that results in system complexity increase. on the other hand, selection combining (sc) receiver processes only one of the diversity branches at the time and it is much simpler and cheaper for practical realization [6]. in interference-limited environment, where the level of cci is sufficiently high compared to noise, sc receiver can employ one of the combining algorithms: the desired signal (ds) algorithm, the signal-to-interference ratio (sir) algorithm and the total signal (ts) algorithm [10]. sir algorithm is based on selecting the diversity branch that has the highest sir and it usually provides the best results in the case of interference-limited systems. l-branch egc and mrc receivers operating over non identical weibull fading channels have been considered in [11]. performance analysis of digital communications receivers over weibull fading channels that employ sir algorithm was thoroughly investigated in [12]-[14]. the performance of sc diversity system operating over correlated weibull fading channels that applies sir decision algorithm is studied in [12] for dual-branch system, in [13] for triple-branch system and in [14] for l-branch system. a system that uses ds algorithm, where both desired signal and interference are correlated and under weibull fading, is presented in [15] for dual-branch and [16] and [17] for triple branch system. this paper presents novel mathematical approach for deriving an expression for the probability density function (pdf) of instantaneous sir at the output of a selection combining diversity system with two correlated weibull fading channels that applies sir algorithm. the mathematical approach used in [12] for the same system, directly calculates pdf of instantaneous sir at the system output while this paper calculates joint pdf of desired signal and interference at the output first and then the result is used for calculation of pdf of instantaneous sir at the output. finally, the results obtained in this paper are compared to the results obtained in [12] and [15]. 2. system and channel model we consider a sc diversity system with two branches in interference-limited weibull fading environment. in practice, diversity systems are applied in small-size terminals and complete independence between branches can not be achieved resulting in diversity gain performance analysis of dual-branch selection diversity system using novel mathematical approach 237 degradation. in such case, desired signal envelopes (x1, x2) and cci envelopes (y1, y2) experience correlative weibull fading with joint pdfs [18, eq. (11)] 1 2 1 2 1 2 1 2 1 2 1 21 2 / 2 / 2 1 1 1 21 2 1 2 1 2 1 2 0 d d d dd d 2 1 , exp , (1 ) ω ω 1 ω ω(1 ( ) ) ω ω x x x x x x x x p x x i                                (1) 1 2 1 2 1 2 1 2 1 2 1 21 2 / 2 / 2 1 1 1 21 2 1 2 1 2 1 2 0 c c c cc c 2 1 , exp , (1 ) ω ω 1 ω ω(1 ( ) ) ω ω y y y y y y y y p y y i                                (2) where ρ represents branch correlation coefficient (0≤ρ≤1), β is weibull fading parameter which expresses fading severity (β>0). as the value of weibull fading parameter increases, fading severity decreases. ω i di i x   and ω i ci i y   are the average powers of desired and interference signal at i-th branch (i=1,2), respectively. in() is the modified bessel function of the first kind and n-th order [19, eq. (8.445)]. instantaneous values of sir on the first and second diversity branch are defined as z1=x1/y1 and z2=x2/y2, respectively. the joint pdf of these random variables is 1 2 1 2 1 21 2 1 2 1 1 2 2 1 2 1 2 0 0 ( , ) ( , ) ( , ) . z z x x y y p z z y y p z y z y p y y dy dy     (3) sc receiver based on sir algorithm chooses and outputs the branch with larger sir, i.e. z = max {z1, z2}. applying the concepts of probability, the pdf of instantaneous sir at the output of sc combiner can be obtained as 1 2 1 22 2 1 1 0 0 ( ) ( , ) ( , ) . z z z z z z z p z p z z dz p z z dz   (4) the approach described by (3) and (4) is used in previously published papers which study sc receivers. in this work, we propose mathematical approach based on calculation of the joint pdf of desired and interference signal envelopes. the joint pdf of desired and interference signal envelopes on input diversity branches can be easily expressed as 1 1 2 2 1 2 1 21 1 2 2 1 2 1 2 ( , , , , ,) ( ( ).) x y x y x x y y p x y x y p x x p y y (5) when a dual-branch sc diversity system uses sir algorithm, one of two conditions have to be fulfilled: 1. 1 2 1 1 2 2 1 2 , x x y x x y y y x y y x       2. 2 1 2 2 1 1 2 1 , . x x y x x y y y x y y x       in that case, the joint pdf of desired signal and interference envelopes at the output of dual-branch sc receiver based on sir algorithm can be obtained as 238 a. golubović, n. sekulovic, m. stefanović, d. milić 1 1 2 2 1 1 2 2 2 1 2 2 2 2 1 1 1 1 0 0 , ( , , , ) ( , , , )( ) , xy x y x y x y x y y y x x x x p x y p x y x y dy dx p x y x y dy dx          (6) which by substituting (5) and after some mathematical manipulations yields 1 1 1 1 1 2 1 2 1 2 1 2 2 2 2 2 2 1 d c 1 1 1 1 2 +1 1 , 0 d d c c 2 c d d ( 1) ( ) ( 1 ( ) 2 1 ) 1 , exp 1 ω ω (1 ) ! ! 1 (ω ω ) (ω ω ) γ( 2) 1 1 ω ω ω 1, 2; ( ) ( ) ( ; ) 2 ω xy j j i j i j i j i ii j j j x y p x y x y i j i i j y x f i j i                                                          2 2 2 2 2 2 1 2 1 2 1 2 1 2 1 1 1 1 c 2 2 d c 1 1 1 1 ( 1) 1 2 ( ) ( 1 1 , 0 d d ( d ) ) c c c 1 1 exp 1 ω ω (1 ) ! ! 1 (ω ω ) (ω ω ) γ( 2) 1 1 ω ) ω ( ( ) n m n nn m m n m n m n y x x y x y m n m m n y x                                                                    1 1 1 2 1 1 d 2 c ω 1, 2; 2; 1 , ω m n y m n m x f                      (7) where 2f1(,;;z) represents gaussian hypergeometric function [19, eq. (9.100)] and () represents gamma function [19, eq. (8.310.1)]. to the best of the authors’ knowledge, the above presented expression for the joint pdf of desired and interference signal envelopes at the sir based sc receiver output is novel in the open technical literature. the pdf of instantaneous sir at the sc output can be calculated using following equation 0 ( ) ( , ) . z xy p z yp zy y dy    (8) by substituting (7) in (8) and after integration, final expression for the pdf of instantaneous sir at the receiver output is derived as performance analysis of dual-branch selection diversity system using novel mathematical approach 239 1 2 1 2 1 2 21 1 1 2 2 2 2 2 ( 1 (1 )) 1 2 1 2 1 1 , 0 d d 2 2 d c c 2 d 1 d 1 c 1 ( ! !) 1 ω ω (ω ω ) γ ( 2) 1 1 1 1 ω ω ω ω ω 1 1, 2; 2 ( ) ( ) ; 1 ( )( ω ) i jj i iz j i j c c i j z i j i i j z z f i j i z p z                                                             2 1 1 2 1 2 12 2 2 1 1 1 1 1 ( 1) 1( ) 1 2 2 2 1 1 , 0 d d 2 2 d c d 2 c 1 d 1 c 1 ( ! !) 1 ω ω (ω ω ) γ ( 2) 1 1 1 1 ω ω ω ω ω 1 1, 2; 2; 1 ( ) ( , ω )( ) m nn m m n m n c c m n z m n m m n z z f m n m z                                                         (9) the pdf of instantaneous sir at the output of the same system obtained using mathematical approach described by (3) and (4) is presented in [12] by (11). table 1 comparison of number of terms of (9) and (11) in [12] to achieve accuracy at the fourth significant digit (β1=2, β2=3, s1= s2=10db) z=5 z=25 (9) in this paper (11) in [12] (9) in this paper (11) in [12] ρ=0.2 4 6 5 6 ρ=0.5 12 13 13 15 ρ=0.8 34 34 34 41 considering that convergence represents significant problem in infinite-series expressions, table 1 summarizes the number of terms that need to be summed in the expressions for the pdf of instantaneous sir at the sc output obtained in this paper and paper [12] to achieve accuracy at the 4th significant digit after the truncation of the infinite series. instead of individual signal and interference powers, as it was presented in equation (9), the table considers their ratio at the input of i-th branch of selection combiner si=ωdi/ωci, i=1,2. the results show that the expression obtained in this paper converges more rapidly than the expression (11) in [12], making it more manageable for system analysis. 240 a. golubović, n. sekulovic, m. stefanović, d. milić 3. system performance analysis the performance of dual-branch sc system operating over correlated weibull fading channels is analysed using analytically obtained expression for the pdf of instantaneous sir at the output. performance indicators that are considered in this section are outage probability, average bit error probability (abep) and average output sir. the influence of fading severity, correlation coefficient and average powers is studied. moreover, numerical results are compared to numerical results in [12] to verify mathematical approach proposed in this work. 3.1. outage probability being a basic system performance measure in interference-limited environment, outage probability, pout, can be defined as the probability that the output sir drops below a specified threshold zth out 0 ( ) . thz z p p z dz  (10) fig. 1 depicts outage probability of balanced (s1=s2=s) dual-branch sc receiver as a function of outage threshold for different system parameters. the results obtained in this paper match perfectly the results obtained in [12]. the outage probability decreases for lower values of outage threshold and higher weibull fading parameters. for higher values of outage threshold, when desired signal is dominant, the system performance deteriorates as weibull fading parameter increases. when fixed values of weibull fading parameters are observed, it is obvious that for higher correlation coefficient system performance deteriorates. -15 -10 -5 0 5 10 15 20 10 -5 10 -4 10 -3 10 -2 10 -1 10 0   =2.5             results obtained using [12] for corresponding parameters o u ta g e p ro b a b il it y outage threshold [db] s=6db               fig. 1 outage probability of dual-branch sc system comparison of the results for outage probability when ds algorithm [15] and sir algorithm are used for different fading severity is illustrated in fig. 2. the branches of the performance analysis of dual-branch selection diversity system using novel mathematical approach 241 receiver are correlated and balanced. it can be seen that system with sir algorithm shows slightly better performance in terms of outage probability compared to ds algorithm. -15 -10 -5 0 5 10 15 20 10 -5 10 -4 10 -3 10 -2 10 -1 10 0 =1 =4 sir algorithm ds algorithm o u ta g e p ro b a b il it y outage threshold [db] =0.6 s 1 =s 2 =2db fig. 2 result comparison of outage probability for sir and ds decision algorithms 3.2. average bit error probability abep represents one of the important first order performance measures. it is often used for system performance evaluation because it is the most revealing of the nature of the system behaviour. abep is calculated using conditional bit error probability (bep), which is a function of the modulation/detection scheme employed by the system. in this paper, two modulations are considered, bdpsk and bfsk. for these two cases, the conditional bep for a given sir is 21 2 ( ) , gz e p ez   (11) where g represents modulation constant and the values are, for bdpsk g=1 and bfsk g=1/2. abep at the sc output can be evaluated directly by averaging the conditional bep over the pdf of z 0 ( ) .( ) e e z zp p p z dz    (12) fig. 3 illustrates abep of balanced dual-branch sc receiver for bfsk and bdpsk signalling for different correlation coefficient. the results obtained in [12] perfectly match the results obtained in this paper. the system performance is better for lower values of correlation coefficient which means that the system performance is better as the distance between the antennas increases. for the case when correlation is too high, it is possible for deep fades in the branches to occur simultaneously resulting in low improvement degree of considered space diversity. it is obvious from the figure that system with bdpsk signalling shows better performance than system with bfsk signalling which is in compliance with conclusion presented in [6]. 242 a. golubović, n. sekulovic, m. stefanović, d. milić 0 5 10 15 20 25 30 10 -3 10 -2 10 -1 10 0      a b e p s [db] bfsk   bdpsk   results obtained using [12] for corresponding parameters fig. 3 the influence of correlation coefficient on abep of dual-branch sc system 0 5 10 15 20 25 30 10 -3 10 -2 10 -1 10 0 results obtained using [12] for corresponding parameters       a b e p s [db] bfsk   bdpsk   fig. 4 the influence of fading severity on abep of dual-branch sc system in fig. 4, abep of balanced dual-branch sc receiver for bfsk and bdpsk signalling for different fading intensity is presented. it is obvious that system performance is better in the environment with lower fading parameter. it is interesting to note that for lower values of s, bfsk signalling with lower value of β, shows worse system performance than bdpsk signalling with higher value of β while for the case when higher values of s are observed, the situation is vice versa. it can be explained by the fact that in the considered scenario desired signal and cci, which is inferior for higher values of s, are exposed to the same fading severity. performance analysis of dual-branch selection diversity system using novel mathematical approach 243 3.3. average output sir average output sir is one more useful parameter that is used in wireless communications in the case when cci is present. it can be calculated by 0 ( ) .sc zz z p z dz    (13) based on (9) and (13), fig. 5 is plotted. it shows that the results obtained using (9) match perfectly with the results obtained using mathematical approach presented in [12]. the figure shows that the average output sir degrades rapidly for higher values of correlation coefficient. it is also obvious that the system performance is better for higher values of s, which is more significant in the case of lower values of fading parameters. 0.0 0.2 0.4 0.6 0.8 1.0 0 1 2 3 results obtained using [12] for corresponding parameters a v e ra g e o u tp u t s ir    = 2 =2.5; s=3db   = 2 =2.5; s=6db   = 2 =4.7; s=3db   = 2 =4.7; s=6db fig. 5 average output sir as a function of correlation coefficient 4. conclusion this paper studies the performance of dual-branch sc receiver operating over correlated weibull fading channels in the presence of weibull distributed cci for the case when sir algorithm is applied. the pdf of instantaneous sir at the system output was derived using mathematical approach based on calculation of the joint pdf of desired signal and interference signal envelopes at the output. using the pdf of instantaneous sir at the system output, outage probability, abep and average output sir were evaluated as efficient system performance measures. numerical results were graphically presented describing the influence of correlation coefficient, fading severity and average sir at the input on overall system performance. in addition, obtained results were compared to the results in [12] which proved the perfect match, as it was expected. it was shown that the expression for pdf of instantaneous sir obtained in this paper converges faster than the expression in [12] therefore the novel expression derived in this paper can be used more efficiently. moreover, the joint pdf of desired signal and interference signal envelopes at the system 244 a. golubović, n. sekulovic, m. stefanović, d. milić output can be used to calculate other important distributions. for example, the pdf of sum of desired signal and interference signal envelopes can be obtained and applied in performance analysis of system with micro and macrodiversity when macrodiversity combiner uses total power signal algorithm. motivated by these facts, the subject of our future work will be generalization of the mathematical approach for arbitrary order of diversity and macrodiversity system based on ts algorithm. references [1] f. babich, g. lombardi, ―statistical analysis and characterization of the indoor propagation channel,‖ ieee trans. commun., vol. 48, pp. 455-464, mar. 2000. [2] h. hashemi, ―the indoor radio propagation channel,‖ proc. ieee, vol. 81, pp. 943–968, july 1993. [3] g. tzeremes, c. g. christodoulou, ―use of weibull distribution for describing outdoor multipath fading‖ in proc. of the ieee anthennas and propagation society international symposium 1, 2002, pp. 232-235. [4] n. s. adawi, et al., ―coverage prediction for mobile radio systems operating in the 800/900 mhz frequency range,‖ ieee trans. veh. technol., vol. 37, no. 1, pp. 3–72, feb. 1988. [5] n. h. shepherd, ―radio wave loss deviation and shadow loss at 900 mhz,‖ ieee trans. veh. technol., vol. 26, pp. 309–313, nov. 1977. [6] m. k. simon, m. s. alouini, digital communications over fading channels, john wiley & sons, inc. 2000. [7] y. li, x. g. xia, g. wang, ―simple iterative methods to exploit the signal-space diversity,‖ ieee trans. commun., vol. 53, no. 1, pp.32-38, jan. 2005. [8] j. boutros, e. viterbo, ―signal space diversity: a power and bandwidth-efficient diversity technique for rayleigh fading channel,‖ ieee trans. inf. theory, vol. 44, pp. 1453-1467, july 1998. [9] s. h. lin, t. c. lee, m. f. gardina, ‖diversity protections for digital radio-summary of ten-year experiments and studies,‖ ieee commun. magazine, vol. 26, no. 2, feb. 1988, pp. 51-64. [10] w. jakes, microwave mobile communications, john wiley & sons, inc. 1974. [11] g. k. karagiannidis, d. a. zogas, n. c. sagias, s. a. kotsopoulos, g. s. tombras, ‖equal-gain and maximalratio combining over nonidentical weibull fading channels,‖ ieee trans. wireless commun., vol. 4, no. 3, pp. 841–846, may 2005. [12] m. c. stefanovic, d. m. milovic, a. m. mitic, m. m. jakovljevic, ―performance analysis of system with selection combining over correlated weibull fading channels in the presence of cochannel interference,‖ int. j. aeü, vol. 62, no. 9, oct. 2008, pp. 695—700. [13] p. spalevic, n. sekulovic, z. georgios, e. mekic, ―performance analysis of sir-based triple selection diversity over correlated weibull fading cchannels,‖ facta universitatis, series electronics and energetics vol. 23, no. 1, apr. 2010, pp. 89—98. [14] m. stefanovic, d. draca, a. panajotovic, n. sekulovic, ―performance analysis of system with l-branch selection combining over correlated weibull fading channels in the presence of cochannel interference,‖ int. j. commun. systems, vol. 23, no. 2, pp. 139—150, feb. 2010. [15] a. golubovic, n. sekulovic, m. stefanovic, d. milic, i. temelkovski, ―performance analysis of dualbranch selection diversity receiver that uses desired signal algorithm in correlated weibull fading environment‖, tehnicki vjesnik-technical gazette, vol. 21 no. 5, pp. 953-957, 2014. [16] n. sekulovic, m. stefanovic, a. golubovic, i. temelkovski, b. trenkic, m. peric, s. milosavljevic ―performance analysis of triple-branch selection diversity based on desired signal algorithm over correlated weibull fading channels,‖ ttem technics technologies education management, vol. 7, no. 3, pp. 10131019, 2012. [17] n. sekulović, a. golubović, ĉ. stefanović, m. stefanović, ―average output signal-to-interference ratio of system with triple-branch selection combining based on desired signal algorithm over correlated weibull fading channels,‖ facta universitatis series automatic control and robotics, vol. 11, no 1, pp. 37-43, 2012. [18] n. c. sagias, g. k. karagiannidis, ―gaussian class multivariate weibull distributions: theory and applications in fading channels,‖ ieee trans. inf. theory, vol. 51, no.10, pp. 3608—3619, oct. 2005. [19] i. gradshteyn, i. ryzhik, table of integrals, series and products, 7ed, ny: academic press, 2007. http://www.kobson.nb.rs/nauka_u_srbiji.132.html?autor=golubovic%20aleksandra instruction facta universitatis series: electronics and energetics vol. 30, n o 2, june 2017, pp. 209 221 doi: 10.2298/fuee1702209d rf pa linearization by signals modified in baseband digital domain  aleksandra đorić 1 , nataša maleš-ilić 2 , aleksandar atanasković 2 1 innovation center of advanced technologies, niš, serbia 2 university of niš, faculty of electronic engineering, niš, serbia abstract. this paper represents the linearization of the rf power amplifier performed by a new approach that combines two different methods exploiting the modified baseband signals. the signals for linearization in both methods are formed and processed in digital domain. the required modified baseband signals for linearization are products of the second order nonlinearity of a nonlinear system fed by the useful baseband signal. in the first method, adequate part of the modified baseband signal is adjusted in amplitude and polarity and injected at the input and output of the amplifier transistor across the series lc resonant circuit. in the second method, the appropriate modified baseband signal set on the appropriate amplitude and phase modulates the fundamental carrier second harmonic, which is then inserted at the input and output of the amplifier transistor. the effects of the combined linearization method are considered on a single stage power amplifier for quadrature amplitude modulated signals characterized with frequency spacing between spectral components up to 60 mhz for different input power levels, as well as for wcdma digitally modulated signal. key words: linearization, power amplifier, baseband signal, second harmonic, intermodulation products. 1. introduction the new generation of the communication technologies and standards impose demanding requirements to new systems in order to increase bit rate, linearity and spectrum efficiency. these requirements present a serious task for the transmitter designers regarding the power amplifier (pa) topology that should support wideband operation, various modulation formats, a diversity of signal bandwidths and frequency ranges, high efficiency, as well as linear operation [1]. for achieving high power efficiency, the power amplifier should operate closer to its compression region distorting the linearity of the output signals. consequently, significant efforts have been devoted to development of linearization techniques for nonlinear rf and microwave power amplifiers. various linearization methods received june 10, 2016; received in revised form september 12, 2016 corresponding author: aleksandra đorić innovation center of advanced technologies, bulevar nikoletesle 61/5, 18 000 niš, serbia (e-mail: alexdjoric@yahoo.com) 210 a. đorić, n. maleš-ilić, a. atanasković for minimizing nonlinear distortions of power amplifiers have been reported in the literature [2-5]: feedback, feed-forward, predistortion, etc. in the previously deployed linearization technique that uses the fundamental signals’ second-order (im2) and fourth-order nonlinear signals (im4) at frequencies around the second harmonics, [6-12], the signals for linearization were generated and prepared for injection at the input and output of the transistor amplifier in the rf analogue signal domain. linearization effects were validated on the single stage rf power amplifiers throughout the simulation process [6]-[8] and experiments [9], as well as on the doherty amplifiers [10-12]. in this paper, we combine two linearization methods that exploit modified baseband signals formed and processed in the digital domain [13], [14] in order to linearize the rf amplifier. in the first method, an adequately prepared signal in the baseband is adjusted in amplitude and polarity and injected at the input and output of the amplifier transistor across the lc resonant circuit. in the second method, specified baseband signals modulate the carrier second harmonic after appropriate setting on the amplitude and phase, and the formed signal is run at the gate and drain of the amplifier transistor. the injected signals for the linearization and the fundamental signal are mixed due to the second order nonlinearity of the transistor generating additional third-order nonlinear products that may suppress the original intermodulation products caused by the transistor nonlinear characteristic. the impact of the proposed linearization methods is considered on a single stage power amplifier for qam signals wherein i and q components are single tones with maximum spectrum bandwidth 60 mhz, as well as for the wcdma digitally modulated signal. the results obtained by the combined method are validated by comparison with the results achieved by the first and second linearization approaches, which are described in [13] and [14]. since the linearization results for the first method are represented in [13] for only one power level of the qam signal, in this paper we have analysed its linearization effect for a range of the input signal powers. this paper is organized as follows: section ii explains the theoretical background of the combined linearization approach; section iii represents the results of linearization of the designed single-stage rf power amplifier obtained by the combined method proposed in this paper, which are also compared to the results obtained by the first linearization approach; in section iv conclusions are reported. 2.theoretical analysis of the combined linearization technique the operational principle of the linearization method proposed herein can be comprehended by the theoretical analysis of the current nonlinearity at the transistor output in the amplifier circuit. the dominant nonlinearity of fets can be represented by a taylor-series polynomial model [15-17] in case when the memory effects are neglected 2 3 2 3 ds m1 gs m2 gs m3 gs d1 ds d2 ds d3 ds 2 2 m1d1 gs ds m2d1 gs ds m1d1 gs ds ... i g v g v g v g v g v g v g v v g v v g v v            (1) the transistor’s drain current (ids) depends on the voltage between the gate and source (vgs), which is expressed by the transconductance terms labeled by gmx. the dependence rf pa linearization by signals modified in baseband digital domain 211 of the drain current on the voltage between the drain and source (vds) is included by the drain conductance terms gdy. in addition, the drain current is a function of voltage between the gate and source and voltage between the drain and source, which are represented by the coefficients gmxdy. the order of each coefficient can be calculated as x + y. the digitally modulated signal is characterized by the magnitude c(t), phase (t), and carrier frequency 0, as in 0 0 0 s 0 0 ( ) ( ) cos( ( )) ( )[cos( ( )) cos( ) sin( ( )) sin( )] ( cos( ) sin( )) v t = c t t t c t t t t t v i t q t             (2) where i = (c(t)/vs)cos((t)) and q = (c(t)/vs)sin((t)) are the in-phase and quadrature-phase components of the baseband signal. the second-order nonlinear system generates the output signal given by 2 2 2 2 2 2 2 out_2_order in s s 0 0 1 1 ( ) [ ] [( ) cos(2 ) 2 sin(2 )] 2 2 v t v v i q v i q t iq t        (3) if the digital signal expressed by eq.2 is fed at its input. the linearization approach suggested in this paper utilizes the complete modified baseband signal from the eq.3: the baseband signal in appropriate form (the first term) together with the fundamental carrier second harmonic modulated by the adequately shaped baseband signal whose modified (new) in-phase and quadrature-phase components are in the forms inew = i 2  q 2 and qnew = 2iq (the second and third terms). fig. 1 shows the schematic diagram of the amplifier with the linearization circuit that forms, processes, and injects the linearization signals at the input and output of the amplifier transistor in case of the combined linearization method. the baseband part of the signal for linearization in the form bbmod = i 2 + q 2 is multiplied by the coefficients a{ib/ob} for amplitude and polarity tuning. the baseband signals modified in this manner are then inserted over the series lc circuit at the gate and drain of the amplifier transistor. another modified components of the linearization signal, inew and qnew, which modulate the carrier second harmonic, are adjusted in amplitude by a{i2h/o2h} and phase by {i2h/o2h} across two branches. the prepared signals are then inserted at the input and at output of the amplifier transistor through the bandpass filters. the aforementioned indices consisting of the letters i and o in subscript are related to the signals injected at the input and output of the amplifier transistor, respectively. the voltage at the gate, given as, i2h 2 2 gs s 0 0 ib 2 2 i2h 0 0 g ( ) [ cos( ) sin( )] ( ) [( ) cos(2 ) 2 sin(2 )] j v t v i t q t a i q a e i q t iq t v                (4) is comprised of all signals injected: the fundamental useful signal, the gate bias signal, the modified baseband signal and the fundamental carrier second harmonic modulated by the adequately shaped baseband signal. the voltage at the drain, given as, o2h 2 2 ds o 0 0 ob 2 2 o2h 0 0 d ( ) [ cos( ) sin( )] ( ) [( ) cos(2 ) 2 sin(2 )] j v t v i t q t a i q a e i q t iq t v                (5) 212 a. đorić, n. maleš-ilić, a. atanasković consists of the fundamental signal amplified linearly, the drain bias voltage and the signals for linearization including the baseband signal and modulated second harmonic, which are appropriately modified, tuned, and injected together at the amplifier transistor drain. in eq. 5 o 0 0 ( cos( ) sin( ))v i t q t   is the output signal at the fundamental frequency, vg and vd are dc bias voltages supplied at the gate and drain of the amplifier transistor, respectively. fig. 1 schematic diagram of the amplifier linearized by the injection of the modified baseband signals processed in digital domain the distorted output current is obtained by substituting the eq. 4 and eq. 5 in eq. 1. yielding i2h o2h i2h 3 ds s m3 ib s m2 i2h s m2 ob s m1d13 o2h s m1d1 ib o m1d1 i2h o m1d1 2 2 2 2 s o m1d2 s o m2d1 0 3 ( ) 2 4 1 1 2 2 3 3 ( )[ cos( ) 2 2 j im j j i t v g a v g a e v g a v g a e v g a v g a e v g v v g v v g i q i t q                         0 sin( )] (6)t   i2h i2h o2ho2h 25 2 2 2 ds s m5 ib s m3 i2h s m3 ob s m1d2 ib ob o m1d25 θ θ2θ2 o2h s m1d2 i2h o2h o m1d2 ib ob s m2d1 2 ib o m2d1 5 3 ( ) 3 2 8 2 1 2 2 1 j im jj i t v g a v g a e v g a v g a a v g a e v g a a e v g a a v g a v g                    i2h o2hi2h θ θ2θ2 i2h o m2d1 i2h o2h s m2d1 2 2 2 0 0 2 ( ) [ cos( ) sin( )] (7) jj a e v g a a e v g i q i t q t           rf pa linearization by signals modified in baseband digital domain 213 where eq. 6 and eq. 7 refer to the thirdand fifth-order intermodulation products of the drain current at the fundamental frequency, respectively. the nonlinearity of the drain current in terms of the voltage between the drain and source, vds, is expressed by the coefficients gd1  gd3 and according to [16] and [17] it is assumed to have an inessential impact on the intermodulation products and have been omitted from the equations. the first term in eq. 6 represents the signal distorted by the cubic term of the amplifier (gm3), which is considered as a dominant in arousing the third-order intermodulation products, im3, and spectral regrowth [16], [17]. the gm2 second-order transconductance nonlinear products of the fundamental signal and the linearization signals injected at the amplifier transistor gate are expressed as the second and third terms in the eq.6. the fourth and fifth gm1d1terms are the mixing products between the gatesource voltage of the fundamental signal and the voltage of the linearization signals fed at the amplifier transistor drain. additionally, the fundamental signal at the output of the transistor mixes with the linearization signals driven at the amplifier transistor input generating the sixth and seventh terms. the drain current at im3 frequencies includes the mixing products of the third-order, gm1d2 and gm2d1, between the drain and gate voltages of the fundamental signal (the eighth and ninth terms in eq. 6). since the output signal at the fundamental frequency is considered to be 180 degree out of phase in reference to the input signal, these products reduce each other [17] due to their opposite phases. according to previous analysis, it is possible to reduce spectral regrowth caused by the third-order distortion of the fundamental signal by selecting the appropriate amplitude and polarity of the modified baseband signal injected at the input (aib) and output (aob) of the amplifier transistor, as well as by choosing the adequate amplitude and phase of the modified baseband signal that modulates the second harmonic injected at the input (ai2h, i2h) and output (ao2h, o2h) of the amplifier transistor. the first term in eq.7 is formed due to the amplifier nonlinearity of the fifth-order, (gm5) and expresses the fifth-order intermodulation products of the drain current of the amplifier transistor. the mixed terms between the drain and gate, gm1d2 and gm2d1 are the products between the fundamental signal and the baseband linearization signal as well as the fundamental signal and the modulated second harmonic, which exist at the amplifier transistor input or output. it is supposed that these terms of the drain current at the im5 frequencies neutralize each other to a certain extent, which depends on the phase relations between the linearization signals driven at the gate and drain as well as the intensity of the mixing products. however, the second and third gm3 terms in eq. 7 may increase or decrease the im5 products owing to the signs of the thirdand fifth-order nonlinear coefficients gm3 and gm5 and also to the linearization signal phase i2h. 3. linearization results in order to estimate the effects of linearization, the proposed combined approach is applied on the broadband rf amplifier designed in agilent advanced design systemads software. the designing process is based on the nonlinear met model of the freescale transistor mrf281s ldmosfet and includes synthesis of the input and output broadband matching circuits with the lumped elements [7]. the source and load 214 a. đorić, n. maleš-ilić, a. atanasković impedances of the amplifier transistor were determined by the source-pull and load-pull analysis in ads entailing high drain efficiency and maximum output power [8]. the amplifier circuit was designed to operate over the frequency range 0.7 ghz-1.1 ghz. we considered the influence of the bandpass filters (ideal elements from the ads library) connected to the gate and drain of the amplifier transistor to supply the linearization signal, which comprises the modulated fundamental carrier second harmonic, to the amplifier circuit. the series lc circuits that enable injection of the modified baseband signals for linearization into the amplifier were also included into analysis. the gain, power-added efficiency (pae) and output power of the amplifier loaded by the lc circuits and bandpass filters, in terms of the input power, is shown in fig. 2 for the single-tone excitation. it can be noted that the maximum gain observed at 1 ghz is slightly greater than 22 db, showing a variation of approximately 2 db with the change of the excitation signal frequency. the power added efficiency at labeled frequencies deviates from the pae at 1 ghz by maximum 5%, whereas the maximum pae is 50 % at 1 ghz at maximum output power of around 36 dbm. fig. 2 gain, pae and pout of the design amplifier with the series lc circuits and bandpass filters loading gate and drain of the amplifier transistor the designed power amplifier was tested for the qam modulated signals whose spectrum contains two frequency components separated by 2 mhz up to 60 mhz with centre frequency of 1 ghz. through the ads simulations, timed source component named qam was used as a source of the signals. the analysis was carried out for different fundamental signal power levels at the amplifier input: 0 dbm, 3 dbm and 7 dbm. the power levels of the third-order and the fifth-order intermodulation products, before and after the linearization, in terms of the frequency interval between the spectral components of the qam signal are presented in fig. 3 and fig. 4 for different input power levels. it should indicate that the values of the linearization coefficients a{ib/ob} for amplitude and polarity tuning of the baseband signals and coefficients a{i2h/o2h} and {i2h/o2h} for amplitude and phase adjustment of the linearization signals that modulate the carrier second harmonic, were obtained by the optimization process in ads for each considered input signal power level. the random optimization of the adjustable coefficients of the linearization signals rf pa linearization by signals modified in baseband digital domain 215 was carried out with the aim to suppress the third-order intermodulation products and to restrain the fifth-order intermodulation products at the levels below the reduced im3 products. we compared two cases: when the linearization was achieved by insertion of the only modified baseband signals at the input and output of the amplifier transistor (the first or baseband method) [13] and when the linearization was performed by the combined approach, i.e. a simultaneous injection of the adequately modified baseband signals together with the second harmonic of the fundamental carrier modulated by another differently modified baseband signal at the input and output of the amplifier transistor. the combined linearization approach encompasses the linearization methods aforementioned above as the first (baseband) and second methods, [13], [14]. figure 3 represents the third-order intermodulation products, im3, before and after the linearization for the compared linearization cases. it can be noted that, greater reduction of the im3 products for all input power levels over the considered power range was achieved by the combined linearization approach proposed in this paper. the effects of the linearization method that exploits only the modified baseband signal was proposed and tested in [13] for a specific input power level of the qam signal and a range of input power for the wcdma signal. in this paper, we obtained the linearization results for a power range of the qam signal. it can be indicated that the suppression of the im3 products attained for the combined approach suggested in this paper is greater for around 25 db to 15 db in comparison with the results of the baseband approach from [13] for the frequency spacing between the qam spectral components from 2 mhz to 20 mhz. the im3 products reduction grade is around minimum 25 db for frequency separation of 20 mhz when the combined approach is run. a general observation is that, as the input power increases and the frequency span becomes wider, the im3 products drop rate decreases. the im3products are lessened by 10 dbin the case of 7 dbm input power and 60 mhz frequency span when the combined method is applied, that is still much better result referring to the reduction of only a few decibels in case of the baseband method linearization. moreover, it should be stressed that the results achieved by the combined method are also notably better in comparison with the im3 products decrease represented in [14] wherein the fundamental carrier second harmonic modulated by the shaped baseband signal was utilized for the linearization. better linearization results are obvious in the whole signal power range and frequency spacing between the spectral components, that is especially significant for a larger spacing: e.g. for input power 7 dbm and spacing 20 mhz, the im3 products are hardly lowered by a few decibels in the linearization approach from [14], whereas in this paper the combined approach decreases the im3 products by 26 db. the influence of the performed linearization approaches on the fifth-order intermodulation products, im5, is presented in fig. 4. the simulation shows that the im5 products are lessened by minimum 10 db for frequency interval between signals up to 20 mhz, while, by applying the modified baseband linearization signals, the im5 products stayed unaltered in reference to the state before the linearization for almost all considered input power levels and frequency spacing between the qam components. an exception is noted at input power of 3 dbm where the reduction of im5 is 6 db to 13 db. 216 a. đorić, n. maleš-ilić, a. atanasković a) b) c) fig. 3 third-order intermodulation products of the rf power amplifier for qam signal before and after the linearization for different input power levels: a) 0 dbm, b) 3 dbm, c) 7 dbm rf pa linearization by signals modified in baseband digital domain 217 a) b) c) fig. 4 fifth-order intermodulation products of the rf power amplifier for the qam signalbefore and after the linearization for different input power levels: a) 0 dbm, b) 3 dbm, c) 7 dbm 218 a. đorić, n. maleš-ilić, a. atanasković by application of the combined linearization method, theim5 products reduction grade significantly increases in relation to the previous method and depends on the power and frequency spacing between the qam signal spectral components in the similar manner as the results for the im3 products behave: the linearization results are significantly better than when only modified baseband signal is used for linearization. the reduction grade goes from 14 db at 10 mhz signal spacing for the specified power range until 8 db at 60 mhz spacing and 3 dbm input power level. at 7 dbm input power, the im5 products descend by 27 db at 10 mhz spacing, whereas they are retained at the level before the linearization at 60 mhz spectral component frequency spacing. in comparison with the results achieved in [14], where only modulated second harmonics carried out linearization, the better results concerning the im5 products reduction are obtained by the combined method proposed herein. namely, the results from [14] show that the im5 products stayed unchanged or lowered for a few decibels in almost every analysed case (0 dbm to 7 dbm input signal power and 10 mhz to 20 mhz frequency spacing). it should indicate that the im5 products are not lessened in reference to the level before the linearization at 20 mhz spectral component spacing, whereas we have accomplished the im5 suppression even until frequency spacing of 60 mhz for 0 dbm and 3 dbm input power by the combined linearization approach. additionally, the influence of the suggested linearization methods was also investigated for the wcdma signal which has 1 ghz centre frequency, a spectrum width of 3.84 mhz and peak to average power ratio (papr) of 6 db in a range of fundamental signal average output power. the adequate values of the coefficients a{ib/ob}, a{i2h/o2h} and {i2h/o2h} for the required linearization results were determined by ads optimization. it should be noticed that the linearization coefficients obtained for the wcdma signal differ from the values achieved for the qam signals. the linearization results obtained by the combined linearization approach are also compared with the results gained by the method that uses only baseband modified signal [13], as indicated in fig. 5 and fig. 6. the similar observation relating to the linearization results of the baseband linearization approach and the combined approach is imposed for the wcdma signal as for the qam signal previously considered. in the baseband linearization case, the adjacent channel power ratioacpr is enhanced around 10 db at power levels greater than 24 dbm in the range of dominant third-order intermodulation products at ±4mhz offset from the carrier (fig. 5), while in the range of dominant fifth-order intermodulation products at ±8 mhz offset from the carrier, the acpr is restrained at the power levels before the linearization with the exception at the higher observed power levels where it is improved by a few decibels. comparing to the results of the pa linearization gained in this paper by applying the combined linearization method, we may indicate that the improvement of acpr observed at ±4mhz offset from the carrier is better for maximally 5 db in relation to the baseband approach. additionally, the acpr improvement is better by a few decibels in the range of dominant fifth-order intermodulation products when the combined method is utilized than in the first approach. moreover, in reference [14], we analysed the acpr of the wcdma signal before and after the linearization by applying the modulated fundamental carrier second harmonic for only 11 dbm input signal power (output power of 29 dbm), where the acpr was enhanced more than 10 db at ±4mhz offset from the carrier. it is spotted from fig. 5 that the combined method gives around 15 db acpr improvement at that output power level. rf pa linearization by signals modified in baseband digital domain 219 fig. 5 acpr before the linearization (solid line) and after the linearization (dashed and dotted lines) at ±4 mhz offset from the carrier (the range of the dominant third-order distortion) for the wcdma digitally modulated signal in a terms of average output power fig. 6 acpr before the linearization (solid line) and after the linearization (dashed and dotted lines) at ±8 mhz offset from the carrier (the range of the dominant fifth-order distortion) for the wcdma digitally modulated signal in a terms of average output power 4. conclusion this paper presents a new linearization approach that combines variously modified baseband signals where one modulates the second harmonic of the fundamental carrier. the proposed linearization method uses the i and q signals that are adequately processed in the digital domain with the aim to form the signals for the linearization which are 220 a. đorić, n. maleš-ilić, a. atanasković inserted into the gate and drain of the rf power amplifier transistor. the analysis of the impact of the proposed combined linearization techniques on the intermodulation products suppression is assessed in simulation by ads for the qam signal whose i and q components are sinusoidal signals and the spectrum contains two frequency components symmetrical around the carrier frequency. the linearization effects for different input power levels and different frequency spacing between the signal spectral components are examined for the proposed linearization method. also, the obtained results are compared to the results achieved when the only modified baseband signals are fed at the amplifier circuit. it may be noted that the significantly better results are achieved in the reduction of the third-and fifthorder nonlinearity of the amplifier by the combined linearization method in comparison with the method that uses only the modified baseband linearization signal. the same may be inferred regarding the nonlinearity suppression by the combined method in reference to the results given in the literature that were reached by the method that performs linearization by the second harmonics of the fundamental carrier modulated by adequately modified baseband signals. additionally, the linearization influence is also demonstrated for the wcdma digitally modulated signal. the combined linearization method gives also greater improvement of the acpr for the wcdma digitally modulated signal in the range of the dominant third-order as well as the fifth-order distortions relative to the two mentioned linerization approaches. acknowledgement: this work was supported by the ministry of education, science and technological development of republic of serbia, the project number tr-32052. references [1] n. usachev, v. elesin, a. nikiforov, g. chukov, g. nazarova, d. sotskov, n. shelepin, v. dmitriev, “system design considerations of universal uhf rfid reader transceiver ics”, facta universitatis, series: electronics and energetics, vol. 28, no. 2, pp. 297-307, june 2015. [2] p. kenington, high-linearity rf amplifier design. artech house, 2000, chapters 4-6, pp. 135–423. [3] s. cripps, rf power amplifiers for wireless communications. artech house, 1999, chapter 9, pp. 251–282. [4] m. k. kazimierczuk, rf power amplifiers. wiley, 2008, chapter 9, pp. 321–343. [5] n. mizusawa, s. kusunoki, “thirdand fifth-order baseband component injection for linearization of the power amplifier in a cellular phone”, ieee transactions on microwave theory and techniques, vol. 53, no.11, pp.3327-34, 2005. [6] n. males-ilić, b. milovanović, đ. budimir, “effective linearization technique for amplifiers operating close to saturation”, international journal of rf and microwave computer-aided engineering, vol.17, no. 2, pp.16978, 2007. [7] a. đorić, n. males-ilić, a. atanasković, b. milovanović, “linearization of broadband microwave amplifier”, serbian journal of electrical engineering, vol. 11, no. 1, pp. 111-120, february 2014. [8] a. đorić, a. atanasković, n. males-ilić, b. milovanović, “linearization of microwave power amplifier for broadband applications”, xlviii international scientific conference on information, communication and energy systems and technologies icest2013, ohrid, republic of macedonia, pp. 65-68, 2013. [9] a. atanasković, n. maleš-ilić, b. milovanović, “linearization of power amplifiers by second harmonics and fourth-order nonlinear signals”, microwave and optical technology letters, wiley periodicals, inc., a wiley company, vol.55, issue 2, pp. 425-430, february 2013. [10] a. atanasković, n. males-ilić, b. milovanović, “linearization of two-way doherty amplifier”, in proc. of microwave integrated circuits conference (eumic), european 2011, pp. 304-307. [11] n. maleš-ilić,a.đorić, a. atanasković, “linearization of broadband two-way microstrip doherty amplifier”, facta universitatis, series: electronics and energetics, vol. 29, no. 1, pp. 127-138, march 2016. rf pa linearization by signals modified in baseband digital domain 221 [12] a. đorić, n. maleš-ilić, a. atanasković, b. milovanović: “linearization of broadband doherty amplifier”, in proceedings of the 11 th international conference on telecommunications in modern satellite, cable and broadcasting services (telsiks 2013). niš, serbia, october 16-19, 2013, vol. 2, pp. 509-512. [13] a. atanasković, n. maleš-ilić, a. đorić, m. ţivanović, “power amplifier linearization by modified baseband signal injection”, in proceedings of the 12 th international conference on telecommunications in modern satellite, cable and broadcasting services (telsiks 2015). niš, serbia, 14-17 october, 2015, pp. 102-105. [14] a. atanasković, n. males-ilić, k. blau, a. đorić, b. milovanović, “rf pa linearization using modified baseband signal that modulates carrier second harmonic”, microwave review, vol. 19, no. 2, pp. 119-124, december 2013. [15] j. c. pedro and j. perez, “accurate simulation of gaas mesfet’s intermodulation distortion using a new drain-source current model,” ieee trans. microwave theory tech., vol. 42, pp. 25–33, january 1994. [16] j. p. aikio and t. rahkonen, “detailed distortion analysis technique based on simulated large-signal voltage and current spectra”, ieee mtt trans. microwave theory tech., vol. 53, pp. 3057–3065, 2005. [17] a. heiskanen, j. aikio, and t. rahkonen, “a 5-th order volterra study of a 30w ldmos power amplifier”, in proceedings of the international symposium on circuits and systems (iscas'03), bangkok, thailand, 2003, vol. 4, pp. 616–619. facta universitatis series: electronics and energetics vol. 30, n o 1, march 2017, pp. 93 106 doi: 10.2298/fuee1701093k comparison of measured performance and theoretical limits of gaas laser power converters under monochromatic light  rok kimovec, marko topič university of ljubljana, faculty of electrical engineering, ljubljana, slovenia abstract. evaluation of gaas laser power converters (lpc) is reported in light of theoretical maximum limits calculated with detailed balance method as proposed by shockley and queisser (sq). calculations were done for three different theoretical structures of lpcs homogeneously illuminated by monochromatic light. effects of lpc thickness, central wavelength of a monochromatic light source and various irradiance levels are discussed. reflection of incident light from the interface between air and gaas is calculated and countermeasures in the form of single and double layer anti reflection coatings are theoretically studied. measurements of single junction, single segment gaas lpc illuminated by monochromatic light with central wavelength λ0 = 808 nm are presented and compared with the theoretical maximum values. the conversion efficiency ηmeas = 54,4 % was measured for gaas lpc illuminated with power density of monochromatic light pillum = 14,3 w/cm 2 at the temperature of the lpc casing t = 302 k. for the same parameters conversion efficiency ηsq = 76,6 % was calculated resulting in utilization ratio ηmeas/ηsq=0,71. measured jsc and voc achieve 88,5 % and 89,2 % of theoretically calculated sq limit values. key words: laser power converter, shockley-queisser limit, gaas, monochromatic efficiency 1. introduction shockley-queisser (sq) limit [1]–[3] is fundamental, widely adopted figure of merit used for evaluating efficiency limits of photovoltaic devices. it is based on detailed balance method and assumes radiative recombination as the sole loss mechanism in a solar cell. calculations of the sq limits were already done under standard solar spectra (am1.5, am1.0 and am0) or for black body radiation spectrum. for purposes of power beaming, where photovoltaic cell is illuminated by artificial light source in order to transfer energy with no electrically conductive path, sq limit under monochromatic illumination [4] will be calculated, since those systems commonly employ laser diodes as a source of monochromatic illumination. light energy irradiated from a laser diode is  received march 15, 2016; received in revised form june 13, 2016 corresponding author: rok kimovec university of ljubljana, faculty of electrical engineering, tržaška cesta 25, 1000 ljubljana, slovenia (e-mail: rok.kimovec@fe.uni-lj.si) 94 r. kimovec, m. topič converted to electrical energy by gaas laser power converters (lpc) optimized for monochromatic light sources at specific wavelength. in practice laser diodes with central wavelength between λ0 = 800 – 850 nm are often utilized, due to their low price and good system efficiency when employing gaas lpcs as optical energy to electrical energy converters. currently state-of-the-art gaas lpcs achieve efficiencies greater than 56 % [5-6] while illuminated with monochromatic light with central wavelength λ0 between 810 – 820 nm and pillum between 50-124 w/cm 2 . in this paper we present theoretically calculated efficiency limits based on detailed balance principle compared with measured gaas lpc. conversion of optical energy to electrical energy will be presented with loss analysis for both theoretical and measured lpc. 2. model sq current density limit is calculated as difference between photogenerated current density and loss of available current density due to radiative recombination as (1): . (1) jph – photogenerated current density q – elementary charge of electron rr – radiative recombination rate of electron-hole pairs (e – h) 2.1. photogenerated current density photogenerated current density is calculated from the flux of e – h pairs generated by absorbed photons (2). in this paper we present calculation of sq limit under monochromatic illumination applied for a gaas photovoltaic cell. (2) φe – h – flux of photogenerated e – h pairs φe – h presents number of generated e – h pairs in absorber per unit time per unit area and is calculated using absorption coefficient α0 of gaas as measured by [7] including urbach tail with slopes e0 below eg and e’ above eg [8] and fitted to the following equation (3) [9]: { ( ) (3) eph – energy of photons incident on lpc surface eg = 1.42 ev – band gap of gaas α0 = 8000 cm -1 e0 = 6,7 mev e’ = 140 mev comparison of measured performance and theoretical limits of gaas laser power converters 95 from known absorption rate α, absorptivity a for three different hypothetical structures of thickness l of gaas lpcs as seen in fig. 1 were considered as follows [3,4]: a) planar front surface with complete absorption on the back surface (4), representing a single pass of photons through absorber. ( ) ( ) (4) b) planar front surface with perfect reflecting mirror on the back surface (5), representing a double pass of photons through absorber. ( ) ( ) (5) c) random texture on front surface with perfect reflecting mirror on the back surface (6), representing multiple passes of photons through absorber. ( ) ( ) ( ) (6) ngaas – refractive index of gaas all considered structures have thickness dependence, noted with l and are assumed to be exposed in the air. for randomly textured dependence of absorptivity of gaas on refractive index ngaas of gaas can be also noted. fig. 1 different theoretical structures of gaas lpcs considered in calculations. arrows shows the light path through gaas absorber. r – reflection of light from bottom surface with known a, jphoto can be calculated from a flux of photons incident on an lpc front surface. reflection of incident light from the front surface is not taken into account here, but is added and discussed later. laser spectrum around central wavelength λ0 was interpolated with gaussian distribution as shown in the following equation (7). √ ( √ ) ( √ ) (7) 96 r. kimovec, m. topič laser spectrum was weighted with power density of incident light entering the front surface of lpc resulting in spectral irradiance (8), (10). (8) (9) ( ) (10) fwhm – full width at half maximum λ0 – central wavelength h – planck’s constant c – speed of light equation for a flux of photons (11) entering front surface per unit energy ( ) can be derived from known spectral irradiance. ( ) ( ) (11) integration of a multiplied by ( ) over energy content of photons presented in spectral irradiance results in a flux of e – h pairs (12) generated by a flux of photons for defined laser parameters λ0, fwhm and laser power density and lpc thickness l. ( ) ∫ ( ) ( ) (12) photo generated current density can be calculated from a known flux of e – h pair (13) as a function of incident photon energy and thickness of lpc. ( ) ( ) (13) 2.2. radiative recombination rate sq limit assumes radiative recombinations in thermal equilibrium as sole loss mechanism present in the photovoltaic cell [1]. according to the detailed balance method used in calculation of sq limit, all absorbed energy should be emitted for the system to be in equilibrium. therefore the loss of energy due to thermal radiation is unavoidable. derivation of rr can be found in literature [1] and recombination current density can be written as (14): ( ) (14) where: ( ) ∫ ( ) (15) comparison of measured performance and theoretical limits of gaas laser power converters 97 k – boltzmann’s constant h – planck’s constant v – voltage across device at open circuit condition t – device temperature it is remarked that rr in our case corresponds to an emission rate from the device surface (and not from the volume). consequently its unit is m -2 s -1 (instead of more commonly used m -3 s -1 ). 2.3. lpc model performance of the lpc is expressed with the same parameters as used in evaluation of solar cells. efficiency η, fill factor ff, open-circuit voltage voc, short-circuit current density jsc and available electrical power density at maximal power point pmax are derived from current density – voltage dependency, j – v (16). ( ) ( ) (16) max power density is calculated numerically as (17): ( ) ( ( ) ) (17) and vmpp and jmpp as (18): ( ) ( ) . (18) conversion efficiency is calculated as (19): ( ) ( ) (19) and fill factor as (20): ( ) ( ) ( ) ( ) . (20) jsc is obtained as (21): ( ) ( ) (21) and voc is calculated as (22): ( ) ( ) . (22) 3. simulation results all simulations of sq performance limit were done for three different theoretical lpc structures discussed above with lpc thickness l = 1 µm at lpc temperature t = 300 k. 98 r. kimovec, m. topič source of monochromatic illumination was assumed to be homogenous across the lpc front surface with illumination power density pillum = 100 mw/cm 2 , spectral distribution around a central wavelength λ0 = 808 nm is gaussian with fwhm = 5 nm. simulation parameters different from those specified in previous statement are noted where necessary. 3.1. effect of lpc absorber thickness on efficiency as seen in fig. 2 absorber layer thickness plays a significant role on sq efficiency limit for structures with thickness less than 3 µm. for thicker cells, there is less than 1 % difference between the best and worst performing structure and efficiency saturates at η = 68,4 % for all structures and for given simulation parameters. fig. 2 absorber thickness effect on efficiency of lpc similar strong rise with increasing thickness of absorber can be seen for jmpp while values of vmpp slightly fall (fig. 3). thickness is important to guarantee complete absorption of all photons which results in increased jmpp. this is most notable in structure with no reflection from the back surface where only single pass of light through absorber occurs. recombination fig. 3 absorber thickness effect on jmpp and vmpp of lpc comparison of measured performance and theoretical limits of gaas laser power converters 99 rate of e – h pairs increases with increasing thickness, resulting in increased jrad and decreased vmpp. product of jmpp and vmpp is rising with a thickness of absorber resulting in increasing pmpp and efficiency, since the gain from increased absorption is much larger than loss of voltage due to increased recombination rate of e – h pairs. 3.1. effect of central wavelength of monochromatic light on efficiency of lpc sq efficiency for three different 1 µm thick gaas theoretical structures of lpcs as a function of monochromatic light with central wavelength λ0 are shown in fig. 4. maximal efficiency of ηsq = 72,3 % is achieved for the randomly textured lpc with perfect back mirror at λ0 = 872 nm which correlates to eg=1,42 ev of gaas. lpc with planar front surface and perfect back mirror achieves ηsq = 65,0 % at λ0 = 808 nm and planar lpc with an absorbing mirror on the back has ηsq = 56,7 % at λ0 = 728 nm. it is clear that a lpc structure does not only influence absolute maximum of efficiency, but also shifts peak of efficiency, marked with x in fig. 4. commercially available lasers diodes with optimal performance between price and output optical power suitable for illumination of gaas lpcs emit light with a spectral peak at approximately λ0 = 808 nm marked with a vertical line in fig. 4 fig. 4 effect of central wavelength λ0 of monochromatic source on sq efficiency limit 3.2. performance of lpc under high irradiance lpcs are normally illuminated with high irradiance of monochromatic light, since efficiency increases with increasing illumination power density pillum, calculated as . all three structures have logarithmic dependence of efficiency on pillum as shown in semi-log plot in fig. 5. for comparison efficiency of high efficiency gaas lpcs are plotted in fig. 5. highest lpc efficiency known to the authors was achieved by helmers et al. with η = 57,4 % at λ0 = 805 nm and pillum = 124 w/cm 2 [6]. gaas lpc with similar efficiency η = 56,0 % at λ0 = 820 nm and pillum = 56 w/cm 2 was reported by andreev et al.[5]. efficiency η = 52,8 % at λ0 = 810 nm and pillum = 14 w/cm 2 was reported by beaumont et al.[10]. peña et al. developed lpc with efficiency η = 45,4 % at λ0 = 808 nm and pillum = 5 w/cm 2 [11]. for same illumination parameters shan et al. report efficiency η = 53,2 % [12]. reported high efficiency lpcs are marked with circles in fig. 5. 100 r. kimovec, m. topič fig. 5 influence of high irradiance on efficiency of lpcs. efficiencies of state-of-the-art gaas lpcs obtained from the literature are marked with circles. 3.3. single and double layer ar coating for reduced front surface reflection so far in the paper no reflection of incident light from front surface was assumed in calculations, resulting in all light reaching absorption layer. in the real world reflection from interface between two media results in decrease of light coupled in photovoltaic structure. reflection of light perpendicular to the surface is defined with refractive indices of media on the interface (23). in our case interface consists of air and gaas. since refractive index of gaas ngaas is dependent on photon energy [13], reflection r exhibits same dependence. for photon energy eph = 1,6 ev, representing monochromatic light with λ0 = 808 nm, ngaas=3,7 [13]. refractive index of air is nair = 1,0 and is constant through broad range of light spectrum [14]. ( ) (23) large difference of refractive indices between gaas and air leads to high reflection of light from the interface and only 67,2 % of perpendicularly incident monochromatic light at λ0=808 nm is coupled in the absorption region of gaas. numerous schemes are deployed in order to reduce reflection depending on the spectrum of incident light. for broadband white light random texturing of front surface reduce reflectivity of broad wavelength range to few percent [15]. another approach employed when using monochromatic light is to use thin film single layer antireflection ar coating with refractive index nar (24) and with quarter wavelength thickness dar (25) of incident light. √ (24) (25) when using monochromatic light single layer thin film ar coating may totally reduce reflection as seen in fig. 6 while for broad white light spectrum single layer of ar coating reduce reflectance to around 10 % [16]. comparison of measured performance and theoretical limits of gaas laser power converters 101 reflectance r for perpendicularly incident light as a function of thickness dar and energy of photon eph for single layer ar coating can be written as [17] (27): ( ) (26) ( ) ( ) ( ( )) ( ( ) ( )) ( ) ( ( )) ( ( ) ( )) (27) λ0 – wavelength of monochromatic light in air ( ) ( ) for monochromatic light with central wavelength λ0=808 nm, 105,5 nm thick single layer ar coating with refractive index nar=1,92 reduce reflectance to zero as seen in fig. 6, resulting in all incident light coupled in absorption layer. since material with exact same refractive index at specified wavelength doesn’t exist, it is informative to calculate reflection from front surface when using already deployed materials of ar coatings. fig. 6 shows reflections for three different materials of ar coating deployed on gaas as a function of their thickness. the best material regarding refractive index for ar coating on gaas is silicon nitride (si3n4) with refractive index 2,00 at 808 nm [18]. gaas with 101 nm thick layer of si3n4 reflect around 0,1 % of incident light. another appropriate material for ar coating of gaas is al2o3 or alumina. al2o3/gaas interface is widely studied [19], [20] since it has many uses in semiconductor industry such as insulator layer in igfet transistors [21], diode laser coatings [22] and ar coating for high efficiency solar cells [23]. 114,8 nm thick layer of alumina on gaas with refractive index nal203=1,76 [24] at 808 nm resulting in front surface reflection under 1 %. another commonly used material for ar coating on solar cells is sio2 or silica with refractive index nsio2=1,45 at 808 nm [25]. since the refractive index of silica is far from optimal for ar coating on gaas around 7,4 % of incident light is reflected in the best case scenario. fig. 6 influence of single layer ar coating on reflection from interface gaas/air 102 r. kimovec, m. topič single layer ar coating provides sufficient reduction of reflection for monochromatic light from the interface gaas/air, but put strict requirements on ar coating material, since it requires exactly specified refractive index in order to achieve good results. it also performs well only for designed wavelength so performance of single layer ar coating is decreased in real world scenario where wavelength of diode laser varies due to manufacturing tolerances and temperature of operation. to overcome this limits, double layer ar coating can be deployed. for quarter wavelength thicknesses of both ar coatings in double layer ar stack, for perpendicularly incident light r is defined as [17] (28): ( ( ) ( ) ( ) ( ) ( ) ( ) ) (28) r will be minimized when (29): √ (29) nar1, nar2 – refractive index of thin layer one and two of double layer ar coating to minimize reflection from interface air/gaas when using monochromatic light with λ0 = 808 ratio of nar2/nar1 = 1,92 should be utilized. well suited materials for ar coatings that approach this ratio are mgf2 and tio2. refractive indices of those two materials at 808 nm are ntio2 = 2,52 [24] and nmgf2 = 1,37 [26] resulting in ratio of nar2/nar1 = 1,84. fig. 7 shows reflection of double stack ar coating deployed on gaas as function of thickness of mgf2 and tio2. reflection is reduced to zero with thickness of dmgf2 = 72,2 nm and thickness of dtio2 = 58,1 nm. fig. 7 influence of double layer ar mgf2/tio2 coating on reflection from interface gaas/air comparison of measured performance and theoretical limits of gaas laser power converters 103 4. comparison of sq efficiency limit with measured lpc efficiency following theoretical calculations, measurements were done on gaas lpc pictured in fig. 8. a single segment single junction circular gaas lpc with radius 0,15 cm was fully illuminated with monochromatic light from semiconductor laser with λ0 = 808 nm and total output power 1,06 w. light from a laser diode is coupled into mm 105/125 µm, na 0,22 fiber with output positioned perpendicular to the surface of the lpc so that whole area is illuminated and spillage of light is minimized. impinging profile of incident light is near gaussian resulting in uniform irradiance of front surface. area of illumination was 0,074 cm 2 resulting in pillum = 14,3 w/cm 2 . lpc was mounted on to-39 casing that was socketed and mounted on heatsink for efficient heat dissipation. i v curve of illuminated lpc was measured with keithley 2602a. scan through whole i v curve was done in under one second in order to minimize heating of the lpc. measured temperature of the to-39 casing was 302 k. measurement results compared with theoretical sq limits for the same parameters can be seen in table 1. fig. 8 picture of measured gaas lpc mounted on to-39 casing. table 1 measurement and simulated results for lpc under monochromatic illumination for pillum = 14,3 w/cm 2 gaas lpc measured gaas lpc sq ratio [%] η [%] 54,4 76,6 71,0 ff [%] 82,3 90,3 91,1 voc [v] 1,16 1,30 89,2 jsc [a/cm 2 ] 8,24 9,31 88,5 vmpp [v] 1,00 1,20 83,3 jmpp [a/cm 2 ] 7,78 9,12 85,3 pmax [w/cm 2 ] 7,78 10,96 71,0 measured i v curve normalized to calculated sq limit values of jsc_sq and voc_sq [27] for the same parameters can be seen in fig. 9. while jsc and voc of fabricated lpc achieve around 90 % of the theoretical value, pmax at 71 % of theoretical limit still needs to be optimized. reason for low measured pmax in power lost on series resistance rs, which is beside grid shading dominant loss mechanism in manufactured single junction, single segment lpcs as discussed in [28]. 104 r. kimovec, m. topič fig. 9 measured and simulated i v curve of gaas lpc normalized to values of voc_sq and jsc_sq. measurements and calculations were done under monochromatic illumination λ0=808 nm for pillum = 14,3 w/cm 2 5. distribution of losses in lpc following the sq limit we can divide energy conversion from light to electrical energy in lpc in groups. loss analysis for randomly textured l = 1 µm thick lpc with perfect mirror on the back as best case theoretical structure at λ0 = 808 nm, pillum = 14,3 w/cm 2 and fwhm = 5 nm at t = 302 k is shown in fig. 8 in inner section of pie chart. 76,6 % of light energy is converted to useful electrical energy. 13,9 % of the light energy cannot be converted to electrical energy due to lower voltage at maximal power point vmpp than voltage of bandgap, vg. radiative recombinations of e – h pairs contribute to 2,0 % of energy emitted from lpc and 7,5 % is transformed to heat due to the thermal relaxation of photons with energy higher than bandgap. thermal losses could be minimized if monochromatic light source with central wavelength at peak efficiency as seen in fig. 4 would be used. outer section of pie chart in fig. 10 shows measured energy distribution in lpc. rs contribute to significant drop of vmpp resulting in increased loss of useful energy due to vmpp < vg. another 13,3 % of energy is a sum of other electronic and optical losses. fig. 10 distribution of energy conversion in lpc @ pillum = 14.3 w/cm 2 at λ0 = 808 nm and t = 302 k. inner section of pie chart presents energy conversion following sq limit, while outer section presents measured lpc. comparison of measured performance and theoretical limits of gaas laser power converters 105 6. conclusion calculation of sq limits for lpc under monochromatic illumination is a method for evaluation of theoretically achievable limits of lpcs and comparing them to measured results of manufactured devices. we provided insights how lpc design can be further optimized together with appropriate light source in order to achieve high system efficiency. irradiance should be high leading to small surfaces of lpcs and 80 % efficiency could be theoretically achieved for pillum = 100 w/cm 2 . comparison between calculated and measured values shows us that we can already achieve 90 % of theoretical values for jsc and voc while measured pmax achieve 71 % of theoretical limit calculated with sq method. further work should be done to include effect of series resistance in the calculations, since it is a major loss mechanism in single junction single segment lpcs. acknowledgement: the authors acknowledge andreas w. bett and henning helmers from fraunhofer ise for valuable discussion and providing us samples of lpcs. the authors acknowledge the financial support from the slovenian research agency (program p2-0197). r. kimovec thanks the slovenian research agency for his phd funding. references [1] w. shockley and h. j. queisser, "detailed balance limit of efficiency of p‐n junction solar cells," j. appl. phys., vol. 32, no. 3, pp. 510–519, mar. 1961. [2] m. jošt and m. topič, "efficiency limits in photovoltaics: case of single junction solar cells," facta univeristatis, series: electronics and energetics, vol. 27, no. 4, pp. 631–638, 2014. [3] a. w. b. gergö létay, "etaopt – a program for calculating limiting efficiency and optimum bandgap structure for multi-bandgap solar cells and tpv cells," in proc. of the 17th european photovoltaic solar energy conference, munich, germany, 2001, pp. 178–81. [4] a. w. bett, f. dimroth, r. lockenhoff, e. oliva, and j. schubert, "iii-v solar cells under monochromatic illumination," in proc. of the 33rd ieee photovoltaic specialists conference, 2008, pp. 362–366. [5] v. andreev, v. khvostikov, v. kalinovsky, v. lantratov, v. grilikhes, v. rumyantsev, m. shvarts, v. fokanov, and a. pavlov, "high current density gaas and gasb photovoltaic cells for laser power beaming," in proceedings of the 3rd world conference on photovoltaic energy conversion, 2003, vol. 1, pp. 761–764. [6] h. helmers, l. wagner, c. e. garza, and et al, "photovoltaic cells with increased voltage output for optical power supply of sensor electronics," in proceedings of the ama conferences 2015, 2015, pp. 519–524. [7] m. d. sturge, "optical absorption of gallium arsenide between 0.6 and 2.75 ev," phys. rev., vol. 127, no. 3, pp. 768–773, aug. 1962. [8] f. urbach, "the long-wavelength edge of photographic sensitivity and of the electronic absorption of solids," phys. rev., vol. 92, no. 5, pp. 1324–1324, dec. 1953. [9] o. d. miller, e. yablonovitch, and s. r. kurtz, "intense internal and external fluorescence as solar cells approach the shockley-queisser efficiency limit," arxiv prepr. arxiv11061603, 2011. [10] b. beaumont, j. c. guillaume, m. f. vilela, a. saletes, and c. verie, "high efficiency conversion of laser energy and its application to optical power transmission," in proc. of the record of the twenty second ieee photovoltaic specialists conference, 1991, pp. 1503–1507 vol.2. [11] r. pena, c. algora, and i. anton, "gaas multiple photovoltaic converters with an efficiency of 45% for monochromatic illumination," in proceedings of the 3rd world conference on photovoltaic energy conversion, 2003, vol. 1, pp. 228–231 vol.1. [12] t. shan and x. qi, "design and optimization of gaas photovoltaic converter for laser power beaming," infrared phys. technol., vol. 71, pp. 144–150, jul. 2015. [13] d. e. aspnes, s. m. kelso, r. a. logan, and r. bhat, "optical properties of alxga1−x as," j. appl. phys., vol. 60, no. 2, pp. 754–767, jul. 1986. 106 r. kimovec, m. topič [14] p. e. ciddor, "refractive index of air: new equations for the visible and near infrared," appl. opt., vol. 35, no. 9, pp. 1566–1573, mar. 1996. [15] m.-j. huang, c.-r. yang, y.-c. chiou, and r.-t. lee, "fabrication of nanoporous antireflection surfaces on silicon," sol. energy mater. sol. cells, vol. 92, no. 11, pp. 1352–1357, nov. 2008. [16] d. bouhafs, a. moussi, a. chikouche, and j. m. ruiz, "design and simulation of antireflection coating systems for optoelectronic devices: application to silicon solar cells," sol. energy mater. sol. cells, vol. 52, no. 1–2, pp. 79–93, mar. 1998. [17] d. a. steck, classical and modern optics, 1.5.1 ed. 2013. [18] h. r. philipp, "optical properties of silicon nitride," j. electrochem. soc., vol. 120, no. 2, pp. 295–300, feb. 1973. [19] l. hong-liang, l. yan-bo, x. min, d. shi-jin, s. liang, z. wei, and w. li-kang, "characterization of al2o3 thin films on gaas substrate grown by atomic layer deposition," chin. phys. lett., vol. 23, no. 7, p. 1929, 2006. [20] r. e. sah, c. tegenkamp, m. baeumler, f. bernhardt, r. driad, m. mikulla, and o. ambacher, "characterization of al2o3/gaas interfaces and thin films prepared by atomic layer deposition," j. vac. sci. technol. b, vol. 31, no. 4, p. 04d111, jul. 2013. [21] w. s. lee and j. g. swanson, "switching behaviour of al2o3-n gaas misfets," electron. lett., vol. 18, no. 24, pp. 1049–1051, nov. 1982. [22] p. v. bhore, a. p. shah, m. r. gokhale, s. ghosh, a. bhattacharya, and b. m. arora, "effect of facet coatings on laser diode characteristics," indian j eng mater sci, vol. 11, pp. 438–440, 2004. [23] s. abdul hadi, t. milakovich, m. t. bulsara, s. saylan, m. s. dahlem, e. a. fitzgerald, and a. nayfeh, "design optimization of single-layer antireflective coating for gaas p /si tandem cells with , 0.17, 0.29, and 0.37," ieee j. photovolt., vol. 5, no. 1, pp. 425–431, jan. 2015. [24] j. r. devore, "refractive indices of rutile and sphalerite," j. opt. soc. am., vol. 41, no. 6, pp. 416–417, jun. 1951. [25] i. h. malitson, "interspecimen comparison of the refractive index of fused silica," j. opt. soc. am., vol. 55, no. 10, pp. 1205–1208, oct. 1965. [26] h. h. li, "refractive index of alkaline earth halides and its wavelength and temperature derivatives," j. phys. chem. ref. data, vol. 9, no. 1, pp. 161–290, jan. 1980. [27] r. m. geisthardt, m. topic, and j. r. sites, "status and potential of cdte solar-cell efficiency," ieee j. photovolt., vol. 5, no. 4, pp. 1217–1221, jul. 2015. [28] e. oliva, f. dimroth, and a. w. bett, "gaas converters for high power densities of laser illumination," prog. photovolt. res. appl., vol. 16, no. 4, pp. 289–295, jun. 2008. instruction facta universitatis series: electronics and energetics vol. 28, n o 4, december 2015, pp. 557 570 doi: 10.2298/fuee1504557p conversion model of the radiation-induced interface-trap buildup and its hardness assurance application  vyacheslav sergeevich pershenkov national research nuclear university mephi (moscow engineering physics institute), russia abstract. the model, which confirms that the interaction of trapped positive charges (hydrogenous species) in the oxide and electrons from the substrate is an important component of radiation-induced interface-trap buildup, is presented. the “one-to-koi” relationship between the number of trapped holes annealed and number of interfacetrap generated is used for prediction of mos device response in space environment. the model of enhanced low dose rate effect (eldrs) is proposed. eldrs conversion model is based on the assumption that there are two types of traps: shallow and deep. the time constants of these traps are different and correspond to interface-trap buildup at high dose rates for shallow traps and at low dose rates for deep traps. the possible physical mechanism of eldrs effect elimination in the silicon-germanium (sige) bipolar transistors is described. the original mechanism of interface-trap buildup saturation based on radiation-induced charge neutralization (ricn) effect is presented. key words: mos device, bipolar device, interface trap, conversion model, eldrs, hardness assurance 1. introduction total ionizing dose effects in mos and bipolar devices for space electronics connect with radiation-induced positive oxide trapped charge qot and interface-trap nit buildup. electron-hole generation, initial hole yield, continuous-time-random-walk, deep hole trapping and annealing is described in detailed in [1]. physical model [1] is commonly used. the most developed model of radiation induced interface-trap buildup is a twostage “hydrogen” model [2-3]. the other model (so called “conversion” model [4,5]) is based on the assumption that the generation of interface traps connects with the neutralization of positive charge by the substrate or radiation-induced electrons. in this work the conversion model of interface trap buildup is used for the estimation of long time operation mos and bipolar devices in space environment. the introducing of received april 30, 2015 corresponding author: vyacheslav sergeevich pershenkov national research nuclear university mephi (moscow engineering physics institute), russia (e-mail: vspershenkov@mephi.ru) 558 v. s. pershenkov quantitative relationship between two physical processes gives us the possibility to develop numerical prediction methods for the estimation of long time operation mos and bipolar devices in space mission. the use of the conversion model for the description of low dose rate effect in sige transistors and interface-trap buildup saturation are described. 2. conversion model of interface-trap buildup radiation induced buildup of interface traps nit is a problem that has been known for the last 35 years [2,3]. in addition to the works [4] where interface trap generation is connected with electron capture by trapped holes, none widely known experimental results described in [5]. the experimental dependencies of the threshold voltage shift δvit (caused by the interface-trap buildup) versus the annealing time for different four tests are presented in fig. 1. a maximum change of δvit is observed in test 1, when both electrons and hydrogenous species are presented near the surface. in other cases, when there are no electrons (test 2) or no hydrogen species (test 3) or both are near the interface (test 4), shift δvit is essentially reduced. these experimental data confirms the hypothesis that only the presence of hydrogen is not enough for an effective interface trap buildup. the interaction between hydrogen complexes and electrons from substrate is an important component of this process. fig.1 interface-trap component of the threshold voltage shift δvit versus the annealing time in the hydrogen atmosphere (after ref. [5]) conversion model of the radiation-induced interface-trap buildup and its hardness assurance applications 559 3. prediction of mos devices response in space environment total radiation induced threshold voltage shift ∆vth is usually separated to the components due to oxide trapped (∆vot) and interface trap (∆vit) charge buildup th ot it v v v    (1) to separate accumulation and annealing effects which occur simultaneously during irradiation, the technique of linear response theory can be used. at time t the ∆vot response to an arbitrary irradiation starting at t = 0 and described by the dose rate function γ(t) which can be obtained through the convolution integral [6] ( ) ( ') ' ot r v t v t t dt    (2) where ∆vr(t  t׳) is the impulse response function. to describe the annealing process we use the equation for ∆vr introduced in [6]. if after the end of irradiation at t →∞ all trapped holes are completely annealed, the impulse response function ∆vr is given by [6] 0 0 ( ) /(1 / ) r v t v t t      , (3) where ∆v0, t0 and ν are fitting constant. for irradiation time tir using this impulse response function with γ(t) = γ0 for t < tir and γ(t) = 0 for t > tir we have 1 0 ( ) [(1 / ) 1], ot ir v t c t t t t        , (4a) 1 1 0 0 ( ) [(1 / ) (1 ( ) / ) ], ot ir ir v t c t t t t t t t            (4b) where 0 0 0 /(1 )c t v    similar equations were derived in [6]. if no annealing occurs (ν = 0), the threshold voltage shift would reach its maximum value _ max 0 ( ) ot v t v d  , (5) where d is the total absorbed dose. the threshold voltage shift ∆vit includes fast and slow components. we suppose that for times greater than about 10 -3 s the fast component is proportional to the dose _ ( ) it fast i v t v d   , (6) where ∆vi is the fitting constant. according to conversion model of interface buildup, the interface state density is proportional to decrease of positive charge, i.e. there is some conversion coefficient koi which reflects strong correlation between the accumulation of slow interface states and trapped hole annealing. following this approach we can write for slow interface density component ∆nit_slow: _ _ max ( ) it slow oi ot ot v k n n    , (7) where ∆not_max corresponds to ∆not_max. 560 v. s. pershenkov in this case for slow component we have: _ _ max ( ) it slow oi ot ot v k v v    , (8) note, that the process of interface annealing is ignored, because at room temperature they decay with a time constant of several years. finally, we have the analytical equations for interface voltage shift: ( ) it oi o i oi ot v k v v d k v      , (9) the practical formula for hardness assurance application of mosfet voltage shift response can be derived from equation (1): ( ) (1 ) th oi o i oi ot v k v v d k v       , (10) where ∆vot is calculated using (4a). equation (10) has five fitting parameters: koi, ∆vo, ∆vi, t0 and ν, which can be found numerically using experimental data obtained in laboratory tests with high dose rate irradiation. there are several approaches to fitting procedure: solving of nonlinear least squares problem for five unknown parameters, implementation of separation techniques and so on. more convenient approach is to find three constants ∆vo, t0 and ν using the experimental data on ∆vot and two constants koi and ∆vi from analysis of ∆vit(t). the constants can be extracted from at least three experimental points ∆vot and ∆vit versus t. the reasonable value for the first measurement is taken to be equal to 1s after the end of irradiation. the monte-carlo simulation shows that the second point can correspond to interval 2 tir and the third measurement can be done at 100 tir [7]. the results of parameter extraction for our experimental data as well as for data taken from [8-11] are listed in table 1. table 1 parameters extracted from experimental data (after ref. [7]). data vg (v) ∆v0 (v/rad) 10 -6 t0 (s) ν koi ∆vi (v/rad) 10 -7 [8], fig 2 5.0 0.35 26 0.082 0.0 0.6 [9], fig 1 6.0 14 1.5 0.081 0.73 2.5 [9], fig 4 6.0 3.6 0.018 0.078 0.44 4.5 [10], fig 5 5.0  8900 0.405 0.25  [11], fig 13 2.5 21 110 0.1 0.41 12 experiment: n-channel, 30nm 0 2.5 5.0 1.1 0.83 0.6 0.0004 0.0016 0.019 0.074 0.083 0.092 0.0 0.12 0.12 1.5 0.14 0.0028 experiment: n-channel, 100nm 0 2.5 5.0 20 23 22 15 16 48 0.026 0.035 0.078 1.0 1.0 1.0 56 59 98 4. low dose rate effect in bipolar devices the low dose rate effect in bipolar transistors or the enhanced low-dose-rate sensitivity (eldrs) consists in more serve degradation of bipolar structure current gain for the given total dose following the low dose rate [12]. the eldrs model in the given work is based on conversion model of the radiation-induced interface-trap buildup and its hardness assurance applications 561 the hydrogen-electron (h-e) conversion model. the motivation of this development is the creation of a model that is allowed to obtain a quantitative numerical estimation of radiation degradation of bipolar transistor current gain for the arbitrary dose rate and temperature. because the h-e model is based on the conversion of a radiation-induced positive trapped charge to interface traps, the model described below is called the eldrs conversion model. to explain the classical radiation-induced positively charge annealing [13] and the reversibility of annealing effect [14], it is necessary to consider two positions of positive centers in the oxide forbidden gap: the non-rechargeable centers located about 1 ev above sio2 valence band [12], and the rechargeable parts of the oxide trapped charge located opposite the silicon forbidden gap [13]. direct substrate electron tunneling to positive centers, located opposite the silicon forbidden gap, is impossible because the tunneling electron energy must be constant (basic principles of quantum mechanics). but tunneling to the thermally activated positive centers is still possible. the positive centers energy level can reach the silicon conduction band due to a thermally excited vibration of the lattice (fig. 2,a). the positive charge can be neutralized by hole emission to silicon valence band (fig. 2,b). below the case of an interaction of positive charge and electron (fig. 2,a) will be considered. fig. 2 conversion of oxide charge (qot)rech to interface trap nit: capture of an electron e (a), emission of a hole h (b). ec and ev are energy levels of si conduction and valence band an interaction of thermally excited rechargeable positive charges and tunneling substrate electrons leads, according to conversion model, to interface-trap buildup. the physical nature of the conversion process can be connected with changing a distance between positive si+ and neutral sio atoms (eγ′ center, hole trap) after electron capture by eγ′ center [15]. the probability of the oxide positive center excitation up to conduction band depends on its energy depth in oxide relatively si forbidden gap. the shallow oxide traps (near conduction band) are converted for short time, while the deep traps (opposite to middle of si forbidden gap) need much more time for conversion. 562 v. s. pershenkov for simplicity, it is supposed that there are two kinds of oxide traps: shallow traps with small time of conversion, responsible for the degradation at high dose rates, and deep traps determining the excess base current increasing at long times of irradiation, i.e. at low dose rates (fig. 3). the shallow traps are converted with time constant τs; the conversion time of the deep traps is τd. essentially, the conversion time of the deep traps or constant τd is responsible for eldrs. fig. 3 the shallow (qot)s and deep (qot)d oxide trapped charges with conversion time τs and τd as shown in [16], the degradation of the base current as a function of dose rate (for irradiation time much more than 1 s) can be written as: ( ) 1d d b d s d d i k k d k e                   , (11) where ks is excess base current per unit dose at high dose rate; kd is excess base current per unit dose at low dose rate; γ is dose rate; d is the total dose. a conversion of oxide charge to interface traps is a thermal stimulating process. to consider a temperature effect on base current degradation, dependence of deep trap conversion time from temperature is introduced. temperature dependence of time constant τd can be described by arrhenius equation: 0 exp( / ) d d a e kt  , (12) where τd is conversion time of deep traps; t is temperature; eа is the activation energy of the oxide trap thermal excitation; k is the boltzmann's constant; τd0 is pre-exponential coefficient. conversion model of the radiation-induced interface-trap buildup and its hardness assurance applications 563 thus eldrs conversion model has 4 fitting parameters: ks, kd, eа and τd0. their extractions are performed by the following steps presented in [17]: 1. constant ks determining the contribution of shallow trapped charge conversion to base current degradation is estimated as a ratio of base current degradation to the specified total dose at 10 rad(sio2)/s irradiation. 2. the deep traps conversion time or constant τd is estimated from data of postirradiation anneal following high dose rate irradiation to the specified total dose. pre-exponential constant τd0 and activation energy ea in (12) are derived from the data for two different temperatures of elevated temperature post-irradiation anneal. 3. constant kd determining the contribution of deep trapped charge conversion to base current degradation at low dose rate is estimated from elevated temperature irradiation data. constant kd is derived from (11), where the constant τd for using elevated temperature is calculated from (12) (values of τd0 and activation energy ea are determined on step 2). the eldrs conversion model was validated by comparison with previously reported experimental data. two examples are shown below. in fig. 4 calculated and experimental results obtained from relationship (11) and [18] are shown. relationship (11) well describes experimental data [18] for values of fitting constants: ks = 1.35∙10 -3 na/rad(sio2), kd = 8.65∙10 -3 na/rad(sio2), τd = 2.2∙10 5 s (for lateral pnp) and ks = 0.16∙10 -3 na/rad(sio2), kd = 1.49∙10 -3 na/rad(sio2), τd = 5.0∙10 5 s (for substrate pnp). the same results for [19] are shown in fig. 5. fitting constants for that case are: ks = 0.33∙10 -3 na/rad(sio2), kd = 6.33∙10 -3 na/rad(sio2), τd = 3.0∙10 5 s. fig. 4 excess base current versus dose rate. experimental [18] and calculated data from relationship (11). 564 v. s. pershenkov fig. 5 excess input base current lm158 versus dose rate. experimental [19] (dots) and calculated data from conversion model (11). the conversion model proposed also explains why the base current starts growing 10 5 s after the cessation of the short-term, high dose rate irradiation [19]. the reason is that the charge at the deep oxide traps has no time to be converted into interface traps during the short-term, high dose rate irradiation. it is not accidental that the measured value τd = 3.0∙10 5 s is of the same order of magnitude as the started delay in [19]. 5. eldrs in sige transistors the activation energy of deep positive oxide center with energy eot in the oxide (fig. 6) can be presented as the sum of the energy of thermal excitation ∆ed from eot to electron energy at conduction band edge ec and energy of elastic coupling of positive center with lattice atoms: a d latt e e e   , (13) where eact is the activation energy of the positive oxide trap; ∆ed = ec – eot ; ec is the electron energy at conduction band edge; eot is energy level of positive trap in the oxide; elatt is the energy of elastic coupling of positive center with lattice atoms. in sige hbts due to the ge content, the bandgap narrowing in base region takes place. the bandgap narrowing ∆eg leads to a reducing of the energy interval (∆ed)sige which is needed for an interaction of the thermal exited deep oxide traps and tunneling substrate electrons. it leads to a reducing of deep trap conversion time and during any dose rate irradiation all oxide trapped charges have time to be converted into interface traps. as a result, deep traps can act as shallow traps, and eldrs is eliminated. the reducing of a necessary exited energy for conversion of deep traps in sige transistors depends on bandgap narrowing ∆eg of base region under base spacer interface: ( ) ( ) d sige d si g e e e    , (14) conversion model of the radiation-induced interface-trap buildup and its hardness assurance applications 565 where (∆ed)sige is the thermal exited energy for conversion of the oxide deep traps in sige transistor; (∆ed)si is the thermal exited energy for conversion of the oxide deep traps in conventional si transistor; ∆eg is bandgap narrowing of base region under base spacer interface of sige hbt. fig. 6 the energy of thermal excitation ∆ed from level of positive trap in the oxide eot to conduction band edge ec . eg is bandgap of semiconductor. it can be shown using results of [16] that for conventional bipolar devices the deep trap location is near 0.21ev – 0.29 ev below the edge of conduction band. fig.7 presents the effect of bandgap narrowing on the exited energy ∆ed which is enough for conversion of deep traps into interface traps. the line 1 in fig. 7 corresponds to initial value ∆ed = 0.29ev, line 2 corresponds the initial value ∆ed = 0.21 ev. the dotted line shows the boundary between eldrs region and region where eldrs is absent (eldrs-free). fig. 7 the effect of bandgap narrowing ∆eg on the exited energy ∆ed. the dotted line presents the boundary between eldrs region and region where eldrs is absent (eldrs-free). 566 v. s. pershenkov we consider that the eldrs boundary (existence or absence eldrs) corresponds to ∆ed = 0.12 ev. it connects with following physical reason. a spreading of the energy location of the positive oxide traps by temperature excitation can be estimated as ±(2-3) kt. it means that shallow and deep energy levels can be separated as different traps if the energy gap between their locations more than approximately 5 kt or 0.0125 ev. for ∆ed more than 0.12 ev the shallow and deep oxide traps act as the different traps and eldrs can be observed (above dotted line in fig.7). for ∆ed less than 0.12 ev the shallow and deep oxide traps are equivalent one trap and eldrs cannot be observed (under dotted line in fig.7). in sige hbts the value of bandgap narrowing has order 0.1ev – 0.2 ev. fig. 8 shows valence band offset as a function of ge content [20, fig.9]. fig. 8 valence band offset as a function of ge content (after ref. [20]). therefore, for sige devices eldrs will be not observed (no eldrs region in fig. 7) if bandgap narrowing more than 0.1 ev or 0.18 ev. it is very probable that parameters of the modern sige hbts lay within “no eldrs” region. this conclusion agrees with experimental data of [20], where was said: “and to first order, enhanced low dose rate sensitivity (eldrs) is not observed in sige hbts, which is clearly good news since it is a traditional concern in most si bjt technologies” [20, page 2001]. the eldrs conversion model can give physical explanation of this statement. 6. saturation of the radiation-induced interface-trap buildup the analysis of this section is based on the assumption that the positive charge of trapped holes in oxide is transformed through electron capture into a new defect (the ad center) with two energy states in forbidden gap of si [21]. this is point defect, for which the high energy level is acceptor-like and lower energy level is donor-like. the following process of ad center generation and annihilation is proposed. the strained si-si bond (oxygen vacancy) serves as precursor for this radiation-induced defect. this precursor can be treated as a non-activated donor center d. the radiation induced holes are captured by deep d traps creating a positive charged d + center: d + h = d + . free electron capture by d + center causes its transformation to the two-level ad center: d + + e = a 0 d 0 . the ad conversion model of the radiation-induced interface-trap buildup and its hardness assurance applications 567 defect can be found in four different states: a 0 d 0 , a – d 0 , a 0 d + , a – d + . the superscripts after a and d designate charge state of the acceptor and donor levels respectively: a 0 d 0 – acceptor level is empty, donor level is occupied; a – d 0 – acceptor and donor levels are occupied; a 0 d + – acceptor and donor levels are empty; a – d + – acceptor level is occupied, donor level is empty. the charge exchange of the a 0 d 0 with radiation induced or substrate electrons leads to a – d 0 and a 0 d + . the charge state a – d + cannot be stable and is assumed to immediately relax back to the d precursor due to energy released during electron transition from higher (a) to lower (d) levels. therefore, the appearance of the a – d + state leads to the annihilation of the ad center. the saturation can be explained by two competitive processes: accumulation and annihilation (annealing). at mathematical form it can be written / /( ) it it ann it dn dt g n   , (15) where g is accumulation rate of interface trap; nit is density of interface traps; (τann)it is the time constant of interface state annihilation. in saturation, dnit/dt = 0 and nit reaches a saturated value ( ) ( ) it sat ann it n g   , (16) the accumulation rate of nit buildup is proportional to the dose rate ( ) acc it g k  , (17) where (kacc)it is a coefficient characterizing interface trap accumulation; γ is the dose rate. therefore ( ) ( ) ( ) it sat acc it ann it n k     , (18) the value of (nit)sat is proportional the dose rate γ if (kacc)it and (τann)it are constants. but, as follows from experimental data, the value interface trap concentration in saturation (nit)sat is very weak function of the dose rate. the changing of the dose rate at more than 4 orders in region from 300 krad (si)/min to 13 rad (si)/min leads to very small variation of (nit)sat [22]. the same result is obtained in [23,24], where the saturation of nit was observed for the changing of the dose rate from 333 rad (sio2) to 5.25 rad (sio2). the coefficient (kacc)it is very weak function of the dose rate. it follows from linear dependence of nit buildup at small total doses, that agrees with numerous experimental data reported by [22, 24, 25]. the value (nit)sat is not dependent at the dose rate γ if (τann)it is inversely proportional γ or an annihilation (annealing) of interface traps depend on the dose rate. it is necessary to consider radiation induced charge neutralization (ricn) effect. usually ricn effect concerns to the annealing of oxide trapped charge. in given work we suppose using ricn effect as basic mechanism of interface-trap annealing. consider the case when annihilation takes place from a 0 d + configuration after capture radiation-induced electron. the a 0 d + state transforms to a – d + state, which is not stable and is assumed to immediately relax back to the d precursor. the nit annihilation process can be described by the relationship from recombination theory of shocklyread-hall [26] ( / ) it ann th t it dn dt v n n   , (19) 568 v. s. pershenkov where υth is the thermal velocity; σt is the capture cross-section of ad center; n is concentration of radiation induced electrons. concentration of radiation induced electrons equal p y n k k  , (20) where kp is generation rate per unit dose rate; ky is electron yield; γ is the dose rate. result of substituting (20) in equation (19) is ( / ) /( ) it ann th t p y it it ann it dn dt v k k n n       , (21) where ( ) / ann it ad k  , (22) 1/ ad th t p y k v k k , (23) it means from (18) that ( ) ( ) it sat acc it ad n k k  , (24) the value of density of interface trap in saturation, as follows from (24), depends on product of interface trap accumulation rate (kacc)it and constant kad which is function of thermal velocity, capture cross-section of ad center, generation rate and electron yield of radiation induced electrons. consider the analysis of the some results of work [25], using relationship (24). two vendors (vendor “a” and vendor “b”) of n-channel metal-oxide-semiconductor field effect transistors (mosfets) were irradiated with x-ray. the vendors had different initial values of interface trap density and were irradiated at different dose rates, which presented in table 2 with estimated value of (kacc)it and kad. table 2 experimental conditions and estimation results for transistor venders from [25]. dose rate (rad(sio2)/s) initial nit, cm -2 (kacc)it, (rad(sio2) -1 cm -2 ) kad, rad(sio2) (nit)sat, cm -2 vendor “a” 170 2*10 10 6.4*10 4 1.6*10 7 1*10 12 vendor “b” 1700 2*10 11 1.15*10 6 1.7*10 7 2*10 13 the values of kad for different venders are the same despite different initial nit values and irradiation dose rate. it means that model, presented in this work, is able to describe physical mechanism of interface-trap buildup saturation correctly. value of (kacc)it is determined by initial nit buildup rate and depends on parameters of manufacture technology process and irradiation dose rate. the additional information concerning interface-trap buildup saturation can be find in [27]. 7. conclusion the eldrs conversion model for modeling the radiation-induced degradation of bipolar device parameters for the impact of low dose rate irradiation is described. the model is based on the concept that the radiation-induced interface-trap buildup connects with the hydrogen-electron mechanism, where both hydrogenous species and electrons are responsible for radiation-induced interface-trap formation. the interaction of trapped conversion model of the radiation-induced interface-trap buildup and its hardness assurance applications 569 positive charges (hydrogenous species) and electrons from the substrate leads to the formation of interface traps. the main feature of the eldrs conversion model includes the fitting parameter extraction techniques. the model was validated by comparing it with the previously reported experimental data for different technologies and devices. according to conversion model of interface trap buildup, bandgap narrowing of the sige bipolar transistor base region leads to reducing of deep trap conversion time and, as a result, during irradiation at any dose rate all oxide trapped charges have enough time to be converted into interface traps. therefore, there is no difference between test dose rate and low dose rate irradiation (eldrs-free). the interface-trap buildup saturation is explained by an interaction of the radiation-induced electrons with centers which were formed during conversion process. references [1] t.r. oldham, f.b. mclean, "total ionizing dose effects in mos oxides and devices", ieee trans. nucl. sci. ns-50, no. 3, 483-499, 2003. doi: 10.1109/tns.2003.812927 [2] p.s. winokur, h.e. boesch jr., "interface state generation in radiation-hard oxides", ieee trans. on nuclear science, vol. 27, no. 6, pp. 1647-1650, 1980. doi: 10.1109/tns.1980.4331083 [3] f.b. mclean, "a framework for understanding radiation-induced interface state in sio2 mos structures", ieee trans. on nuclear science, vol. 27, no. 6, pp. 1651-1657, 1980. doi: 10.1109/tns.1980.4331084 [4] s.k. lai, "interface trap generation in silicon dioxide when electrons are captured by trapped holes", j. appl. phys., vol. 54, pp.2540-2546, may 1983. doi: 10.1063/1.332323 [5] a.v. sogoyan, s.v. cherepko, v.s. pershenkov, "hydrogen-electron model of radiation induced interface trap buildup on oxide-semiconductor interface", russian microelectronics, vol. 43, no. 2, pp. 162-164, 2014. [6] f.b. mclean, "generic impulse response function for mos systems and its application to linear response analysis", ieee trans. on nuclear science, vol. 35, no 6, pp. 1178-1185, 1988. doi: 10.1109/23.25436 [7] v.s. pershenkov, v.v. belyakov, s.v. cherepko, i.n. shvetzov-sholovky, "threepoint method of prediction of mos device response in space environments", ieee trans. nuclear science, vol. 40, no. 6, pp.1714-1720, 1993. doi: 10.1109/23.273488 [8] m.p. baze, r.e. plaag, a.h. johnston, "a comparison of methods for total dose testing of bulk cmos and cmos/sos devices", ieee trans. on nuclear science, vol. 36, no. 6, pp. 1818-1824, 1990. doi: 10.1109/23.101195 [9] d.m. fleetwood, p.s. winokur, j.r. schwank, "using laboratory x-ray and cobalt-60 irradiations to predict cmos device response in strategic and space environments", ieee trans. on nuclear science, vol. 35, no. 6, pp. 1497-1505, 1988. doi: 10.1109/23.25487 [10] b.j. mrstik, r.w. rendell, "si/sio2interface state generation during x-ray irradiation and during postirradiation exposure to a hydrogen ambient", ieee trans. on nuclear science, vol. 38, no. 6, pp. 11011110, 1991. doi: 10.1109/23.124081 [11] a.j. lelis, t.r. oldham, w.m. delancey, "response of interface traps during high-temperature anneal", ieee trans. on nuclear science, vol. 38, no 6, pp. 1590-1596, 1991. doi: 10.1109/23.124150 [12] r. l. pease, r.d. schrimpf, d.m. fleetwood, "recent advances in understanding total-dose effects in bipolar transistors", ieee trans. on nuclear science, vol. 57, no. 4, 1894-1908, 2009. doi: 10.1109/radecs.1995.509744 [13] p. j. mcwhorter, s. l. miller, w. m. miller, "modeling the anneal of radiation-induced trapped holes in a varying thermal environment", ieee trans. nucl. sci., vol. 37, no. 6, p.1683, dec. 1990. doi: 10.1109/23.101177 [14] v.v. emelianov, a.v. sogoyan, o.v. meshurov, v.n. ulimov, v.s. pershenkov, "modeling the field and thermal dependence of radiation-induced charge annealing in mos devices", ieee trans. on nucl. sci., 1996, vol. ns-43, no. 6, pp.2572-2578. doi: 10.1109/23.556838 [15] e.p. reilly, j. roberson, "theory of defects in vitrous silicon dioxide", phys. rev. b, vol.27, no. 6, p. 3780 (1981). 570 v. s. pershenkov [16] v.s. pershenkov, d.v. savchenkov, a.s. bakerenkov, v.n. ulimov, a.y. nikiforov, a.i. chumakov, a.a. romanenko, "the conversion model of low dose rate effect in bipolar transistors", in proceedings of radecs, pp. 286-393, 2009. doi: 10.1109/radecs.2009.5994661 [17] a.s. bakerenkov, v.v. belyakov, v.s. pershenkov, a.a. romanenko, d.v. savchenkov, v.v. shurenkov, "extracting the fitting parameters for the conversion model of enhanced low dose rate sensitivity in bipolar devices", russian microelectronics, vol. 42, issue 1, january, 2013, pp. 48-52. doi: 10.1134/s1063739712040026 [18] s.c. witczak, r.d. schrimpf, k.f. galloway, d. m. fleetwood, r.l. pease, j.m. puhl, d.m. schmidt, w.e. combs, j.s. suehle, "accelerated tests for simulating low dose rate gain degradation of lateral and substrate pnp bipolar junction transistors", ieee trans. on nucl. sci., vol. 43, no. 6, pp.3151-3160, 1996. [19] r.k. freitag, d.b. brown, "study of low-dose-rate effects on commercial linear bipolar ics", ieee trans. on nucl. sci., vol. 45, no. 6, pp.2649-2658, 1998. [20] john d. cressler, "radiation effects in sige technology", ieee transactions on nuclear science, vol. 60, no. 3, pp. 1992-2014, june 2013. doi: 10.1109/tns.2013.2248167 [21] v.s. pershenkov, s.v. cherepko, a.v. sogoyan, v.v. belyakov, v.n. ulimov, v.v. abramov, a.v. shalnov, v.i. rusanovsky, "proposed two-level acceptor-donor (ad) center and nature of switching traps in irradiated mos structures", ieee transactions on nuclear science, vol. 43, no. 6, pp. 25792586, 1996. [22] m.p. baze, r.e. plaag, a.h. johnston, "dose dependence of interface traps in gate oxides at high levels of total dose", ieee transactions on nuclear science, vol. 36, no. 6, pp. 1858-1864, 1989. doi: 10.1109/23.556839 [23] j. boch, y.g. velo, f. sainge, n. roche, r.d. schrimpf, j. vaille, l. dusseau, c. chatry, e. lorfevre, r. ecoffet, a.d. touboul, "the use of dose rate switching technique to characterize bipolar devices," ieee transactions on nuclear science, vol. 53, no. 6, pp. 3347-3353, 2009. doi: 10.1109/tns.2009.2033686 [24] h.j. barnaby, r.d. schrimpf, r.l. pease, p.cole, t. turflinger, j.kreig, j. titus, d. emily, m. gehlhausen, s.c. witczak, m.c. maher, d. van nort, "identification of degradation mechanisms in bipolar linear voltage comparator through correlation of transistor and circuit response", ieee transactions on nuclear science, vol. 46, no. 6, pp. 1666-1673, 1999. doi: 10.1109/23.819136 [25] j. m. benedetto, h.e. boesch, jr., f.b. mclean, "dose and energy dependence of interface trap formation in cobalt-60 and x-ray environments", ieee transactions on nuclear science, vol. 35, no. 6, pp. 1260-1264, 1988. doi: 10.1109/23.25449 [26] s.m. sze, physics of semiconductor devices, new york, willey, 1981. [27] v.s. pershenkov, a.s. bakerenkov, a.v. solomatin, v.v. belyakov, v.v. shurenkov, "mechanism of the saturation of the radiation induced interface buildup", applied mechanics and materials, vol. 565, pp. 142-146, 2014. doi: 10.4028/www.scientific.net/amm.565.142 instruction facta universitatis series: electronics and energetics vol. 29, n o 1, march 2016, pp. 1 10 doi: 10.2298/fuee1601001l enhanced dynamic voltage clamping capability of clustered igbt at turn-off period  hong. y long, mark. r sweet, e. m. sankara narayanan department of electrical and electronic engineering, university of sheffield, uk abstract. one of the critical requirements for high power devices is to have rugged and reliable capability against hash operating conditions. in this paper, we present the dynamic voltage clamping capability of 3.3kv field stop clustered igbt devices under extreme inductive load condition. it shows that pmos trench gate cigbt structure with outstanding performance of fast turn-off time and low over-shoot voltage. further optimization of current gain of cigbt structure is analyzed through numerical evaluation. a step further in the safe operating area has been achieved for high voltage devices by cigbt technology. key words: insulated gate bipolar transistor (igbt), power semiconductor devices, clustered igbt 1. introduction similar to short circuit device failure, dynamic latch-up of high voltage igbts represents another practical failure mode during device turn-off under dynamic avalanche conditions. overshoot of anode voltage occurs during device turn-off, especially for parallel connected power modules is very critical for igbt operation and should be protected within the limited safe operating area (soa). manufacturers and circuit designers have been trying to suppress the peak voltage by reducing anode current turnoff di/dt or de-rating and the use of voltage clamping circuits, snubbers to achieve sustainable capability. however, these methods unavoidably increase the turn-off switching loss, cost and the complexity of the system. the self-voltage clamping characteristics of igbt have been reported in [1-4]. it must be capable of absorbing all the energy stored in the inductance during abnormal conditions [5]. it is important to develop igbt without destruction even under the condition of dynamic avalanche [6]. during turn-off, the abruptly reduction of gate voltage seizes the injection of electron from the n-channel. the anode current continues to flow due to the inductive load. it must be sustained by the hole current. the hole carriers flows across the and modifies the received june 08, 2015 corresponding author: e.m.sankara narayanan department of electrical and electronic engineering, university of sheffield, united kingdom (e-mail: s.madathil@sheffield.ac.uk) 2 h. y. long, m. r. sweet, e. m. s. narayanan effective carrier concentration in the n-drift region. the profile of electric field is determined by the poisson equation in e.g. (1) (1) wherein neff is the effective carrier concentration in the n-drift region. these extra carriers lead to an increase in neff. it can modify the profile of the electrical field and may force the device into a dynamic avalanche mode by the high peak electric field. this process is stable if the extra generated electrons and holes are balanced in numbers and will continue until all the remaining excess carriers are eliminated and subsequently, the dynamic avalanche mode is suddenly eliminated. otherwise, the process can get out of control by the avalanche-generated carriers and would lead to a device failure. due to the stray inductance in the circuit, the igbt anode voltage over-shoots and eventually the electric field could punch through the n-drift region. when the anode voltage reaches the dc bias voltage, the anode current begins to fall as the current is transferring to the diode in a rate depending on the stray inductance and peak anode voltage. the capability for the power devices to dissipate a large amount of power dissipated during the period could be improved by employing a high igbt internal pnp gain, β [1]. more hole carriers will balance the effective carriers in the n-drift region, but this approach would have increased turn-off loss and higher leakage current in the off-state. in this paper, we demonstrate the dynamic avalanche ruggedness of 3.3kv field-stop clustered igbt (cigbt) [7-10] with self-voltage clamping capability. the technology shows improved safe and efficient operation and will ease the design constrictions on the system level. 2. self-clamped inductive switching capability 2.1. device structure cigbt is a mos-bipolar device employing a controlled thyristor concept to significantly reduce on-state voltage drop. it has the unique capability to clamp the cathode cell potential by punch-through of an n-well region between the p-base and pwell, termed as “self-clamping”. the feature improves current saturation characteristics and enables better short circuit performance [11]. the single cell schematic structures of 3.3kv class, conventional, pmos trench gate cigbt and field-stop igbt structures are shown in fig. 1(a)-(c) respectively. the igbt structure model is optimized for comparable purpose [11]. as a result, all structures have the same cell dimensions. the pmos trench cigbt [12], fig. 1(b), is identical to that of the conventional cigbt, fig. 1(a), except that a pmos trench gate (width=1µm, depth=4µm) connects the p-base to gate. the pmos and nmos gates are connected together to form a three terminal device. the pmos channels are only conducted during the turn-off cycle when the gate voltage is negative and is used for hole current pass. a constant lifetime of 50µs is chosen for both electrons and holes and it is assumed that the edge termination does not have any impact upon device performance under this condition. the simulated cigbt structures have only one full cell considered although in reality each cluster can consist of 50 to 100 cathode cells. enhanced dynamic voltage clamping capability of clustered igbt at turn-off period 3 fig. 1 schematic structure diagram of (a) planar gate cigbt, (b) planar gate cigbt with deep pmos trench channel and (c) conventional planar gate igbt. 2.2. device turn-off performance the 3.3kv fs cigbt and igbt structures listed in fig. 1 are simulated to compare their capability to clamp voltage under such extreme condition. the circuit configuration for the inductive turn-off is shown in fig. 2. these devices are turned-off at vdc=2500v, ianode=150a and tj=25˚c. a large stray parasitic inductance of lc=2.4µh is also included in the circuit. it is important to point out that there is no gate resistor used in the circuit. because conventional technology normally requires large gate resistance to suppress the dynamic avalanche generation, but the turn-off loss increases in this case mainly due to the change of the reduction of dv/dt and longer turn-off time [13]. a further increase in turn-off losses and applying de-rating factor to power devices will cause a significant loss in soa capability. the reduction of rg in new technology will provide much lower power losses, shorter delay time during turning-off transient when compared to conventional technology. fig. 2 circuit setup of inductive load turn-off simulation. 4 h. y. long, m. r. sweet, e. m. s. narayanan fig. 3 igbt and cigbt turn-off waveforms (vdc=2500v, ia=150a, tj=25˚c, solid line: anode voltage; dashed line: anode current). fig. 3 shows the turn-off waveform of planar gate cigbt, pmos trench gate cigbt and conventional igbt. the maximum voltage peak across the igbt during the transient is about 400v higher than the other cigbt devices and associated with strong voltage oscillation. the planar gate cigbt has a slow dv/dt in comparison to igbt device. this is because cigbt has several times higher conductivity modulation of the n-drift region due to thyristor conduction [10] . it takes longer time to remove excess carriers from its ndrift region. on the other hand, the low dv/dt helps to maintain current and voltage levels within the soa, ease the high power stress across the device and less voltage peak and oscillation are found. pmos trench gate cigbt is the best performed device by displaying both fast turn-off time and low voltage peak in contrast to the other two structures. the current flow lines of planar gate igbt, planar gate cigbt and pmos trench gate cigbt at 200ns after the gate turn-off, when the mos channel of these devices has cutoff and enters dynamic avalanche in the n-drift region, are shown in fig. 4 (a)-(c), respectively. the cigbt devices behave differently to that of igbt due to its current is carried by a controlled thyristor. the holes within the p-well region flow through the depleted n-well at the saturated hole velocity and are collected at the cathode contact. it should be noted that the n-well is completely depleted when the anode voltage exceeds the self-clamping value of the n-well. avalanche-generated electron and hole carriers can also be found by the laterally displayed current flow lines. with pmos trench gate, it conducts during turn-off period when the gate voltage goes negative. it extracts the holes vertically by the trench gate channels and enhanced the capability of cigbt to remove charges underneath the cathode region. lower current density can thus be achieved. enhanced dynamic voltage clamping capability of clustered igbt at turn-off period 5 (a) (b) (c) fig. 4 current flow lines of (a) igbt, (b) cigbt, and (c) pmos trench gate cigbt structure at time=200ns. 6 h. y. long, m. r. sweet, e. m. s. narayanan after the turn-off of the gate voltage, the dc voltage is then supported within the structure by the formation of the depletion region. depending on the concentration of the excess carriers in the depletion region, the width of the depletion region expands with time allowing the device to support larger anode voltage. the electric field profiles in the n-drift region during turn-off period are plotted in fig 5. the electric field expands towards anode contact to support higher voltages and eventually punches through to the n-buffer region at their maximum clamped voltage. it should be noted that the different positions of electric field peaks at the cathode side are due to the forward blocking voltage is support by the p-base/n-drift junction for igbt whereas it is supported by the p-well/n-drift junction for cigbt devices. fig. 5 simulated electric fields distribution of structures after gate turn-off (solid line: time=200ns, dash line: time=400ns, and dotted line: time at maximum anode voltage). fig. 6 simulated effective carrier concentration of structures based on the results shown in fig. 5 (solid line: time=200ns, dotted line: time=400ns, and dash line: time at maximum anode voltage). enhanced dynamic voltage clamping capability of clustered igbt at turn-off period 7 the electric field of planar gate cigbt expands at a slower rate in comparison to the other two devices. this could be explained by the neff concentration across the structures as shown in fig 6. planar gate cigbt has a significant high portion of carriers are concentrated at the cathode side than the other two structures. in the process of time, neff is moving towards anode contact and becomes more evenly distributed across the whole region. it is important to notice that the with the help of pmos trench gate, the number of hole carriers at the cathode side has greatly reduced in comparisons to the conventional cigbt. this technology provides an efficient way to remove excess carriers. 3. optimization of current gain of fs-cigbt the self-clamping of the over-shoot voltage can be achieved by optimization of nbuffer layer in fs technology. this results larger soa required for high voltage devices. like short circuit condition, the self-clamped voltage is influenced by the internal pnp current gain, βpnp, which is a function of anode emitter efficiency, γanode, base transport factor, αt and also the effective n-buffer thickness, weff, as stated in the e.q.(2). ( ⁄ ) (2) where, lp is the hole diffusion length. it depends on the carrier mobility, lifetime and temperature. it thus requires optimum parameters of βpnp for the fs cigbt to enable the device to withstand over-shoot voltage successfully. the 3.3kv planar gate fs cigbt is simulated under the same circuit configuration in the section a to determine the influence of pnp current gain on the dynamic clamping performance of cigbt with different n-buffer thicknesses, and anode peak doping concentrations. fig. 7 shows the turn-off waveforms with n-buffer thicknesses varying from 5µm to 30µm with the same peak concentration of 5.0×10 15 cm -3 . it can be observed that the dv/dt of anode voltage after mos channel turn-off is greatly influenced by the n-buffer thickness. it also leads to reduction of over-shoot voltage as the n-buffer thickness reduces. but this sacrifices the current fall time during the transient. in comparison, the turning-off waveforms of the structure with varying anode peak concentration with a constant n-buffer thickness (15µm) and doping concentration (5.0×10 15 cm -3 ) are demonstrated in fig. 8. as expected from the increase in current gained by increasing peak anode doping concentration, a reduced self-clamped voltage is achieved at the expense of turn-off loss. fig. 9 has illustrated the peak power density during turning-off transient with a function of n-buffer thickness and anode peak doping concentration. the peak power density decreases with decreasing n-buffer thickness. the same trend can be found for igbt plotted in comparison. as a thinner buffer enhances the number of holes injected into the n-drift region during the transient, the peak power density reduced. but the reduction is less significant when the n-buffer thickness is less than 15µm. other constraints, such as stray inductance and carrier mobility, limit further improvement in the peak power density when there are sufficient holes to maintain a normal electric field distribution in the n-drift region. in the case of the igbt, its peak power density is higher for the same n-buffer thickness due to a higher electric field peak across the ndrift region than that exhibited by the cigbt as explained in the previous section. 8 h. y. long, m. r. sweet, e. m. s. narayanan fig. 7 cigbt turn-off waveforms with variable n-buffer thickness from 5µm to 30µm (vdc=2500v, ia=200a, tj=25˚c, rg=0ω). fig. 8 cigbt turn-off waveforms with variable anode peak concentration (vdc=2500v, ia=200a, tj=25˚c, rg=0ω). for a constant n-buffer thickness of 15µm, the peak power density of cigbt with increasing peak anode doping concentration is also plotted in the same figure. with a higher anode peak concentration, it also increases the pnp current gain. but the peak power density only shows a slight reduction when compared to the variation of n-buffer thickness. because as e.q. (2) suggested, n-buffer thickness causes βpnp change exponentially whereas γanode changes linearly with the current gain. a trade-off relationship between turn-off power loss and maximum self-clamped voltage is plotted in fig. 10 with n-buffer thicknesses from 5µm to 30µm. by controlling the n-buffer thickness, trade-off between voltage clamping capability and turn-off loss can be optimized. as can be concluded from the above results, the 3.3kv fs cigbt device exhibits good voltage clamping capability and turn-off loss. enhanced dynamic voltage clamping capability of clustered igbt at turn-off period 9 fig. 9 peak power density during turn-off. fig. 10 turn-off loss and clamped voltage dependence on the n-buffer thickness. 4. conclusion this paper has shown the dynamic voltage clamping capability of planar gate cigbt, pmos trench gate cigbt and conventional igbt under extreme stray inductance and zero gate resistance. the removal of excess charges stored in the n-drift region determines the turn-off time and maximum clamped voltage. pmos trench gate provides a more efficient method to extract the hole carriers by the induced p-channel when the gate voltage goes to negative value. it has exhibited low losses, fast turn-off time and smooth switching waveforms among the three types of structures simulated. 10 h. y. long, m. r. sweet, e. m. s. narayanan the self-voltage clamping feature of cigbt can be further improved through structural optimization of internal pnp current gain. a high current gain has better over-voltage protection, but would increase the turn-off power loss. a low current gain should also be avoided as it shifts the peak electrical field from cathode to anode side and induces oscillation during the process. the simulation analysis has shown that greater optimization of the performance of fs devices is achieved through the freedom provided by the n-buffer than by npt technology. there is a considerable impact on soa capability and power losses to fs cigbt. the new protection feature of fs cigbt can simplify the system design and offer greater optimization of performance of high voltage devices. references [1] m. rahimo, a. kopta, s. eicher, u. schlapbach, and s. linder, "a study of switching-self-clampingmode "sscm" as an over-voltage protection feature in high voltage igbts," in proceedings of the 17th international symposium on power semiconductor devices & ics, pp. 67-70, 2005. [2] a. rahimo, a. kopta, s. eicher, u. schlapbach, and s. linder, "switching-self-clamping-mode "sscm", a breakthrough in soa performance for high voltage igbts and diodes," ispsd '04, in proceedings of the 16th international symposium on power semiconductor devices & ics, pp. 437-440, 2004. [3] m. otsuki, y. onozawa, s. yoshiwatari, and y. seki, "1200v fs-igbt module with enhanced dynamic clamping capability," ispsd '04, in proceedings of the 16th international symposium on power semiconductor devices & ics, pp. 339-342, 2004. [4] j. yedinak, j. wojslawowicz, b. czeck, r. baran, d. reichl, d. lange, p. shenoy, and g. dolny, "enhanced igbt self clamped inductive switching (scis) capability through vertical doping profile and cell optimization," in proceedings of the 14th international symposium on power semiconductor devices & ics, pp. 289-292, 2000. [5] j. yedinak, j. merges, j. wojslawowicz, a. bhalla, d. burke, and g. dolny, "operation of an igbt in a self-clamped inductive switching circuit (scis) for automotive ignition," ispsd '98, in proceedings of the 10th international symposium on power semiconductor devices & ics, pp. 399-402, 1998. [6] j. lutz and r. baburske, "dynamic avalanche in bipolar power devices," microelectronics reliability, vol. 52, pp. 475-481, mar 2012. [7] e. m. s. narayanan, m. r. sweet, n. luther-king, k. vershinin, o. spulber, m. m. de souza, and j. v. s. c. bose, "a novel, clustered insulated gate bipolar transistor for high power applications," in proceedings of the international semiconductor conference, cas 2000, vols 1 and 2, pp. 173-181,542, 2000. [8] m. sweet, n. luther-king, s. t. kong, e. m. s. narayanan, j. bruce, and s. ray, "experimental demonstration of 3.3kv planar cigbt in npt technology," ispsd 08, in proceedings of the 20th international symposium on power semiconductor devices & ics, pp. 48-51, 2008. [9] k. vershinin, m. sweet, o. spulber, s. hardikar, n. luther-king, m. m. de souza, s. sverdloff, e. m. s. narayanan, and d. hinchley, "influence of the design parameters on the performance of 1.7kv, npt, planar clustered insulated gate bipolar transistor (cigbt)," ispsd '04, in proceedings of the 16th international symposium on power semiconductor devices & ics, pp. 269-272, 477, 2004. [10] n. luther-king, e. m. s. narayanan, l. coulbeck, a. crane, and r. dudley, "comparison of trench gate igbt and cigbt devices for increasing the power density from high power modules," ieee transactions on power electronics, vol. 25, pp. 583-591, mar 2010. [11] h. y. long, l. ngwendson, e. sankara narayanan, and m. sweet, "numerical evaluation of the shortcircuit performance of 3.3-kv cigbt in field-stop technology", ieee transactions on power electronics, vol. 27, pp. 2673-2679, 2012. [12] n. luther-king, m. sweet, and e. m. s. narayanan, "performance of a trench pmos gated, planar, 1.2 kv clustered insulated gate bipolar transistor in npt technology," in proceedings of the 21st international symposium on power semiconductor devices & ics, pp. 164-167, 2009. [13] t. ogura, h. ninomiya, k. sugiyama, and t. inoue, "turn-off switching analysis considering dynamic avalanche effect for low turn-off loss high-voltage igbts," ieee transactions on electron devices, vol. 51, pp. 629-635, apr 2004. instruction facta universitatis series: electronics and energetics vol. 30, n o 3, september 2017, pp. 403 416 doi: 10.2298/fuee1703403p analasys of two low-cost and robust methods for indoor localisation of mobile robots * miloš petković, vladimir sibinović, dragiša popović, vladimir mitić, darko todorović, goran s. đorđević university of niš, faculty of electronic engineering, niš, serbia abstract. this paper presents two simple and cost effective indoor localisation methods. the first method uses ceiling-mounted wide-view angle webcam, computer vision and coloured circular markers, placed on the top of a robot. main drawbacks of this method are lens distortion and sensitivity to lighting conditions. after solving these problems, a high localisation accuracy of ±1cm is achieved at about 5 hz sampling rate. the second method is a version of trilateration, based on ultrasound time of flight distance measurement. an ultrasonic beacon is placed on a robot while wall detectors are strategically placed to avoid an excessive occlusion. the zigbee network is used for inter-device synchronisation and for broadcasting measured data. robot location is determined as a solution to the minimisation of measurement errors. using nelder-mead algorithm and low-cost distance measuring devices, a solid sub 5 cm localisation accuracy is achieved at 10hz. key words: robot localization, nelder-mead, gnu scientific library, usb camera, opencv 1. introduction the robot or objects indoor localisation is a vital research area, intrinsically important in expanding competences of future low-cost home robots. a comprehensive research overview is best gained by browsing applications in microsoft’s indoor localisation competition, held three years in a row [2], starting with 2014. the best scores are often achieved through engagement of expensive components such as lidar’s. however, when it comes to a low-cost mobile robot, it is demanded that localisation is both reliable and inexpensive. consequently, a compromise is reduced to the ratio of positioning accuracy and the costs of producing and implementing localisation. this is not difficult to received october 7, 2016; received in revised form december 15, 2016 corresponding author: miloš petković faculty of electronic engineering, university of niš, serbia, aleksandra medvedeva 14, 18000 niš, serbia (e-mail: milos.petkovic@elfak.ni.ac.rs) * an earlier version of this paper received best section paper award at electronics section at 3rd international conference on electrical, electronic and computing engineering icetran 2016, zlatibor, serbia, 13-16 june, 2016 [1] 404 m. petković, v. sibinović, d. popović, v. mitić, d. todorović, g. s. đorđević achieve for service robots. for example, home cleaning robots do not require high precision localisation for wandering. however, if servicing an arbitrary point in workspace is required, a comprehensive research would be needed in order to stay below the price tag. furthermore, the indoor localisation is especially challenging [3] due to a problem with weak or non-existing gps signal, and due to occlusion problems as a result of variety of objects and their placement within a room. thus, usage of any method that needs a straight line visibility between two parts would require a redundant solution. on the other hand, such increasing of complexity leads to the increase of the overall costs. therefore, a careful consideration has to be made before choosing the right method. the localisation is based on a low-cost, ultrasonic, time-of-flight, distance measuring system. it is similar to cricket [4, 5]. the robot emits an ultrasonic beacon signal, while fixed wall-mount devices measure time-of-flight. this kind of system is often inexpensive, so increasing redundancy by adding more of wall devices is not increasing the overall system cost considerably. use of straightforward trilateration imposes few problems. the first one appears when, due to a measurement error, three or more spheres do not intersect at a single point. for smaller measurement errors this could be neglected and considered as a rounding error. since our system had better than ± 10cm accuracy, this could not be the case. the other problem, a special case of the first one, is absence of intersection between spheres in case of negative errors. mathematically speaking, a solution of trilateration is imaginary. arguably, accuracy could be improved by calibrating each wall unit separately, and ensuring their precise coordinates. however, in cases of occlusion and reflections, these kinds of problems would reappear. therefore, we seek a solution through a criterion-based optimisation to get as close as possible to the point that minimises the measurement error. further improvement could be achieved by using a secondary, more accurate, localisation system. when these two systems run in parallel, the second system would be a good reference for the calibration of the initial one. for this purpose, localisation rate does not even need to be high. therefore, we decided to base the secondary system on computer vision and recognition of passive markers. low-cost requirement was priority as well, so overcoming typical drawbacks of such an image processing methods was important. fisheye lens distortion was removed by using known geometry [6], and complexity of object recognition was avoided by simplification and colour coding of markers [7]. the rest of them will be presented in details in the following section. 2. visual feedback mapping for localisation 2.1. materials and method we have placed a fish-eye webcam on the ceiling in the middle of the test room. in order to make this system affordable, we based it on a full hd webcam, genius f100, with 120° view angle lens, and moderate power pc of amd athlon ii x3 455 3.30ghz, ati radeon hd 6450 and 4gb ddr3 ram. the distance between the camera lens and the floor is 3.1 metres therefore, the camera with 120° wide view angle lens can cover the area of 4×3 m. a grid of 0.5 × 0.5 m was drawn on the floor to ease calibration and provide a visual clue during the measuring. the grid is highly accurate, with only 5 mm distortion error over the diagonals of 5 m. analasys of two low-cost and robust methods for indoor localisation of mobile robots 405 rectification was crucial for this system because a wide-angle lens that is used has intrinsic distortion. its removal is easy since the camera itself is stationary and marker height was supposed to be constant. sampling images of the marker at different positions reveals levels of distortion. this data is then used to invert the effects. we gathered those samples at drawn greed points. as an aid we used a tripod, and as a marker we used an orange ball, as shown in fig. 1. height of this customized calibration tool was set to 1.1 m which reduced the distance between the camera lens and the markers to exactly 2 m. after relocating the tripod around the grid, and overlaying all images one on top of the other, we generated fig. 1. the central part of the grid, which aligns with the middle of the camera, is free from the lens distortion. that is why we dropped out some middle points but left one on the edges and corners, where the distortion is at its largest. fig. 1 overlay of tripod with marker as calibration points in our test room. we found it fitting to divide the frame to 9 regions and linearize them independently. this keeps rectification simple and calibration easy. number of pixels between sampled points was manually counted and converted to centimetres. later on, calibration constants and offsets for each region were calculated, and embedded in the positioning algorithm. distortional displacement within the camera image is not the same for close and distant objects. obviously, an additional calibration is required if height of the marker is changed. however, there is no need for this if its placement is optimal. the best place for the marker is on the top of the tracked object, where chances for occlusion are negligible. we should note that markers placed higher do require more linearization sectors, as the difference between the real position of the object on the floor and the camera frame varies. an important part of the simplification of the marker recognition is its colour coding. this makes identification easy. in addition, extracted marker shape is more accurate, which enhances precision in marker centre calculation. we implemented this extraction through pixels classification. the classification of pixels generates a black and white image, where white pixels are originally in adjacent colour space of the marker. this new image contains slightly etched shapes of markers with some artefacts as well. another layer of smoothing filter corrects this. we suggest gaussian blur, as it produced quite useful results for us. larger artefacts, if they happen to persist, are filtered out by shape and size classification. we opted for a circular marker design. 406 m. petković, v. sibinović, d. popović, v. mitić, d. todorović, g. s. đorđević marker colour distinction also enables multi object tracking, or orientation recognition by engaging two markers per object. in particular we used the larger, orange coloured, marker for tracking position, while the smaller one which was green, was an aid in tracking robot heading. this marker combination proved to be the most desirable with respect to the program execution time. after marker positions in pixels are extracted, in our case after the centre of the only remaining circle is calculated, its conversion to absolute position in centimetres comes in place, by using formula (1) and calibration constants. 1 2 calib mp os mpc os c        (1) mp is the marker position in pixels while os1 is the marker offset in pixels for the region it belongs to. ccalib and os2 are linearity gain and offset in centimetres for the region. their values, for all nine calibration regions, are given in table 1. finally, mpc is marker position in centimetres, in coordinate system which centre is placed at the bottom left calibration point of fig. 1. table 1 calibration constants and offsets for conversion into cm marker osition os1 ccalib os2 x y x y x y upper left 285 28 3.44 3.64 0 0 centre left 285 28 3.44 3.43 0 0 lower left 285 900 3.36 3.23 0 250 upper middle 620 20 2.29 2.26 100 0 centre 620 200 3.5 3.5 100 50 lower middle 620 1319 3.6 3.6 50 0 upper right 1319 28 3.44 3.43 300 0 centre right 1319 28 3.44 3.43 300 0 lower right 285 900 3.36 3.23 0 250 2.2. implementation and results the program was done under window 10 with microsoft visual studio community 2015 with inclusion of opencv library version 3.0. at the start up of the program, camera parameters, such as brightness, contrast, saturation, hue, gamma, sharpness and exposure, are pre-set to suitable values. this parameters tweaking enhances proper pixel colour classification at given lighting conditions. we experimentally determined them for our neon light test room, with west facing windows. prior to the pixel classification, the image is converted from rgb intohsv. after this, the inrange function is used, as classifier, to generate black and white image. as already stated, we used gaussianblur for bw image smoothing and smaller artefacts removal. in the next step we calculate the marker position by data extraction. we used simpleblobdetector in this process. parameters of this function are set to ignore everything but circles of particular size, thus filtering any larger artefacts. it is the middle point of a found blob, that is considered as the marker position, pixel-wise. analasys of two low-cost and robust methods for indoor localisation of mobile robots 407 to speed up the program we decided to trim sampled frames only to region-of-interests (roi). this way, computationally intensive functions like simpleblobdetection shall execute faster. during the initialisation phase, the program searches the whole frame for marker, until it is found. afterwards, the roi is extracted from frames based on previous marker position and the maximum expected movement. this roi trimming not only shortens calculation time but also filters out other objects of similar visual properties as the marker’s. precaution that needs to be taken into account is that these kinds of objects are not present during the start of the program. in such cases it could happen that some other object is recognised for tracking, instead of the marker, and then the wrong roi would be extracted. in the last step, marker position in pixels is converted into actual position in centimetres, in absolute coordinate frame attached to the floor. approximately, one centimetre corresponds to 2.5 pixels. initial verification of the system includes repetitive measurements with the marker, placed on a tripod, at an arbitrary point in workspace. this tests calibration accuracy and system repeatability. upon consecutive large number of measurements, we can confirm that the system is reliable and repeatable at the acceptable level. the number of 1572 location samples of a still marker was acquired. on average, it required 235 ms to complete one localisation cycle. with 4.26 hz localisation rate, such system is not suitable for localising high speed mobile platforms. nevertheless, a robot that travels at comfortable speed of 0.3 m/s would be localised at points 7 cm apart. this can be considered acceptable in applications such as fetching objects to the customer or telepresence, but not in precise object handling. repeatability for all 1572 measurements was within one-centimetre range which corresponds to 2 to 3 pixels of the camera. due to small variations in lighting and inherent camera noise, there exists a jitter in marker position, found by a simple blob detector. when position in pixels is converted into position in centimetres, and rounded, the jitter passes to marker position in centimetres. an improvement is possible with the increase of camera resolution, or perhaps with the increase of the number of linearization sections. however, we find this system static performance quite satisfactory for calibration and support of low-cost, ultrasound based, time-of-flight localisation system. for the dynamic testing of camera localisation system, we have decided to make several circular motions in the centre of the test room. there are two reasons for this. the first is simplicity of trajectory equations, which allows easier data analysis later on. the second is trajectory length that should provide sufficient time for acquisition of a sufficient amount of data. since the test room was not large enough for straight line movements, the most logical trajectory then was circular. also, it can be easily performed without the need for an expensive setup. for example, a simple remotely driven mobile platform, like more powerful homemade rc car, suffices. another proposal is a motor driven rotating stand. at our disposal was a small, student grade, robotic platform. after attaching the marker to it, we have initiated the localisation and made 30 laps, with approximately constant speed of 20 cm/s. the programme was set to log the marker positions with the time stamps of frame acquisitions. the time stamps are expressed in milliseconds and the local time is measured from the beginning of the test. fig. 2. shows plotted positions of the marker. as it can be noted, the trajectory is circular but there exists some slight movement of the centre. 408 m. petković, v. sibinović, d. popović, v. mitić, d. todorović, g. s. đorđević fig. 2 logged trajectory of circular motion of marker. number of repetitive cycles is 30. in the next step, we have done a time stamp analysis of about 426 s long measurement streak. this was necessary for the later analysis of trajectory. the logged time seemed rather linear when plotted. an average time between processed frames is 236 ms, with standard deviation of 22.9 ms. differences on the histogram of time between two successively grabbed frames are an interesting observation, which is shown in fig. 3. fig. 3 histogram of time differences, dt, between two successively grabbed frames. histogram peaks are at an equal distance of approx. 15 ms. since the camera streams at about 30 fps, this 15 ms seems like a half of a frame time. an average period of 236 ms is then correlated to 7 frames. considering a slight variance in stream frame rate and code execution, it could lead to a frame grabbing jitter. the jitter would be only one frame. its effect would be increase in localisation uncertainty of one frame time multiplied by the speed of marker. if speed is low, uncertainty increase is only a few centimetres. in our case, for speed of just under 20cm/s, it is evaluated to 0.6cm. when time stamps are analasys of two low-cost and robust methods for indoor localisation of mobile robots 409 converted to integer number of frames from the beginning of test, and time difference is recalculated, the histogram looks like in fig. 4. now it is much clearer that almost half of the samples are taken with 7 frame difference. from the remaining samples, about one third is with 6 frame difference and one third with 8 frame difference. in other words, standard deviation is 0.77 frames. to conclude, as far as the timing analysis is concerned, since no real time os were used, a variance in processing frames and sampling does exist. however, it is not more than 10 frames or one third of a second. fig 4 histogram of time stamp differences, when time is converted to frames with 30fps rate. in parallel with the dynamic performance test we have done an additional timing analysis. we wondered whether this kind of localisation system could be integrated as small localisation device capable of broadcasting tracked object location via wi-fi. thus, the image processing pc was set to send position via udp packets to pc within the same wireless network. comparing the time difference of localisation frame sampling time and time of the udp arrival, we got 236 ms of time difference between location information. on the other hand, a standard deviation is now 133 ms, which is almost 6 times more than for the localisation alone. the main culprit is packet buffering, and wireless signal quality. due to them, considerate number of packets was late. note also that this differential analysis excludes fixed amount of latency from wi-fi, as it did with camera frame grabbing. since we are using low-cost off the shelf components, it is not possible to determine accurately this kind of delays. at least not without the use of special setups. conversely, we find sending location via udp packets and wi-fi for control purposes plausible, however, control algorithms must either be rugged enough for variable time delays or take advantage of frame time stamp and perform small corrections of received location. in the following step, we have done trajectory analysis in two stages. firstly, we have found trajectory radius r and centre (x0, y0), as well as speed of centre movement (vx, vy). this was achieved by finding the best fit for function (2). 410 m. petković, v. sibinović, d. popović, v. mitić, d. todorović, g. s. đorđević 2 2 0 0 ( , , ) ( ) ( ) x y f x y t r x v t x y v t y       (2) basically, function (2) represents difference in radius of acquired location and the estimated one. for any measured point it should be equal to zero. the best fit result gave r of 41.9 cm, (x0, y0) of (191.1, 149.6) cm, as well as (vx, vy) of (0.264, -0.096) mm/s. the best fit average error is 6e-16, while the standard deviation is 0.633cm. it is interesting to note that the standard deviation is on the level of mentioned frame jitter, for an object with speed of 20 cm/s. nevertheless, we state that accuracy of this system for moderate speed of tracked marker is ±1.5cm, or ±2.25cm if absolute limits are applied. so performance of system for tracking a moving object does not go far off from the static measurements. now, if we take into consideration that speed of the marker was constant, we can assume that coordinates (x, y) change as in (3), where ω is constant angular velocity and ϕ is initial angular offset. the formula (3) is our ideal mathematical model of real trajectory. 0 0 ( ( ), ( )) ( cos( ), sin( )) x y x t y t x v t r t y v t r t          (3) difference of trajectory given with the formula (3) and measured data is given with function (4). ideally, it equals zero. 2 2 0 0 ( , , ) ( cos( ) ) ( sin( ) ) x y g x y t x v t r t x y v t r t y             (4) the best fit result gives angular velocity of -0.439 rad/s, which translates to 18.4 cm/s peripheral speed, and angular offset of 3.163 rad. negative velocity comes from the clockwise direction of trajectory. average fitting error is 2.8 cm and standard deviation is 1.8 cm. since this result seems much worse than the one from trajectory path analysis, we conclude that this method is accurate for localisation within a frame. however, when a tracked object is moving, due to unsynchronised frame grabbing, larger margin of error occurs. indeed, when we calculated travelled distances between successive sampled frames, we got 4.4 cm in average and standard deviation of 0.5 cm. this seems like a great variance, considering the fact that marker speed was pretty constant. after calculating temporal velocities, we got the result that average speed is 18.6 cm/s and standard deviation is 0.6 cm/s. so generally, due to variance in precise image capturing, we get very rough velocity approximation based only on two samples. however, after filtering, this information seems quite right. 3. time-of-flight localisation method 3.1. materials and method a simplified block diagram of time-of-flight distance measurement system is presented in fig. 5. there is a beacon that emits ultrasound on the left and a wall mount device on the right. the minimum number of wall devices necessary for successful trilateration is three. before the beacon fires a streak of waves, it notifies a wall device via radio module, and it starts the counter. when the wall device detects emitted sound, it stops the counter. information about time of flight is then sent via radio. distance is calculated after the time analasys of two low-cost and robust methods for indoor localisation of mobile robots 411 of flight is multiplied by the speed of sound. since the device is for indoor use only, speed changes due to temperature variations are neglected. multiple ultrasonic transducers are used in both devices. beacon covers 360 degrees horizontally and about 45 degrees vertically. the wall device covers about 140 degrees horizontally and 45 degrees vertically. therefore, a proper redundancy is needed for specific coverage. currently we use 4 wall devices placed in corners of a rectangle, with an orientation toward common centre. we made sure to do the measurements only in areas covered with more than 3 wall units. although devices are low-cost to make, this is only an initial accuracy testing and we find it irrelevant to have coverage of any preferred size or shape. fig. 5 simplified block diagram of system: ultrasound emitting beacon on the left and time-of-flight measuring wall mount device on the right. in order to overcome the problem of trilateration when using low-accuracy, but also low-cost, distance measuring system, we have based solution calculation through minimisation of the sum of squares of measurement errors. in the minimisation function ,)( 1 2    n i iri dppf (5) n represents number of wall devices that responded to ultrasonic beacon. position vector of beacon pr and position vectors of wall devices pi are defined in 3d and in regard to some ground reference point. again, vectors pi, where i is from 1 to n, are known, as they are measured during localisation system installation. the x and y axes are in the plane of the floor while the z axis is oriented toward the ceiling. measured distances di are obtained short after the beacon signal is emitted. the function minimum is located around the beacon’s position. this function is equal to zero when no measuring error is present. otherwise, a small precision uncertainty will occur in the case of measurement errors. when measured data noise is of random nature, there is no possibility to narrow down solution search area, at least not statically. 412 m. petković, v. sibinović, d. popović, v. mitić, d. todorović, g. s. đorđević in order to test this method, we have created a wolfram mathematica script. it simulates a system of 3 or 4 wall devices and a beacon. distance measuring error is randomly generated and added to the precise value. we set the x and y plane to correspond to the floor and the z axis to point to the ceiling. although this method allows finding position of beacon in 3d, we are more interested in keeping its height constant. this would be most probable use-case in mobile robotics. therefore the script visualises 2d plane of the z axes at the fixed height of beacon of 1.3 m, as in fig. 6. possible beacon positions in that plane are circles, designated with thick circular arcs in fig. 6. note that both positive and negative measurement errors were introduced. the dot represents calculated position, while the short lines, that connect it to the arcs, are estimated measurement errors. the squares represent projection of wall devices on the plane. they are also centres of the circles. the lower left part contains magnified detail around the dot. fig. 6 a plane, where the z coordinate is constant 1.3 m, that contains calculated robot position which is shown with a dot. possible beacon positions, for that plane, according to the measured data are circles, are shown partially with thick arcs. the short lines represent estimated measurement error. the squares represent projection of wall devices on the plane. they are also centres of the circles. the zoomed detail around solution point is presented at the bottom left. visual checks were only used as an aid, for better understanding of behaviour of solution in response to errors and device placement. for example, actual and calculated positions are identical when there is no measurement error. equal errors in all wall devices tend to cancel each other. numeric evaluation is done as well. we used nminimize function for minimization. available minimisation methods are nelder-mead [8], differential evolution [9], simulated annealing [10] and random search [11]. we used them all simultaneously in order to compare them with respect to efficiency and accuracy. wall devices were placed in rectangular pattern with same height, as they might be used commonly. we generated random beacon positions, calculated accurate distances to wall devices, and then added a gaussian error in range of ±10 cm. beacon analasys of two low-cost and robust methods for indoor localisation of mobile robots 413 location found by minimisation of function (5) was accurate enough, mostly bellow 5 cm error. however, in some cases, the error went up to extremes of almost 20 cm. that occurs in situation when two adjacent wall devices have maximal error of +10 cm while the opposite two have –10 cm of error. probability for this is rather low and general conclusion is that this method works quite nicely. it shows robustness to both positive and negative measurement errors. solution exists independently from the number of wall devices. increasing their number to overcome temporary occlusion problems does not affect solution calculation, neither in complexity nor in time. comparison of results of four minimisation methods showed no significant difference between them. difference in accuracy was well below 1 cm. the same could be said about efficiency. so we chose the nelder-mead for practical implementation. 3.2. implementation and results after successful method of validation in wolfram mathematica, we have built c++ code. we have chosen to use nelder-mead solver from the gnu scientific library. the program was used on the minnowboard computer with non-commercial ubuntu os. the minnowboard is an open-source, 64-bit intel® atom™ based mini/embedded pc. initial tests were done with pre calculated examples, generated with mathematica script. execution time was about 1 ms, in average. though sometimes it reached 3ms however, this was not the only program running. nevertheless, we find this quite satisfactory. for service type robot speed, this introduces a localisation error less than one millimetre. delays in distance measuring system are much greater and position sampling is below 10hz. if by any chance execution time has to be reduced it could be done by lowering solver precision. we noticed that in most cases 10 to 20 iterations were enough to get the right position of centimetre resolution. as in the visual feedback localisation in section 2, we initially verified the system, through repetitive measurement with beacon fixed at arbitrary position in the workspace. this verification helps understanding repeatability in measurement and also gives reasonable confidence in usability for further implementation on a mobile robot. upon consecutive large number of measurements, we can confirm that the system is reliable and repeatable at an acceptable level. the beacon firing rate was fixed, with the period of 150 ms, which is frequency of 6.67 hz. although we could set it up to 10hz, we did not want to use it at its limits. a number of 1172 measurements at fixed position is presented as histogram in fig. 7. the average point is (213cm, 169cm) and standard deviation is 0.62, or 0.38 for x axis data and 0.49 for y axis data. in general, only 0.26%, or 3 points, is outside of ± 1.5cm accuracy region. these data show a satisfactory initial accuracy of the method. although it returns a bit more scattered location than the camera based method, it works faster. for dynamic testing of ultrasonic based localisation system, we have done the same test as with camera based localisation system. furthermore, we decided to do both tests in parallel. this would make the comparative analysis easier. so the ultrasonic beacon was placed on the same platform as the marker. since the platform, which was in the centre of the test room, was making circular motions, both the marker and the beacon had the same centre of rotation. since the beacon must not occlude the marker it was placed as close as possible to it. nevertheless, there still existed a slight difference of almost 3 cm, in their radiuses. the 414 m. petković, v. sibinović, d. popović, v. mitić, d. todorović, g. s. đorđević initial trajectory analysis confirmed a slightly lower localisation accuracy of this system compared to the camera based one. therefore, we decided to use the centre of rotation (x0, y0) calculated from the camera based system trajectory analysis, as well as speed values (vx, vy), and repeat fitting process with (2). the best result gave r of 44.8 cm, an average error of -3e-14, and a standard deviation of 7.44 cm. this result looks a lot higher than the one for the static test. this stems from the poor choice of rf modules for the system. these are low power zigbee modules. several studies indicate low performance of zigbee communication in presence of wi-fi signals. this is nicely summarised in [12]. there it is clearly stated that wi-fi signal can corrupt zigbee signal on bit level or cause drastic increase in retransmission. since our setup room had one wi-fi router and there were plenty more distributed in nearby offices, we have noticed both effects. when we analysed time of arrival of packets from single wall device we discovered that latency between packets is quite drastic. instead of having packets at regular beacon firing intervals of 150 ms, plus or minus time of flight of ultrasound up to 5 m, there were packet buffering where packets came with less than 30 ms difference. since packets with distance information were not time stamped at transmitter side, it was impossible to determine whether the wall device failed to transmit after one beacon firing or the measured distance information came after the following beacon firing. in such cases mixing of data occurred. it could be otherwise interpreted like higher inaccuracy in distance measurement, which leads to higher localisation error. at some rare moments, packets from unknown wall unit address were received, which we interpret like obvious pollution of data. it is quite possible that lower performance of zigbee modules is even due to its quality, since they were one of the cheapest on the market. fig. 7 histogram of 1172 measurements at single beacon pint. most often measured position is (213, 169) cm. analasys of two low-cost and robust methods for indoor localisation of mobile robots 415 problems associated with zigbee modules could perhaps be overcome by using better and more reliable modules, and by implementation of some better protocol for sending data over zigbee as suggested in [12]. another solution could be using modules that avoid overcrowded 2.4 ghz region at all. since we had already identified the problematic latency in our system, we skipped the second part of trajectory accuracy analysis that we did with the camera based system. simply, it would not add any value to the results. 4. conclusion we have implemented two methods for indoor localisation, and tested them against each other under identical conditions in our testing facility. after initial static testing and validation of systems accuracy, with laser range finder, we have determined that the first method, the camera-based one, has better accuracy. although it has half of localisation speed than the time-of-flight method, we have decided to use it as referent system during dynamic testing. since mobile service robots have moderate speeds, then the localisation rate of visually based system is quite adequate. dynamic test showed that ultrasonic based localisation system has lower accuracy and success rate of measurement, due to zigbee modules communication glitches that require additional attention and improvements. on the other hand, the first method has its own pitfalls. it is, foremost, sensitivity to changes in lighting condition. it also requires a comprehensive calibration which should be automated in order to make it an off-the-shelf localisation solution. the standard pc could be easily replaced with embedded type pc, for example, with any of newer raspberry pi series. nevertheless, both systems showed simplicity in setting up and use. their low implementation cost makes them affordable for use in education and some less demanding real life applications, such as service robots. in conclusion, camera-based system is better for laboratory conditions due to its high accuracy. the other system, although less accurate, is more suitable for a variety of other locations. references [1] m. petković, v. sibinović, d. popović, v. mitić, d. todorović and g. s. đorđević, “robust indoor localisation methods of mobile robots: direct visual feedback and time-of-flight trilateration”, in proceedings of the 3rd international conference on electrical, electronic and computing engineering (icetran 2016), zlatibor, serbia, june 13 – 16, 2016, eli2.6 1-6. [2] "microsoft indoor localisation competition". research.microsoft.com. n.p., 2016. web. 15 apr. 2016. [3] j. borenstein, et al, “mobile robot positioning sensors and techniques”, invited paper for the journal of robotic systems, special issue on mobile robots, vol. 14, no. 4, pp. 231 – 249, 1996. [4] "the cricket indoor location system: an nms project". cricket.csail.mit.edu. n.p., 2016. web. 15 apr. 2016. [5] n. b. priyantha, a. chakraborty and h. balakrishnan, “the cricket location-support system”, in proceedings of the 6th acm mobicom, boston, ma, august 2000. [6] c. hughes, et al., “wide-angle camera technology for automotive applications: a review”, iet intelligent transport systems, vol. 3, no. 1, pp. 19-31, 2009. [7] z. garofalaki, et al, “object motion tracking based on color detection for android devices”, international journal of computer, electrical, automation, control and information engineering , vol. 9, no. 4, pp. 970-973, 2015. 416 m. petković, v. sibinović, d. popović, v. mitić, d. todorović, g. s. đorđević [8] j. a. nelder and r. mead, “a simplex method for function minimization”, computer journal, no. 7, pp. 308–313, 1965. [9] r. storn and k. price, “differential evolution a simple and efficient heuristic for global optimization over continuous spaces”, journal of global optimization, no. 11, pp. 341–359, 1997. [10] s. kirkpatrick, c. d. gelatt jr and m. p. vecchi, “optimization by simulated annealing”, science, vol. 220, no. 4598, pp. 671–680, 1983. [11] l. a. rastrigin, “the convergence of the random search method in the extremal control of a many parameter system”, automation and remote control, vol. 24, no. 10, pp. 1337–1342, 1963. [12] c. m. liang, n. b. priyantha, j. liu and a. terzis “surviving wi-fi interference in low power zigbee networks”, in proceedings of the 8th acm conference on embedded networked sensor systems, acm ny, 2010, pp. 309-322. 11059 facta universitatis series: electronics and energetics vol. 36, no 2, june 2023, pp. 209-226 https://doi.org/10.2298/fuee2302209g © 2023 by university of niš, serbia | creative commons license: cc by-nc-nd original scientific paper a reliable routing mechanism with energy-efficient node selection for data transmission using a genetic algorithm in wireless sensor network sateesh gudla1, 2, nageswara rao kuda3 1dept. of computer science and engineering, jntuk, kakinada, ap, india 2dept. of computer science and engineering, lendi institute of engineering and technology(a), jntu kakinada, vizianagaram, ap, india 3dept. of computer science and systems engineering, auce(a), andhra university, ap, india abstract. energy-efficient and reliable data routing is critical in wireless sensor networks (wsns) application scenarios. due to oscillations in wireless links in adverse environmental conditions, sensed data may not be sent to a sink node. as a result of wireless connectivity fluctuations, packet loss may occur. however, retransmissionbased approaches are used to improve reliable data delivery. these approaches need a high quantity of data transfers for reliable data collection. energy usage and packet delivery delays increase as a result of an increase in data transmissions. an energyefficient data collection approach based on a genetic algorithm has been suggested in this paper to determine the most energy-efficient and reliable data routing in wireless sensor networks. the proposed algorithm reduced the number of data transmissions, energy consumption, and delay in network packet delivery. however, increased network lifetime. furthermore, simulation results demonstrated the efficacy of the proposed method, considering the parameters energy consumption, network lifetime, number of data transmissions, and average delivery delay. key words: genetic algorithm, energy-efficient routing path, data transmissions, lifetime, wireless sensor networks 1. introduction wireless sensor networks (wsns) are networks of many static or mobile sensors that use self-organization and multi-hop communication. wsns use cooperative sensing, collecting, computation, and transmission to cover the data of sensing objects in a specific region and then deliver it to the sink node or base station. wsn emerged as a received august 29, 2022; revised november 26, 2022, december 02, 2022 and december 24, 2022; accepted january 18, 2023 corresponding author: sateesh gudla department of computer science and engineering, jntuk, kakinada, ap, india e-mail: sateesh.research@gmail.com 210 s. gudla, k. nageswara rao significant paradigm for the rapid collection of data because of its widespread use in time-critical applications such as tsunami warnings, chemical assault detection, forest fire detection, and infiltration detection in military surveillance [1, 2, 3], etc. considering the significance of data acquired, many of these applications necessitate reliable information quickly. a sensor network's primary purpose is information dissemination, but this cannot take place if essential data is lost due to unforeseen node failure or the unpredictable nature of a wireless communication channel. hence achieving reliable data delivery is a challenging issue. on the other hand, the wireless sensor node is engineered on tiny circuits, and the sensors are low-cost and power-restricted [4]; energy is a critical resource since it affects the lifetime of individual sensor nodes and the entire network. henceforth current wsn routing techniques focus on discovering an efficient route to the sink to diminish energy usage and prolong the network lifetime. hence in wsns, it is challenging to deliver energy-efficient, reliable, low-latency data routing. reliable data delivery can be obtained by reducing packet drops in data communication. a packet could be lost for several reasons, including lousy connectivity, overflowing buffers, or a node running out of energy. retransmission and redundancy are two standard methods used in wsns to ensure network reliability. network energy usage and packet delivery delays are exacerbated by retransmission-based techniques because the number of transmissions increases [5]. since a bit lost within a packet can be retrieved using the coding scheme, the research community needs to pay more attention to employing redundancy to achieve reliability in wsns. if packets could be repaired to recover any lost or incorrect bits, the transmission overhead created by retransmitting an entire packet would be significantly reduced. however, selection of any node as a successor node in the routing path plays a vital role in designing an energy-efficient routing protocol. the packet dropping rate may be reduced by selecting a resourceful (which has the highest residual energy, free available buffer, good link quality, and less distance) node as the successor node. the current work aims to design energy-efficient and reliable data-gathering algorithms in wsns while adhering to strict data delivery time constraints by considering parameters such as residual energy, buffer capacity, link quality, and distance between nodes. in recent years, there has been considerable research to find the solution to reliable communication between sensor nodes while managing their energy consumption [6]. the primary keys in this area are addressed by familiar optimization techniques inspired by swarm intelligence, fuzzy logic system, heuristic search algorithms, reinforced learning approach, and genetic algorithms (ga). genetic algorithm (ga) is an evolutionary heuristic, stochastic optimization algorithm that learns about its universe by analyzing data to eliminate wrong solutions and increase the number of excellent ones. it discovers near-optimal solutions quickly and heuristically for either massive or substantially smaller populations [7]. the literature states that ga provided optimized solutions for node placement, network coverage, clustering, data aggregation, routing, etc, in wsns. hence, ga is considered in the present research for developing a heuristic approach-based efficient and reliable routing mechanism. the proposed research's novelty lies in developing a heuristic approach-based energyefficient and reliable routing mechanism using a genetic algorithm that ensures the reduction of packet drops by selecting a resourceful node as a successor node in the route. the authors proposed a genetic algorithm with a roulette wheel selection process, value encoding, and fitness function evaluation using the quality of the link. the network a reliable routing mechanism with energy-efficient node selection for data transmission 211 parameters considered to state the resourcefulness of a node in the evaluation of a proposed algorithm are residual energy, an available buffer of the node, link quality, and distance. the above approach demonstrated a decrease in retransmissions, packet delivery delays, and energy consumption. further, an increase in the network lifetime. the rest of this document is structured as follows. section 2 discusses related work and motivation. the proposed mechanism is described in section 3. the chromosome representation is presented in section 3.2.1, initialization of the population is shown in section 3.2.2, the fitness function is defined in section 3.2.3, section 3.2.4 offers the selection of chromosomes, the crossover and mutations are presented in section 3.2.5 and 3.2.6 respectively and repairing of the chromosomes are explained in section 3.2.7. accordingly, performance was measured using the simulations in section 4. finally, in section 5, we summarise our findings. 2. related work and motivation this part of the manuscript provides an overview of existing works on network lifetime improvement and energy-efficient and reliable routing algorithms. celimuge wu et al. [5] proposed a redundancy-based technique for reliable and rapid data collection in sensor networks to achieve high reliability and minimal end-to-end delay. the proposed protocol employs a network-based coding strategy to increase packet redundancy when a link is defective or when there is a strict end-to-end latency requirement. the protocol automatically modifies the redundancy level to suit the application requirements and the link failure rate. fatma h. elfouly et al. [6] used swarm knowledge to suggest a routing scheme that grows the life of wsns while reducing energy consumption per node. in addition, this model accounts for data reliability by guaranteeing that the sensed data will arrive at the sink node consistently. finally, the buffer size reduces packet loss and energy expenditure from retransmitting duplicate packets. mahmood et al. [8] addressed several reliability schemes that use retransmission and redundancy techniques to recover lost data through either hop-by-hop or end-to-end procedures. they analyzed these schemes by looking into the optimal mix of these techniques, methods, and desired reliability levels to propose an efficient mechanism for resource-constrained wsns. bhardwaj et al. [9] discussed the factors affecting the lifetime of wsns. they determined that one of the most critical issues in wireless sensor networks is how long a network can survive if each node is limited in energy consumption. using an evolutionary genetic algorithm, bhatia et al. [10] suggested an approach termed gada-leach, which seeks to enhance ch selection in the conventional leach routing protocol for wsn such that to facilitate communication between the central hub (ch) and the base station (bs) by considering relay node. according to wang et al. [11], a multipath routing technique for wireless sensor networks uses genetic algorithms to boost the network's fault tolerance and reduce energy consumption. they used only the distance between nodes to compute the genetic algorithm’s fitness function. muruganantham et al. [12, 13] have presented simulating results analysis of classic and genetic-based routing strategies to examine the performance of a wireless sensor network. the genetic approach has been shown to enhance the wsn’s lifespan in the presence of faulty nodes. mujtaba romoozi et al. [14] explored innovative strategies for node positioning to reduce power consumption without compromising coverage. they proposed a genetic algorithmbased node positioning in wireless sensor networks to optimize energy consumption and 212 s. gudla, k. nageswara rao extend the network’s lifetime. t.abirami et al. [15] used a genetic algorithm to improve the network lifetime by creating spanning trees for data aggregation, which are stable and economical with power consumption. hasanien ali talib et al. [16] presented a honey-bee optimization with a genetic algorithm approach to developing a system for sharing data among individual nodes in a one-to-one network setup. ioana apetroaei et al. [17] applied genetic algorithms for routing protocols in wsns. b. baranidharan et al. [18] to improve fnd, hnd, and lnd presented a new clustering technique called the genetic algorithm based energy-efficient clustering hierarchy (gaech). their fitness function is computed by considering the most critical aspects of a cluster.ajay khunteta et al. [19] designed an approach using a genetic algorithm with leach protocol for cluster head selection in wsn to mitigate energy consumption. trong-the nguyen et al. [20] proposed an approach based on a genetic algorithm with self-configuration chromosomes in the cluster formation of a sensor network. m. k. somesula et al. [21] established contact duration-aware cooperative cache placement using a genetic algorithm-based heuristic search technique for practical scenarios to improve the hit and acceleration ratios. table 1 summary of existing work vs proposed work referencе algorithms / techniques used parameters used comparison of the proposed work with related work proposed work genetic algorithm – with roulette wheel selection, value encoding, and route quality as the fitness function. available buffer, residual energy, link quality, and distance. ga-based route construction with resourceful successor nodes in the path by considering available buffer, residual energy, link quality, and distance. the proposed work significantly decreased packet drops, improved the network’s lifetime, reduced energy consumption and packet delivery delay, and decreased the number of transmissions. [5] network coding assisted redundancy in improving redundancy based on link quality. link quality. a network coding-based approach to improve packet redundancy when a link is unreliable, or there is a strict end-to-end delay requirement. it has considered only link quality to update redundancy levels. but available buffer, residual energy, and distance are also important parameters to be considered to improve the network further by reducing packet dropping rate. [11] genetic algorithm where the fitness function is determined using distance. distance between nodes. multipath routing uses a genetic algorithm with only distance as a parameter. available buffer, residual energy, and link quality are not considered in routing decisions which plays a vital role in reducing packet drops. [26] fuzzy approach and the a-star algorithm. residual energy, traffic load, hop count. a-star routing approach based on fuzzy logic to evaluate node weight using residual energy, traffic load, and hop count. here the node’s available buffer is not considered, which may lead to packet dropping. a reliable routing mechanism with energy-efficient node selection for data transmission 213 park et al. [22] introduced a scalable architecture to achieve reliability in downstream data delivery efficiently by considering the unique characteristics of wireless sensor networks to ensure the reliability of data transfer from sensing devices to base nodes. le et al. [23], with the help of a statistical reliability metric, suggested an energy-efficient and reliable transport protocol (ertp) to minimize the number of retransmissions in wsns. ertp guarantees that more than enough data packets are sent to the sink. d. jiang et al. [24] described a multi-constraint routing strategy for smart city applications using load balancing to achieve significant energy efficiency. lee et al. [25] investigated cluster-based wireless sensor networks’ upper bound on network lifespan by addressing a solution to the energy hole problem with spatial correlation in networked clusters. alshawi et al. [26] suggested a new routing strategy for wsns that combines the fuzzy approach and the a-star algorithm to increase the network's lifetime. the concept seeks to find the best route from the source to the destination, prioritizing routes with the most available energy, the fewest possible hops, and the low traffic loads. the redundancy-based technique for reliable and rapid data collecting in sensor networks employed a network-coding-based strategy to increase packet redundancy when a link is defective or when there is a strict end-to-end latency requirement. the ga-based technique used only distance as a parameter to choose which routing paths to explore. but when selecting the most efficient routes for data transmission, it is crucial to consider the network link quality, remaining energy, and available buffer, as these are the primary factors that cause packet loss. the network's lifetime increases when packet loss decreases due to fewer transmissions. in this work, a data routing strategy with parameters remaining energy, the available node buffer, link quality, and distance based on a genetic algorithm that is both energyefficient and reliable has been proposed. this study's significance resides in using a genetic algorithm with a roulette wheel selection process, value encoding, and fitness function evaluation using the quality of the link. the resulting route ensures a selection of a resourceful node as a successor node in the routing path to minimize the number of endto-end data transmissions, energy consumption, packet delivery delay, and an improved network lifetime. 2.1. motivation obtaining accurate data collection (packet level reliability) with carefully enforced end-toend delay requirements is a substantial difficulty in several time-critical applications of wireless sensor networks, such as tsunami warnings, chemical assault detection, forest fire detection, and infiltration detection in military surveillance. hence apart from improving the lifetime of the wsn, reliable data should be delivered within the time boundaries. retransmission and redundancy are two standard methods used in wsns to ensure network reliability. even though the retransmission approach achieves reliable data transfer as it consumes more energy, it is unsuitable for wsns. the research community has placed less focus on using redundancy to achieve reliability in wsns since, in redundancy-based reliability mechanisms, a bit lost within a packet can be recovered by adopting some form of the coding scheme. the transmission overhead generated by retransmitting a whole packet would be drastically reduced if packets could be repaired to restore any lost or malformed bits. so that, to meet the challenges here, we need to build a route such that the nodes in the route should be strong enough to mitigate packet drops. the primary causes of packet drops are bad connectivity, overflowing buffers, or a node running out of energy. 214 s. gudla, k. nageswara rao the genetic algorithm (ga) is well-known in stochastic optimization because it employs the idea of natural evolution to develop optimization solutions. thus, a genetic algorithm is suitable for representing and resolving many complex issues. it is even proved that ga is used in wireless sensor networks to improve the positioning of nodes, the extent to which a network is covered, the organization of clusters, and the collection and aggregation of data. hence, in this paper, an energy-efficient and reliable data routing solution using a genetic algorithm has been proposed to tackle the stated challenges in wsns. the proposed technique improves the network's lifetime and minimizes the number of data transmissions while achieving reliable data delivery and strict packet delivery delay. the proposed mechanism also considered the available buffer, residual energy, and link quality while selecting the data routing paths. the primary objective of this study is (a) to design a wsn data routing method based on a genetic algorithm that is both energy-efficient and reliable and (b) evaluation of the proposed algorithm's performance using simulation results. 3. proposed mechanism the proposed research work develops an energy-efficient and reliable routing mechanism using the genetic algorithm with a roulette wheel selection process and value encoding; by considering the sensor node’s residual energy, available buffer, link quality, and distance. this section of the manuscript describes the proposed work and associated methodologies genetic algorithm-based routing scheme, chromosome representation, initialization of population, fitness function, selection, recombination (crossover), mutation, and repair. 3.1. network model as indicated in fig.1, a wsn has been considered as a graph g (v, e), where v is a set of vertices of the graph representing a set of sensor nodes of the wsn, and e is the set of edges of the graph illustrating a set of wireless communication links of wsn. each node is shown in fig.1. with associated residual energy ‘e’ and buffer availability ‘b.’ using multi-hop communication, a sensor node gets the data and directs it to the sink node. retransmissions were expected to be repeated until all the packets arrived at the sink node. fig. 1 network diagram a reliable routing mechanism with energy-efficient node selection for data transmission 215 3.2. genetic algorithm-based routing scheme the genetic algorithm (ga) is a stochastic optimization tool based on natural evolution [21,32,33,35]. ga was found to be suitable for parallel optimization [29].ga is an incremental method in which each round is referred to as a generation.ga is a populationbased method that considers all possible individuals. ga derives individuals randomly from a specified population, and these individuals are encoded into genetic form. every chromosome can be viewed as a single string or an array of genes covering a piece of the solution. alleles are different variations of a gene’s value. the evolution of encoded individuals is accomplished by continuing the processes below until the termination conditions are met. ▪ the fitness function identifies the fittest members with the best fitness levels, and these most qualified individuals are chosen as parents for the following generation. ▪ the determined parents were subjected to genetic operators (crossover and mutations) to generate new offspring from the existing ones. the chromosomes for the following generations are chosen after a population’s crossover and mutation mechanisms. some of this generation's best performers might be changed out for the worst performers from prior generations in the same proportion to ensure that the current generation is, at maximum, as fit as the preceding generation. this is referred to as elitism. this process is continued till the algorithm’s halting requirement is fulfilled. we present a genetic method for an energy-efficient routing algorithm in wsns based on this survival of the fittest concept in algorithm 1. algorithm-1: genetic algorithm for routing input: ‘p(n)’: size of population ‘cp’: crossover-probability ‘mp’: mutation-probability ‘g’: number of iterations output: routing path from sensor node to sink node 1: for all nodes n ∈ n, do 2: t=0; 3: generate the initial population by randomly initializing p(n); 4: repair the randomly generated population p(n); 5: evaluating the individual’s fitness from the population p(n); 6: store best solutions of p(n) in old b(m); 7: while t < g do 8: choose the individuals as parents chromosomes from p(n) (i.e., selection); 9: perform the crossover on the selected individuals to produce new offspring (i.e., recombination); 10: perform mutation on the new offspring based on cp; 11: repair the individual chromosomes produced after mutation; 12: evaluating the individual’s fitness from the new population; 13: store the best fitness individuals of p(n) in new b(m); 14: if fit (old b(m))>fit(new b(m)) then 15: new b(m)=old b(m); 16: end if 17: old b(m)=new b(m); 18: compute worst fitness value in p(n) and change it with new b(m); 19: t=t+1; 20: end while 21: end for 216 s. gudla, k. nageswara rao ga aims to discover a strategy for gathering data in wsn that maximizes the network’s lifetime, given a set of intermediate nodes and a base station with their coordinates. every data collection period is referred to as a round, and the number of repetitions defines how long the first intermediate node will operate before running out of energy. we further consider that each relay node has the same initial energy. 3.2.1. chromosome representation in the initial population, every chromosome belongs to a feasible genetic strategy. a sequence of positive integers indicating the ids of sensor nodes in a route from the source to the destination is called a chromosome [28]. the structure of the initial routing path depends on the position of the central nodes, which can move the content from the source to the destination (fig.2). the first locus gene is usually given the location of the source. the chromosome size is sensitive, but it must stay within the allowable level, the number of nodes in the network because the channel will never have more than the value. based on virtual network information (route table), the chromosome (path) represents a problem by assigning node ids from its source location to the destination. the first locus's gene encodes the source node, and the nodes linked with the source node designated by the front gene's allele are picked randomly or heuristically by the succeeding locus's gene. a selected node is removed from the topological information database to prevent it from being selected again. this procedure will be repeated until the destination node has been reached. it is worth noting that encoding is only possible if each route step is carried out over an actual network link. to represent a chromosome, we used the ids of nodes in the routing sequence from source to destination. the same is encoded, as shown in fig.2. there are different encoding schemes from the literature [7], such as binary encoding, octal encoding, hexadecimal encoding, permutation encoding, value encoding, and tree encoding. in the proposed work, the chromosome is a sequence of node ids of nodes in the route from the source node to the destination node. hence value encoding is considered to avoid complexities and further conversions. fig. 2 encoding scheme consider an example of a chromosome from source to destination. the chromosome is a set of nodes along the formed path (s1 − s4 − s8 −s11 − sink), as shown in fig. 1. the chromosome length is denoted as the count of nodes (genes) in the constructed route. 3.2.2. initialization of population population initialization of genetic algorithm assumes two points: the process to initialize the population and its size. to develop sensible solutions, it was believed that the size of the population needs to grow significantly with the complexity of the problem. however, recent research has demonstrated that satisfactory results can be produced with substantially smaller populations. to conclude, having a significant population is beneficial, a reliable routing mechanism with energy-efficient node selection for data transmission 217 but it comes with a high cost in terms of memory and time [35, 36]. as one might think, determining an appropriate population size is critical for effectiveness. furthermore, the initial population can be generated in random or heuristic initialization. due to the lacking of variety in the population, the optimal global solution is never achieved, and it examines only a tiny portion of the solution space. hence, random initialization is used in this study to construct the initial population using the encoding approach described in section 3.2.1. 3.2.3. fitness function the fitness function analyses chromosomes regarding the physical description and assesses their viability in the solution based on desirable features. however, the fitness function should precisely estimate the population’s chromosomal quality. hence, the fitness function is crucial. in our work, the fitness function is the number of data transmissions (including ack and retransmissions) needed to transmit the packet to the sink node and is defined as follows. 1 1 1 1 * nr nr x x x x n g g p = =     = + + +          (1) where gx = (1 − p h)x n signifies the total number of transfers (including ack and data transfer), ‘nr’ represents the largest number of retransmissions (‘nr’ = 8), ‘p’ represents the probability of successful link transfer, ‘h’ represents the number of hops between the source and destination, 'gx’ represents the probability that a node will not get an ack for an 'x th' data transmission. in the transfer method, the first part of equation 1 specifies the data transfer value, and the next part specifies the ack transfer value per packet of data. 3.2.4. selection the reproduction operation aims to increase the population’s average quality by increasing the possibility of high-quality chromosomes being transferred to the subsequent generation [35,38]. as a result of the selection, the exploration is focused on profitable regions in the solution space. algorithm-2: selection process 1: compute the selection probability of each individual using eq. (2); 2: for all generations g ∈ g, do 3: ci=0;/*chromosome index*/ 4: p(r)=0;/*accumulation probability of roulette*/ 5: while p(r) < random (0,1) do 6: ci=ci+1; 7: p(r)=p(r)+ps(ci); 8: end while 9: selected individual=ci; 10: end for 218 s. gudla, k. nageswara rao the selection systems are characterized by selection pressure, the ratio of the probability of selecting the best chromosome in a population to the probability of selecting an average chromosome. as a result of the tremendous selection pressure, the population reaches stability swiftly, although genetic diversity is inevitably sacrificed. various approaches for selection are available in the literature [45]: proportionate selection (roulette wheel selection), stochastic selection, tournament selection, and truncation selection. the roulette wheel selection approach is the optimal choice when deciding on a selection method for genetic algorithm. because of its simplicity of understanding and coding with correctness at runtime, there is a strong bias toward selecting the fittest elements. hence the proposed work uses roulette wheel approach to choose the elite individuals. here individuals are picked depending on the selection probability of the fitness function. the selection probability of an individual is well-defined as: ( ) ( ) ( ) pop s j n fit j p j fit j  =  (2) where ‘fit(j)’ represents the fitness value of j. 3.2.5. recombination (crossover) blending the existing chromosome’s genetic information as parents to produce the new chromosomes (children) is known as crossover (recombination). fig.3 shows the crossover process. in the routing problem, recombination is defined as swapping each partial route of two individuals so that the child produced by the crossover reflects only one route. fig. 3 crossover example as a result, the one-point crossover is a suitable method for the proposed ga. the source node is connected to a relay node by one partial route, and the relay node is connected to the destination node by the other partial route. the recombination of two prevalent parents picked through the selection process (algorithm 3) increases the likelihood of generating children with prevailing characteristics. the proposed manuscript considered a one-point crossover different from the traditional one-point recombination scheme in the proposed ga. since the proposed problem deals with routing, the solution should contain the sequence of nodes that form a path from a source to the target node. hence, the recombination operation must have a minimum of one common node (gene) in the chromosomes other than the source and destination nodes in the considered chromosomes. however, the position of genes in the chromosome needs to be different. algorithm 3 shows the steps to perform the crossover operation. a reliable routing mechanism with energy-efficient node selection for data transmission 219 3.2.6. mutation the mutation operation involves either flipping the randomly chosen gene based on mutation probability or explicit modification, and the mutation causes a slight bias. fig. 4 shows mutation. the mutation helps to achieve the global optima by avoiding the optimal local value. flipping a gene may produce a partial and incomplete path in the suggested scheme. hence, when a random node is picked, we identify the common nodes connected to the next and before the chosen node. the chosen node is replaced with one of the common nodes identified. algorithm 4 specifies the steps involved in the mutation process. fig. 4 mutation example algorithm-4: mutation process 1: if mp > random[0,1] then 2: randomly choose a gene ci[m](i.e., node in the routing path); 3: identify the previous and next genes to ci[m] (i.e., ci[m−1] and ci[m+1]) in the chromosome; 4: choose a node randomly from the list of nodes that are common to ci[m−1] and ci [m+1]); 5: end if algorithm-3: crossover process input: each chromosome contains variable length d genes e.g., ci={g1,g2,···,gd} d: length of the chromosome output: cm,cn: new chromosomes after crossover 1: for all chromosomes do 2: pick two chromosomes (ci,cj) from the given population; 3: cps=find the set of crossing point pairs by calling common(ci,cj) 4: if length(cps) ≥1 then 5: if cp > random[0,1] then 6: ex=choose randomly a pair from cps; 7: produce the two new routes by switching the partial routes of ci and cj with other (i.e., cm=(ci[1] to ci[ex[1] ] )+( cj[ex[2]+1] to cj[d]) and cn=(cj[1] to cj[ex[2]])+( ci[ex[1]+1] to ci[d])); 8: end if 9: end if 10: end for 220 s. gudla, k. nageswara rao 3.2.7. repairing the recombination could produce infeasible individuals, which contradicts the proposed buffer and residual energy availability constraints. even loops may be generated while performing crossover operations. each chromosome produced after crossover and mutation should be reasonable. hence, repairing the violations should be done to make the infeasible chromosomes feasible. first, the nodes involved in the looping path should be removed to eliminate the loops in the routing path. second, each node should have buffer availability (ab) and residual energy (re) more significant than the given threshold; otherwise, considering those nodes would result in a losing path or dropping packets. therefore, each gene should be evaluated in the repairing process to determine whether ‘ab’ and ‘re’ exist sufficiently. if any gene is not meeting the criteria, then replace the gene (node) with one of the common nodes common to the previous and next nodes to a node in position ‘g’ having available buffer greater than τab and residual energy greater than τre; the process of repairing is present in algorithm 5. algorithm-5: repairing process 1: for all chromosomes, do 2: if chromosome is infeasible (i.e., loops in routing path), then 3: identify nodes that form a cycle; 4: remove the cycle 5: end if 6: for g ∈ ci do 7: if ( gab < τab ) or ( gre < τre), then 8: sabcomm =nodes that are common to the previous and next nodes to a node in position g having available buffer greater than τab and residual energy greater than τre; 9: replace g with a node randomly chosen from sabcomm; 10: end if 11: end for 12: end for 4. evaluation of performance at this stage, the proposed data collection method has been simulated in network simulator-3 (ns3) [46]; it is compared with the retransmission-based strategy and multipath routing using ga in terms of power consumption, transmission rate, packet delivery delay, and network lifetime. to ensure the most reliable data transfer, each packet is repeatedly transferred in the simulation until it reaches the sink node. in this paper, the data collection round refers to each node that senses the packet and sends it to the sink node. 4.1. simulation setup in the simulation, the nodes in the network are between 50 and 500, the initial energy of a node is 25 kj, and the buffer capacity is 2.5 kb. the link quality is assigned a value between 0 and 1. table 2 describes the simulation parameters. a reliable routing mechanism with energy-efficient node selection for data transmission 221 table 2 parameters for the simulation parameter value number nodes 50 to 500 nodes range transmission 40 meters node’s initial energy 25 kj packet size 960 bits еѕ = α3 α3 = 50 x 10 -9 joules / bit er = α12 α12 = 0.787 x 10 -6 joules / bit et = α11 + α2d n α11 = 0.937 x 10 -6 joules / bit α2 = 10 x 10 -12 joules / bit / meters2 d = 85 meters 4.2. results and discussions to analyze the network, the metrics considered are energy consumption, average delivery delay of a packet, the number of retransmissions, and network lifetime in terms of the first node dying round and half nodes dying round. the packet delivery delay, also known as latency, refers to the time a data packet travels from one node to another. the average packet delivery delay for the three approaches is shown in fig. 5. it is observed from fig. 5 that, compared to the other two techniques, the retransmission-based strategy has the most latency due to more retransmissions. the distancebased ga algorithm considered the distance as the parameter for finding the route. but our proposed work also considered link quality, residual energy, and available buffer parameters while finding the data routing paths. fig. 5 performance comparison between the average packet delivery delay fig. 6 compares the number of transmissions among the distance-based ga algorithm, retransmission-based approach, and the proposed energy-efficient and reliable ga-based data gathering mechanism. the number of retransmissions in the network increases in proposition to the packet loss rate. each dropped packet is retransmitted in the simulation to achieve high data delivery reliability. as seen in fig. 6, our approach minimizes the number of retransmissions compared to the other two mechanisms. our mechanism identifies resourceful data routing paths to reduce total data transmissions. the fitness function of the proposed gabased approach considers the link quality while choosing the routing paths. hence, the proposed algorithm selects good link-quality routing paths; thereby, it minimizes the rate of 222 s. gudla, k. nageswara rao packet dropping and reduces the number of retransmissions. the proposed mechanism also considered the residual energy and available buffer parameters in the genetic algorithm. fig. 6 performance comparison between the number of data and ack retransmissions energy consumption is the energy needed to send data from one node to another (transmission energy). fig.7 shows the network’s energy consumption for the proposed technique, retransmission method, and distance-based ga mechanism. the number of retransmissions is higher in the retransmission-based approach than in our proposed mechanism. so that the network's consumption of energy is more in a retransmissionbased approach. a distance-based ga mechanism considered only distance as a parameter while constructing a route. hence in contrast to the retransmission method and distancebased ga mechanism, the proposed mechanism used less energy. the ga-based proposed mechanism considered the link quality as the parameter that reduces the number of transmissions, resulting in reduced energy consumption. fig. 7 performance comparison between network energy consumption network lifetime has different definitions by various researchers. here we considered that the lifetime of a network is measured in terms of how long it takes for some predetermined fraction of sensors to die from lack of energy, such as the first node dying round and half a reliable routing mechanism with energy-efficient node selection for data transmission 223 nodes dying round. fig. 8 and fig. 9 show the total life of the network (first dead node) and half of the dead nodes. a half-node dead round is a data collection round in which half of the total number of nodes dies. the proposed approach has improved network lifetime compared to the retransmission method and the distance based on the ga algorithm. fig. 8 performance comparison between first node dead round the proposed technique provides the optimal data routing path in terms of available energy, available buffer, and quality of the link. thus, the packet loss rate is minimized. fig. 8 and fig. 9 depict that with an improvement in the node count, the node’s lifetime decreases due to increased network energy consumption. in comparison to existing techniques, the proposed mechanism exhibited considerable longevity improvement. fig. 9 performance comparison of half nodes dead round table 3 signifies that the proposed method is an improvement over the retransmissionbased strategy, the distance-based genetic algorithm, and the a-star-based approach. this is because the proposed work developed an energy-efficient and reliable data delivery routing using a genetic algorithm with a roulette wheel selection approach and value encoding for encoding chromosomes, considering the parameters like node's current energy state, distance, link quality, and available buffer. it is demonstrated when compared with the related works from table 3, the delay in packet delivery is reduced, the network's lifetime is extended, and the energy consumption is reduced. 224 s. gudla, k. nageswara rao table 3 comparison of results with existing works reference number of nodes average packet delivery delay (seconds) first node died round half number of nodes died round [5] 50 0.24 15 750 [11] 0.12 19 880 [26] 0.13 16 860 proposed work 0.11 20 890 [5] 200 0.49 8 670 [11] 0.34 10 760 [26] 0.22 11 755 proposed work 0.21 12 780 [5] 300 0.67 5 620 [11] 0.58 7.5 690 [26] 0.44 7 700 proposed work 0.41 8 710 [5] 400 0.89 3 560 [11] 0.79 5 620 [26] 0.68 5 634 proposed work 0.63 6 650 5. conclusion there are two primary challenges in wsn: power efficiency and the availability of reliable data transmissions. in this study, a genetic algorithm-based data collection strategy is developed to improve the ability to use robust and reliable data systems in wsns to extend network life. the genetic algorithm with a roulette wheel selection approach and value encoding regulates the most energy-efficient and reliable routes by considering the node's remaining energy, the link quality, the node's free available buffer, and distance as key factors that cause packet loss. simulated results show that the proposed method reduces network energy consumption by 40 percent. in addition, the proposed technique significantly increases the network’s lifetime and reduces the packet delivery delay. possible future expansion to the study includes considering node mobility and energy harvesting. references [1] i. f. akyildiz, w. su, y. sankarasubramaniam and e. cayirci, "wireless sensor networks: a survey", comput. netw., vol. 38, no. 2, pp. 393-422, march 2002. [2] i. f. akyildiz and i. h. kasimoglu, "wireless sensor and actor networks: research challenges", ad hoc netw., vol. 2, no. 4, pp. 351-367, oct. 2004. [3] t. rault, a. bouabdallah and y. challal, "energy efficiency in wireless sensor networks: a top-down survey", comput. netw., vol. 67, pp. 104-122, april 2014. [4] b. singh and d. k. lobiyal,"an energy-efficient adaptive clustering algorithm with load balancing for wireless sensor network", int. j. sensor networks, vol. 12, no. 1, pp. 37-52, july 2012. [5] c. wu, y. ji, j. xu, s. ohzahata and t. kato, "coded packets over lossy links: a redundancy-based mechanism for reliable and fast data collection in sensor networks", comput. netw., vol. 70, pp. 179-191, sept. 2014. a reliable routing mechanism with energy-efficient node selection for data transmission 225 [6] f. h. elfouly, r. a. ramadan, m. i. mahmoud and m. i. dessouky, "swarm intelligence based reliable and energy balance routing algorithm for wireless sensor network", fu: elec. energ., vol. 29, no. 3, pp. 339-355, sept. 2016. [7] u. mehboob, j. qadir, s. ali and a. vasilakos,"genetic algorithms in wireless networking: techniques, applications, and issues", soft computing, vol. 20, no. 6, pp. 2467-2501, june 2016. [8] m. a. mahmood, w. k. g. seah and i. welch, "reliability in wireless sensor networks: a survey and challenges ahead", comput. netw., vol. 79, pp. 166-187, march 2015. [9] m. bhardwaj, t. garnett and a. p. chandrakasan, "upper bounds on the lifetime of wireless sensor networks", in proceedings of the ieee international conference on communications (icc), 2001, pp. 785-790. [10] t. bhatia, s. kansal, s. goel and a. verma, "a genetic algorithm-based distance-aware routing protocol for wireless sensor networks", comput. electr. eng., vol. 56, pp. 441-455, nov. 2016. [11] s. wang, "multipath routing based on genetic algorithm in wireless sensor networks", hindawi math. prob. eng., vol. 2021, pp. 1-6, june 2021. [12] n. muruganantham and h. el-ocla, "genetic algorithm-based routing performance enhancement in wireless sensor networks", in proceedings of the ieee 3rd international conference on communication and information systems (iccis), 2018, pp. 79-82. [13] n. muruganantham and h. el-ocla," routing using genetic algorithm in a wireless sensor network”, wirel. pers. commun., vol.111., pp. 2703-2732, jan. 2020. [14] m. romoozi and h. ebrahimpour-komleh, "a positioning method in wireless sensor networks using genetic algorithms", in proceedings of the international conference on medical physics and biomedical engineering, 2012, pp. 1042-1049. [15] t. abirami and p. priakanth, "energy efficient wireless sensor network using genetic algorithm based association rules", int. j. comput. appl., vol. 91, no. 10, april 2014. [16] h. a. talib, r. alothman and m. k. farhan, "optimization approach to optimal power efficient based on cluster top option in wireless sensor networks", turkish j. comput. math. education, vol. 12, no. 4, pp. 970-979, april 2021. [17] i. apetroaei, i.-a. oprea, b.-e. proca and l. gheorghe, "genetic algorithms applied in routing protocols for wireless sensor networks", in proceedings of the 10th roedunet international conference, 2011, pp. 1-6. [18] b. baranidharan and b. santhi, "gaech: genetic algorithm based energy efficient clustering hierarchy in wireless sensor networks", hindawi j. sensors, vol. 2015, pp. 1-8, aug. 2015. [19] a. khunteta and a. bajpai, "genetic algorithm with leach protocol for cluster head selection in wireless sensor networks", ictact j. commun. technol., vol. 11, no. 2, pp. 2182-2186, june 2020. [20] t.-t. nguyen, c.-s. shieh, m.-f. horng and t.-k. dao, "a genetic algorithm with self-configuration chromosome for the optimization of wireless sensor networks", in proceedings of the 12th international conference on advances in mobile computing and multimedia, 2014, pp. 413-418. [21] m. k. somesula, r. r. rout and d. somayajulu,"contact duration-aware cooperative cache placement using genetic algorithm for mobile edge networks", comput. netw., vol. 193, april 2021. [22] s. j. park, r. vedantham, r. sivakumar and i. f. akyildiz,"garuda: achieving effective reliability for downstream communication in wireless sensor networks", ieee trans. mobile comput., vol. 7, no. 2, pp. 214-230, feb. 2008. [23] t. le, w. hu, p. corke and s. jha, "ertp: energy efficient and reliable transport protocol for data streaming in wireless sensor networks", comput. commun., vol. 32, pp. 1154-1171, jan. 2009. [24] d. jiang, p. zhang, z. lv and h. song, "energy-efficient multi-constraint routing algorithm with load balancing for smart city applications", ieee internet of things j., vol. 3, no. 6, pp. 1437-1447, sept. 2016. [25] s. lee and h. s. lee, "analysis of network lifetime in cluster-based sensor networks", ieee commun. letters, vol. 14, no. 10, pp. 900-902, oct. 2010. [26] i. s. alshawi, l. yan, w. pan, b. luo, "lifetime enhancement in wireless sensor networks using fuzzy approach and a-star algorithm". ieee sensors j., vol. 12, no. 10, pp. 3010-3018. oct. 2012. [27] s. k. a. imon, a. khan, m. d. francesco and s. k. das, "energy-efficient randomized switching for maximizing lifetime in tree-based wireless sensor networks", ieee/acm trans. netw., vol. 23, no. 5, pp. 1401-1415, oct. 2015. [28] c. w. ahn and r. s. ramakrishna, "a genetic algorithm for shortest path routing problem and the sizing of populations", ieee trans. evolutionary comput., vol. 6, no. 6, pp. 566-579, dec. 2002. [29] p. lin, q. song and a. jamalipour,"multidimensional cooperative caching in comp-integrated ultradense cellular networks", ieee trans. wirel. commun., vol. 19, no. 3, pp. 1977-1989, dec. 2019. [30] w. shen, t. zhang, f. barac and m. gidlund,"priority-mac: a priority-enhanced mac protocol for critical traffic in industrial wireless sensor and actuator networks", ieee trans. industr. inform., vol. 10, no. 1, pp. 824-835, feb. 2014. https://www.sciencedirect.com/journal/computer-networks https://www.sciencedirect.com/journal/computer-networks/vol/79/suppl/c https://ieeexplore.ieee.org/xpl/conhome/5983339/proceeding 226 s. gudla, k. nageswara rao [31] n. alsindi and k. pahlavan, node localization: wireless sensor networks: a networking perspective, john wiley & sons, 2009, chapter 8. [32] j. patel and h. el-ocla, "energy efficient routing protocol in sensor networks using genetic algorithm", mdpi sensors, vol. 21, no. 21, p. 7060, oct. 2021. [33] m. shokouhifar and a. hassanzadeh, "an energy efficient routing protocol in wireless sensor networks using genetic algorithm", adv. environ. biol., vol. 8, no. 21, pp. 86-93, oct. 2014. [34] y. liu, a. liu, y. li, z. li, y. june choi, h. sekiya and j. li, "apmd: a fast data transmission protocol with reliability guarantee for pervasive sensing data communication", pervasive mob. comput., vol. 41, pp. 413-435, 2017. [35] k. sastry, d. goldberg and g. kendall, genetic algorithms in search methodologies, springer, 2005, chapter-4, pp. 97-125. [36] u. dohare, d. k. lobiyal and s. kumar, "energy balanced model for lifetime maximization in randomly distributed wireless sensor networks", wirel. pers. commun., vol. 78, no. 1, pp. 407-428, april 2014. [37] b. singh and d. k. lobiyal, "an energy-efficient adaptive clustering algorithm with load balancing for wireless sensor network", int. j. sensor networks, vol. 12, no. 1, pp. 37-52, july 2012. [38] d. e. goldberg, genetic algorithms in search, optimization, and machine learning, addison-wesley publishing, october 1989. [39] s. gudla and n. r. kuda, "learning automata-based energy efficient and reliable data delivery routing mechanism in wireless sensor networks", j. king saud univ. – comput. inform. sci., vol. 34, no. 8, pp. 5759-5765, april 2021. [40] a. rastogi and s. rai, "a novel protocol for the stable period and lifetime enhancement in wsn", int. j. inform. technol., vol. 13, pp. 777-783, jan. 2021. [41] d. k. sharma, d. kukreja, s. bagga et al., "gauss-sigmoid based clustering routing protocol for wireless sensor networks", int. j. inform. technol., vol. 13, pp. 2569-2577, nov. 2019. [42] f. ullah, m. zahid khan, m. faisal, h. u. rehman, s. abbas and f. s. mubarek, "an energy-efficient and reliable routing scheme to enhance the stability period in wireless body area networks", comput. commun., vol. 165, no. 1, pp. 20-32, jan. 2021. [43] d. deepakraj and k. raja, "markov-chain based optimization algorithm for efficient routing in wireless sensor networks", int. j. inform. technol., vol. 13, pp. 897-904, march 2021. [44] j. agarkhed, v. kadrolli and s. patil, "fuzzy based multi-level multi-constraint multi-path reliable routing in a wireless sensor network", int. j. inform. technol., vol. 12, pp. 1133-1146, june 2020. [45] r. champlin, "selection methods of genetic algorithms", student scholarship computer science,2018. available at: https://digitalcommons.olivet.edu/csis_stsc/8 (accessed: 2022-01-02). [46] network simulator 3. available at: https://www.nsnam.org (accessed:2022-01-02). https://digitalcommons.olivet.edu/csis_stsc/8 plane thermoelastic waves in infinite half-space caused facta universitatis series: electronics and energetics vol. 30, n o 3, september 2017, pp. 417 427 doi: 10.2298/fuee1703417s design and implementation of non-uniform quantizers for discrete input samples and its application to an image processing algorithm  nikola simić 1 , zoran h. perić 1 , milan savić 2 1 university of niš, faculty of electronic engineering, department of telecommunications, niš, republic of serbia 2 university of pristina, faculty of natural science and mathematics, department of informatics, kosovska mitrovica, republic of serbia abstract. this paper describes an algorithm for grayscale image compression based on non-uniform quantizers designed for discrete input samples. non-uniform quantization is performed in two steps for unit variance, whereas design is done by introducing a discrete variance. the best theoretical and experimental results are obtained for those discrete values of variance which provide the operating range of quantizer located in the vicinity of maximal signal value that can appear on the entrance. the experiment is performed by applying proposed quantizers for compression of standard test grayscale images as a classic example of discrete input source. the proposed fixed non-uniform quantizers, designed for discrete input samples, provide up to 4.93 [db] higher psqnr compared to the fixed piecewise uniform quantizers designed for discrete input samples. key words: discrete input samples, grayscale image processing, non-uniform quantization, optimal input range. 1. introduction the interest in methods of digital image processing comes from two basic ideas. first of all, rapidly growing information systems aim at reducing the amount of data required for data processing in order to use narrower bandwidth, as well as to save available storage. next, visual interpretation has to be improved since digital images are widely used in a number of applications [1]. generally, all compression algorithms may be classified in two groups – „lossless‟ compression algorithms if there is no loss of information, and „lossy‟ methods if some information is lost irreversibly [1], [2]. even though there is a variety of compression algorithms for different purposes [3], research areas are still expanding. in recent years, schemes which incorporate compressive sensing became very important and received november 6, 2016; received in revised form january 17, 2017 corresponding author: nikola simić faculty of electronic engineering, university of niš, serbia, aleksandra medvedeva 14, 18000 niš, serbia (e-mail: simicnikola90@gmail.com) 418 n. simić, z. h. perić, m. savić some restoration as well as image reconstruction schemes made an impact in the image processing field [4]-[5]. besides schemes developed for software applications, some effort is also paid to fpga based solutions [6]. this paper deals with a type of improved btc (block truncation coding) algorithm that is a kind of a „lossy‟ method, used for compression of grayscale images [7]. although the basic algorithm has been well-known for years, some upgrades proposed in recent years have found application in modern systems [8]. moreover, an improved block truncation coding algorithm based on optimized dot diffusion was proposed by guo et. al [9], whereas an effective image retrieval system was presented a year later [10]. also, a data hiding scheme based on btc algorithm, designed to embed a huge amount of watermarks was presented in the paper [11], so it can be concluded that the core algorithm can be still improved and implemented in modern systems. despite the core algorithm and its modifications usually can not improve the coding gain comparing to the modern state-of-the-art techniques such as jpeg and jpeg2000, the computational complexity of those schemes is much lower compared to the aforementioned state-of-the-art solutions, which makes it very suitable for image retrieval purposes [10]. the difference in designing of fixed uniform quantizers for continual and discrete input was observed in papers [12], [13]. further research in this direction included designing of fixed piecewise uniform quantizers described in [14]. this paper is a logical continuation of the research. we expect that the gain due to different non-uniform quantizer designing for discrete and continual input is higher than the maximal difference of psqnr (peak signal-to-quantization-noise ratio) between fixed piecewise uniform (l=16) and optimal non-uniform quantizer that is equal to 0.7 [db] (for continual input signal)[14], [15]. the proposed design is fixed, it was tested for a set of standard test grayscale images and optimal parameters are found. however, non-uniform quantizer can be designed by using lloyd-max algorithm which represents a very powerful iterative solution [16]. moreover, highquality performance can be achieved by introducing variance adaptation which would provide better quality of reconstructed image [17]. on the other hand, the proposed design is less complex and it requires less processing time, as it represents a kind of fixed scalar quantization. the paper is organized as follows. in section 2 basic modelling of discrete input source is shown, improved by introducing non-uniform quantization. section 3 describes an algorithm for grayscale image compression that is used for experimental analysis. finally, the obtained theoretical and experimental results as well as the obtained gain in comparison to other models are presented in section 4. 2. system model the considered system consists of two stages − the purpose of uniform quantizer q0 exploited in the first stage is to convert analog input signal to discrete samples, whereas the proposed quantizer q, designed for discrete input, is exploited in the second stage in order to perform additional data compression. in the first step, samples with a continual amplitude have to be quantized with a fixed uniform quantizer q0 which is described with n0 output levels, x={x1,x2,…, xn0}, and the maximal amplitude xmax, which depends on the input signal range [14], [17]. considered pixel values of standard grayscale images are described with 8 bits and they can take values from 0 to 255, so xn0 = 255. furthermore, quantization process in btc algorithm is design and implementation of non-uniform quantizers for discrete input samples 419 based on quantization of distinction between the original and mean pixel value of all pixels in a block. therefore, the number of output levels n0 is equal to 512. on the other hand, samples with continuous amplitude can be described only as random variables, since the input information is unknown. in probability theory, random variables are described by using probability density function (pdf) which provides the relative likelihood for the observed random variable to take on a given value. so far, it is shown in literature that laplacian source ensures good matching between a btc model and reality [1], [7]. consequently, in the rest of the paper we will suppose that the information source is laplacian with a memoryless property and mean value equal to zero. it is defined with: 1 2 | | ( ) exp 2 x p x          , (1) where  represents a standard deviation of the random variable x. the second step of quantization process involves quantization of discrete output samples from the quantizer q0 using n quantization levels, where n < n0. probabilities of these discrete input levels for laplacian distribution are: 1 1 2 21 ( )d exp exp 2 i i x i i i x x x p p x x                             , (2) where i = 0, … , n0 1. the main goal of this phase is additional data compression. in the rest of the paper the quantizer from the second step is denoted with q. this paper deals with designing and optimization of quantizer q. so far in literature, the application of both uniform and piecewise uniform quantizers was described, and in this paper we propose application of a non-uniform quantizer since it provides better quality of reconstructed signal for the equal number of quantization levels [1]. the design of the non-uniform quantizer q is done as follows. firstly, we design the optimal compandor with n quantization levels for the unit standard deviation (σ = 1). its compressor function maps the range (-, ) to (-1, 1). the compressor function formed in this way can be defined with:       ttp ttp xc x d)( d)( 21)( 3/1 3/1 . (3) decision thresholds obtained in this way can be calculated as [15]: 2 0 2 log 2 3 n i n i ti        , (4) . 2)(2 log 2 3 ni n in n ti           (5) 420 n. simić, z. h. perić, m. savić furthermore, representational levels are determined with [15]: 2 11 2 log 2 3 n i n i i        , (6) . 21)1(2 log 2 3 ni n n n i           (7) in all previous equations log(x) represents natural logarithm of x. the range of quantizer designed in this way is (tn, tn). since the obtained range is not adjusted to the theoretical range of pixel values, denormalization is required. due to the fact that for a low number of quantization levels tn < xn0 [15] is always valid, denormalization is performed by introducing a discrete variance d̂ . it is obtained by multiplying decision thresholds ti and representational levels i with discrete variance d̂ that is used for quantizer designing. finally, decision thresholds and representational levels of quantizer q are determined with: ,0,ˆ nitx dii   (8) .1,ˆ niy dii  (9) the maximal support xn can be defined in different ways [15], and in this paper we have decided to choose a simplest form in order to place the last represent at the half of the decision range. however, in the case if xn < xn0, the overload distortion will exist. on the other hand, if xn > xn0, the range [xn0, xn] will be unused. as a result, higher granular distortion would exist. if the system conditions require designing of fixed quantizer with the unused range (case xn > xn0), we propose additional modification by introducing another denormalization parameter . its function is to adapt the range [xn, xn] formed in the previous step, to the range [xr, xr], where the desired maximal value of the range is denoted with xr. consequently, we define parameter  with: nr xx / . (10) finally, decision thresholds and representational levels of quantizer q in the case xn > xn0 are equal to: ,0, ' nixx ii  (11) .1, ' niyy ii  (12) as this is a kind of a „lossy‟ compression method, some information will be lost irreversibly during the quantization process. as a standard measure of a reconstructed signal quality we estimate distortion (d) which consists of both granular (dg) and overload (do) distortion that can be calculated with [8], [9]: design and implementation of non-uniform quantizers for discrete input samples 421 ,)()(2 2/ 1 1 2      n i k j ijiijg i xpyxd (13) .)()(2 1 2 2/0    s j jnj xpyxd (14) in eq. (13) parameter ki denotes the number or input levels mapped with yi whereas xij  x. moreover, in eq. (14) xj  x, parameter s denotes the total number of pixel values from the theoretical range, which are not placed within the designed range. this parameter can be calculated as: . 0 nn xxs  (15) finally, the total distortion is equal to: .ogt ddd  (16) 3. algorithm for image processing the proposed design of second-stage quantizer q from section 2 is tested by analyzing its application to the image processing algorithm, defined as follows. 1. the image is divided into m non-overlapping blocks of dimensions m  m. 2. each block is processed separately by sending data and reconstructing information at the receiver side. the algorithm processes pixels from left to right and from top to bottom. 3. the mean value of all pixels in the block (xav) is calculated and then quantized ( avx̂ ) with a fixed uniform quantizer. in order to minimize the error in the reconstruction process, coding process uses values which are available to the decoder. 4. the difference blocks of m  m pixels are formed. elements of a block are denoted with di,j and obtained as: avjiji xxd ˆ,,  , (17) where xi, j is original pixel value and i = 1,…, m; j = 1,…, m. elements of a difference block have laplacian distribution [1], and they can take integer values [xn0, x n0]. 5. elements of difference blocks are quantized by using proposed fixed non-uniform quantizers from section 2. these elements are denoted with jid , ˆ , coded with log(n) bits and transmitted to the receiver. 6. in the receiver, pixel reconstruction is done as: avjiji xdx ˆ ˆˆ ,,  . (18) during quantization process there was made distortion of original image in step 5. it can be experimentally measured as [9]: 2 , , 1 1 1 ˆ( ) m m i j i j i j d x x m m       2 , , 1 1 1 ˆ( ) m m i j i j i j d d m m       . (19) 422 n. simić, z. h. perić, m. savić the flow chart of this algorithm is shown in fig. 1. fig. 1 flow chart of the proposed grayscale image compression method 4. numerical results to demonstrate the performance of the proposed algorithm for image compression, we will show a comparison of theoretical with experimental results obtained for a set of standard test grayscale images as well as a comparison with the results available in literature for piecewise uniform quantization model [14]. all theoretical calculations and experimental results are done for a set of three standard test grayscale images (lena, street and boat). we estimate system performance using average bit-rate rb and psqnr which represent standard measures. since we discuss fixed non-uniform quantizers, average bitrate depends on the number of quantization levels n and the number of bits required for transmitting the mean value avx̂ . on the other hand, psqnr is defined with [13], [14], [17]: ]db[log10 2 10 0            d x psqnr n (20) design and implementation of non-uniform quantizers for discrete input samples 423 for measuring experimental psqnrex we use eq.(20), whereas d is defined with eq. (19). however, theoretical results have to include weighting function, since input samples do not occur with the same probabilities [14]. the weighting function in linear domain for tested images is shown in fig. 2. fig. 2 the weighting function in fig. 2, i represents standard deviation of the difference between pixels and the mean value of the block that pixel belongs to. taking previous consideration into account, including weighting averaging for the observed test grayscale images and considering that total distortion is defined with eq.(16), theoretical results are denoted with psqnrwav. this measure is defined with [14]: ]db[)()( 255 1 iiwav psqnrwpsqnr i    (21) table 1 shows obtained experimental results of applying the proposed algorithm for grayscale image compression as well as corresponding theoretical results. it can be seen that experimental results very well follow changes of theoretical values, whereas relative difference between theoretical and experimental values occurs due to non-ideal modelling with laplacian source as well as because of averaging for a set of images [18]. from table 1, it can be clearly seen that the best theoretical and experimental results are obtained for those values of discrete variances ( 17ˆ d for n = 32 and 15ˆ d for n = 64) which ensures input range of quantizer q as close as possible to the range (152, 152) [14], [17]. consequently, this means that parameter xr = 152. 424 n. simić, z. h. perić, m. savić table 1 comparison of experimental and theoretical results for the proposed model n d̂ psqnrwav[db] psqnrex . [db] nx rb [bpp] 32 15 46.82 47.57 132 5.375 17 46.43 46.94 149 29 44.59 44.51 255 64 15 49.38 51.57 154 6.375 24 49.00 50.85 247 29 48.01 48.50 298 moreover, it can be noticed that for the case n = 64 and 29ˆ d , overload distortion does not exists since the range (-298, 298) is wider of the theoretical range (255, 255) and the support region is not adapted to the theoretical one. in this case, decision thresholds and representational levels could be calculated using eqs.(10)-(12). however, this modification involves additional hardware requirements and processing time as well as information about xr for specific systems regarded to the nature of the input signal. in fig. 3 we have shown original test grayscale images of resolution 512512 pixels, while in fig. 4 we have presented corresponding images from fig. 3, after processing with the proposed algorithm for n=32 quantization levels and .15ˆ d (a) (b) (c) fig. 3 standard test grayscale images: (a) lena, (b) boat and (c) street (a) (b) (c) fig. 4 standard test grayscale images from fig. 3, after compression with the proposed algorithm (n=32): (a) lena, (b) boat and (c) street design and implementation of non-uniform quantizers for discrete input samples 425 in order to compare the obtained results with models available in the literature, we perform comparison of both experimental and theoretical results with system performance of the model based on fixed piecewise uniform quantizers designed for discrete input, as it represents the model with similar complexity. the experimental comparison is measured as experimental gain of the proposed method and it represents the difference of psqnr between the proposed and equivalent results from savic et al. [14], i.e. gain [db] = psqnrex . [db] psqnreq(n) inf [db], where equivalent results are provided for n=32 and n=64 quantization levels. in [14], obtained experimental results as close as to the nonuniform quantization are achieved for n = 32 and l =16 (psqnrex(32) inf = 42.64 [db], rb = 5.375 [bpp]), whereas corresponding theoretical performance is psqnrth(32) inf =42.29 [db]. since the paper [14] did not deal with systems that use n = 64 levels, comparison for these results is done considering the rule that psqnr values increase/decrease for 5.5 [db] by changing the bit-rate for 1 bit [13], [14]. respecting that bit-rate difference between quantizers that are designed for n = 32 and n = 64 quantization levels is 1 [bpp], corresponding result for n = 64, which is used for comparison, is psqnreq(64) inf = 42.640+1*5.5 = 48.14 [db]. comparing the obtained results from table 1 with corresponding results (psqnrex(32) inf and psqnreq(64) inf ) from [14], the obtained experimental gain is shown in table 2 for the same number of quantization levels. table 2 experimental gain of the prposed model in comparission to the piecewise uniform quantization model. n d̂ gain[db] 32 15 4.93 17 4.30 29 1.87 64 15 3.43 24 2.71 29 0.36 by observing table 2, it can be concluded that fixed non-uniform quantizers designed for discrete input samples for n = 32 and n = 64 quantization levels gives from 0.35605 to 4.93 [db] higher psqnr compared to the fixed piecewise uniform quantizers designed for discrete input samples in addition, comparing theoretical results from table 1 with psqnrth(32) inf , it can be concluded that beside experimental gain, the proposed improved theoretical model that uses discrete variance predicts gain up to 4.52 [db] compared to the same similar system, confirming experimental results. 5. conclusion in this paper we described a novel method for non-uniform quantizer design for discrete input samples and we tested the proposed quantizer for grayscale image coding. considering that quantizers designed for continuous and discrete signals have different nature, we have introduced discrete designing variance as an additional and effective parameter in the 426 n. simić, z. h. perić, m. savić process of quantizer designing, for discrete input samples. system performance was discussed using weighting averaging of psqnr for a set of three standard test grayscale images. the experimental results demonstrate that the performance of the proposed method outperforms other similar models obtained gain of the proposed discrete solution is much higher for the most of discussed cases than the maximal difference of psqnr between piecewise uniform (l=16) and optimal non-uniform quantizer that is equal to 0.7 [db] (for continual input signal), which proves the introduction of the proposed quantizer design. furthermore, additional system modification was proposed to adjust quantizer design in the special cases. however, this modification requires additional computing time as well as information about a set of input images. to generalize this approach, future work will include testing of specific images in order to find optimal values of input range support as well as implementation for different types of discrete input source. acknowledgments: this work is supported by serbian ministry of education and science through mathematical institute of serbian academy of sciences and arts (project iii44006) and by serbian ministry of education, science and technological development (project tr32035). references [1] jayant n. s., noll p, digital coding of waveforms, prentice hall pb, 1984. [2] yun q., shi, huifnag sun, image and video compression for multimedia engineering, taylor & francis group, 2008. [3] m. savic, z. peric, n. simic, “coding algorithm for grayscale images based on linear prediction and dual mode quantization”, expert systems with applications, vol. 42, pp. 7285–7291, 2015. [4] n. eslahi, a. aghagolzadeh, “compressive sensing image restoration using adaptive curvelet thresholding and nonlocal sparse regularization”, ieee transactions on image processing, vol. 25, no. 7, pp. 3126 – 3140 july 2016. [5] j. musić, t. marasović, v. papić, i. orović, s. stanković, “performance of compressive sensing image reconstruction for search and rescue”, ieee geoscience and remote sensing letters, vol. 13, no. 11, pp. 1739 – 1743, nov. 2016. [6] a. napieralski, j. cłapa, k. grabowski, m. napieralska, w. sankowski, p. sękalski, m. zubert, “image and video processing with fpga support used for biometric as well as other applications”, facta universitatis, series: electronics and energetics, vol. 28, no. 2, june 2015, pp. 165 – 175. [7] y. yang, q. chen, y. wan, “a fast near-optimum block truncation coding method using a truncated kmeans algorithm and intre-block correlation”, international journal of electronics and communications (aeu), 2011, no. 65, pp. 576-581. [8] s. kim, d. lee, j-s. kim, h-j. lee, “a block truncation coding algorithm and hardware implementation targeting 1/12 compression for lcd overdrive”, journal of display technology, vol. 12, no. 4, pp. 376−389, april 2016. [9] j-m., guo, y-f., liu, “improved block truncation coding using optimized dot diffusion”, ieee transactions on image processing, vol. 23, no. 3, pp.1269−1275, march 2014. [10] j-m., guo, h. prasetyo, n-j., wang, “effective image retrieval system using dot-diffused block truncation coding features”, ieee transactions on multimedia, vol. 17, no. 9, pp. 1576−1590, september 2015. [11] j-m., guo, y-f., liu, “high capacity data hiding for error-diffused block truncation coding”, ieee transactions on image processing, vol. 22, no. 12, pp. 4808−4818, december 2012. [12] m. savić, z. perić, m. dinčić, “design of forward adaptive uniform quantizer for discrete input samples for laplacian source”, electronics and electrical engineering, no. 9 (105), pp. 73-76, 2010. [13] m. savić, z. perić, m. dinčić, “an algorithm for grayscale image compression based on the forward adaptive quantizer designed for signals with discrete amplitudes”, electronics and electrical engineering, no. 2 (118), pp. 13-16, 2012. design and implementation of non-uniform quantizers for discrete input samples 427 [14] m. savic, z. peric, m. dincic, “coding algorithm for grayscale images based on piecewise uniform quantizers”, informatica, vol. 23, no. 1, pp. 125-140, 2012. [15] z. peric, m. petkovic, m. dincic, “simple compression algorithm for memoryless laplacian source based on the optimal companding technique”, informatica, vol. 20, no. 1, pp. 99–114, 2009. [16] z. peric, j. nikolic, “an effective method for initialization of lloyd-max's algorithm of optimal scalar quantization for laplacian source”, informatica, vol. 18, no.2, pp. 279-288, 2007. [17] n. simic, z. peric, m. savic, ”improved algorithm for grayscale image compression based on multimode coding algorithm”, revue roumaine des sciences techniques-serie electrotechnique et energetique, tome 59, issue 3, pp. 315-323, october 2014. [18] z. peric, n. simic, m. savic, “analysis and design of two stage mismatch quantizer for laplacian source”, elektronika ir elektrotechnika, vol. 21, no. 3, pp. 49-53, 2015. instruction facta universitatis series: electronics and energetics vol. 29, n o 4, december 2016, pp. 631 646 doi: 10.2298/fuee1604631j multi-criteria assessment of the smart grid efficiency using the fuzzy analytic hierarchy process  aleksandar janjic 1 , suzana savic 2 , goran janackovic 2 , miomir stankovic 2 , lazar velimirovic 3 1 university of niš, faculty of electronic engineering, niš, serbia 2 university of niš, faculty of occupational safety, niš, serbia 3 mathematical institute of the serbian academy of sciences and arts, belgrade, serbia abstract: in this paper, the key performance indicators related to the smart grid efficiency, as the key factor of any energy management system implementation have been analysed. the authors are proposing multi-criteria fuzzy ahp methodology for the determination of overall smart grid efficiency. four criteria (technology, costs, user satisfaction, and environmental protection) and seven performances (according to eu and us initiatives for analysis of benefits and effects of smart grid systems) for the selection of optimal smart grid project are defined. the analysis shows that the dominant performances of the optimal smart grid project are efficiency, security and quality of supply. the methodology is illustrated on the choice of smart grid development strategy for the medium size power distribution company. key words: smart grid, multi-criteria analysis, fuzzy analytical hierarchy process 1. introduction a smart grid is usually defined as an electrical grid that intelligently integrates the actions of all users connected within it – producers, consumers, and those who are both, with the purpose of efficiently producing electricity and delivering it sustainably, economically, and safely [1]. the smart grid promises a variety of efficiency gains for utilities, like the reducing distribution line losses through minimization of reactive power and more precise voltage control [2]. furthermore, the smart grid should enhance utilities‟ ability to monitor and measure the effectiveness of end-use energy-efficiency programs, and to better manage energy costs on the customer side, which is confirmed by the numerous projects and organizations that were initiated to facilitate the evolution of the smart grid [3], [4]. received june 28, 2015; received in revised form august 25, 2015 corresponding author: lazar velimirovic mathematical institute of the serbian academy of sciences and arts, kneza mihaila 36, 11001 belgrade, serbia (email: lazar.velimirovic@mi.sanu.ac.rs) 632 a. janjic, s. savic, g. janackovic, m. stankovic, l. velimirovic in the eu, the concept of smart grids was adopted in 2005, as an official document of the european commission through the european technology platform smart grids, and more precisely defined in [5] and [6]. in early april 2010, the european commission issued a statement reiterating the need to improve the existing grids, listing the following as the main objectives [7]: increased use of renewable electricity sources, grid security, energy conservation and energy efficiency, and deregulated energy market. therefore, the strategy for sustainable, competitive, and safe energy primarily implies: competitiveness, use of different energy sources, sustainability, innovation, and technological improvement [8]. the result of energy system development is reflected in energy performance, with quantifiable results pertaining to energy (e.g. energy efficiency, energy intensity, or specific energy consumption) and energy performance indicators as quantitative indexes of energy performance. energy efficiency is a way of managing and restraining the growth in energy consumption. the key energy performance indicators were defined in 2005 as a result of cooperation between several international organizations – global leaders in energy and environmental statistics and analysis: international atomic energy agency (iaea), united nations department of economic and social affairs (desa), international energy agency (iea), european environment agency (eea), and the directorate-general of the european commission for statistics – eurostat [9]. the key energy performance indicators include a set of 30 indicators: 4 social indicators, 16 economic indicators, and 10 environmental indicators. the values of the u.s. energy security risk index were determined based on the data for the period between 1970 and 2010, and predicted for the period between 2011 and 2035 [10]. the indicator values do not merely represent data but the basis for communication between stakeholders regarding sustainable energy use. each set of indicators (social, economic, or environmental) expresses specific aspects or impacts of energy production and use. the lack of systematic approach in the classification of these indicators is the main reason why the smart grids were evaluated on individual indicators only. the cyber security indicator has been explored in [11]-[13], while the cost/benefit assessment of a smart distribution system with intelligent electric vehicle charging has been analysed in [14], [15]. in the smart grid context, three main assessment frameworks based on key performance indicators (kpis) have been introduced. the ec task force for smart grids has introduced the characteristics of the ideal smart grids (services) and the outcomes of the implementation of the ideal smart grid (benefits) [16], [17]. a measure of the contribution of projects to the ideal smart grid is quantified in terms of benefits, via a set of kpis. the european electricity grid initiative has divided the ideal smart grid system into thematic areas (clusters) and is currently mapping smart grid projects into clusters [18]. in usa, the ideal characteristics of the smart grid and a set of metrics to measure progresses toward the ideal smart grids has been defined [19]: build metrics that describe attributes that are built in support of a smart grid (e.g. percentage of substations using automation) and value or impact metrics that describe the value that may derive from achieving a smart grid (e.g. percentage of energy consumed to generate electricity that is not lost, or quantity of electricity delivered to consumer compared to electricity generated expressed as a percentage). however, because of proliferation of these energy indicators, it is still very difficult to decision maker to answer to simple questions like:  among different smart grid projects, which alternative to choose?  which alternative will be the most beneficiary to different stakeholders?  how to monitor the efficiency of already implemented smart grid project? multi-criteria assesment of the smart grid efficiency using the fuzzy analitical... 633 the contribution of this paper is the introduction of multi-criteria approach in the smart grid efficiency assessment. unlike the approach used in [1], the fuzzy ahp method has been proposed, offering much more flexibility in the criteria selection and the evaluation of both criteria and alternatives. furthermore, the new hierarchy of four criteria and seven performances has been introduced in order to obtain more consistent evaluation framework. we proved that the method is highly successful in the evaluation of alternatives in the presence of heterogeneous criteria. because of the main characteristic of the adopted smart grid evaluation framework and its complex hierarchical structure, we proposed the fuzzy ahp methodology for the project evaluation, structuring a decision into a hierarchy of criteria, sub criteria and alternatives. by means of pair-wise comparisons of two (sub) criteria or alternatives, it generates inconsistency ratios and weighting factors to prioritise the criteria and alternatives. after the brief overview of key performance indicators for the smart grid evaluation, the fuzzy ahp methodology has been presented. the methodology is illustrated on the choice of smart grid projects deployment for one medium size power distribution company. 2. smart grid assessment frameworks the implementation of a smart grid is useful to achieve strategic policy goals, such as the smooth integration of renewable energy sources, a more secure and sustainable electricity supply and full inclusion of consumers in the electricity market. smart grids help the consumers to better understand their own energy use, which in turn allows them to identify energy saving opportunities. smart grid and advanced metering infrastructure (ami) systems could open up opportunities for energy management companies, hired by consumers, to use data from consumers‟ smart meters to identify opportunities for energy savings or to measure the success of energy savings measures after they are undertaken [20]. for utilities, a better understanding of the electrical grid's status at a second-by-second level allows the grid to be operated at much tighter tolerances, resulting in greater efficiencies and reliability. steering the smart grid transition is a challenging, long-term task, which requires balancing energy policy goals, environmental constraints and market profitability. in this perspective, a first approach in smart grid assessment is to evaluate to what extent smart grid projects are contributing to progresses toward the “ideal smart grid” and its expected outcomes (e.g. sustainability, efficiency, consumer inclusion), which are directly linked with the policy goals that have triggered the smart grid transition. this first approach is conducted via the definition of suitable metrics and key performance. a second complementary approach is to assess the profitability of smart grid solutions and investments through an appropriate multi-criteria decision analysis methodology. 2.1. key performance indicators the progress of smart grid development can be measured by formulating a set of key performance indicators (kpis) and applying those to the electricity network. in [17]-[19] the characteristics of the ideal smart grids and defined metrics to measure progresses and outcomes resulting from the implementation of smart grid projects have been defined. the ideal smart grid has been defined in terms of characteristics in the us and in terms 634 a. janjic, s. savic, g. janackovic, m. stankovic, l. velimirovic of services in the european union. built/value metrics in the usa and benefits/kpis in europe are used to measure progresses toward the ideal smart grid. the ec smart grid task force has identified a list of benefits deriving from the implementation of a smart grid [16]:  increased sustainability;  adequate capacity of transmission and distribution grids for „collecting‟ and bringing electricity to the consumers;  adequate grid connection and access for all kinds of grid users;  satisfactory levels of security and quality of supply;  enhanced efficiency and better service in electricity supply and grid operation;  effective support of transnational electricity markets by load flow control to alleviate loop flows and increased interconnection capacities;  coordinated grid development through common european, regional and local grid planning to optimise transmission grid infrastructure;  enhanced consumer awareness and participation in the market by new players;  enable consumers to make informed decisions related to their energy to meet the eu energy efficiency targets;  create a market mechanism for new energy services such as energy efficiency or energy consulting for customers;  consumer bills are either reduced or upward pressure on them is mitigated. each benefit is expressed via a set of kpis including both quantitative and qualitative indicators. for illustration, the first benefit – increased sustainability is valued by the quantified reduction of carbon emissions, environmental impact of electricity grid infrastructure and quantified reduction of accidents and risk associated with generation technologies (this sentence is not clear). the complete list of indicators can be found in [16]. the kpis can be applied to evaluate project results on smart grids as well. a clearly defined framework can specify where exactly the project contributed to a smart electricity grid. the mixture of quantitative and qualitative indicators is one of the major reasons for introducing the multi-criteria decision analysis techniques. another reason is the shortcoming of the cost benefit analysis, which will be explained in the sequel. 2.2. smart grid development assessment model the implementation of the smart grid should be market-driven. another necessary approach in smart grid assessment is therefore to assess the costs, the benefits and the beneficiaries of different smart grid solutions. a comprehensive methodology for cost benefit analysis of smart grid projects has been defined in [21], while the european commission has adapted and expanded the doe/epri methodology to fit the european context [22]-[24]. however, the traditional cost benefit analysis approach is not catching all the effects involved in development policies, where intangible aspects are not secondary, but dominating [25]. the main disadvantage of the cost benefit is the translation of all the effects in a common numerical and a single aggregate measure. it is crucially important to ensure that project proposals are evaluated against a common reference system, to integrate the outcome of the kpi and of the economic analysis and come up with an overall project evaluation. therefore, multiple criteria analysis seems to be better in measuring intangibles and soft impacts than cost benefit; actually, it uses more than one criterion introducing qualitative aspects in the analysis. multi-criteria assesment of the smart grid efficiency using the fuzzy analitical... 635 in order to get a thorough understanding of the status of smart grid development, the main smart criteria (they have to be specific, measurable, attainable, relevant and time-bound) can be defined. starting from eleven main benefits, presented in previous section, an adapted list of main criteria is defined in our approach, including:  technology, covering all aspects of advanced services and new requirements imposed to the distribution and transmission network;  costs;  customer satisfaction, encompassing different options of customer choice, new energy services and market participation;  environmental impact. introducing this higher level of four main criteria, after the first set of benefits defined on the base level of efficiency assessment, the higher level of assessment with four criteria explained above can be established. different levels between relations can be set up in terms of the volume of their inter connectedness. multi-criteria methods differ in the way the idea of multiple criteria is treated. each method shows its own properties with respect to the way of assessing criteria, the application and computation of weights, the mathematical algorithm utilised, the model to describe the system of preferences of the decision maker, and finally, the level of uncertainty embedded in the data set. because of the main characteristic of the adopted smart grid evaluation framework and its complex hierarchical structure, we proposed the fuzzy ahp methodology for the project evaluation, structuring a decision into a hierarchy of criteria, sub criteria and alternatives. by means of pair-wise comparisons of two (sub) criteria or alternatives, it generates inconsistency ratios and weighting factors to prioritise the criteria and alternatives. sensitivity analysis can be applied to test the robustness of the priorities. the main characteristics of this methodology are presented in the sequel. 3. methodology thomas l. saaty developed the original ahp in the late 1970s [26]. in this method, human‟s judgments are represented as crisp values. however, in many practical cases the human preference model is uncertain and decision makers cannot to assign crisp values to the comparison judgments. in these cases it is useful implementation of fuzzy ahp method. fuzzy ahp method is designed to improve decision support for uncertain valuations and priorities. in this method the data and preferences of experts are evaluated under fuzzy set environment [27]. the use of fuzzy set theory allows the decision makers to incorporate unquantifiable information, incomplete information, non-obtainable information and partially ignorant facts into decision model [28]. the basic notions of fuzzy arithmetic are given in the appendix. many authors have used fuzzy ahp method for solving problems in different areas: to solve multi-criteria problems involving qualitative data [29], [30]; water management [31]-[33]; evaluation naval tactical missile systems [34]; hazardous waste management [35]; prioritization of human capital measurement indicators [36]; shipping asset management [37]; occupational safety management [38], [39]. in this paper the fuzzy ahp method is used for smart grid projects ranking and selection, precisely because of many uncertain and non-tangible benefits and criteria involved in the smart grid projects. 636 a. janjic, s. savic, g. janackovic, m. stankovic, l. velimirovic the fuzzy ahp method involves the following steps: (1) the overall goal (objective) is identified and clearly defined; (2) the criteria, sub-criteria, and alternatives are identified; (3) the hierarchical structure is formed; (4) pair-wise comparison is made using fuzzified saaty‟s evaluation scale; (5) the priority weighting vectors are evaluated; (6) the defuzzification and the final ranking of alternatives are defined. in this study, the fuzzy ahp method is applied to the ranking of smart grid projects, according to following steps. 1. goal identification. the goal is to rank different smart grid projects. 2. identification of criteria, sub-criteria, and alternatives. criteria for smart grid projects selection are: technology, costs, user‟s satisfaction and environmental protection. sub-criteria are project performance: sustainability, capacity of transmission and distribution grids for „collecting‟ and bringing electricity to the consumers, possibility of grid connection and access for all kinds of grid users, security and quality of supply, efficiency and good service in electricity supply and grid operation, effective support of transnational projects and electricity markets, transparent information to consumers. finally, the smart grid projects are identified as alternatives. 3. hierarchical structure formation. the fuzzy ahp method presents a problem in the form of hierarchy: the first level represents the goal; the second level considers relevant criteria (four identified criteria); the third level considers relevant sub-criteria (seven identified sub-criteria); and the fourth level defines smart grid projects. 4. pair-wise comparison. pairs of elements at each level are compared according to their relative contribution to the elements at the hierarchical level above, using fuzzified saaty‟s scale, as shown in table 1. table 1 crisp and fuzzified saaty‟s scale for pairwise comparisons [30]. crisp values (x) judgment description fuzzy values 1 equal importance (1, 1, 1+δ) 3 week dominance (3-δ, 3, 3+δ) 5 strong dominance (5-δ, 5, 5+δ) 7 demonstrated dominance (7-δ, 7, 7+δ) 9 absolute dominance (9-δ, 9, 9) 2, 4, 6, 8 intermediate values (x-1, x, x+1) in this paper fuzzification is implemented by triangular fuzzy numbers, and the value of fuzzy distance of 2 is used; on boundaries, (1,1,3) is used for 1, and (7,9,9) is used for 9. it is used a fuzzy distance of 2 for odds (3, 5, 7), and a fuzzy distance of 1 for pairs (2, 4, 6, 8), as recommended in [33], because the most consistent results can be expected. pair-wise comparisons at each level, starting from the top of the hierarchy, are presented in the square matrix form , 1,ij i j n a a      , where ija is the fuzzy value about the relative importance of criteria/cub-criteria/alternative i over criteria/cub-criteria/alternative j, 1 ij a  for i = j and 1 /ij jia a for i≠j. 5. priority weights vectors evaluation. the ranking procedure starts with the determination of criteria weighting vector: 1 2 3 4 ( , , , ) t c c c c c w w w w w . (1) multi-criteria assesment of the smart grid efficiency using the fuzzy analitical... 637 elements of criteria weighting vector, with respect to equation (a.7), are determined as: 4 4 4 1 1 1 1 [ ] , 1, 2, 3, 4 ci ij ij j i j w a a i         . (2) performance weighting vectors are defined by pair wise comparison of performance according to every single criterion. appropriate elements of this vector, according to equation (a.7), are calculated as follows: 1 4 7 4 1 1 1 ij ij lj j l j x a a              , (3) where xij represents the fuzzy weights of the i-th performance with respect to the j-th criterion. final performance weights are derived through the aggregation of the weights at two consecutive levels, i.e. multiplying performance weights by criteria weights: 1 2 3 4 5 6 7 ( , , , , , , ) t sc c sc sc sc sc sc sc sc w x w w w w w w w w   . (4) finally, the smart grid projects are compared according to the relevant performance. proper weights of projects for individual performance are determined according to equation (a.7), as follows: 1 7 3 7 1 1 1 ij ij lj j l j y a a              , (5) where yij represents the fuzzy weights of the i-th project with respect to the j-th performance. final smart grid projects weights are obtained by multiplying the weights of the projects and the final performance weights: 1 2 3 ( , , ) t a sc a a a w y w w w w   . (6) 6. defuzzification and the final ranking of alternatives. in this paper triangular fuzzy numbers are ranked by applying the total integral value method. this method is used for ranking of smart grid projects according to moderate and optimistic attitude toward risk. 4. results and discussion the proposed methodology is illustrated on the choice of the smart grid deployment strategy in a hypothetical power distribution company of medium size. the company is supposed to supply 50 000 consumers, and the list of alternatives with the description of proposed actions and appropriate indicators is given in table 2. 638 a. janjic, s. savic, g. janackovic, m. stankovic, l. velimirovic table 2 different development alternatives. no description of the proposed action performance indicator alternative 1 alternative 2 alternative 3 1 advanced meter installation number of advanced meter installed 20 000 10 000 5 000 2 substation automation percentage of substations applying automation technologies 20% 30% 40% 3 introduction of dynamic line rating technology number of lines operated under dynamic line ratings 2 3 4 percentage of kilometres of transmission circuits operated under dynamic line ratings 15% 20% 15% 4 solar power plant connection total installed power (mw) 3 5 7 three alternatives are evaluated, encompassing four activities introducing new technologies in the distribution network: replacement of old meters with the remotely read meters; the remote control and introduction of substation in the scada system; dynamic line rating of transmission lines; construction of new photovoltaic plant embedded in the distribution network. all activities are planned inside the same approximate budget of 5 000 000 € and the planners proposed three different development strategies. using the presented methodology, experts (in the field of smart grid technologies and multi-criteria decision-making) ranked three smart grid projects whose characteristics are presented in table 3. the proposed set of actions is bringing some qualitative and quantitative benefits. for instance, the increased number of advanced meters installed in the first alternatives will strongly affect both the adequate grid connection because of the enhanced low voltage network management and transparent information to consumers. the quantitative aggregated performance indicators for different alternatives are calculated and represented in table 3. table 3.quantitative aggregated performance indicators for different alternatives. no performance indicator alternative 1 alternative 2 alternative 3 1 energy losses reduction [mwh/year] 3000 8000 11000 2 quantified reduction of carbon emissions (t) 5 400 14 000 19 000 3 probability of injures reduction (in percentage) 10 15 20 although the calculation of these parameters is outside the scope of this paper, the relation between the proposed actions and the expected results is obvious. energy loss reduction is caused by the dynamic line rating enabling the more economic line loading and the connection of the photovoltaic plant (row 1). this renewable source is reducing the carbon emission according to the installed plant power (row 2). finally, the automation of substations reduces the probability of injures during the equipment manipulation (row 3). experts first performed pair wise comparison of the following criteria: technology (c1), costs (c2), customer satisfaction (c3) and environmental (c4). the results of the comparison, fuzzy weights, final weights (fws) and ranks of criteria are shown in table 4. multi-criteria assesment of the smart grid efficiency using the fuzzy analitical... 639 table 4 the pairwise comparison, fuzzy weights, final weights and ranks of criteria. c1 c2 c3 c4 fuzzy weights wci λ=0.5 fws rank λ=1.0 fws rank c1 1 3 5 5 (0.1967,0.5303, 1.3141) 0.5096 1 0.5023 1 c2 1 3 1 3 3 (0.0787, 0.2778, 0.7885) 0.2819 2 0.2904 2 c3 1 5 1 3 1 1 (0.0576, 0.0960, 0.3504) 0.1189 3 0.1216 3 c4 1 5 1 3 1 1 1 (0.0412, 0.0960,0.2190) 0.0896 4 0.0858 4 then the experts compared the following performance indicators in relation to every criterion: sustainability (sc1), capacity of transmission and distribution grids for „collecting‟ and bringing electricity to the consumers (sc2), possibility of grid connection and access for all kinds of grid users (sc3), security and quality of supply (sc4), efficiency and good service in electricity supply and grid operation (sc5), effective support of transnational projects and electricity markets (sc6), transparent information to consumers (sc7). this step is necessary because of different economical, social and political conditions for different distribution companies. as stated above, the pairwise comparison made by experts is performed both by qualitative and quantitative indicators. for instance, security criteria (sc4) can be supported by the reduction of injuries (table 3), while the market development criteria (sc6) is much more susceptible to subjective experts judgments. the results are presented in tables 5 to 8. table 5 the pairwise comparison matrix of sub-criteria in relation to the technology. sc1 sc2 sc3 sc4 sc5 sc6 sc7 fuzzy weights xi1 sc1 1 1 3 3 1 7 1 7 1 5 1 5 (0.0179, 0.0489, 0.1307) sc2 3 1 5 1 5 1 5 1 3 1 3 (0.0376, 0.0981, 0.2539) sc3 1 3 1 5 1 1 7 1 7 1 5 1 5 (0.0122, 0.0216, 0.0551) sc4 7 5 7 1 1 3 3 (0.1125, 0.2631, 0.6320) sc5 7 5 7 1 1 1 3 3 (0.1081, 0.2631, 0.5996) sc6 5 3 5 1 3 1 3 1 1 (0.0622, 0.1526, 0.4051) sc7 5 3 5 1 3 1 3 1 1 1 (0.0578, 0.1526, 0.3727) table 6 the pairwise comparison matrix of sub-criteria in relation to the costs. sc1 sc2 sc3 sc4 sc5 sc6 sc7 fuzzy weights xi2 sc1 1 1 5 3 1 5 1 5 1 3 5 (0.0364, 0.0939, 0.2361) sc2 5 1 7 3 3 5 7 (0.1230, 0.2929, 0.6769) sc3 1 3 1 7 1 1 5 1 5 1 3 3 (0.0181, 0.0492, 0.1396) sc4 5 1 3 5 1 1 3 7 (0.0919, 0.2110, 0.5195) sc5 5 1 3 5 1 1 1 3 7 (0.0876, 0.2110, 0.4880) sc6 3 1 5 3 1 3 1 3 1 5 (0.0424, 0.1216, 0.3201) sc7 1 5 1 7 1 3 1 7 1 7 1 5 1 (0.0118, 0.0204, 0.0514) 640 a. janjic, s. savic, g. janackovic, m. stankovic, l. velimirovic table 7 the pairwise comparison matrix of sub-criteria in relation to the customer satisfaction. sc1 sc2 sc3 sc4 sc5 sc6 sc7 fuzzy weights xi3 sc1 1 1 1 7 1 7 1 3 1 3 1 5 (0.0186, 0.0319, 0.1164) sc2 1 1 1 1 7 1 7 1 3 1 3 1 5 (0.0141, 0.0319, 0.0819) sc3 7 7 1 1 5 3 3 (0.1145, 0.2730, 0.6744) sc4 7 7 1 1 1 5 3 3 (0.1100, 0.2730, 0.6399) sc5 3 3 1 5 1 5 1 1 3 1 5 (0.0244, 0.0802, 0.2248) sc6 3 3 1 3 1 3 3 1 1 3 (0.0310, 0.1112, 0.3286) sc7 5 5 1 3 1 3 5 3 1 (0.0768, 0.1988, 0.5015) table 8 the pairwise comparison matrix of sub-criteria in relation to the environmental protection. sc1 sc2 sc3 sc4 sc5 sc6 sc7 fuzzy weights xi4 sc1 1 3 3 5 5 3 7 (0.1101, 0.3115,0.8074) sc2 1 3 1 1 3 3 1 5 (0.0602, 0.1654, 0.5176) sc3 1 3 1 1 1 3 3 1 5 (0.0553, 0.1654, 0.4762) sc4 1 5 1 3 1 3 1 1 1 3 1 5 (0.0422, 0.0946, 0.2967) sc5 1 5 1 3 1 3 1 1 1 1 3 3 (0.0226, 0.0715, 0.2139) sc6 1 3 1 1 1 1 3 3 1 5 (0.0504, 0.1654, 0.4348) sc7 1 7 1 5 1 5 5 1 3 1 5 1 (0.0138, 0.0263, 0.0732) the final vector of fuzzy weights of the performance of the projects, according to equation (4) and tables 4-8, is:     4 1 7 17 4 (0.0120, 0.0850, 0.5756) (0.0204, 0.1523,1.0094) (0.0127, 0.0672, 0.5231) (0.0374, 0.2334,1.5294) (0.0305, 0.2127,1.2984) (0.0194, 0.4113, 0.9951) (0.0173, 0.1082, 0.7221) sc c ij ci scix xx w x w x w w                               (7) at the end, three smart grid projects (project 1 [a1], project 2 [a2], and project 3 [a3]) are compared in relation to performance presented in tables 3 and 4 as presented in table 9. multi-criteria assesment of the smart grid efficiency using the fuzzy analitical... 641 table 9 the pair wise comparison of alternatives in relation to performance sc a1 a2 a3 fuzzy weights yij sc1 a1 1 1 3 1 5 (0.0601,0.1031,0.2731) a2 3 1 1 3 (0.0985,0.2915,0.8194) a3 5 3 1 (0.2239,0.6054,1.5217) sc2 a1 1 1 1 (0.2000,0.3333,1.0000) a2 1 1 1 1 (0.1556,0.3333,0.7143) a3 1 1 1 1 1 (0.1111,0.3333,0.4286) sc3 a1 1 3 5 (0.2239,0.6054,1.5217) a2 1 3 1 3 (0.0985,0.2915,0.8194) a3 1 5 1 3 1 (0.0601,0.1031,0.2731) sc4 a1 1 1 1 3 (0.1158,0.2000,0.7426) a2 1 1 1 1 3 (0.0807,0.2000,0.4455) a3 3 3 1 (0.1579,0.6000,1.6337) sc5 a1 1 1 1 3 (0.1158,0.2000,0.7426) a2 1 1 1 1 3 (0.0807,0.2000,0.4455) a3 3 3 1 (0.1579,0.6000,1.6337) sc6 a1 1 1 1 3 (0.0667,0.1282,0.4545) a2 1 1 1 1 3 (0.1048,0.3333,1.0606) a3 3 3 1 (0.1429,0.5385,1.6667) sc7 a1 1 1 5 1 5 (0.0593,0.0909,0.1570) a2 5 1 1 (0.2308,0.4545,1.0359) a3 5 1 1 1 (0.2000,0.4545,0.8475) the final vector of fuzzy weights for smart grid projects, according to equation (16) is:   7 13 7 (0.0168, 0.2079, 4.5137) (0.0136, 0.2445, 4.1205) (0.0190, 0.4394, 7.4556) a sc ij sci xx w y w y w                 (8) after the defuzzification of final weights vectors of performance and projects, according to equation (11), performance and smart grid projects are ranked. ranking results are shown in table 10 (fws are final weights). 642 a. janjic, s. savic, g. janackovic, m. stankovic, l. velimirovic table 10 ranking of project performance and smart grid projects. λ=0.5 fws rank λ=1.0 fws rank project performance sustainability (sc1) capacity of transmission and distribution grids for „collecting‟ and bringing electricity to the consumers (sc2) possibility of grid connection and access for all kinds of grid users (sc3) security and quality of supply (sc4) efficiency and good service in electricity supply and grid operation (sc5) effective support of transnational projects and electricity markets (sc6) transparent information to consumers (sc7) 0.0861 6 0.1516 3 0.0761 7 0.2310 1 0.1993 2 0.1473 4 0.1086 5 0.0863 6 0.1518 3 0.0771 7 0.2303 1 0.1974 2 0.1485 4 0.1085 5 smart grid projects project 1 (a1) project 2 (a2) project 3 (a3) 0.2669 2 0.2580 3 0.4661 1 0.2781 2 0.2570 3 0.4647 1 based on the previous results, we can conclude the following: 1. the most important criterion for the selection of smart grid (for this particular distribution company) is the selected technology, followed by the costs, the customer satisfaction and the environmental protection (table 5). advanced technology increases the efficiency and security of energy supply of high performance, thus increases user satisfaction and protects the environment. 2. in relation to the technology, the best ranked performance is security and quality of supply; in relation to the costs grids for „collecting‟ and bringing electricity to the consumers; in relation to the user satisfaction possibility of grid connection and access for all kinds of grid users; and in relation to the environmental protection sustainability. 3. the final ranking of the project performance, based on all criteria, is:  security and quality of supply  efficiency and good service in electricity supply and grid operation  capacity of transmission and distribution grids for „collecting‟ and bringing electricity to the consumers  effective support of transnational projects and electricity markets  transparent information to consumers  sustainability  possibility of grid connection and access for all kinds of grid users. the best-ranked performance (security and quality of supply, and efficiency and good service in electricity supply and grid operation) are supported by the advanced technology. 4. the final rank of the alternatives indicates that the highest rank has the a3 project, followed by the a2 project; the lowest priority has the a1 project. this means that for the implementation of the smart grid project 3 should be selected. multi-criteria assesment of the smart grid efficiency using the fuzzy analitical... 643 5. conclusion in this paper, starting from a general set of smart grid performance indicators, a new assessment framework for the evaluation of the smart grid efficiency has been established, as one of the main conditions for the successful implementation of any energy management program. using the fuzzy ahp methodology with four main criteria and seven sub criteria derived from the adopted set of smart grid benefits, we proved that the method is highly successful in the evaluation of alternatives in the presence of heterogeneous criteria. this method allows the decision makers to incorporate unquantifiable information, incomplete information, non-obtainable information and partially ignorant facts into decision model. the proposed methodology is illustrated on the choice of the right smart grid deployment strategy in the medium size power distribution company. the analysis shows that the dominant performances of the optimal smart grid project are the selected technology, followed by the costs, the customer satisfaction and the environmental protection. this methodology is applied to the general assessment of smart grid efficiency, while the further research will be focused on particular aspects of the project implementation. acknowledgement: this work was supported by the ministry of education, science and technological development of the republic of serbia under grant iii 42006 and grant iii 44006. appendix a.1. fuzzy set, triangular fuzzy number and fuzzy arithmetic mathematical basis for fuzzy ahp method is based on fuzzy sets and fuzzy arithmetic. in [40] it is defined a fuzzy set a by degree of membership a(x) over a universe of discourse x as: ( ) : [0,1]a x x  (a.1) a fuzzy number is a convex and normalized fuzzy set {( ) }, ( ) , a a x x x r  . a triangular fuzzy number can be denoted as ( , , )m a b c , and the membership function is: , [ , ] ( ) , [ , ] 0, a x a x a b b a c x x x b c c b otherwise             (a.2) where a b c  , a and c stand for the lower and upper value of the support of m, respectively, and b is the modal value. when a b c  , it is a “normal”, crisp number. fuzzy arithmetic is based on zadeh‟s extension principle. if :f x y is a function, and a is a fuzzy set in x, then ( )f a is defined as: 644 a. janjic, s. savic, g. janackovic, m. stankovic, l. velimirovic ( ) , ( ) ( ) sup ( ) f a a x x f x y y x     , (a.3) where y y . the main laws for operations for two triangular fuzzy numbers 1 1 1 1 ( , , )m a b c and 2 2 2 2 ( , , )m a b c are: 1 2 1 1 1 2 2 2 1 2 1 2 1 2 ( , , ) ( , , ) ( , , )m m a b c a b c a a b b c c       , (a.4) 1 2 1 1 1 2 2 2 1 2 1 2 1 2 1 2 ( , , ) ( , , ) ( , , ), , 0m m a b c a b c a a b b c c a a        , (a.5) 1 1 1 1 1 1 1 1 1 1 1 1 ( , , ) ( , , )m a b c c b a     . (a.6) a.2. fuzzy synthetic extent the value of fuzzy synthetic extent, according to chang‟s extent analysis method, is defined as [41]: 1 1 1 1 [ ] , 1, 2,..., i i m n m j j i g g j i j s m m i n         , (a.7) where i j g m is a triangular fuzzy number representing the extent analysis value for decision element i with respect to goal j and  is fuzzy multiplication operator. sum in equation (a.7) are determined using equations (a.4) and (a.6): 1 1 1 1 , , ( , , ) i m m m m j g j j j i i i j j j j m a b c a b c                , (a.8) 1 1 1 1 1 , , i n m n n n j g i i i i j i i i m a b c                 , (a.9) 1 1 1 1 1 1 1 1 1 , , i n m j g n n n i j i i i i i i m c b a                          . (a.10) a.3. total integral value method for defuzzification for the given triangular fuzzy number ( , , )m a b c the total integral value is defined as follows [38]:  ( ) 0.5( (1 ) ), 0,1ti m c b a         , (a.11) where λ represents an optimism index. it describes the decision maker‟s attitude toward risk. values 0, 0.5 and 1 are used respectively to represent the pessimistic, moderate and optimistic views of the decision maker. if 1 2( ) ( )t ti m i m    , then 1 2m m ; if 1 2( ) ( )t ti m i m    , then 1 2m m ; if 1 2( ) ( )t ti m i m    , then 1 2m m . multi-criteria assesment of the smart grid efficiency using the fuzzy analitical... 645 references [1] v. giordano, s. vitiello, j. vasiljevska, definition of an assessment framework for projects of common interest in the field of smart grids, jrc science and policy reports, 2014 [2] d. tasic et al. “conception of low voltage network loss reduction based on integrated information”, facta universitatis, series: electronics and energetics, vol. 24, no. 1, pp. 59-71, april 2011. [3] european commission, r&d investment in the priority technologies of the set-plan, sec, 1296, 2009. [4] the european electricity grid initiative (eegi), a joint tso-dso contribution to the european industrial initiative (eii) on electricity networks, 2009. [5] european commission, toward smart power networks, lessons learned from european research fp5 projects, 2005. [6] european commission, strategic research agenda for europe electricity networks of the future european technology platform, 2007. [7] european commission, 2010. strategic deployment document for europe electricity networks of the future european technology platform. available from www.smartgrids.eu/documents/smartgrids_ sdd_final_april2010.pdf [8] european network for the security of control and real time systems, r&d and standardization road map, final deliverable 3.2, 2011. [9] commission of european communities, green paper – a european strategy for sustainable, competitive and secure energy, brussels, 2006. [10] institute for 21 st century energy, index of u.s. energy security risk, assessing america‟s vulnerabilities in a global energy market, 2011. [11] b. falahati, f. yong, w. lei, "reliability assessment of smart grid considering direct cyber-power interdependencies," ieee transactions on smart grid, vol. 3, no. 3, pp. 1515-1524, september 2012. [12] b. falahati, f. yong, "reliability assessment of smart grids considering indirect cyber-power interdependencies," ieee transactions on smart grid, vol. 5, no. 4, pp. 1677-1685, july 2014. [13] m. dimitrijevic et al. “implementation of artificial neural networks based ai concepts to the smart grids”, facta universitatis, series: electronics and energetics, vol. 27, no. 3, pp. 411-424, september 2014. [14] z. lin, l. furong, g. chenghong, h. zechun, b. le, "cost/benefit assessment of a smart distribution system with intelligent electric vehicle charging," ieee transactions on smart grid, vol. 5, no. 2, pp. 839847, march 2014 [15] a. janjic and l. z. velimirovic, “optimal scheduling of utility electric vehicle fleet offering ancillary services”, etri journal, vol. 37, no. 2, april 2015. [16] european commission task force for smart grids, 2010a. “expert group 2: regulatory recommendations for data safety, data handling and data protection”, available from http://ec.europa.eu/energy/gas_electricity/smartgrids/doc/expert_group2.pdf [17] european commission task force for smart grids, 2010b. “expert group 3:roles and responsibilities”, available from http://ec.europa.eu/energy/gas_electricity/smartgrids/doc/expert_group3.pdf [18] european electricity grid initiative (eegi), roadmap 2010-18 and detailed implementation plan 201012, 2010. available from http://ec.europa.eu/energy/technology/initiatives/doc/grid_implementation_ plan_final.pdf [19] u.s. department of energy (doe), guidebook for arra sgdp/rdsi metrics and benefits, doe report, 2010. available from http://www.smartgrid.gov/sites/default/files/pdfs/sgdp_rdsi_metrics_ benefits.pdf [20] d. stevanovic, p. petkovic, “utility needs smarter power meters in order to reduce economic losses” facta universitatis, series: electronics and energetics, vol. 28, no. 3, pp. 407-421, september 2015. [21] b. dupont, l. meeus, and r. belmans, "measuring the “smartness” of the electricity grid", in proc. of the 7th international conference on the energy market (eem), pp. 1-6, 2010. [22] epri (electric power research institute), "methodological approach for estimating the benefits and costs of smart grid demonstration projects", palo alto, ca: epri, 1020342, 2010. [23] european commission, guidelines for conducting cost-benefit analysis of smart grid projects. reference report joint research centre, institute for energy and transport, 2012. available from http://ses.jrc.ec. europa.eu/ [24] european commission, guidelines for cost-benefit analysis of smart metering deployment. scientific and policy report joint research centre, institute for energy and transport, 2012. available from http://ses.jrc.ec.europa.eu/ [25] p. beria, i. maltese, and i. mariotti, "multi-criteria versus cost benefit analysis: a comparative perspective in the assessment of sustainable mobility", eur. transp. res. rev., vol. 4, pp. 137-152, 2012. [26] t.l. saaty, the analytic hierarchy process. new york: mcgraw-hill, 1980. http://www.smartgrid.gov/sites/default/files/pdfs/sgdp_rdsi_metrics_benefits.pdf http://www.smartgrid.gov/sites/default/files/pdfs/sgdp_rdsi_metrics_benefits.pdf http://ieeexplore.ieee.org/search/searchresult.jsp?searchwithin=p_authors:.qt.dupont,%20b..qt.&searchwithin=p_author_ids:37528338800&newsearch=true http://ieeexplore.ieee.org/search/searchresult.jsp?searchwithin=p_authors:.qt.meeus,%20l..qt.&searchwithin=p_author_ids:37271271000&newsearch=true http://ieeexplore.ieee.org/search/searchresult.jsp?searchwithin=p_authors:.qt.belmans,%20r..qt.&searchwithin=p_author_ids:37274741400&newsearch=true http://ieeexplore.ieee.org/xpl/articledetails.jsp?tp=&arnumber=5558673&matchboolean%3dtrue%26searchfield%3dsearch_all%26querytext%3d%28%28%28p_title%3ameasuring%29+and+smartness%29+and+electricity%29 http://ieeexplore.ieee.org/xpl/mostrecentissue.jsp?punumber=5551851 http://ieeexplore.ieee.org/xpl/mostrecentissue.jsp?punumber=5551851 646 a. janjic, s. savic, g. janackovic, m. stankovic, l. velimirovic [27] o. duru, e. bulut, and s. yoshida, "regime switching fuzzy ahp model for choice-varying priorities problem and expert consistency prioritization: a cubic fuzzy-priority matrix design", expert systems with applications, vol. 39, pp. 4954-4964, 2012. [28] o. kulak, b. durmusoglu, and c. kahraman, "fuzzy multi-attribute equipment selection based on information axiom", journal of materials processing technology, vol. 169, pp. 337–345, 2005. [29] p. j. m. van laarhoven, w. pedrycz, "a fuzzy extension of saaty‟s priority theory", fuzzy sets and systems, vol. 11, pp. 229-241, 1983. [30] j. buckley, "fuzzy hierarchical analysis", fuzzy sets and systems, vol. 17, no. 3, pp. 233-247, 1985. [31] l. fatti, water research planning in south africa. in: b. golden et al. (eds) application of the analytic hierarchy process, springer: new york, pp. 122–137, 1989. [32] m. ridgley, "a multicriteria approach to allocating water under drought", resource management and optimization, vol. 92, pp. 112–132, 1993. [33] b. srdjevic and y. medeiros, "fuzzy ahp assessment of water management plans", water resources management, vol. 22, pp. 877-894, 2008. [34] c.h. cheng, "evaluating naval tactical missile systems by fuzzy ahp based on the grade value of membership function", european journal of operational research, vol. 96, pp. 343-350, 1996. [35] a.t. gumus, "evaluation of hazardous waste transportation forms by using a two step fuzzy ahp and topsis methodology", expert systems with applications, vol. 36, no. 2, pp. 4067-4074, 2009. [36] f. bozbura, a. beskese, and c. kahraman, "prioritization of human capital measurement indicators using fuzzy ahp", expert systems with applications, vol. 32, pp. 1100-1112, 2007. [37] e. bulut, o. duru, t. kececi, and s. yoshida, "use of consistency index, expert prioritization and direct numerical inputs for generic fuzzy-ahp modeling: a process model for shipping asset management", expert systems with applications, vol. 39, pp. 1911-1923, 2012. [38] m. dağdeviren and i. yüksel, "developing a fuzzy analytic hierarchy process (ahp) model for behaviorbased safety management", information sciences, vol. 178, no. 6, pp. 1717-1733, 2008. [39] g. janackovic, s. savic, and m. stankovic, "selection and ranking of occupational safety indicators based on fuzzy ahp: case study in road construction companies", south african journal of industrial engineering, vol. 24, no. 3, pp. 175-189, 2013. [40] l.a. zadeh, "fuzzy sets", information and control, vol. 8, pp. 338-353, 1965. [41] d.y. chang, "applications of the extent analysis method on fuzzy ahp", european journal of operational research, vol. 95, pp. 649–655, 1996. instruction facta universitatis series: electronics and energetics vol. 31, n o 1, march 2018, pp. 51 61 https://doi.org/10.2298/fuee1801051j feedback linearization for decoupled position/stiffness control of bidirectional antagonistic drives  kosta jovanović 1 , branko lukić 1 , veljko potkonjak 2 1 laboratory for robotics etf robotics, school of electrical engineering, university of belgrade, serbia 2 school of information technologies, metropolitan university, belgrade, serbia abstract. to ensure safe human-robot interaction impedance robot control has arisen as one of the key challenges in robotics. this paper elaborates control of bidirectional antagonistic drives – qbmove maker pro. due to its mechanical structure, both position and stiffness of bidirectional antagonistic drives could be controlled independently. to that end, we applied feedback linearization. feedback linearization based approach initially decouples systems in two linear single-input-single-output subsystems: position subsystem and stiffness subsystem. the paper elaborates preconditions for feedback linearization and its implementation. the paper presents simulation results that prove the concept but points out application issues due to the complex mechanical structure of the bidirectional antagonistic drives. key words: bidirectional antagonistic drives, variable stiffness actuators, pullerfollower control, stiffness control. 1. introduction this paper presents a further elaboration of the approach for stiffness control of classical antagonistic drives in robotics [1] 1 to bidirectional antagonistic drives. the long term desire of scientists to design and build a faithful copy of a human being finally coincides with the latest efforts of in-house service robotics how to design a robot which fully matches the house environment. because humans shape their living environment to fully meet their comfort and necessities, home robots have to be built to fit such areas and therefore they must move and behave in the same manner as humans. received november 16, 2016; received in revised form may 4, 2017 corresponding author: kosta jovanović laboratory for robotics etf robotics, school of electrical engineering, university of belgrade, 11000 belgrade, serbia (e-mail: kostaj@etf.rs) *an initial research related to this paper received best paper award at 3 rd international conference on electrical, electronic and computing engineering (icetran ’16) – section robotics and flexible automation [1]. 52 k. jovanović, b. lukić, v. potkonjak therefore, there are numbers of actual research projects with the ultimate goal of creating musculoskeletal (or so-called anthropomimetic robots [2], [3]). the most popular among them are famous japanese robot kenshiro [4] and eccerobot as an anthropomimetic robot of european consortium [5]. following the anthropomimetic approach, key issues are human-like actuators and their control. the design of an anthropomimetic actuator has to follow guidelines set by its human paragon: it should be tendon driven, compliant (of changeable compliance vsa) and therefore it has to be driven by at least two motors – to control both position and compliance (opposite of stiffness). the control of such drives, which are inevitably multivariable and non-linear, has to be reliable, safe and robust. this paper presents one instance of a bio-inspired robotic drive of changeable stiffness – bidirectional qbmove maker pro, and an approach to control such drive initially based on our work on puller-follower approach [6]. a brief overview of bidirectional antagonistic joints in robotics, as well as our target one, is given in section 2. special attention of our group from robotics laboratory at the school of electrical engineering, university of belgrade, is paid to the control of novel bioinspired robot actuators in general and the control of bidirectional antagonistic drives as one of the instances available in the laboratory. generalized puller-follower approach based on feedback linearization to the control of qbmove maker pro is introduced in section 3. the validity of the proposed control algorithm is proven via simulation in section 4. section 5 brings conclusions about a prospective application of the proposed methodology, gives tips for future work and points out the already tested alternative approaches for stiffness control of bidirectional antagonistic drives. 2. bidirectional antagonistic drives qbmove maker pro a subgroup of vsas that mimics biological paragon of mammals is antagonistic actuators. although classical antagonistic actuation is the prime example of a fully biologically inspired actuation, lately, the engineers turned to bidirectional antagonistic actuation as a big step towards real antagonistic actuation. the most significant advantage of bidirectional antagonistic actuation is bidirectional torque achieved by two antagonistically coupled motors. namely, both motors could either pull or push, contrary to classical antagonistic, tendon driven actuators and human muscles. therefore, slacking of the tendons is not possible, and controllability of such drives is ensured. pioneering works in antagonistic actuation exploited intrinsic compliance of hydraulic and pneumatic actuators as antagonistically coupled drives. therefore, the first widely known implementation of antagonistic drives were: the utah/m.i.t. dexterous hand [7], mckibben pneumatic artificial muscles in antagonistic arrangements [8] such as work of tondu et al. [9] or boblan et al. [10], biped walking robots with antagonistically actuated joints at waseda university [11], or european pneumatic biped lucy build at vrije university of brussels [12]. in parallel, electric drives have been gradually developed and prevailed in antagonistic drives due to control issues when pneumatic actuators are employed [13]. to achieve variable stiffness, non-linear tendon transmission has to be designed [14]. the non-linear transmission could be obtained either by placing non-linear elastic elements ([15] and [16]) or placing linear elastic elements with a controlled system feedback linearization for decoupled position/stiffness control of bidirectional antagonistic drives 53 dedicated to shaping non-linearity in transmission. the latter approach was employed by migliore [17], hurst [18], and tonietti [19]. in this research we opted for the first approach. although vsa is a topic of an increasing importance towards safe human-robot interaction, a limited number of vsa is available on the market due to high costs and complex mechanical design. with an idea to bring an instance of such compliant actuator to a broad audience, to researchers and academy, the natural motion initiative [20] developed qbmove maker series of the actuator. their latest prototype, qbmove maker pro is a lowcost 3d printed bidirectional spring antagonistic actuator design which is affordable and it has all features of bidirectional antagonistic vsa. all parts of the actuator are on-the-shelf and could be either purchased from the natural motion initiative or their models could be downloaded from the internet free-of-charge. furthermore, all software dedicated to realtime control of qbmove maker pro is open source [21]. a prototype of qbmove maker pro actuator and its functional scheme are depicted in fig.1. therefore, both motors can contribute to the overall shaft torque symmetrically. this is the basic difference when compared to the traditional antagonistic structure where each motor can contribute only in one direction due to a pulling constraint. joint shaft and motors are coupled via non-linear springs. the non-linear force-deflection characteristic is of fundamental importance since it enables variable stiffness of the joint which depends on spring pretensions [17]. experiments which confirm this non-linear coupling are given in [22]. fig. 1 qbmove maker pro: prototype (left), functional scheme (right) a mathematical model of qbmove maker pro actuator is given by equations (1) (7). non-linearity in force-deflection characteristics causes that relatively small displacement of motors positions and/or output shaft induces a significant change in stiffness for high stiffness values. equation (1) describes joint/shaft dynamics, equation (2)-(3) stands for motor dynamics. resulting driving torques are given by (4)-(7). ( ) ̈ ( ̇) ̇ ̇ ( ) ( ) (1) ̈ ̇ ( ) (2) ̈ ̇ ( ) (3) ( ) ( ) ( ) (4) 54 k. jovanović, b. lukić, v. potkonjak ( ) ( ( )) (5) ( ) ( ( )) (6) ( ) ( ( )) ( ( )) (7) actuator dynamics is specified by shaft inertia ( ), velocity related terms (centrifugal and coriolis) ( ̇) ̇, viscous damping , gravity load ( ), and overall actuator torque ( ) as a sum of both bidirectional antagonistic tendon/drive torques ( ) and ( ). the bidirectional antagonistic drives are assumed to be symmetric with inertia – and damping term – . note that non-linearity in the transmission given by (5) and (6) is a prerequisite for variable stiffness of qbmove maker pro actuator. since both drives influence actuator position as well as actuator stiffness, decoupling of position and stiffness subsystem is demanding control challenge which is considered in this paper. since our final goal to control joint stiffness, let us briefly recall the definition of joint stiffness equivalent to the stiffness of a translational spring. the force acting on the spring depends on its extension and this static dependence is defined as the spring stiffness ⁄ . thus, the spring of length in its equilibrium position ( ) stays undeformed, whereas if the spring is extended to a length , it generates force . if this relation is linear, then we consider the spring as linear (8) and the stiffness is constant. otherwise, the spring is considered as non-linear (9) and the stiffness is variable. likewise, the stiffness of the robot joint (usually denoted in the literature as ⁄ ) is defined by (10), where stands for the torque generated in the joint and denotes the joint position. ( ) ⁄ (8) ( ) ( ) ( ) ⁄ (9) (10) analogously, joint stiffness can be constant or changeable which is a desirable feature from an exploitation point of view since it enables tradeoffs between safe and precise manipulation. since we focus on robot joints that exploit antagonism, the stiffness of such joints is presented in accordance with the source of mechanical stiffness in antagonistically coupled tendons. therefore, the overall shaft/joint stiffness of qbmove maker pro actuator is estimated as follows in (11). for unloaded shaft, equilibrium position is given by (12). ( ) ( ( )) ( ( )) (11) (12) feedback linearization for decoupled position/stiffness control of bidirectional antagonistic drives 55 3. feedback linearization for decoupled position/stiffness control of bidirectional antagonistic drives since both bidirectional antagonistic motors contribute to joint position and joint stiffness, static feedback linearization is employed to decouple this multivariable system into two decoupled and linearized single-input-single-output systems. the original system can be written in state-space representation (13). ̇ ( ) ( ) (13) here, joint and motor positions and velocities are considered as state space variables ̇ ̇ ̇ , while motor torques are considered as control inputs. joint position and overall joint stiffness are outputs: . by straightforward application of feedback linearization [23], outputs and were differentiated until a linear relation to inputs and/or was obtained. to that end, outputs and were differentiated four times (14) and two times (15) respectively. since the sum of the relative degrees (=4+2) of the outputs was equal to the state dimension of the system (=6), zero dynamics does not exist and all states are fully observable. ( ) ( ) (14) ( ) ( ) (15) ( ) denotes lie derivative of ( ) along vector function ( ). lie derivatives in cases of position and stiffness of the model representing qbmove maker pro are depicted in (16) and (17) respectively. decoupling the matrix ( ), defined as in (18), has to be non-singular to prove controllability of the system, which is always valid for positive joint stiffness. at the same time, this is the second precondition for the application of static feedback linearization. for the sake of simplicity, the following notation is adopted: ( ( )), ( ( – )), ( ( )), and ( ( )). ( ( ( ) ) ( ̇ ̇) ( ( ) ) ( ̇ ̇) ) (16) ( ( ( ) ) ( ̇ ̇) ( ( ) ) ( ̇ ̇) ) (17) 56 k. jovanović, b. lukić, v. potkonjak [ ] [ ] (18) finally, in accordance to [23], original input can be transformed as in (19) to achieve independent control of both the joint position and stiffness via the newly-defined intermediate input [ ] . the result of this input transformation is two linear single-input-single-output systems controlled by intermediate input which can be written in linear state space form (20). new state vector contains all output derivatives up to the highest order [ ̇ ̈ ( ) ̇] . ( * ( ) ( ) + * +) (19) ̇ (20) from (14) through (20) follows that ( ) ( ) [ ] . thus, if we choose as the desired joint position and as the desired joint stiffness, a basic control law (21) can be applied. accordingly, state feedback linearization allows control of both the positions and stiffness of the bidirectional antagonistic robot joint, using two totally independent linear controllers, composed of static state feedback and feed-forward action. as demonstrated in [24] and [25], the stability of the proposed control methodology (21) is ensured if the gains in are chosen so the polynomials depicted in (22) are hurwitz's. ( ) ( ( ) ( )) ( ̈ ( )) ( ̇ ( )) ( ( )) ̈ ( ̇ ( )) ( ( )) (21) (22) theoretically, if the desired joint positions and stiffness are smooth trajectory, asymptotic trajectory/force tracking is possible. in this paper, the desired trajectories are set manually without considering higher control levels and optimization issues. an illustrative scheme of the proposed algorithm is depicted in fig 2. feedback linearization for decoupled position/stiffness control of bidirectional antagonistic drives 57 fig. 2 decoupled position/stiffness control scheme for qbmove maker pro actuator 4. results and discussion the mathematical model (presented in section 2) and the presented control approach (section 3) are implemented in user-defined dedicated matlab/simulink model. the validation of the proposed approach is given in fig 3 through fig 6. fig 3 presents joint position tracking the desired trajectory combines an interval of smooth increase in position for ⁄ and sine trajectory with an amplitude of ⁄ . desired and achieved stiffness are depicted in fig 4. desired stiffness comprises flat and sine part of an amplitude of which is in accordance with desired trajectory to demonstrate simultaneous control of both joint position and stiffness for different trajectory patterns. theoretically, as elaborated by palli et al. [24], [25], if the desired joint positions are continuous up to the 4 th order ( ) , and the stiffness is planned to be continuous up to the 2 nd order ( ) , asymptotic trajectory/ force tracking is achieved. fig 5 presents coordinated actions of two antagonistically coupled motors which contribute to the joint position but also stiffness. one can see that while the desired stiffness is constant ( ) both motors move in the same direction equally contributing to the joint position which follows its pattern. when stiffness starts changing its value motors act as follows: when joint stiffen (rise in stiffness) motors move in opposing directions while a decrease in joint stiffness results in a decrease in the difference in antagonistic motor positions. the overall resulting joint torque is depicted in fig 6 which fits the pattern of the desired joint trajectory. demonstrated results are obtained for parameters adopted as shown in table 1. control parameters (23) and (24) are adopted from [6]. ( )( ) ( ) ( ) (23) ( ) (24) 58 k. jovanović, b. lukić, v. potkonjak table 1 simulation parameters label numerical value unit description 0.000003 motor inertia 0.015 joint inertia 0.000001 [ s/rad] motor damping 0 [ s/rad] joint damping 6.7328 spring coefficient 0.0227 spring coefficient fig. 3 joint position tracking fig. 4 joint stiffness tracking feedback linearization for decoupled position/stiffness control of bidirectional antagonistic drives 59 fig. 5 positions of bidirectional antagonistically coupled motors fig. 6 resulting joint torque as contribution of both bidirectional antagonistically coupled motors 5. conclusion the paper elaborated exploitation of the stiffness control method proposed in [1] to robot joint driven by a bidirectional antagonistic actuators qbmove maker pro actuator. therefore, an increasing topic of variable stiffness actuation was presented. the approach which enables simultaneous decoupled control of joint position and joint stiffness was demonstrated. the concept is validated through simulations. 60 k. jovanović, b. lukić, v. potkonjak however, the key issue in the implementation of this feedback linearization based control approach is model dependence. the model itself is very complex and non-linear, so model identification must be considered comprehensively before the approach is used. moreover, it is well known that systems that are linearized by decomposing their structure to two or more linear subsystems are prone to behave erratically when disturbed. the robustness of the presented approach is discussed by authors’ previous work [6]. to overcome the dependence on the model, alternative approaches to simultaneous position/ stiffness control of bidirectional antagonistic drives were pointed out in authors’ previous works [27] and [28], while neural networks for system modeling and feed-forward control were presented in [29]. future work on the topic will consider the implementation of the proposed approach for stiffness control on the laboratory setup driven by qbmove maker pro actuators, on a model-based multi-jointed robot with bidirectional antagonistic drives, as well as its implementation for cartesian stiffness control. an ultimate goal of this research is the development of a control scheme which should shape cartesian stiffness by symbiosis of joint stiffness control and posture planning of the robot. acknowledgment: research leading to these results was funded by the ministry of education, science and technological development, republic of serbia, under contract tr-35003. references [1] k. jovanovic, b. lukic, v. potkonjak, “enhanced puller-follower approach for stiffness control of antagonistically actuated joints”, in proceedings of international conference on electrical, electronic and computing engineering (icetran ’16), 13-16 jun 2016, pp. roi1.2.1-5. [2] a. diamond, r. knight, d. devereux, o. holland, "anthropomimetic robots: concept, construction and modelling," international journal of advanced robotic systems, vol. 9, no. 209, pp. 1-14, 2012. [3] k. jovanovic, v. potkonjak, o. holland, "dynamic modelling of an anthropomimetic robot in contact tasks," advanced robotics, vol. 28, no. 11, pp. 793-806, 2014. [4] y. nakanishi, s. ohta, t. shirai, y. asano, t. kozuki, y. kakehashi, h. mizoguchi, t. kurotobi, y. motegi, k. sasabuchi, j. urata, k. okada, i. mizuuchi, m. inaba, "design approach of biologicallyinspired musculoskeletal humanoids", international journal of advanced robotics systems, vol. 10, no. 216, pp. 1-13, 2013. [5] s. wittmeier, c. alessandro, n. bascarevic, k. dalamagkidis, a. diamond, m. jäntsch, k. jovanovic, r. knight, h. g. marques, p. milosavljevic, b. svetozarevic, v. potkonjak, r. pfeifer, a. knoll, o. holland, "toward anthropomimetic robotics: development, simulation, and control of a musculoskeletal torso", artificial life, vol. 19, no. 1, pp. 171-193, 2013. [6] v. potkonjak, b. svetozarevic, k. jovanovic, o. holland, "the puller-follower control of compliant and noncompliant antagonistic tendon drives in robotic system", international journal of advanced robotics systems, vol. 8, no. 5, pp. 143-155, 2012. [7] s. c. jacobsen, e. k. iversen, d. knutti, r. johnson, k. biggers, "design of the utah/m.i.t. dextrous hand", in proceedings of ieee international conference on robotics and automation (icra 1986), san francisco, ca, usa, 7-10 april 1986. pp. 1520-1532. [8] g. c. klute, j. m. czerniecki, b. hannaford, "mckibben artificial muscles: pneumatic actuators with biomechanical intelligence", in proceedings of ieee/asme international conference on advanced intelligent mechatronics, atlanta, ga, usa, 19-23 september 1999, pp. 221-226. [9] b. tondu, s. ippolito, j. guiochet, a. daidie, "a seven-degrees-of-freedom robotarm driven by pneumatic artificial muscles for humanoid robots", the international journal of robotics research, vol. 24, no. 4, pp. 257-274, 2005. [10] i. boblan, j. maschuw, d. engelhardt, a. schulz, h. schwenk, r. bannasch, i. rechenberg, "a humanlike robot hand and arm with fluidic muscles: modelling of a muscle driven joint with an antagonistic feedback linearization for decoupled position/stiffness control of bidirectional antagonistic drives 61 setup", in proceedings of international symposium on adaptive motion in animals and machines, ilmenau, germany, 25-30 september 2005. [11] j. yamaguchi, d. nishino, a. takanishi, "realization of dynamic biped walking varying joint stiffness using antagonistic driven joints", in proceedings of ieee international conference on robotics and automation (icra 1998), leuven, belgium, 16-20 may 1998, pp. 2022-2029. [12] b. verrelst, r. van ham, b. vanderborght, f. daerden, d. lefeber, "the pneumatic biped "lucy" actuated with pleated pneumatic artificial muscles", autonomous robots, vol. 18, no. 2, pp. 201-213, 2005. [13] s. ĉajetinac, d. šešlija, v. nikolić, m. todorović, "comparison of pwm control of pneumatic actuator based on energy efficiency", facta universitatis, series: electronics and energetics, vol. 25, no. 2, pp. 93-101, 2012. [14] r. van ham, t. sugar, b. vanderborght, k. hollander, d. lefeber, "compliant actuator design: review of actuator with passive adjustable compliance/controllable stiffness for robotic applications", ieee robotics & automation magazine, vol. 13, no. 3, pp. 771-789, 2009. [15] k. koganezawa, y. watanabe, n. shimizu, "stiffness and angle control of antagonistically driven joint", advanced robotics, vol. 12, no. 7-8, pp. 81-94, 1997. [16] c. english, d. russell, "implementation of variable joint stiffness through antagonistic actuation using rolamite springs", mechanism and machine theory, vol. 34, no. 1, pp. 27-40, 1999. [17] s. migliore, e. brown, s. deweerth, "biologically inspired joint stiffness control", in proceedings of ieee international conference on robotics and automation (icra ’05), 18-22 april 2005, pp. 4508-4513. [18] j. hurst, j. chestnutt, a. rizzi , "an actuator with physically variable stiffness for highly dynamic legged locomotion", in proceedings of ieee international conference on robotics and automation (icra 2004), new orleans, la, usa, 26 april-1 may 2004, pp. 4662-4667. [19] g. tonietti, r. schiavi, a. bicchi, "design and control of a variable stiffness actuator for safe and fast physical human/robot interaction", in proceedings of ieee international conference on robotics and automation (icra 2005), barcelona, spain, 18-22 april 2005. pp. 526-531. [20] m. catalano, g. grioli, m. garabini, f. bonomo, m. mancini, n. tsagarakis and a. bicchi, “vsa-cubebot: a modular variable stiffness platform for multiple degrees of freedom robots”, in proceedings of ieee international conference on robotics and automation (icra ’11), 9-13 may 2011. pp. 5090 5095. [21] natural motion machine initiative (nmmi) [qbmove maker pro assembly guide], last accessed november 13th, 2016 – https://sourceforge.net/projects/nmmiwebsite/files/qbmovev01/assembly%20guide%20v01.pdf/ download [22] k. melo, m. garabini, g. grioli, m. catalano, l. malagia, a. bicchi, “open source vsa-cubebots for rapid soft robot prototyping”, robot makers workshop in conjunction with 2014 robotics science and systems conference, berkeley, california, usa, july 12, 2014. [23] h. k khalil, "chapter 13: state feedback stabilization," in nonlinear systems, 3rd edition, upper saddle river, new jersey, usa, prentice hall, 2002, pp. 197-227. [24] g. palli, c. melchiorri, a. de luca, "on the feedback linearization of robots with variable joint stiffness", in proceedings of ieee international conference on robotics and automation (icra 2008), pasadena, ca, usa, 19-23 may 2008. pp. 1753-1759. [25] g. palli, c. melchiorri, t. wimböck, m. grebenstein, g. hirzinger, "feedback linearization and simultaneous stiffness-position control of robots with antagonistic actuated joints", in proceedings of ieee international conference on robotics and automation (icra '07), rome, italy, 10-14 april 2007. pp. 4367-4372. [26] b. lukić, k. jovanović, a, rakić, “realization and comparative analysis of coupled and decoupled control methods for bidirectional antagonistic drives: qbmove maker pro,” presentedat the 3 rd international conference on electrical, electronic and computing engineering (icetran 2016), zlatibor, serbia, jun 13-16, 2016. [27] b. lukić, k. jovanović, “minimal energy cartesian impedance control of robot with bidirectional antagonistic drives,” in proceedings of the iftomm/ieee/eurobotics 25 th international conference on robotics inalpe-adria-danube region – raad 2016, belgrade, june 30 th july 2 nd 2016. [28] b. lukić, k. jovanović, g. kvašĉev, “feedforward neural network for controlling qbmove maker pro variable stiffness actuator”, in proceedings of the 13th symposium on neural networks applications in electrical engineering (neurel 2016), belgrade, serbia, november, 2016. instruction facta universitatis series: electronics and energetics vol. 27, n o 3, september 2014, pp. 467 477 doi: 10.2298/fuee1403467p automatic prosody generation in a text-to-speech system for hebrew  branislav popović 1 , dragan knežević 1 , milan sečujski 1 , darko pekar 2 1 faculty of technical sciences, university of novi sad, serbia 2 alfanum – speech technologies, novi sad, serbia abstract. the paper presents the module for automatic prosody generation within a system for automatic synthesis of high-quality speech based on arbitrary text in hebrew. the high quality of synthesis is due to the high accuracy of automatic prosody generation, enabling the introduction of elements of natural sentence prosody of hebrew. automatic morphological annotation of text is based on the application of an expert algorithm relying on transformational rules. syntactic-prosodic parsing is also rule based, while the generation of the acoustic representation of prosodic features is based on classification and regression trees. a tree structure generated during the training phase enables accurate prediction of the acoustic representatives of prosody, namely, durations of phonetic segments as well as temporal evolution of fundamental frequency and energy. such an approach to automatic prosody generation has lead to an improvement in the quality of synthesized speech, as confirmed by listening tests. key words: speech synthesis, speech processing, natural language processing, classification and regression trees 1. introduction explicit modeling of prosodic features of synthesized speech, as well as prediction of values of certain parameters of a model based on explicit morphological, phonetic, syntactic and other relevant rules, is considered to be a relatively poor solution in practice. this is due to an enormous number of factors that need to be considered, as well as their mutual influence, too complicated to be closely examined on reasonably large speech corpora [1]. on the other hand, inadequately determined prosodic features impair the naturalness, and in some cases even the intelligibility of synthesized speech, significantly narrowing the field of its application. as the use of machine learning methods eliminates the need for explicit modeling of prosody, they have been widely adopted as a solution for automatic prosody generation  received february 25, 2014; received in revised form may 21, 2014 corresponding author: branislav popović university of novi sad, faculty of technical sciences, trg dositeja obradovića 6, 21000 novi sad, serbia (e-mail: bpopovic@uns.ac.rs) 468 b. popović, d. knežević, m. seĉujski, d. pekar within text-to-speech systems. furthermore, they can also provide information about the mutual influence of specific linguistic factors (e.g. masking), which is of great interest to the linguistic community. in this paper, automatic training and subsequent prediction of prosodic features are carried out according to the methodology of classification and regression trees (cart) [2]. the idea of this methodology is to generate a tree structure through the process of automatic training based on a speech corpus of sufficient size. such a training should identify the most relevant factors that influence the prosodic features of speech and their acoustic representatives – phone durations as well as temporal evolution of fundamental frequency and energy. the speech corpus is marked for phone boundaries as well as relevant prosodic events, such as types and levels of boundaries between adjacent intonation units, as well as levels of emphasis. using regression trees trained on thus annotated speech corpus, the quality of synthesized speech is significantly improved compared to the quality obtained by conventional methods for prosody prediction in text-to-speech [3], [4], [5]. the paper is organized as follows. section 2 presents the particularities of the hebrew language, as it is well known that the properties of the target language significantly affect the development of a system for automatic speech synthesis (most notably the automatic prosody generation module). section 3 defines the procedure of automatic part-of-speech (pos) tagging and additional morphological annotation of input text. in section 4, prosody generation and synthesis are presented. section 5 presents the experimental results. in section 6, several conclusions are given. 2. language particularities the hebrew language, one of the most widely spoken semitic languages today, has a range of properties which drastically affect the design of a speech synthesis system. firstly, from the orthographical point of view, it belongs to the group of so called abjad languages, where each symbol commonly stands for a consonant [6]. however, vowels can be indicated by (1) the use of "weak consonants" serving as vowel letters (for example, the letter vav indicates that the preceding vowel is either /o/ or /u/, yodh indicates an /i/, whereas aleph indicates an /a/), or (2) by using a set of diacritical symbols called niqqud. another thing that should be borne in mind is that abjad languages, including hebrew, suffer from very loose spelling rules. this means that for a number of words there can be more than one acceptable spelling, which is a very serious source of ambiguity. namely, the revival of the hebrew language in the late 19 th century has left many unresolved issues [7]. as hebrew speakers were almost all native speakers of european languages and thus accustomed to the latin alphabet, it has led to the development of two parallel spelling systems: the first, where vowel indicators are used according to the historic rules, and the second, where vowel indicators are used excessively. it should also be noted that even today, a vast majority of speakers commonly makes spelling errors. therefore, if one aims at the design of a text-tospeech system which should be able to handle arbitrary texts, spelling errors have to be accepted as a part of standard inventory. spelling errors are thus another source of ambiguity in hebrew, and are something that the design of a practically applicable speech synthesizer cannot dismiss. automatic prosody generation in a text-to-speech system for hebrew 469 the hebrew alphabet has 22 letters, five of them have different forms when they are used at the end of a word. modern israeli hebrew has 5 vowel phonemes. however, the meaning of a word is carried not only by its phonological content, but also by its stress, and it is not uncommon to find pairs of words containing the same string of phonemes, but pronounced differently, the only difference being the stress. from the point of view of morphology, it should be noted that hebrew exhibits a pattern of stems consisting typically of consonantal roots from which nouns, adjectives, and verbs are formed in various ways. hebrew uses a range of very productive prefixes and a multitude of suffixes, dramatically increasing the number of possible morphological interpretations of each surface word form in the text. the syntactic structure of the sentence and the word ordering in hebrew can be considered as relatively flexible. although particular choices in word ordering can indicate specific literary styles or genres, one commonly encounters sentences where several orders of words can be considered equivalent. this is another source of difficulty for automatic morphological annotation of text. 3. morphological annotation after the text is preprocessed in order to locate sentence boundaries and reveal elements such as abbreviations, dates, punctuation, special characters, web addresses etc., it is submitted to automatic morphological annotation, aimed at assigning part-of-speech tags as well as some additional morphological information that may be of interest to any subsequent phase of automatic prosody generation. the morphological analysis begins by assigning an empty array of "readings" to every surface word form (token) in a sentence. the term "reading" denotes a morphological interpretation of this token together with its phonological representation, i.e. a particular inflected form of a word, together with the corresponding lemma, values of part-of-speech and corresponding morphological categories, its pronunciation as well as position and type of stress. in general, it is possible to derive several hundreds of morphological forms from a single lemma in hebrew. ideally, the lexicon should contain entries representing each and every possible surface word form. an evaluation score will be assigned to each of the readings of a word token during the evaluation process, in order to select the reading which is most likely to be correct. the aim of morphologic analysis is, thus, to distinguish between the available readings and thus assign a correct vocalization and stress pattern to each word, which is of utmost importance for the naturalness of synthesized speech. the novel approach to morphologic analysis described in this paper is outlined in fig. 1 and uses a combination of active and passive methods [8]. the passive method presumes the selection of appropriate lexemes, by using the hebrew lexicon, the lexicon of foreign words in hebrew transcription and finally, the lexicon of frequent foreign words in latin transcription. the active method involves an automatic morphological analysis of the input text string, as well as generation of appropriate readings by using a complex expert algorithm relying on a set of transformational rules. the use of the active method reduces the initialization time as well as the number of inflected morphological forms in the lexicon by two orders of magnitude, enabling the use of the software component within real-time applications. on the other hand, the passive methodology reduces the error rate. 470 b. popović, d. knežević, m. seĉujski, d. pekar fig. 1 morphological annotation of input text transformational rules in the form of complex tree structures are applied iteratively. branches are generated by using appropriate sets of morphological rules. word analysis is carried out morpheme by morpheme. every word is processed according to its left and right context. the aim is to correctly identify the surface form as a particular inflected form of a particular lemma. currently, the system supports more than 30 part-of-speech classes with more than 3000 corresponding morphological categories. the algorithm for the evaluation of particular readings, in order to select the most likely one, consists of a set of disambiguation tools, divided into individual scoring procedures. the scoring of syntactic structures assigns syntactic indexes to words using predefined statistical algorithms, aiming at establishing the similarity between the syntactic structure of input sentence and the predefined syntactic structures. the algorithm is coupled with an accurate comparison mechanism that allows the use of existing structures in order to project on unfamiliar ones. a syntactic score indicates the level of compatibility of a certain reading to the previously tagged syntactic environment. the scoring of semantic structures uses an analogous method, with only one difference: the structures represent semantic relations instead of syntactic ones. the index used is built over semantic attributes. the challenge in this process, besides building the most convenient set of indexes, is to determine the collection of a minimal number of morphological descriptors (tags) covering at the same time the maximum number of words. proximity scoring is the most efficient of the scoring processes. there are three types of proximity rules: generic to generic (this type of rules refers to the assignment of a relationship between linguistic items of non-specific identity, such as "there is a high probability that a verb in past tense of semantic category moving will be adjacent to a copula"; the attributes that can be used in composing these rules may be of grammatical and/or semantic nature), specific to generic (this type of rules would attach a generic rule to a specific word, e.g. "a verb in passive mood is likely to be followed by the word by") and specific to specific (this type of rules will attach two specific words, e.g. tel is likely to be followed by aviv). the effect of proximity scoring is clearly limited only to the words and entities for which proximity rules have been defined. full-niqqud scoring is a type of scoring unique to hebrew. it determines how close a certain reading of a word is to the most commonly used spelling version. due to the automatic prosody generation in a text-to-speech system for hebrew 471 previously mentioned lack of unique spelling standard, such a scoring procedure has to be taken into account as well. another scoring procedure used is frequency scoring, i.e. scoring readings according to their frequency in standard texts. although such a procedure is highly inaccurate on its own (it commonly serves as a baseline for establishing the performance of more sophisticated morphological annotation techniques), it can serve as an efficient tie-breaker, i.e. it can be used in cases where other scoring procedures have assigned approximately equal scores to multiple readings. every reading is also additionally evaluated in view of its context. context scores are obtained in compliance with the previously selected set of tags for the left context, as well as the set of tags for all possible readings in the right context. this is probably the most complex among all the applied scoring procedures. table 1 illustrates the effectiveness of the described scoring procedures, in terms of the overall accuracy of the automatic annotation process (selection of the correct reading), on the corpus of 3093 sentences (55046 words). table 1 the overall accuracy scoring type status syntactic on on on on semantic on on on on proximity on on on full niqqud on on frequency on on context on on on acc. [%] 92.3 85.9 44.7 45.1 32.1 46.9 99.3 99.4 99.6 table 2 presents the correlation matrix among the different scoring procedures. a high correlation between proximity, context and full-niqqud score can be noted. although such an analysis of the correlation between different scoring procedures is not immediately aimed at the improvement of the quality of synthetic speech, it can give an insight into the directions of the future development of the scoring system. at the same time, high correlation between particular scoring procedures, besides giving a linguistic insight into the problem, confirms the validity of the algorithms. table 2 the correlation matrix scoring type syntactic semantic proximity full niqqud context syntactic 1 0.062 0.238 0.224 0.239 semantic 0.062 1 0.356 0.364 0.342 proximity 0.238 0.356 1 0.945 0.982 full niqqud 0.224 0.364 0.945 1 0.929 context 0.239 0.342 0.982 0.929 1 472 b. popović, d. knežević, m. seĉujski, d. pekar fig. 2 evaluation scores and manually selected readings evaluation scores for an example sentence are presented in fig. 2. the sentence is given in the top right corner, and the readings with the highest scores (highlighted) match the actual correct readings. features recovered by automatic morphological annotation (primarily vocalization and stress pattern) constitute the symbolic representation of the prosody of a given input sentence. this representation will be used as an input to the cart prosody generator, which will, in turn, produce a corresponding sequence of values of fundamental frequency and energy, as well as phone durations. 4. prosody generation and synthesis as has been mentioned before, it is well known that fully expert systems used for modeling of prosodic features are not of great practical use within speech synthesizers, mostly due to the large number of factors that influence prosody as well as their mutual effects, which are too complex to be sufficiently analyzed on speech corpora of reasonable size. speaker inconsistence represents an additional problem. even a single speaker can be expected to pronounce the same sentence differently on different occasions, each of the resulting utterances being equally acceptable to the listener. for all these reasons, the prediction of prosodic features is performed using machine learning, namely the methodology of classification and regression trees (cart) [9]. the basic principle of cart prosody prediction will be shown on an example of predicting the durations of phonetic segments (phones). the initial and the most important step is to identify the features to be used for training. this step has some basis in expert knowledge but the rest of the procedure is completely automatic. the set of features considered to be relevant for the phone duration includes phonemic identity, primary and secondary stress (with values: stressed, unstressed; applicable to vowels only), position within the syllable and position within the intonation boundaries (expressed as number of syllables), but many others as well. the durations of phones and relevant features are known for the training set and this set is thus the basis for prediction of duration for all other phoneme instances. automatic prosody generation in a text-to-speech system for hebrew 473 fig. 3 the first 3 levels of the regression tree used for estimation of phone duration the tree branching is performed as follows. all the possible yes/no questions based on the selected features (e.g. "is the phone stressed?", "is the distance to the nearest phrase break more than 3 syllables?" etc.) are evaluated for each phone instance in the training set. every question splits the starting n phoneme instances ("root" node) into two distinct subsets ("child" nodes) based on the answer (yes or no), and every question generally splits the set differently. the most relevant question is the one that reduces the total diversity (in terms of duration) of both "child" nodes to the greatest possible degree. at this point, the initial node is split into two "child" nodes based on the most relevant question (e.g. "is the phone stressed?"), and the procedure is recursively repeated for every descendant node, until the tree is fully branched. every terminal node ("leaf" node) is assigned a value – the average duration of all instances assigned to that node. the final tree usually contains multiple phoneme instances assigned to each "leaf" node. although the branching procedure is very computationally complex, the final use of the tree is exceptionally simple and fast. during the synthesis phase, the instance of the phone with known answers to all the relevant yes/no questions is propagated through the tree – from the root node to one of the leaf nodes. the exact path to the leaf node and the final node itself depend on the answers to yes/no questions. the estimated phone duration is the one assigned to the "leaf" node during the training phase (average duration for all the instances assigned to that node). as an illustration, fig. 3 shows the first 3 levels of the regression tree for the prediction of phone duration. the number within the node indicates the occupancy, i.e. number of phone instances within the node. the module for automatic prediction of prosodic features of the synthesized speech based on the regression trees for the hebrew language is trained on the speech database which consists of approximately 4 hours of speech from one professional speaker (the same database is used for synthesis). the database is annotated for phone boundaries and 474 b. popović, d. knežević, m. seĉujski, d. pekar phonological content, which corresponds to the phonological inventory of modern israeli hebrew. some phones are split into subphones (such as occlusions and explosions of stops and fricatives). stress is also marked (primary and secondary). for the purposes of cart training, the database is marked for a number of prosodic events including types and levels of intonational phrase boundaries (up, down; none, weak, medium, strong, very strong) as well as levels of emphasis (very weak, weak, neutral, strong, very strong). regression trees are trained for duration, energy, the value of f0 and its derivative, log ratio of f0 values at 1/4 and 3/4 of the duration of a vowel, as well as log ratio of f0 values between two successive vowels (measured at 3/4 of the duration of the first vowel and 1/4 of the duration of the second one). energy and durations are directly obtained, while the final f0 curve is derived from the outputs of the 4 f0-related trees. a total of 600 different criteria (yes/no questions) are taken into account during the process of regression trees branching. these criteria are defined based on the phonetic context, type of phoneme, phoneme position within a word, the corresponding word’s position within the sentence, etc. a number of compound criteria are also used (e.g. "is the phone vowel and stressed?"). in this case, with a training corpus of approximately 4 hours of speech, the maximum number of levels in the trees was 11. however, it should be pointed out that this value is, in general, greatly dependent on the criterion used for stopping the branching procedure (e.g., a number of instances in the node is less than some predefined threshold, or the reduction of the impurity of the node has been reduced by branching by a value which is less than some predefined threshold). after the trees have been built, at synthesis time, the expert systems analyze the input text and attempt to recover the correct reading for each word in it. by doing so, they recover the symbolic representation of the desired prosody for the input text, including the positions of stressed syllables as well as types and levels of intonational phrase boundaries and levels of emphasis for each word. these features exactly correspond to the features used in cart questions, and will be used for “passing” each phoneme of the input sentence down the tree, thus providing the acoustic representation of the desired prosody. after the acoustic representatives of prosody have been generated, segments used for speech signal synthesis are selected. the basic unit on which the segment selector operates is a half-phone. half-phones that are selected as candidates to be used for concatenation are assigned concatenation and target costs. a trellis structure is formed and the viterbi algorithm is used to find the optimal path (half-phone sequence) through the trellis, i.e. the one with the minimal accumulated cost. the cost assignment is performed based on multiple criteria, which can be classified into two basic groups: target criteria and concatenation criteria. the target criteria determine the mismatch between the acoustic features of the candidate half-phone and the required prosodic features, and express it through target cost, which is thus the measure of the unsuitability of the phonetic segment for being used in actual synthesis. the features taken into account for target cost are duration, f0 and its derivative, as well as energy. on the other hand, the concatenation criteria determine the cost of concatenating any two half-phones [10]. the quality of the synthesized speech greatly depends on the frequency of concatenation points, as well as the audibility of each of them. the concatenation cost, assigned to any ordered pair of half-phones, is defined as the measure of their acoustic mismatch at concatenation points and thus their incompatibility for being automatic prosody generation in a text-to-speech system for hebrew 475 concatenated. for pairs of half-phones which are adjacent and in the same order as in the speech database this cost is equal to zero, which means that such pairs of segments will, whenever possible, be selected for concatenation. in other words, the basic units for synthesis are thus, in fact, not limited to half-phones, but can include strings of halfphones of unlimited length. in practice, the strings of half-phones selected for concatenation are mostly between 3 and 5 half-phones long. the speech signal synthesis module performs signal concatenation. this module is based on the time-domain pitch synchronous overlap and add (td-psola) algorithm, as implemented previously in [11]. the outputs of the prosody generator module and the segment selection module are used as inputs for the concatenation module. since it is impossible very unlikely to have the segments that ideally match the prosody requirements, it is usually necessary to additionally adjust the selected segments as regards their durations, f0 and/or energy. 5. the quality of speech it should be noted that there are several independent sources of the differences between the prosody of synthesized speech and the prosody of natural human speech. besides the intrinsic variability of speech prosody (the fact that no speaker will pronounce the same utterance twice in the same way, and that a wide range of the values of prosodic parameters can be considered acceptable), there are two major factors that affect the accuracy of synthetic prosody. firstly, any error in morphologic annotation (and thus stress assignment) or the assignment of some other prosodic event such as phrase break or emphasis will lead to an error at the input of cart based prosody predictor. this would inevitably result in audible prosodic errors. on the other hand, even in cases when the input to cart is quite accurate, the output still may be of inferior quality due to corpus tagging errors (largely eliminated through manual inspection), data sparsity (insufficient training corpus size), inadequately estimated feature set or simply the intrinsic inability of the cart technique to adequately cover all the peculiarities of spoken language. the errors introduced by cart are most often less audible, and the final outcome is an intonation contour characteristic of accurate, albeit somewhat emotionless speech. the evaluation of the proposed automatic prosody generation module was carried out through the perceptual evaluation of the quality of synthesis. within the listening tests, 10 listeners (native speakers with no background in speech processing, text-to-speech synthesis or speech prosody) rated the tts system performance in terms of naturalness of synthesized speech on a scale from 1 (unnatural, robotic speech) to 5 (speech with apparently natural prosodic features). the listeners were presented with examples of synthesized speech using either the proposed cart-based generator or its previous version based on an expert system implementing explicit rules governing prosodic features. the utterances (a total of 20) were not marked, and their ordering was varied. the average score given to the cart-based system was 3.9, as opposed to 3.5 given to the rule-based version (the corresponding standard deviations were 0.39 and 0.41 respectively). figure 4 shows a comparison of three fundamental frequency contours for the sentence 'תיטמוטוא הארקה תכרעמ תועצמאב תעמשומ תאז העדוה', corresponding to the utterance as rendered by the native speaker (blue), referent system [5] (grey) and 476 b. popović, d. knežević, m. seĉujski, d. pekar proposed system (green). the three contours have been manually time-aligned to the utterance as rendered by the human speaker (indicated by the waveform and the phonemic labelling). it can be observed that the intonation curve as generated by the referent system seems quite regular, unlike the curves corresponding to the native speaker and the proposed system, which seem to exhibit more variation. furthermore, it can be seen that a much greater percentage of frames in the speech signal generated by the referent system were identified as voiced, in comparison to the other two systems. this is related to the characteristic buzziness present in the speech signal generated by the referent system, which (together with a rather monotonous intonation) was one of the major drawbacks of the referent system as reported by the listeners. however, most listeners also reported that the intonation contours of both synthesizers are adequately related to the positions of stressed syllables. 6. conclusion by using the expert system in combination with cart the quality of synthesized speech is considerably increased. based on the results of the listening tests, the system described in the paper provided much more natural-sounding speech when compared to the previous version of the system, in which the prosody was estimated using the expert system. an additional benefit of automated prosody generation is in the fact that such an automated system can be adapted to different dialects of the hebrew language much more easily and in much less time than the expert system. namely, covering a different dialect fig. 4 fundamental frequency contours for an example sentence, corresponding to the native speaker (blue), referent system [5] (grey), and proposed system (green). automatic prosody generation in a text-to-speech system for hebrew 477 of hebrew would require that a new speech corpus be recorded and tagged, and that the automatic training procedure be repeated, which is still widely considered to be far simpler than discovering new sets of expert rules related to prosody. the quality of synthesized speech could be further improved by widening the set of relevant questions as well as by improving the segment selection and signal concatenation modules. acknowledgement: this research work has been supported by the ministry of education, science and technological development of the republic of serbia, and it has been realized as a part of the research project tr 32035. references [1] j.p.h. van santen, "contextual effects on vowel duration", speech commun., 1992, vol. 11, no. 6, pp. 513-546. [2] m. seĉujski, n. jakovljević and d. pekar, "automatic prosody generation for serbo-croatian speech synthesis based on regression trees", in proceedings of the 12th annual conference of the international speech communication association, 2011, florence, italy, pp. 3157-3160. [3] ö. öztürk and t. çiloğlu, "segmental duration modelling in turkish", in proceedings of the 9th international conference on text, speech and dialogue, brno, czech republic, lect. notes comput. sc., springer, 2006, vol. 4188, pp. 669-676. [4] a. lazaridis, p. zervas, n. fakotakis and g. kokkinakis, "a cart approach for duration modeling of greek phonemes", in proceedings of the 12th international conference on speech and computer, 2007, moscow, russia, pp. 287-292. [5] d. kamir, n. soreq and y. neeman, "a comprehensive nlp system for modern standard arabic and modern hebrew", in proceedings of semitic’02, the acl-02 workshop on computational approaches to semitic languages, 2002, acl, stroudsburg, pa, usa, pp 1-9. [6] n. chomsky, morphophonemics in modern hebrew. routledge, 2012. [7] j. fellman, "concerning the "revival" of the hebrew language", anthropol. linguist., may 1973, vol. 15, no. 5, pp. 250-257. [8] b. popović, m. seĉujski, v. delić, m. janev and i. stanković, "automatic morphological annotation in a text-to-speech system for hebrew", in proceedings of the 15th international conference on speech and computer, pilsen, czech republic, lect. notes comput. sc., springer, 2013, vol. 8113, pp. 319-326. [9] l. breiman, j.h. friedman, c.j. stone and r.a. olsen, classification and regression trees. chapman & hall/crc, boca raton, london, new york, washington d.c., 1984. [10] a. black and n. campbell, "optimising selection of units from speech databases for concatenative synthesis", in proceedings of the 4th european conference on speech communication and technology, 1995, madrid, spain, pp. 581-584. [11] v. delić, m. seĉujski, n. jakovljević, m. janev, r. obradović and d. pekar, "speech technologies for serbian and kindred south slavic languages", adv. speech recognition, chapter 9, 2010. instruction facta universitatis series: electronics and energetics vol. 27, n o 2, june 2014, pp. 299 316 doi: 10.2298/fuee1402299k electrification of the vehicle propulsion system – an overview  vladimir a. katić, boris dumnić, zoltan čorba, dragan milićević university of novi sad, faculty of technical sciences, novi sad, serbia abstract. to achieve eu targets for 2020, internal combustion engine cars need to be gradually replaced with hybrid or electric ones, which have low or zero ghg emission. the paper presents a short overview of dynamic history of the electric vehicles, which led to nowadays modern solutions. different possibilities for the electric power system realizations are described. electric vehicle (ev) operation is analyzed in more details. market future of evs is discussed and plans for 2020, up to 2030 are presented. other effects of electrification of the vehicles are also analyzed. key words: ev short history, electric vehicles, ev power system 1. introduction transportation sector is the major energy consumer. as statistical data from 2009 and 2010 show, the transportation is spending as much as 19% (2009) of global total energy use [1]. in the eu its share in 2010 goes up to 31.7% or 365.2 million toe [2]. it is also contributing to 23% of the energy related green-house-gasses (co2) emission (2012), which is significant increase from 6.5% in 1990 [2], [3]. if current trend continues, transportation energy use and co2 emission are projected to increase by nearly 50% by 2030 [1]. although eu 2020 policy target is to decrease green-house-gasses (ghg) emission by 20% in 2020, the above data shows that transportation sector will not contribute much to it. this future is not sustainable, as the effects of climate change resulting from global temperature increase and fast rise of co2 concentration (fig.1, left) are evident. such a negative trend needs to be addressed and some possible solution should be pointed out. replacement internal combustion engine (ice) with electric propulsion in passenger cars is seen as a way for decreasing ghg emission and to mitigate the climate change problems [4]. electric propulsion is not new, and it dated back to mid 19 th century, when the first electric vehicles (ev) were presented [5]. however, the market destiny was not favorable to evs and in 1930s they were totally abounded. the revival started in late 1980s when environmental awareness of the population increased due to fast rise of the ghg and especially co2 emission (fig.1, left). at the same time, fast depletion of fossil fuels raised oil prices and put forward questions of energy future of the mankind (fig.1,  received february 12, 2014 corresponding author: vladimir katić university of novi sad, faculty of technical sciences, trg dositeja obradovića 6, 21000 novi sad, serbia (e-mail: katav@uns.ac.rs) 300 v.a. katić, b. dumnić, z. ĉorba, d. milićević right). intensive research efforts and new improved power conversion technology enable rapid development and cost-effective solutions. (source: http://philebersole.wordpress.com/2012/03/19/the-epic-history-of-oil/) (source: http://www.climatechoices.org.uk) fig. 1 concentration of the co2 and temperature change since 1850 (left) and crude oil prices history 1861 – 2010 (right) different structures of electric drive trains are possible: hybrid, as a combination of the ice and electric motor, or fully electric one [4]. the propulsion using fuel-cells or hydrogen energy is also offered, but in this paper it will not be discussed in more details. nowadays, all major car makers companies are offering some models with hybrid or electric propulsion. still, the motor type (induction, synchronous, reluctant, brushless dc or other) and battery packaging are not standardized and there are a lot of room for innovations and improvements. in this paper, an overview of current status in this field, regarding above presented problems is presented. additionally, market prospects and trends up to 2030 are considered, showing that ill destiny of early electric cars will not be repeated. 2. short history end of the 19 th century brought great discoveries in the field of electrical engineering and raised enormous hopes on its rapid development and application. one of the fascinating presentations of that time was at the world exhibition in chicago in 1893, where a new product for the transportation, the electric car was shown. this was result of more than 100 years of discoveries and innovations in the field of electricity, which yielded to creation of simple electric carriage powered by non-rechargeable primary battery cells by scottish inventor robert anderson in 1838, invention of rechargeable lead-acid battery by french physicist gaston plante in 1859 and its basic improvement for use in vehicles by another frenchman camille faure in 1881 [5]. the first electric car was made by french engineer gustav trouve in 1881. it was a three-wheel vehicle with a 70w dc motor powered by lead-acid battery. at the same time englishmen william ayrton and john perry develop their solution, an electric tricycle with a motor power of 350w and a maximum speed of about 15 km/h. vehicle was supplied from lead-acid batteries, and speed control was achieved by changing battery connections [5]. the sudden development of electric traction using a dc drive and its commercialization in the urban regions of (electric trams, trains, ships, etc.) led several companies to start ev manufacturing in 1896. the first produced cars found application as new york city taxis (fig. 2). they had a maximum speed of 32 km/h and radius of up to 40 km [5]. http://philebersole.wordpress.com/2012/03/19/the-epic-history-of-oil/ http://www.climatechoices.org.uk/ electrification of the vehicle propulsion system – an overview 301 (source http://en.kllproject.lv/new-york-yellow-taxi-photo.html) fig. 2 electric (yellow) taxi in new york around 1901 at the beginning of the 20 th century motorized transport customers could choose between a steam-powered vehicles, gasoline or electric ones. the market was divided, without any indication as to which drive will be dominant in the future. steam-powered vehicles had speed and they were cheaper, but they suffer from long start-up time (to warm up they needed about 45 minutes), and need to make frequent stops for water. vehicles with internal combustion engines had vibration, noise and smell due to exhausted gases. they needed manual operation to start, changing gears presented a special problem during a drive, but the price was moderate and they could be used for longer trips at a reasonable speed without stopping. the popularity of electric vehicles was due to some advantages they had over their competitors. they were convenient for short distances, around the city limits, easy to start and to drive (no difficulties with gear shifting), reaching high speeds. the first man to break 100 km/h speed barrier was camille jenatzy’s, a belgian race driver with his electric car named jamais contente in 1899. electric vehicles were clean and quiet, also, but expensive, as they were built for upper class in form of massive carriages, with fancy interiors and from expensive materials. during this period, nearly fifty companies manufactured electric cars covering 38% of the u.s. market. the electric vehicles were prosperous until 1920s with the peak of production in 1912 [5]. electric drive technology has not kept pace with the needs of population for traveling on long distances nor in terms of speed or a suitable infrastructure for energy supply (battery charging stations). already in 1913 the general observation was that electric vehicles are losing competition with gasoline cars. the great depression '20s in the united states and in the world has drastically limited the resources for innovation in this area, so it gradually decreased production in the u.s. and other companies in europe. the final blow was when the ford motor company developed a system of mass production and launched the famous model ford t4 for price 50% lower than the corresponding electric cars. up to the 1930s electric vehicles have disappeared from the car market in the usa. however, a series of circumstances brought electric cars, again, in the focus of researchers and the general public attention. developments in power electronics and wide http://en.kllproject.lv/new-york-yellow-taxi-photo.html 302 v.a. katić, b. dumnić, z. ĉorba, d. milićević application of semiconductors (solid-state) power converters in late 1950s have reduced losses, improved operation of electric drives and increased energy efficiency to over 90%. new algorithms of analogue and latter digital control using microprocessors, enables highquality and reliable motor speed control occupying small space. in sixties the production was limited to small experimental types. models p50 and peel trident of the peer electric mini car company were suitable for city rides and parking introducing three-wheeler structure with fiberglass bodywork (fig. 3, left). similarly, model enfield 8000 (fig. 3, right) was produced as small city car in london and had two doors and four seats. dc electric motor of 6 kw and 220ah lead-acid batteries enabled the radius of up to 90 km and a maximum speed of 60 km/h. it was manufactured in only 106 copies, so the price was not competitive and was no treat to dominant gasoline powered cars. (source: http://www.trendhunter.com) (source: www.veicolielettricinews.it) fig. 3 peel electric cars (1962) and model 8000 enfield (1969) one of the turning points in the automakers industry happened in mid-1970s, when oil supply was restricted due to political instabilities in the middle east, resulting in serious energy crises and sharp oil prices jump (fig.1). additionally, news that the oil reserves are limited and that they will be soon exhausted brought new concerns. at the same time, concerns for the environment and high air pollution in the big cities, due to exhaust gasses from the gasoline cars and threat to population health, started “green” movement in many countries. with increased environmental awareness and with ghg (especially co2) emission effects on climate changes (fig.1), the movement became a worldwide. it encouraged the search for alternatives in transportation power train and lead to reconsidering of electric cars or proposing hybrid solutions. during the 1980s the research efforts continued, but besides the various models of mini cars, no serious commercial attempts were made. a key problem has been the batteries, their great weight and relatively small energy capacity leading to short driving range. additional challenge has been improvements at the heat engine competition, i.e. significant reduction of exhausting gasses and fuel consumption of the ice. still, the nineties brought the first models of electric car to the market, the ev1, which general motors produced from 1996 to 2003 (fig. 4, left). ev1 was accelerating from 0100 km in 8 s, the maximum speed was 160 km/h and it had a radius of 193 km. the first models in 1996 use 53 ah lead-acid batteries, with internal voltage of 312 v, which enabled the range of 100 km. later models (2 nd generation 99-2003), switched to the nimh (nickel metal hydride) batteries, which reduced weight and increased range of up to 240 km. the batteries have been in the form of a package with capacity of 26.4 kwh, which consisted of http://www.trendhunter.com/ http://www.veicolielettricinews.it/ electrification of the vehicle propulsion system – an overview 303 26 pieces of 13.2 ah 77 v batteries, and with a bus voltage of 343 v. however, the basic model price was high and it needed complete replacement of the batteries after only 40,000 km, so it did not withstand competition and was removed from the market. another significant market attempt was launch of the honda's ev plus model in 1997, which uses dc brushless 49 kw motor and nimh batteries (fig. 4, right). ev plus had a maximum speed of 130 km/h and radius of up to 160 km. it was produced in 340 copies, only, and they were sold exclusively leased, as their price was high. in the end, honda pulled all the cars from the market and dismantled most of them in 1999. (source: http://www.telegraph.co.uk/motoring/picturegalleries/5423182/a-history-of-general-motors-in-pictures.html) (source: http://www.barthworks.com) fig. 4 general motors ev1 (1996) and honda model ev plus with nimh batteries frequent increases in prices and uncertainty of the oil market, as well as mature environmental awareness, especially in economically developed countries, have contributed that the beginning of the 21 st century is marked by the decision of the majority of the world's great car manufacturers to start the production of hybrid and electric vehicles. the landmark is clearly marked by two models: a hybrid car toyota’s prius and an electric car teslamotors’s roadster. toyota prius is result of careful thinking and evaluative strategy of passenger car development from the ice, over hybrid to electric propulsion. it is the first successful hybrid model, produced for more than 15 years. it has appeared in three generations (prius, prius + and prius plug -in) and was sold over 3,000,000 units (fig. 5). the details of these models are widely known, but it is worth mentioning that it is a hybrid solution, in which the 500 v electric motor (27kw, 37ks) directly powered from an electric generator, which powered internal combustion engine (like, drive the alternator on standard vehicles with ice) [6]. internal combustion engine has a capacity of 1.8 l, 100 hp with an average fuel consumption of 4 l/100km, and the excess energy is stored in batteries. nimh battery has a capacity of 6.5 ah, and consists of 28 modules of 7.2 v, of total weight of 30 kg, with an output voltage of 202 v. specific battery power is 1300 w/kg, and the durability of 300.000 km. last generation include larger battery packs and charging ability of the public distribution network, plug-in prius. other similar models hybrids, especially plug-in solutions, include chevrolet volt (or voxal ampera or opel ampera), ford fusion energi phev, ford c-max energi phev, etc. http://www.telegraph.co.uk/motoring/picturegalleries/5423182/a-history-of-general-motors-in-pictures.html http://www.barthworks.com/ 304 v.a. katić, b. dumnić, z. ĉorba, d. milićević (source: http://www.automagazin.rs) fig. 5 three generations of toyota prius, the top selling hybrid in the world tesla roadster appeared in 2008 and presents an innovation in the field of electric cars. electric propulsion is based on a three-phase, four-pole, ac electric motor, which is controlled by the microprocessor-controlled three-phase inverter and powered from the lithium-ion (li-ion) battery capacity of 60 kwh and 200,000 km of guaranteed operation (there is option of 85 kwh batteries and with unlimited duration of warranty). his driving performances are impressive it needs only 3.7 s to accelerate to 100 km/h, and with single battery charge may drive up to 400 km (fig. 6). for charging it may use the garage battery charger, which can fill the battery in 4h or mobile charger, with which charging takes 6 h. the new model (2014) tesla s offers some improved options like fast battery swap and better charging. (source: http://www.teslamotors.com/roadster) fig. 6 tesla roadster model 2008 the teslamotors company is also making significant investment to develop network of charging stations for energy supply of electric vehicles across the country. in 2014 there are feeder cells in most metropolitan areas in the u.s. with fast chargers of 120 kw, where charging takes only 30 min. the company is planning for 2015 to cover with fast chargers the most important cities and routes in usa enabling easy coast-to-coast ride (fig. 7). in recent years, more and more models of electric cars of serial production are available in the market, some of which are already well-known: aforementioned tesla roadster, then the mitsubishi i-miev, nissan leaf, reva (manufacturer reva electric car comp. from the uk), peugeot ion, citroen c-zero, renault zoe, bmw i3 and others. leaf and imiev, with total sales of over 15,000 units each, are now the best-selling full electric cars. http://www.automagazin.rs/ http://www.teslamotors.com/roadster electrification of the vehicle propulsion system – an overview 305 (source: http://www.teslamotors.com/supercharger) fig. 7 the charging station infrastructure in usa – teslamotors plan for 2015 3. electric vehicular power system a passenger vehicle or a motor car has four power systems: mechanical, hydraulic, pneumatic and electrical one to operate a large number of different loads and to perform assistance to the propulsion system main power train – internal combustion engine (ice). however, all systems are actually powered from the ice, except batteries, which could be charged off-board (this option is rarely used, i.e. only when batteries are depleted and needs to be re-charged). to improve efficiency and reduce oil consumption and gasses emission of an ice, involvement of electrically powered drive train is needed. for example, efficiency of an ice reaches 30-33%, but of an electric motor (machine) it could go between 80-90%, even beyond. the motor car electrification or process of application of electric energy for powering some apparatus or equipment in an ice powered vehicles started way back in 1908, when the first electric device was implemented. it was electric horn, or klaxon, which was powered directly from dry battery cells. problem with non-rechargeable dry cells and a need for better lighting was solved by introduction of rechargeable batteries and dynamo generator in 1912. the main goal of that time was to simplify driving, especially starting, than to increase safety, to improve convenience and passenger comfort. therefore, many other electrical apparatus and electronic devices were introduced (electric starter, wind screen wipers, air-conditioning, modern entertainment systems, on-board computers, information system, sensors, radars, parking assistance, and many others) later on, demanding more electrical power and secure and reliable supply. however, previous electrification goals have not included the main power train, i.e. the ice, which is actually the trend of recent decade. nowadays, the motor car electrification has wider goal, the one which has been mentioned at the beginning of this paper, i.e. to improve efficiency, reduce emission and improve performance. in addition, the need for smarter, more reliable, safer vehicle, integrated in modern communication (internet) and social networks, but also connected with other vehicles on the road, requires more electric, electronics, communication and computer systems on board. http://www.teslamotors.com/supercharger 306 v.a. katić, b. dumnić, z. ĉorba, d. milićević the level of electrification is defined as ratio of peak electric power to peak power of all power generating systems (electric and ice) [7]. regarding this feature of a motor vehicle, several levels of electrification may be distinguished: 1. ice with non-propulsion electric systems 2. more electric vehicle (mev) 3. hybrid electric vehicle (hev) 4. plug-in hybrid electric vehicle (phev) 5. full electric vehicle or battery electric vehicle (ev or bev) 6. future evs – fuel-cell powered evs, solar assisted evs, etc. 3.1. ice with non-propulsion electric systems ice with non-propulsion electric systems uses electric power for operation of a number of electrical loads (previously mentioned), but not for the drive train. all those loads are supplied from an alternator (12v, 0.8 – 1.7 kwp), an ac electrical generator with ac/dc conversion and with backup from lead-acid battery (12v, 45ah-110ah). as number of loads is increasing, energy management and increase of efficiency became essential. the long debated proposal for increasing the battery voltage to 42v, have been abounded due to significant additional costs and lower reliability, although many advantages have been pointed out [8]. 3.2. more electric vehicle (mev) more electric vehicle (mev) is a car that keeps its ice propulsion system, but optimizes other systems (non-propulsion), especially electrical one. the main characteristic of such vehicles is integration of the starter and alternator (integrated starter/alternator – isa), which enables easy implementation of start-stop function and regenerative braking. the start-stop function ceases motor operation during short stops in front of traffic lights and similar situation with idling engine, resulting in lower fuel consumption and co2 emission per km. the regenerative braking function uses the kinetic energy of the vehicle during braking or down-hill riding to convert it to electrical and charge the batteries. that function improves energy management and better utilization of different electrical loads. 3.3. hybrid electric vehicles (hev) hybrid electric vehicles (hev) are step in evolution towards full electric ones. they are compromise between huge investments needed for developing completely new vehicle model and requirements of modern society to decrease co2 gasses emission and fuel consumption. they may be also classified as low emission vehicles in compliance with new legislation, especially in the state of california (usa) [9]. the main idea of hev is to apply electric energy for propulsion in addition to the ice. depending on the level of electric propulsion implementation, different levels of hybridization are defined. these levels are expressed with vehicle’s hybridization factor or with ratio between its peak electrical power and its peak total electrical and mechanical power. in that sense, the hybrid vehicles can be divided into micro hybrids, mild hybrids, power (full) hybrids and energy hybrids [7]. micro hybrids usually have a hybridization factor of 5-10%, mild hybrids between 10-25%, while power and energy hybrids between 30-50%. the main advantages of hybridization or of having a dual power train are that combining electrical motor and ice higher efficiencies may be reached, better flexibility electrification of the vehicle propulsion system – an overview 307 of drive, improved riding autonomy, while fuel consumption and gasses emission are decreased. there are several possibilities of organization of such a dual power train, so hevs may be additionally classified into series, parallel and series-parallel [10]-[14]. the series hybrid architecture consists of three machines connected in series (fig. 8). the ice drives an ac electric generator that produces power for charging batteries (dc) and driving the ac electric motor, which is attached to the transmission or directly to the differential or the wheels. to connect different power systems and voltage levels, an ac/dc and a dc/ac converter are needed. in such a way, the dc link decouples two electrical machines, while the electrical system mechanically decouples ice from the wheels. problem is that overall efficiency gain is not significant due to multiple power conversion. fig. 8 series hybrid electric vehicle architecture in a parallel hybrid vehicle both electrical machine and ice are contributing to the propulsion as they are mechanically coupled to the transmission or the wheels (fig. 9). the electrical machine is assisting the ice in order to reduce the fuel consumption, so it is used mainly during start-up and speed acceleration. on the other hand, it enables regenerative function and battery charging through power electronics converter. the battery, i.e. energy storage system is relatively small providing enough power for short operation, but not enough to energize all-electric mode, especially at high speeds. for such an application supercapacitor or ultracapacitors are recently proposed [15]. fig. 9 parallel hybrid electric vehicle architecture 308 v.a. katić, b. dumnić, z. ĉorba, d. milićević in a serial-parallel hybrid two electrical machines are combined with ice to provide both series and parallel paths for power (fig. 10). electrical motor and ice are mechanically coupled for delivering power to the wheels. ice is also coupled with electrical generator to generate electricity for charging batteries. batteries are providing power to electric motor. regenerative function for charging batteries is possible, also. fig. 10 series-parallel hybrid electric vehicle architecture 3.4. plug-in hybrid vehicles (phev) while the hevs have electric assistance to ice, the phevs have a high-energydensity energy storage system that can be externally charged (fig. 11). this enables that the vehicle can run in full electric mode or electric assisted mode. again, series, parallel or series-parallel hybrid power trains are possible. as the batteries are of higher energy capacity than in hevs, such vehicles are also called range-extended hevs. (source: http://www.mge.com/environment/innovative/hybrid-vehicles.htm) fig. 11 phev in comparison to hev http://www.mge.com/environment/innovative/hybrid-vehicles.htm electrification of the vehicle propulsion system – an overview 309 phevs may have on-board charger which could be bidirectional and enable smartcharging capacity, which is recognized as vehicle-to-grid (v2g) charging mode. also, off-board or public chargers in v2g mode may be used, together with home chargers (vehicle-to-home v2h charging mode). this gives phevs more flexibility and possibility in running in two operating modes: charge-depleting and charge-sustaining mode. in the first mode, electrical energy from the batteries is used to provide power until it is consumed, i.e. the state of charge of the batteries reaches predefined minimal value. after that the ice is turned on and the vehicle runs in a hybrid mode. if the state of charge of the batteries is sustained in predefined range, this operation mode is called charge-sustaining mode. 3.5. full electric vehicle or battery electric vehicle (ev or bev) full electric vehicles or electric vehicles (evs) or battery electric vehicles (bev) has allelectric propulsion system. there is no other engine, then electric motor. the electric power system has the same concept as at the beginning at the end of 19 th century, when the energy was supplied from rechargeable batteries, and then energy conversion block make adaptation to the needs of the dc electric motor, which was powering the car. however, the technology and the efficiency of the whole system have been improved through intensive research and innovation using computer modeling, simulation, emulation and laboratory and prototype testing [4], [7], [11], [13], [14], [16][19]. nowadays power train components include improved high voltage batteries of high energy density, supercapacitors (or ultracapacitors) of high power density, battery (energy) management system, high voltage dc grid (130v – 400v), power conversion system (dc/dc converters and dc/ac inverter), on board battery charger (ac/dc converter), ac motor/generator, low voltage battery, low voltage (12v) dc power grid for non-propulsion loads. off-board chargers and plug-in features are also included in the system. complete ev’s electric power system is shown in fig. 12. fig. 12 typical power system architecture in an ev 310 v.a. katić, b. dumnić, z. ĉorba, d. milićević 3.5.1. batteries and supercapacitors the batteries and supercapacitor perform an energy supply and storage system of an ev, enabling high energy and power supply. normally, batteries (packed to produce high voltage output) are the main energy source of the propulsion system, determining the operation of the vehicle and its driving range. besides this one, the ev has a separate low voltage battery (12v) for supplying non-propulsion loads. to improve the performance, especially in high power demanding driving moments, like starting the car, speed increase or fast acceleration, a superconductor is considered as additional power source in the power train, parallel to the batteries [16], [20]-[22]. the batteries have evolved from the lead-acid ones, which are reliable, low price, and standardized, but heavy (9-15 kg), short-lived (500-800 cycles) and of lower specific energy (33-42 wh/kg) and power (0.18 kw/kg) density. today’s li-ion batteries are more convenient for ev and phev applications, having 3,500 cycles in a life time, a specific energy density of 130-140 wh/kg, and a power density of 2.4 kw/kg (fig. 13). increase production and innovations will lead to further improvements, so in 2015 it is expected that li-ion batteries will reach a specific energy of 250-300 wh/kg and a specific power density of 3.5 kw/kg, while the costs will decrease from 0.5-0.6 €/wh in 2011 to 0.15-0.25 €/wh in 2015 [16]. however, the main problem of long battery charging time is still remaining to be solved. improving ac/dc converters to convert ac power from the public grid to dc, either as off-board (public or home) or as on-board charger resulted in four charging modes – from the fast one (20-30 min) to slow one (6-8 h) [23]. however, this is still not satisfactory, as people are used to short refuelling time with the ice. one possible solution is offered with flow batteries, which are kind of rechargeable fuel cells [4], [24]. there are several types (like redox, hybrid and membrane-less), but the most promising are the vanadium redox batteries, which have a life time of 10,000 cycles, quick and easy recharging similar like refuelling an ice, as it is done simply by replacing the electrolyte. the main disadvantages are a relatively poor specific energy density 10-20 wh/kg and the system complexity and size. (source: http://en.wikipedia.org/wiki/file:supercapacitors-vs-batteries-chart.png#file) fig. 13 energy storage devices: energy vs power density http://en.wikipedia.org/wiki/file:supercapacitors-vs-batteries-chart.png#file electrification of the vehicle propulsion system – an overview 311 fast charging are achieved with supercapacitors or ultracapacitors (or electrical double layer capacitors) for power applications, which are energy storage devices like electrolytic capacitors, but of capacitance values up to several thousand farads (fig. 13) [15], [21]. their main advantage is very high specific power density (from 2 kw/kg to 15 kw/kg in 2013, with expectation of further rise up to 30 kw/kg), long life time (10 5 – 10 6 charge/discharge cycles) and very fast charging time (several seconds). on the other hand, the specific energy density is relatively low (10-15 wh/kg), so they are not suitable to be use as solely energy storage units. 3.5.2. energy conversion standard electric power system of an ice powered cars has 12v dc bus, which is appropriate for powering all loads. due to increased power demand in modern vehicles, 42v dc bus was considered, but this idea has been abounded due to economical reasons. however, in a hevs and evs besides conventional vehicular loads, there is an ac electric motor, as the main propulsion for a car motion, which needs higher operating voltage. therefore, a separate high voltage dc bus, which is supplied from high voltage battery, is needed [13]. the battery output voltage depends on state of charge i.e. on depletion level and may vary between 125v to 200 v. a regenerative dc/dc convertor is used to boost the voltage up to dc bus system level of 400 v. if battery voltage is below nominal 200 v, than the dc bus voltage is also decreased to minimum 267 v. the on-board or off-board charger is a three-phase ac/dc converter in h bridge or back-to-back topology connected to high voltage dc bus. the dc/dc converter operating in buck mode is transferring the energy to high voltage battery and through 12v dc/dc converter to low voltage battery, also. the traction inverter, which function is to provide ac power to the main, traction ac motor, has input voltage range between 190 v and 400 v. the inverter is h bridge topology, composed of igbt switches with free-wheeling diodes and controlled with space vector modulated pwm or different other control algorithms. all these converters are operating in switch-mode resulting in high efficiency and low losses. still, sophisticated energy management is needed to coordinate energy flow and enable high efficiency. further improvements in these directions are expected in the future. 3.5.3. traction electric motor although dc motor seems logical choice for ev’s propulsion, as it is powered from dc batteries, today’s solutions are based on ac motors. induction or synchronous ac motors are used as traction motors, due to their lower weight and costs, higher reliability and lower maintenance needs. for high power propulsion, induction motor is used. for example, tesla roadster is using a 3-phase 4-pole induction motor of 185 kw power and with maximum speed of 6,000 rpm. for four wheel drive, four permanent magnet synchronous motors (pmsm) are mounted as a part of a wheel structure. the advantages of such realization are elimination of mechanical gears and differential, which are used in single radial machine drive system. this gives higher efficiency, less weight, and improved reliability, but has usual size and weight restriction, so they are convenient for small vehicles. 312 v.a. katić, b. dumnić, z. ĉorba, d. milićević 3.6. future evs nowadays electric vehicles are powered from electric batteries, which are charged from the public electric grid. however, such electric energy is generated partially (30% 70%) from fossil fuels powered plants (coal, oil or similar) and therefore such evs are not contributing to reduction of the co2 emission and improving the environment in full sense. in fact, the emission area is only moved from the big cities to the area where the coal or oil plant is located. it may be estimated that a coal plant co2 emission for a 1km of a 40kw (52hp) ev drive is around 200 gco2/km [16]. therefore, additional efforts and technical innovations are needed to achieve full green effect of evs operation. there are several ideas, but for the moment only applications with solar energy and hydrogen energy using fuel cells have been manufactured as prototype in some ev models. solar energy is converted to electrical one using photo-voltaic effect in photo-cells. there are two ways of using this renewable energy source. one is for charging batteries during park time and the other is powering the vehicle from photo-cells integrated in the cover of the vehicle. the first solution is very popular enabling different parking shades design. the second possibility is practical only for non-propulsion apparatus/loads in the vehicle, like air-conditioning, or in case of very light vehicles [25]. fuel-cell (fc) is using h2 gas аs the fuel, and combines it with oxygen to produce electricity and h2о as output. therefore it is environmentally friendly and does not emmit any ghg. the complete scheme of a fc ev is shown in fig. 14. the cell is producing electricity and store it to the batteries, from which it is consumed by electric motor. the process is of low dynamics, so additional power using supercapacitors are considered [26]. fig. 14 fuel cell electric vehicle architecture electrification of the vehicle propulsion system – an overview 313 4. effects of vehicles electrification electrification of the vehicle propulsion system and development of plug-in ev (pev) industry is the final step toward achievement of low or even zero emission passenger cars. the process started as individual effort of some innovators in 1960s up today’s determination of all major car makers companies to include at least one hybrid or electrical model in their portfolio. at the moment the number of sold hevs, phevs and evs is rapidly rising, spreading form a group of countries called ev initiative (evi). market success and high public acceptance of hybrid models in usa, especially toyota prius and chevrolet volt, which was sold in more 3,000,000 cars, made breakthrough in application of electric energy in the power propulsion of modern vehicles. phevs are dominating in usa market, with 70% of share, followed by japan with 12% and the netherlands with 8% in 2012 [16]. the data show that there was around 180,000 pevs on the road in 2012 [27]. as of december 2013, this number has been risen up to 380,000 pevs (passenger cars and utility vans) worldwide, and almost 2,000,000 evs (pevs+hevs). fig. 15 (left) shows annual ev sales by drive-train (hev, phev, bev) in 2013 and further prospects up to 2022 [27]. it can be seen that there is huge ev market prospective and that the annual sales of several million units are envisages. other sources forecast that in 2020 the ev industry (hev+phev+bev) will produce between 5,000,000 units [27] up to 7,500,000 units [28], [29], reaching 15% [7] up to 20% [28] (fig.15, right) of all vehicles sales. fig. 15 left: annual evs worldwide sales forecast 2013–2022 [27]; right: cumulative evs sales 2009-2020 [28]. still the most of the sales will be in range of mild and full hybrid evs, which are in class of low emission vehicles (lev). therefore, the goal of decreasing the overall level of co2 by 20% in eu will not be achievable, with electrification, only. however, in long run, up to 2050, the effects will be more significant. fig. 16 shows cumulative ghg emission savings of a fleet of 11.2 million evs that will be sold between 2010 and 2020 under three scenarios. maximum savings is reaching almost 45 million metric tons of ghg (for comparison purpose, in 2009 u.s. emitted 7000 million metric tons) [29]. 314 v.a. katić, b. dumnić, z. ĉorba, d. milićević fig. 16 cumulative ghg savings due to use of evs 2010-2030 [29] another effect of fast growth of the evs may be increase of electricity demand and influence on stability of electrical grid. an estimation with similar three scenarios shows that in 2020 additional 7 twh is needed, while 40 twh in 2030 (fig. 17). this is not a significant demand for a large country capacity like usa, so it may be concluded that there will be not major problem in providing electricity supply for the evs. another research shows that in the california, no additional capacities will be needed to charge 10 million evs between 11 p.m. and 8 a.m., but a 30% of new capacities will be required if these vehicles are charged between 5 p.m. to 12 a.m. [30]. fig. 17 number of evs and their electricity demand forecast (2010-2030). 5. conclusion electrification of vehicles is entering in the final stage where the remaining ice propulsion is gradually replaced with electric one. different solutions are possible for the drive train – hybrid, plug-in hybrid or full electric. the evs power system is characterized with dominant dc bus, ac electric motor and multiple voltage levels. the main power source is battery, but additional power may be supplied from supercapacitors, also. to operate such a system, several power electronics converters with sophisticated control electrification of the vehicle propulsion system – an overview 315 methods and energy management are needed. also, special on-board and/or off-board electricity chargers are integrated in the system. the market perspectives for the ev industry are very promising. a huge rise in production is expected in coming years. this will have effects on decreasing the ghg emission, but in long run. on the other hand, no significant influence on existing public power system is expecting, especially if battery charging is performed during night hours. still, to become competitive with ice cars, further improvements and innovative development is needed and expected in the future. acknowledgement: the paper is a part of the research done within the project no. 114-4513508/2013-04 co-financed by the provincial secretariat for science and technological development of a.p. vojvodina. references [1] international energy agency, “transport, energy and co2 – moving toward sustainability”, report, paris, 2009, http://www.iea.org/publications/freepublications/publication/transport2009.pdf [2] european commission – eurostat, “consumption of energy”, data from august 2012, on-line only: http://epp.eurostat.ec.europa.eu/statistics_explained/index.php/consumption_of_energy [3] international energy agency, “technology roadmap: electric and plug-in hybrid electric vehicles (ev/phev)”, report, paris, released 2009, updated june 2011, http://www.iea.org/publications/ freepublications/publication/name,3851,en.html [4] b.k. bose, “global energy scenario and impact of power electronics in 21 st century”, ieee transaction on industrial electronics, vol.60, no.7, pp.2638-2651, july 2013, doi: 10.1109/tie.2012.2203771 [5] ***, “the history of electric vehicles”, electric vehicles news, available on-line (feb. 2014), http://www.electricvehiclesnews.com/history/historyearly.htm [6] r.h. staunton, c.w. ayers, l.d. marlino, j.n. chiasson, t.a. burress, “evaluation of 2004 toyota prius hybrid electric drive system”, report for the u.s. department of energy, may 2006, http://k0bg.com/images/pdf/890029.pdf [7] a. emadi, “transportation 2.0”, ieee power & energy magazine, vol.9, no.4, pp.18-29, july/aug. 2011, doi: http://dx.doi.org/10.1109/mpe.2011.941320 [8] j.g. kassakian, h.c. wolf, j.m. miller, c.j. hurton, “automotive electrical systems circa 2005”, ieee spectrum, vol.33, no.8, pp.22-27, aug. 1996, doi: http://dx.doi.org/10.1109/6.511737 [9] u.s. department of energy, alternative fuels data center, “california laws and incentives for air quality / emissions”, 2013, http://www.afdc.energy.gov/laws/laws/ca/reg/3843 [10] a. emadi, s.s. williamson, a. khaligh, “power electronics intensive solutions for advanced electric, hybrid electric, and fuel cell vehicular power systems”, ieee transaction on power electronics, vol.21, no.3, pp.567-577, may 2006, doi: http://dx.doi.org/10.1109/tpel.2006.872378 [11] k. rajashekara, “present status and future trends in electric vehicle propulsion technologies”, ieee journal of emerging and selected topics in power electronics, vol.1, no.1, pp.3-10, march 2013. [12] c. shen, p. shan, t. gao, “a comprehensive overview of hybrid electric vehicles”, international journal of vehicular technology, vol.2011, article id 571683, pages 7, 2011, on line available: http://dx.doi.org/10.1155/2011/571683 [13] h. van hoek, m. boesing, d. van treek, t. schoenen, r.w. de doncker, “power electronics architectures for electric vehicles”, int. conf. on emobility electrical power train, 8-9 nov. 2010, leipzig, doi: 10.1109/emobility.2010.5668048 [14] a. emadi, y.j. lee, k. rajashekara, “power electronics and motor drives in electric, hybrid electric, and plug-in hybrid electric vehicles”, ieee transaction on industrial electronics, vol.55, no.6, pp.2237-2245, june 2008, doi: http://dx.doi.org/10.1109/tie.2008.922768 [15] j.w. dixon, m. ortúza, e. wiechmann, “regenerative braking for an electric vehicle using ultracapacitors and a buck-boost converter”, 17th electric vehicle symposium (evs17), montreal (canada), oct.15-18, 2000, http://web.ing.puc.cl/~power/paperspdf/dixon/42a.pdf http://www.iea.org/publications/freepublications/publication/transport2009.pdf http://epp.eurostat.ec.europa.eu/statistics_explained/index.php/consumption_of_energy http://www.iea.org/publications/freepublications/publication/name,3851,en.html http://www.iea.org/publications/freepublications/publication/name,3851,en.html http://dx.doi.org/10.1109/tie.2012.2203771 http://www.electricvehiclesnews.com/history/historyearly.htm http://k0bg.com/images/pdf/890029.pdf http://dx.doi.org/10.1109/mpe.2011.941320 http://dx.doi.org/10.1109/6.511737 http://www.afdc.energy.gov/laws/laws/ca/reg/3843 http://dx.doi.org/10.1109/tpel.2006.872378 http://dx.doi.org/10.1155/2011/571683 http://dx.doi.org/10.1109/emobility.2010.5668048 http://dx.doi.org/10.1109/tie.2008.922768 http://web.ing.puc.cl/~power/paperspdf/dixon/42a.pdf 316 v.a. katić, b. dumnić, z. ĉorba, d. milićević [16] n.c. kar, k.l.v. iyer, a. labak, x. lu, ch. lai, a. balamurali, b. esteban, m. sid-ahmed, “courting and sparking: wooing consumers’ interest in the ev market”, ieee electrification magazine, vol.1, no.1, pp.21-31, sep.2013, doi: http://dx.doi.org/10.1109/mele.2013.2272481 [17] e.m. adzic, m.s. adzic, v.a. katic, d.p. marcetic, n.l. celanovic, “development of high-reliability ev and hev ac propulsion drive with ultra-low latency hil environment”, ieee transactions on industrial informatics, vol. 9, no.2, pp.630-639, may 2013, doi: 10.1109/tii.2012.2222649 [18] n. janiaud, f.-x. vallet, m. petit, g. sandou, “electric vehicle powertrain simulation to optimize battery and vehicle performances”, ieee vehicle power and propulsion conference (vppc), lille, 1-3 sep. 2010, doi: http://dx.doi.org/10.1109/vppc.2010.5729141 [19] seref soylu (editor), “electric vehicles – modelling and simulations”, intech europe, rijeka, croatia, 2011, http://www.intechopen.com/books/electric-vehicles-modelling-and-simulations [20] l. sun, c. c. chan, r. liang, q. wang, “state-of-art of energy system for new energy vehicles”, ieee vehicle power and propulsion conference (vppc), september 3-5, 2008, harbin, china, doi: http://dx.doi.org/10.1109/vppc.2008.4677574 [21] m. halper, j. ellenbogen, “supercapacitors: a brief overview”, mitre, mc lean, virginia, usa, march 2006, http://www.mitre.org/sites/default/files/pdf/06_0667.pdf [22] c.c. chan, l. sun, r. liang, q. wang, “current status and future of energy storage system for ev”, 23 rd int. battery, hybrid and fuel cell electric vehicle sym. & exh. (evs-23), anaheim, 2-5 dec. 2007, http://www.lifepo4.info/battery_study/batteries/current_status_and_future_of_energy_storage_system _for_ev.pdf [23] iec 62196-1 standard: “plugs, socket-outlets, vehicle couplers and vehicle inlets–conductive charging of electric vehicles”, genève, 2003. [24] t. nguyen, r. savinell, “flow batteries”, interface, the electrochemical society, vol. 19, no.3, pp.5456, fall 2010, http://www.electrochem.org/dl/interface/fal/fal10/fal10_p054-056.pdf [25] r. sims, p. mercado, w. krewitt, et al., “integration of renewable energy into present and future energy systems”, chapter 8 of the ipcc special report on renewable energy sources and climate change mitigation, cambridge university press, cambridge, u.k. and new york, usa, 2011, http://srren.ipccwg3.de/report/ipcc_srren_ch08.pdf. [26] k. rajashekara, “propulsion system strategies for fuel cell vehicles”, sae 2000 world congress, detroit, usa, march 6-9, 2000, http://am.delphi.com/pdf/techpapers/2000-01-0369.pdf [27] s.shepard, j.gartner, “electric vehicle market forecasts”, navigant research, report, 4q 2013, http://www.navigantresearch.com/research/electric-vehicle-market-forecasts [28] r. lache, d. galves, p.nolan, “electric cars: plugged in 2”, deutsche bank, fitt research report, nov.2009, http://gold-estate.com/content/lithium/electriccarspluggedin2.pdf [29] l. schewel, d.m. kammen, “smart transportation: synergizing electrified vehicles and mobile information systems”, environment – science and policy for sustainable development, vol. 4, sep.-oct. 2010, on line available: http://www.environmentmagazine.org/archives/back%20issues/septemberoctober%202010/smart-transportation-full.html [30] federal communication commission, “connecting america: national broadband plan” march 2010, http://download.broadband.gov/plan/national-broadband-plan.pdf http://dx.doi.org/10.1109/mele.2013.2272481 http://dx.doi.org/10.1109/tii.2012.2222649 http://dx.doi.org/10.1109/vppc.2010.5729141 http://www.intechopen.com/books/electric-vehicles-modelling-and-simulations http://dx.doi.org/10.1109/vppc.2008.4677574 http://www.mitre.org/sites/default/files/pdf/06_0667.pdf http://www.lifepo4.info/battery_study/batteries/current_status_and_future_of_energy_storage_system_for_ev.pdf http://www.lifepo4.info/battery_study/batteries/current_status_and_future_of_energy_storage_system_for_ev.pdf http://www.electrochem.org/dl/interface/fal/fal10/fal10_p054-056.pdf http://srren.ipcc-wg3.de/report/ipcc_srren_ch08.pdf http://srren.ipcc-wg3.de/report/ipcc_srren_ch08.pdf http://am.delphi.com/pdf/techpapers/2000-01-0369.pdf http://www.navigantresearch.com/research/electric-vehicle-market-forecasts http://gold-estate.com/content/lithium/electriccarspluggedin2.pdf http://www.environmentmagazine.org/archives/back%20issues/september-october%202010/smart-transportation-full.html http://www.environmentmagazine.org/archives/back%20issues/september-october%202010/smart-transportation-full.html http://download.broadband.gov/plan/national-broadband-plan.pdf instruction facta universitatis series: electronics and energetics vol. 30, n o 1, march 2017, pp. 39 48 doi: 10.2298/fuee1701039v super-sech soliton dynamics in optical metamaterials using collective variables  marija veljković 1 , daniela milović 1 , milivoj belić 2 , qin zhou 3 , seithuti p. moshokoa 4 , anjan biswas 4,5 1 faculty of electronic engineering, department of telecommunications, university of niš, serbia 2 science program, texas a&m university at qatar, po box 23874, doha, qatar 3 school of electronics and information engineering, wuhan donghu university, wuhan-430212, people’s republic of china 4 department of mathematics and statistics, tshwane university of technology, pretoria-0008, south africa 5 department of mathematics, faculty of science, king abdulaziz, university jeddah-21589, saudi arabia abstract. this paper presents collective variable approach for super-sech soliton dynamics in optical metamaterials. the soliton dynamics is governed by the generalized nonlinear schrödinger's equation. the numerical simulations of pulse width, amplitude, chirp and frequency are given. key words: solitons, metamaterials, super-sech 1. introduction optical metamaterials as novel type of microstructured material have been extensively studied [1–15]. metamaterials (mms) are artificial composite structures with both negative permittivity and negative permeability and fascinating physical properties at terahertz and optical frequencies. different waveguide structures using metamaterials are already demonstrated in optical region [3]. optical waveguide can be implemented by slab structure with core made of positive-indexed material and claddings of double negative materials. these waveguides are engineered using advanced processing technology. however, the design of microstructured materials is limited by losses. nevertheless, the development of low-loss metamaterials could be the foundation of switches, modulators and other novel optical devices in all-optical integrated information processing systems. the transmission of ultrashort pulses through such promising material exhibit unique feature. it is well known that soliton is one of the remarkable nonlinear excitations produced by the balance between nonlinearity and group velocity dispersion [9–11, 13–19]. recent  received september 26, 2016 corresponding author: daniela milović faculty of electronic engineering, department of telecommunications, university of niš, aleksandra medvedeva 14, 18000, niš, serbia (e-mail: dachavuk@gmail.com) 40 m. veljković, d. milović, m. belić, q. zhou, s. p. moshokoa, a. biswas researches point out that ultrashort pulses propagating in mms can be described by a modified generalized nonlinear schrödinger equation (gnlse) in which the linear and nonlinear coefficients can be tailored to attain any combination of signs unachievable in ordinary materials [1–13]. simply engineering the mms can tailor linear and nonlinear effective properties. the nonlinear mms exhibit a rich spatiotemporal dynamics and promising applications which was unthinkable in the past [10–14]. metamaterials enhance nonlinearity by confining electrical field in a small region, so it is a great challenge to compensate losses and nonlinearity, using metamaterials as waveguides. in metamaterials, linear and nonlinear coefficients of the propagation equation can be set to achieve any combination of signs that is not possible in regular materials. this metamaterials properties allow propagation of a wider variety of solitary waves, efficient phase-matching and modulational instability. earlier results disclose that similar regular (positive indexed) dielectric material dispersion plays a crucial role in supporting short duration soliton pulses. the dynamics of soliton propagation through these optical metamaterials is governed by the nonlinear schrödinger'squation (nlse) with a few perturbation terms. the integrability aspect of this model was studied with various forms of nonlinearity [9-15]. different algorithms are used to yield solitons, shock waves and other solution to the model that appeared with several integrability conditions. 1.1. governing equation the dynamics of solitons in optical metamaterials is governed by the model [4-7] 2 2 2 2 2 2 1 2 3 | | (| | ) (| | ) | | | |( ) z tt t t t tt tt tt iq aq c q q i q i q q i q q q q q q q q                (1) equation (1) is the nonlinear schrödinger's equation (nlse) that is studied in the context of metamaterials. here in (1), a and b are the group velocity dispersion and the self-phase modulation terms respectively. this pair produces the delicate balance between dispersion and nonlinearity that accounts for the formation of the stable solitons. on the right hand side λ represents the self-steepening term in order to avoid the formation of shocks and ν is the nonlinear dispersion, while α represents the intermodal dispersion. then finally, θj for j = 1,2, 3 are the perturbation terms that appears in the context of metamaterials [1] 2. collective variable approach algorithm algorithm of collective variables principle implies that solution of nlse is split into two components [9, 11, 14]. the first one constitute soliton solution while the second one represents the residual radiation. decomposition of the original soliton field q(z,t) is made at position z in the fiber and at time t, as follows: ( , ) ( , ) ( , )q z t f z t g z t  (2) the soliton field f is defined as a function that depends on parameters, symbolically represented by , 1,...,jx j n super-sech soliton dynamics in optical metamaterials by using collective variables 41 1 2 ( , ) ( , , , , ) ( , ) n q z t f x x x t g z t   , (3) where collection of variables represent soliton amplitude, temporal position, pulse width, chirp, frequency and phase of the pulse. introduction of cv in function f increases the degrees of freedom resulting in the expansion of available phase space of the system. that is undesirable effect, so there are some constraints and residual free energy given by: 2 2 1 2 | | | ( , ,..., ) | . n e g dt q f x x x dt         (4) should be minimized. from this definition, let cj denote the rate of change of residual free energy with respect to the j th cv xj. 2| | j j j j j e g g c g dt g g dt x x x x                         (5) second parameter that should be defined is j c , the rate of change of cj with the normalized distance. using 1 2 ( , ) ( , ) { ( , ), ( , ), , ( , ), } n g z t q z t f x z t x z t x z t t   in the above equation, cj can be rewritten as: j j j f f c g g x x        (6) now, parameter j c can be presented as: 2 1 2 2 n j k j kj j j k dc xd f f g f c gdt dt gdt dz dz x x z x x z                                              (7) the overhead dot represents the derivative with respect to z and the subscripts xj denote partial derivative.  represents the real part and  means      . thus, 2 2 , , k j j j k xf g f c g x z x x z                  (8) dirac's principle implies that if a function is approximately zero, it cannot be set equal to zero until its variations with respect to all its parameters are made. therefore, cj are minimum and the equations of the constraints are obtained as: 0 j c  (9) 0 j c  (10) substituting (2) into (1), we obtain equations of motion of the residual field g(z, t) which upon substitution into (7) gives 42 m. veljković, d. milović, m. belić, q. zhou, s. p. moshokoa, a. biswas 2 1 2 n k j j k j k j k dxf f f c dt dt r x x x x dz                           (11) where * * * 2 * 2 * * * 2 * 2 * 2 * 2 2 2 | | 2 | | 2 2 2 ( | | ) 2 ( | | ) 2 ( | | ) 2 ( j j j j j j j j j j x tt x tt x x x t x t x t x t x t x j iaf f dt iaf g dt ibf f g fdt ibf f g gdt f f dt f g dt f f g f dt f f g g dt r f f g fdt f                                                               2 * 2 * 2 1 1 * 2 * 2 2 2 * 2 * * 2 * 3 3 | | ) 2 ( | | ) 2 ( | | ) 2 | | 2 | | 2 ( ) 2 ( ) j j j j j j t x tt x tt x tt x tt x tt x tt f g gdt i f f g f dt i f f g g dt i f f g f dt i f f g g dt i f f g f dt i f f g g dt                                            equation (11) is equivalent to the matrix equation     c c x r x (12) 1 1 1 1 2 2 2 2 1 2 1 2 n n nn n n c c c x x x c c cc x x x x cc c xx x                                                   , 1 2 n x x x              x , 1 2 n r r r             r (13) with 2 2 j k j k j k c f f f dt gdt x x x x x                         (14) super-sech soliton dynamics in optical metamaterials by using collective variables 43 3. super-sech parameter dynamics in this section soliton parameter dynamics in optical metamaterials will be obtained by cv approach. we assume the desired form of the function f is: 22 2 5 2 4 6 3 1 sech exp[ ( ) ( ) ] 2 m t f i x x x x x x ixt x t i           (15) where x1 stands for soliton amplitude, x2 the center position of the soliton, x3 the pulse width, x4 the soliton chirp parameter, x5 the soliton frequency and x6 the soliton phase that evolves along with propagation. also m is the super-sech parameter, where m > 0. in this case n = 6 and matrices have dimension 6x6. equations for all the cv are obtained under lowest order cv theory, bare approximation. applying the bare approximation implies that residual field is set to zero, g(z,t) =0. for m = 2 elements of matrix r are as follows: 3 1 1 3 4 1 2 3 ( 3 35 ) 2 r x x x     (16) 4 41 1 3 5 3 2 2 4 2 2 2 1 3 4 3 5 3 2 4 2 2 2 1 3 4 3 5 3 2 2 4 2 2 3 1 5 3 4 5 3 5 3 2 4 2 1 1 2 4 5 3 4 4 512 64 315 35 2 336 35( 6 ) 420 315 2 384 4( 49 6 ) 288 315 2 1008 105( 6 ) 420 315 2 1152 1 ( ) ( ) ( ) 2( 49 6 ) 288( x bx x x x x x x x x x x x x x x x ax x x x x x x x x x x r x x                                 2 4 2 3 5 3 2 4 2 2 2 1 5 3 4 3 5 3 2 2 4 2 2 3 1 5 3 4 3 5 3 4 2 ) ( ) 315 2 640 12( 49 6 ) 288 315 2 128 12( 49 6 ) 288 31 ( ) 5 x x x x x x x x x x x x x x x x               (17) 2 2 2 2 2 2 1 1 4 1 1 3 1 4 2 4 1 4 2 3 4 324 48( 31 6 )4 (15 4 ) 45 2835 16( 205 24 ) ( 2835 ( ) ) x x x x r a x x x x                  (18) 44 m. veljković, d. milović, m. belić, q. zhou, s. p. moshokoa, a. biswas 2 4 3 2 2 3 2 4 3 1 3 1 3 5 1 3 5 2 2 2 2 4 2 2 2 2 1 3 3 4 3 5 2 2 4 4 2 2 1 1 3 3 4 3 4 4 4 1 4 ( 49 6 ) ( 6 ) ( 49 6 ) 315 9 315 15120( 15 ) 945 60 7 18900( 6 ) 170100 64(58715 7194 ) 1512(60 35 3 ) 2160( 4 ( ( 9 6 ) ) ( ) b x x x x x x x x ax x x x x x x x x r x x                                           2 2 5 2 2 4 4 2 2 2 2 2 1 3 3 4 3 5 2 2 4 4 2 2 2 2 3 1 3 3 4 3 4 4 5 170100 5760( 29 3 ) 1512(60 35 3 ) 2160( 49 6 ) 170100 5760( 29 3 ) 1512(60 ) ( ) 35 3 ) 2160( 49 6 ) 1 ( 7 10 ) 0 0 x x x x x x x x x x x x x                              (19) 2 2 3 2 4 3 1 3 4 1 3 4 2 2 3 2 4 3 1 3 4 5 1 3 4 5 1 2 4 3 2 4 3 1 3 4 5 2 1 3 5 3 5 4 2 8 ( 6 ) ( 49 6 ) 9 315 4 16 ( 6 ) ( 49 6 ) 9 315 16 16 ( 49 6 ) ( 49 6 ) 315 315 x x x x x x a x x x x x x x x x x x x x x x x r                               (20) 4 2 4 1 3 1 3 5 1 3 5 2 2 4 2 2 2 1 3 4 3 5 3 2 4 2 2 2 1 1 3 4 3 5 3 2 4 2 2 2 2 1 3 4 3 5 3 2 4 2 3 1 3 4 4 6 4 4 ( ) 64 8 64 35 3 35 2 336 35( 6 ) 420 315 2 384 4( 49 6 ) 288 315 2 384 4( 49 ( 6 ) 288 315 2 384 ) ( ) ( 4( 49 6 ) 28 bx x x x x x x x ax x x x x x x x x x x x x x x x x x x x r x                                   2 2 3 5 3 )8 315 x x x (21) finally, the nonlinear dynamical system (ds) reduces to: 2 3 3 2 1 1 4 1 4 2 321 1 4 2 4 ( 755 84 ) 4 ( 1025 228 ) ( ) 105( 15 )4 315( 15 4 ) x a x x x x x x                     (22) 2 52 1 5 1 2 3 ( 1 ( 21( 2 ) 8 3 2 2 3 ) 21 ( ))axx x x            (23) 2 2 2 2 3 4 1 1 3 23 2 ( (2 63 ( 15 4 ) 8 3( 89 12 ) 2( 31 6 )( ) 63( 1 ) 5 ) 4 ) x x a x x                    (24) super-sech soliton dynamics in optical metamaterials by using collective variables 45 4 22 3 41 52 4 3 3 2 4 2 2 1 1 3 4 3 4 3 2 4 2 2 2 1 3 4 3 4 3 2 4 2 2 3 1 3 4 3 4 3 4 2 5 2 5 2 5 2 ( 108 )312 ( ) 7 2 ( ) 315 2 (15840 ) 315 2 9 7 (15840 ) 315 020 9 7020 9 7020 a px xbx b x px px x r x x x x px x x x x x px x x x x x px x q q q                    (25)              2 2 1 42 2 2 2 2 5 1 2 3 5 1 4 (3 11 4 105 6 20 6 2 3 11 4 87 8 207 28 ) x x x x                            (26) 2 4 2 4 2 2 3 52 4 2 3 2 2 4 4 2 1 3 4 1 2 3 2 4 2 4 1 2 3 2 6 2 4 3 5 1 (756 (450 75 16 5( 45 30 4 ) ) 3780( 45 30 4 ) (81(1680 280 13 ) 16((352290 111599 8328 ) 18( 870 305 48 )( )) 36 (3 ( 1470 655 96 ) 3 ) ( ( ( 3 a x x x x x x x x b x                                                 2 4 2 4 2 4 5 1 2 4 2 4 2 3 30 545 64 ) 80( 45 30 4 ) ((6390 5235 672 ) ( 810 435 32 ) ( 8010 4365 608 ) ))))) x                                 (27) where 4 2 4 30 45p     , 4 2 96 1045 1050q     , 2 16(7464 60335)r   ; 4. results and conclusion collective variable approach was applied to solve the evolution equation that governs the dynamics of soliton and its propagation through optical metamaterials. numerical investigations on the evolution of pulse parameters have been carried out in order to illustrate results of collective variable approach. results have been obtained using standard fourth order runge-kutta method for integration of the system of ordinary differential equations that resulted from the cv analysis. in figure 1 dynamic of the system is presented for the following parameter values:  = 0.25, a = 0.1, b = 20,           . as the pulse propagates, the amplitude (x1), pulse width (x3), frequency (x5) and chirp (x4) vary periodically. the control parameter of the soliton solution as it evolves is the total energy q. the total energy can be expressed as function of the super-sech function parameters 2 1 3 4 3 x x q  (28) 46 m. veljković, d. milović, m. belić, q. zhou, s. p. moshokoa, a. biswas this expression shows that the total energy strongly depends on amplitude (x1) and the pulse width (x3). the collective variables method enables a clear analysis of the equations and reveals the influence of various parameters. fig. 1 variation of pulse parameters (x1  soliton amplitude, x2  center position of the soliton, x3  pulse width, x4  soliton chirp, x5  soliton frequency, x6  soliton phase) with propagation distance. in conclusion, we have investigated the dynamics of an ultra short pulse in optical fibers, using cv approach.this paper could be used for further investigations of solitons dynamics and the influence of nonlinear parameters on solitons amplitude, temporal position, frequency, phase and chirp. super-sech soliton dynamics in optical metamaterials by using collective variables 47 acknowledgement: this research is funded by qatar national research fund (qnrf) under the grant number nprp 6-021-1-005. the second, third and sixth authors (dm, mb & ab) thankfully acknowledge this support from qnrf. the second author (dm) thankfully acknowledges the support from ministry of education, science, and technological development of republic of serbia [ iii 44006, tr-32051]. the fourth author (qz) was funded by the national science foundation of hubei province in china under the grant number 2015cfc891.the fifth author (spm) would like to thank the research support provided by the department of mathematics and statistics at tshwane university of technology and the support from the south african national foundation under grant number 92052 irf1202210126. the sixth author (ab) would like to thank tshwane university of technology during his academic visit on 2016. the authors also declare that there is no conflict of interest. references [1] v. g. veselago, “the electrodynamics of substances with simultaneously negative values of  and ”, sov. phys. usp., vol. 10, no. 4, pp. 509-514, 1968. [2] n. a. zharova, i. v. shadrivov, a. a. zharov, y. s. kivshar, “nonlinear transmission and spatiotemporal solitons in metamaterials with negative refraction”. optics express, vol. 13, no. 14, pp. 1291-1298, 2005. [3] v. m. shalaev, nature photonics. 1, 41 (2007). [4] a. biswas, k. r. khan, m. f. mahmood & m. belic. "bright and dark solitons in optical metamaterials". optik, vol. 125, issue 13, pp. 3299-3302, 2014. [5] a. biswas, m. mirzazadeh, m. savescu, d. milovic, k. r. khan, m. f. mahmood & m. belic. "singular solitons in optical metamaterials by ansatz method and simplest equation approach", journal of modern optics, vol. 61, issue 19, pp. 1550-1555, 2014. [6] a. biswas, m. mirzazadeh, m. eslami, d. milovic & m. belic, "solitons in optical metamaterials by functional variable method and first integral approach", frequenz., vol. 68, issues 11-12, pp. 525-530, 2014. [7] g. ebadi, a. mojavir, j. vega-guzman, k. r. khan, m. f. mahmood, l. moraru, a. biswas & m. belic, "solitons in optical metamaterials by fexpansion scheme". optoelectronics and advanced materials – rapid communications, vol. 8, no. 9-10, pp. 828-832, 2014. [8] m. veljkovic, y. xu, d. milovic, m. f. mahmood, a. biswas, and m. r. belic, “super-gaussian solitons in optical metamaterials using collective variables”, journal of computational and theoretical nanoscience, vol. 12, no. 12, pp. 5119-5124, 2015. [9] s. i. fewo & t. c. kofane. "a collective variable approach for optical solitons in cubic-quintic complex ginzburg-landau equation with third order dispersion", optics communications, vol. 281, issue 10, pp. 2893-2906, 2008. [10] p. green, d. milovic, d. lott & a. biswas. "dynamics of gaussian optical solitons by collective variables method". applied mathematics and information sciences, vol. 2, issue 3, pp. 259-273, 2008. [11] e. v. krishnan, m. al gabshi, q. zhou, k. r. khan, m. f. mahmood, y. xu, a. biswas & m. belic. "solitons in optical metamaterials by mapping method", journal of optoelectronics and advanced materials, vol. 17, no 3-4, pp. 511-516, 2015. [12] a. b. moubissi, k. nakkeeran, p. t. dinda & t. c. kofane. "non-lagrangian collective variable approach for optical soliton in fibers", journal of physics a., vol. 34, pp. 129-136, 2001. [13] m. savescu, k. r. khan, p. naruka, h. jafari, l. moraru & a. biswas. "optical solitons in photonic nano waveguides with an improved nonlinear schrödinger's equation", journal of computational and theoretical nanoscience, vol. 10, no. 5, pp. 1182-1191, 2013. [14] s. shwetanshumala & a. biswas. "femtosecond pulse propagation in optical fibers under higher order effects: a collective variables approach", international journal of theoretical physics, vol. 47, issue 6, pp. 1699-1708, 2008. [15] s. shwetanshumala. "temporal solitons in nonlinear media modeled by modified complex ginzburglandau equation under collective variables approach", international journal of theoretical physics, vol. 48, issue 4, pp. 1122-1131, 2008. [16] y. xu, q. zhou, a. h. bhrawy, k. r. khan, m. f. mahmood, k. r. khan & m. belic. "bright solitons in optical metamaterials by traveling wave hypothesis", optoelectronics and advanced materials – rapid communications, vol. 9, no. 3-4, pp. 384-387, 2015. 48 m. veljković, d. milović, m. belić, q. zhou, s. p. moshokoa, a. biswas [17] q. zhou, q. zhu, y. liu, a. biswas a. h. bhrawy, k. r. khan, m. f. mahmood & m. belic. "solitons in optical metamaterials with parabolic law nonlinearity and spatio-temporal dispersion", journal of optoelectronics and advanced materials, vol. 16, no. 11-12, pp. 1221-1225, 2014. [18] z. jakšić, m. obradov, s. vuković, m. belić, "plasmonic enhancement of light trapping in photodetectors", facta universitatis, series: electronics and energetics, vol. 27, issue 2, pp. 183-203, 2014. [19] e. suhir, "fiber optics engineering: physical design for reliability", facta universitatis, series: electronics and energetics, vol. 27, no. 2, pp. 153-182, 2014. 10537 facta universitatis series: electronics and energetics vol. 35, no 4, december 2022, pp. 483-493 https://doi.org/10.2298/fuee2204483b © 2022 by university of niš, serbia | creative commons license: cc by-nc-nd original scientific paper new approach to a ds-cdma-uwb system using a pseudo orthogonal code (poc) kada biteur1,2, belkacem benadda1,2, ahmed nour el islam ayad3 1dept of telecommunications, university abou bekr belkaid of tlemcen, algeria 2information processing and telecommunication laboratory (ltit),university tahri mohamed, bechar, algeria 3dept of electrical engineering, university kasdi merbah ourgla, algeria abstract. ultra-wideband direct sequences code division multiple access (ds-dma) plays an important role in the case of multi-terminal multi-application communications of uwb devices. in the case of uwb systems that exploit the injection of the pulse itself directly to the antenna hence the very wide bandwidth, generation of suitable ds-cdma codes poses a real challenge. in this paper we will describe our novel uwb transmission which uses pseudo-orthogonal time code (poc) as ds-cdma sequences. the suggested codes are unipolar sequences with chips that may be dynamically modified to target a certain number of users or applications. our approach bypasses the modulations schemes commonly used on uwb systems. moreover, as perspectives to our work, it would be very interesting to realize our new approach based on an fpga circuit. key words: uwb systems, pseudo-orthogonal code (poc), direct sequence-cdma 1. introduction the ultra-wideband (uwb) technology can be integrated into many applications such as personal area networks (wpan) [1-3] and mobile telecommunications (5g today) [46]. the uwb system is a rapidly developing technology that uses short range with very low power consumption, to transmit information over a majority of the radio spectrum to occupy a bandwidth greater than or equal to 25% of the center frequency or 1.5 ghz [7]. the uwb transmitters use very short-in-time pulses instead of carrier signals modulation. the most used pulses models are gaussian second derivatives, whose representation in the time domain is described by (1): 𝑈(𝑡) = (1 − 4𝜋 ( 𝑡 𝜗 ) 2 ) 𝑒 −2𝜋( 𝑡 𝜗 ) 2 (1) where ϑ represents a time normalization factor. received february 23, 2022; revised april 18, 2022 and july 30, 2022; accepted august 31, 2022 corresponding author: biteur kada department of telecommunications, university abou bekr belkaid of tlemcen, algeria e-mail: biteur.kada@univ-ghardaia.dz 484 k. biteur, b. benadda, a. n. e. i. ayad fig. 1 second derivative of a gaussian pulse especially for wireless communications, the united states federal communications commission has set the power level to a very low level (lower than -41.3 dbm) [8] allowing uwb technology to share spectrum with other users without interference. to get the required spreading, various techniques can be used such as direct sequence (ds) and time-hopping (th) [9]. user data is allotted to time frames in the th-uwb systems, and pulse position modulation (ppm) is employed to eliminate overlap in multiple access networks [10-11]. on the other hand, time spreading codes are used in ds-uwb techniques [12] in the same way as they are in traditional direct sequence code access (dscdma) technique, so they have the same advantages than direct sequence spread spectrum (dsss) [13-14]. in this paper we propose a transceiver model suitable for a new approach to direct sequence digital transmission, for an ultra-wideband application (ds-uwb), using a pseudo-orthogonal time code (poc). the proposed codes are composed of unipolar sequences characterized by a length l, constituted of n elements called "chips", a predefined number of users, and the weight of the code; chips with level "1". moreover, to enhance the synchronization between transmitters and receivers, this new proposed spreading schema makes it possible to separately code high-level bits '1' and low-level bits '0' of the data stream by two different codes; the doublet code sequence is unique for each user. the proposed study aims to transmit ds-cdma-uwb without using classical modulations associated with uwb systems. our new model, based on pseudo orthogonal codes, build a ds-cdma-uwb system for both sides receiver and emitter. direct sequences for uwb systems are explained in section 2. section 3 will introduce the classic modulation schema used on uwb systems. sections 4 detail the poc mechanisms. the uwb ds-cdma emitter is detailed in section 5. sections 6 and 7 highlight the emitter signals generation; simulation and results for the propagation and signal acquisition at the receiver level that we present in section 8. section 9 concludes this paper. -5 -4 -3 -2 -1 0 1 2 3 4 -0.4 -0.2 0 0.2 0.4 0.6 0.8 1 nanoseconds a m pl it ud e new approach to a ds-cdma-uwbsystem using a pseudo orthogonal code 485 2. direct sequence uwb (ds-uwb) direct sequence spread spectrum systems appear easier to implement since all the pulses are spaced at the same period, which imposes fewer constraints on the components of the transmission chain. indeed, our built ds-uwb transmitter scheme uses orthogonal pseudo-random codes (pn) [15] as spreading sequences to encode each bit of information, and the bandwidth of the transmitted pulse is much greater than that used by the transmitted binary stream. figure 2illustrates a block diagram of the ds-uwb signal generator. fig. 2 block diagram of a ds-uwb signal generator 3. modulations associated with uwb systems there are mainly modulation methods for uwb communications, such as ppm (pulse position modulation), ook (on-off keying) and pam (pulse position amplitude modulation) [16-17]. ▪ ppm modulation: the information is encoded according to the position of the timespaced pulse; bit '0' is defined by a time-shifted pulse from a reference pulse that matches bit ’1’. ▪ ook modulation: corresponds to the presence of a pulse representing the "1" bit and the "0" results in the absence of a pulse. ▪ pam modulation: is an access method based on the property of orthogonality of pulses. fig. 3 modulations associated with uwb systems 486 k. biteur, b. benadda, a. n. e. i. ayad 4. pseudo-orthogonal codes (poc) j. a. salhi developed the poc codes in 1989 [18], these codes are composed of unipolar sequences c = {c j} defined by the following parameters: ▪ l represents the code length poc ▪ w stands for the code's weight, which denotes the number of chips at "1." ▪ the auto and inter-correlation constraints are represented by λ a and λ c respectively. 4.1. numbers of user in the event that λ a = λ c =1, various works [18-19] have shown that the number of possible users of a poc code sequence is limited by the relation (2): 𝑁(𝐿, 𝑊, 1,1) ≤ ⌊ 𝐿−1 𝑤(𝑤−1) ⌋ (2) ▪ n: number of user. ▪ l, w: represents the code length poc and the code's weight respectively. 4.2. construction of codes the bibd (balanced incomplete block design) method [20] allows us to generate oc (l, w) code sequences when the desired spread length is a prime number. it is mathematical method based on properties related to primitive roots from a galois field; it is a simpler and faster method. we consider the primitive root α of l, we can get the positions of the chips at 1 of the ith sequence ci = [pi,0 ; pi,1;…; pi,w-1] for each code according to the parity of w [21] : − 𝑖𝑓 𝑊 𝑖𝑠 𝑒𝑣𝑒𝑛(𝑊 = 2𝑚): { 𝑃𝑖,0 = 0 𝑃𝑖,𝑗 =∝ (𝑚×𝑖)+(𝑗×𝑘) (3) 𝑤𝑖𝑡ℎ: 𝑖 ∈ [0, 𝑁 − 1]; 𝑗 ∈ [0, 𝑊 − 2] 𝑒𝑡 𝑘 = 2 × 𝑚 × 𝑁 − 𝑖𝑓 𝑊 𝑖𝑠 𝑜𝑑𝑑(𝑊 = (2 × 𝑚) + 1): {𝑃𝑖,𝑗 = 𝛼 (𝑚×𝑖)+(𝑗×𝑘) (4) 𝑤𝑖𝑡ℎ: 𝑖 ∈ [0, 𝑁 − 1]; 𝑗 ∈ [0, 𝑊 − 1] 𝑒𝑡 𝑘 = 2 × 𝑚 × 𝑁 ▪ α is the primitive root of l. ▪ is the pci is the position of chips at 1 for i th code sequence 𝐶𝑖 = [𝑃𝑖,0; 𝑃𝑖,1; … ; 𝑃𝑖,𝑊−1] table 1 shows the code positions used in our study according to the bibd method. in the following figure 4, we present the positions of the chips at "1" of the poc code (73, 4) according to number of users n=6andthe length of the code l=73. table 1 the different positions of (73, 4, 1, 1) code according to the bibd method first chips j 0 1 2 code (73,4,1,1) n = 6 i 0 (c1) 0 1 8 64 1 (c2) 0 25 54 67 2 (c3) 0 36 41 69 3 (c4) 0 3 24 46 4 (c5) 0 2 16 55 5 (c6) 0 35 50 61 new approach to a ds-cdma-uwbsystem using a pseudo orthogonal code 487 fig. 4 positions of chips at "1" of the code poc (73,4) 5. new model of ds-cdma-uwb emitter for our proposed model, the bit flow equal to "1" is convoluted by the chips of a user's poc code and the bit flow equal to "0" bit by another user code, which gives an increased bandwidth to the signal by emitting low-energy gaussian-shaped pulses that are coherent on reception as explained by figure 5. fig. 5 ds-cdma-uwb emitter the ds-cdma-uwb signal transmitted to a user can be expressed as follows: 𝑆𝑃𝑂𝐶𝑐𝑜𝑑𝑒 (𝑡) = [ ∑ 𝑏1 𝑘 ∞ 𝑘=−∞ ∑ 𝐶𝑗 𝑈 + ∑ 𝑏0 𝑘 ∞ 𝑖=−∞ ∑ 𝐶𝑗 𝑈∼ 𝑁𝑐−1 𝑗=0 ] ⊕ 𝑁𝑐−1 𝑗=0 𝑊(𝑡 − 𝑖𝑇𝑠 − 𝑗𝑇𝑐 ) (5) 0 50 100 0 0.2 0.4 0.6 0.8 1 user1 0 50 100 0 0.2 0.4 0.6 0.8 1 user2 0 50 100 0 0.2 0.4 0.6 0.8 1 user3 0 50 100 0 0.2 0.4 0.6 0.8 1 user4 0 50 100 0 0.2 0.4 0.6 0.8 1 user5 0 50 100 0 0.2 0.4 0.6 0.8 1 user6 spread spectrum data source the bits ‘’0’’ the bits ‘’1’’ poc code of a user poc code of another user uwb pulse generator 488 k. biteur, b. benadda, a. n. e. i. ayad ▪ 𝑏0 𝑘,𝑏1 𝑘 : is the 0 and the 1 bit respectively of binary data sent by the kth source ▪ 𝑊 is the pulse waveform ▪ 𝑇𝑐 , 𝑇𝑠are chip and symbol duration respectively ▪ 𝑁𝑐 is the number of chips ▪ 𝐶𝑗 𝑈 ,𝐶𝑗 𝑈∼ is a code of two different users which only takes chips 1 or 0 up to n the number of users. 6. emitter simulation we first consider a random sequence of 8 bits modeling the useful information as limited bit stream. then we use two selected poc codes to spread the spectrum, which is completely independent of the random data sequences [20], this data transmission method uses more bandwidth than necessary to traditional transfer. for this paper purpose we have selected as an example the 4th and 6th poc sequences (73,4) for our user (all other codes use the same principle), i.e. the bit flow equal to "1" is convolved by 73 chips of code #4 and the bit flow equal to "0" convolved by 73 chips of code #6. 𝑆(73,4)(𝑡) = [( ∑ 𝑏1 𝑘 ∞ 𝑘=−∞ ∑ 𝐶𝑗 4) + ( ∑ 𝑏0 𝑘 ∞ 𝑖=−∞ ∑ 𝐶𝑗 6) 73−1 𝑗=0 ] ⊕ 73−1 𝑗=0 𝑊(𝑡 − 𝑖𝑇𝑠 − 𝑗𝑇𝑐 ) (6) the spread of the spectrum as represented in figure 6 modulates a sequence of data “10011011” by means of two pseudo-random poc codes chosen at a bit rate much higher than that of the information signal to be transmitted. that is to say the convolution is done once between the 73 code chips of user #4 with bits equal to and the 73 code chips of user #6 with bits equal to 0. . fig. 6 spread spectrum phase for the data sequence “10011011” 0 2 4 6 8 0 0.5 1 \data 0 20 40 60 80 0 0.5 1 \code of user#4 0 20 40 60 80 0 0.5 1 \code of user#6 0 200 400 600 0 0.5 1 \speared spectrum 0 50 100 0 0.5 1 \speared spectrum zoom of the bit '1' and '0' new approach to a ds-cdma-uwbsystem using a pseudo orthogonal code 489 6.1. generation of uwb pulses in this paper, we used the second derivative of the gaussian generated by equation (1) because of their ease of implementation in uwb systems [19-20]. as shown in the figure 7, the uwb pulse generator receives the spread data to create a second order gaussian derivative pulse train and output the signal through the antenna [22]. fig. 7 uwb pulses to comply with the regulatory agency's recommendations, the frequency band allocated for uwb transmissions has been grouped into two parts, a so-called "low band", comprising between 3 and 5 ghz, and the other called "high band", include between 6 and 10 ghz [23]. our transmitted ds-uwb signal is included in low band according to figure 8 which shows uwb signal spectrum and power spectral. fig. 8 the spectrum and the power spectral of uwb signal 0 100 200 300 400 500 600 0 0.5 1 \uw b 0 100 200 300 400 500 600 700 -1 0 1 \uw b signal 0 20 40 60 80 100 120 140 -1 0 1 \zoom uwb signal of the bit ''1'' and ''0'' 0 200 400 600 0 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09 0.1 frequency (hz) x 10 7 a m pl itu de (v ) spectrum of uwb signal 0 200 400 600 -100 -90 -80 -70 -60 -50 -40 -30 -20 frequency (hz) x 10 7 20 lo g1 0( db ) power spectral of uw b signal 490 k. biteur, b. benadda, a. n. e. i. ayad 7. transmission channel in our work, we did not examine multi-user interference (mui) [24] and intersymbol interference (isi) [25] since these phenomena are not predominant. in our work, the only phenomenon which imperfects our system is the noise awgn the received signal can be described by r(t) = s(t) + n(t) where s(t) is the signal generated by the transmitter and n(t) denotes the additive gaussian noise [26-27-28]. figure 9 shows the noise signal based on the awgn channel model. fig. 9 awgn channel output, where eb/no=2db 8. the correlation receiver gaussian white additive noise (awgn) channel the correlation receiver as shown in the figure 10 is the most optimal of a ds-cdma-uwb chain by adding a filter adapted to the received signal, it uses a correlation device, it breaks down into three steps main [29]: ▪ multiplication of the received signalr(t) by the poc code users #4 and #6 with the pulse generator uwb: 𝑅𝑐𝑜𝑟𝑟 (𝑡) = 𝑟(𝑡) ∗ [(∑ 𝑏1 𝑘∞ 𝑘=−∞ ∑ 𝐶𝑗 4) + (∑ 𝑏0 𝑘∞ 𝑖=−∞ ∑ 𝐶𝑗 6) 73−1𝑗=0 ] ⊕ 73−1 𝑗=0 𝑊(𝑡 − 𝑗𝑇𝑐 )] (7) ▪ integration of the correlated signal over the bit time 𝑍1 (𝑖) = ∫ 𝑟𝑐𝑜𝑟𝑟 (𝑡)𝑑𝑡 𝑇𝑏 0 (8) ▪ decision making by comparison to a threshold knowing that user poc code #4 and #6 indicates bit '1' , '0' respectively. fig. 10 correlation receiver 0 100 200 300 400 500 600 700 -2 -1.5 -1 -0.5 0 0.5 1 1.5 2 2.5 new approach to a ds-cdma-uwbsystem using a pseudo orthogonal code 491 at the reception, it suffices to compare the correlator signal with the possibly generated poc sequence to recover the transmitted signal. figure 11 illustrates the correlator output signal with its power spectral, the spectrum of the correlator output signal and recovered data. fig. 11 the correlator output signal with its power spectral, the spectrum and the data recovered the new ds-uwb system based on poc orthogonal unipolar codes without modulation was analyzed. only the end-to-end ds-uwb transmission chain we are interested in. we removed the modulation part on our new approach. poc codes are preconfigured (calculated in advance). our perspective is to realize our new ds-uwb approach based on components such as fpga [30-31-32], soc [33]… because nowadays it is easy to build a transceiver. 9. conclusion in this work, we suggested a new approach to a multi-users ds-cdma-uwb system using a family of pseudo-orthogonal codes poc on an awgn channel for a correlation receiver. applying poc code offered a whole new and different approach than any other used before in literature with the ultra-broadband system. we have given a complete description of the ds-cdma-uwb system, including the transmission and reception formalism. this work allowed us to present and analyze new emission reception approach based on ds-cdma-uwb signal. references [1] k. h. liu, l. cai and x. s. shen, ''exclusive-region based scheduling algorithms for uwb wp'', ieee trans. wirel. commun., 2008, 7, 933–942. [2] z. p. li and g. s. kuo, ''layered mac for high-rate uwb wpan system''. in proceedings of the ieee 64th vehicular technology conference, melbourne, australia, 7–10 may 2006, pp. 1-5. 492 k. biteur, b. benadda, a. n. e. i. ayad [3] n. m. aripin and n. fisal, ''analysis of channel time allocations for mpeg-4 video transmission over uwb wpan'', in proceedings of the ieee symposium on industrial electronics & applications, (isiea 2009), kuala lumpur, malaysia, 4–6 october 2009; vol. 2, pp. 705-710. [4] j. clerk maxwell, a treatise on electricity and magnetism, 3rd ed., vol. 2. oxford: clarendon, 1892, pp. 68-73. [5] b. yu, d. yang and b. wang, ''design of uwb antenna with double band-notched in 5g'', in proceedings of the ieee 5th advanced information technology, electronic and automation control conference (iaeac), 12-14 march 2021, pp. 480-483. [6] a. m. islam, e. i. emon and a. ahmed, ''a metamaterial loaded microstrip patch antenna for lower 5g'', u-nii spectrum, math. model. eng. probl., vol. 7, no. 4, pp. 556-562, dec. 2020. [7] p. tiwari and p. k. malik, ''design of uwb antenna for the 5g mobile communication applications: a review'', in proceedings of the ieee international conference on computation, automation and knowledge management (iccakm), 9-10 jan. 2020, pp. 24-30. [8] d. g. leeper, ''a long-term view of short-range wireless'', ieee computer, vol. 34, no. 6, pp. 39-44, jun 2001. [9] s. elajoumi, a. tajmouati, j. zbitou, a. errkik, a. m. sanchez and m. latrachee, ''bandwidth enhancement of compact microstrip rectangular antennas for uwb applications'', telkomnika telecommunication computing electronics and control, vol. 17, no. 3, pp. 1559-1568, 2019. [10] c. r. nassar, f. zhu and z. wu, ''direct sequence spreading uwb systems: frequency domain processing for enhanced performance and throughput in communications'', in proceedings of the ieee international conference on communications, 2003, vol. 3, pp. 2180-2186. [11] b. hu and n. c. beaulieu, ''accurate performance evaluation of time hopping and direct-sequence uwb systems in mmulti-user interference'', ieee trans. commun., vol. 53, no. 6, pp. 1053-1062, 2005. [12] w. wu, z. y. wu and w. ji. xie, ''uwb ppm-th and pam-ds system with time reversal and its improved solution'', in proceedings of the ieee 6th international conference on information and automation for sustainability, 27-29 sept. 2012, pp. 332-336. [13] l. lu and v. k. dubey, ''performance of a complete complementary code-based spread-time cdma system in a fading channel'', ieee trans. veh. technol., vol. 57, no. 1, pp. 250-259, jan. 2008. [14] b. r. vojcic and r. l. pickholtz, ''direct-sequence code division multiple access for ultra-wide bandwidth impulse radio'' in proceedings of the ieee military communications conference (milcom), 2003, vol. 2, pp. 898-902. [15] a. gupta and l. bhaskar, ''performance analysis of different pn sequence and orthogonal spreading sequences in ds-ss'', in proceedings of the ieee 5th international conference confluence the next generation information technology summit, 25-26 sept. 2014, pp. 890-892. [16] n. t. huyen and p. t. hiep, ''proposing adaptive pn sequence length scheme for testing nondestructive structure using ds-uwb'', in proceedings of the 3rd international ieee conference on recent advances in signal processing, telecommunications & computing (sigtelcom), 21-22 march 2019, pp. 10-14. [17] i. opperman, j. iinatti and m. hčamčalčainen, uwb theory and applications, the atrium, southern gate, chichester, west sussex po 19 8sq, england, wiley 2004. [18] h. s. hamid, m. s. mohammed and m. i. mustafa, ''design low power detection qpsk-transceiver for uwb'', in proceedings of the 3rd international conference on sustainable engineering techniques, iop conf. series: materials science and engineering, vol. 881, 2020, p. 012134. [19] j. a. salehi and c. a. brackett, '' code division multiple-access techniques in optical fiber networkspart i: fundamental principles'', ieee trans. on comm., vol. 8, no. 37, pp. 824-833, aug. 1989. [20] k. biteur and m. kandouci, ''successive interference cancellation receiver (sic) in ds-ocdma system'', in proceedings of the 24th international conference on microelectronics (icm), 16-20 dec. 2012, pp. 1-4. [21] h. chung and p. kumar, ''optical orthogonal codes new bounds and an optimal construction'', ieee trans. inf. theory, vol. 36, pp. 866-873. [22] k. biteur and m. kandouci, ''conventional receiver with optical limiter in ds-ocdma system'', int. j. adv. eng. technol., vol. 6, no. 4, pp. 1494-1504, sept. 2013. [23] t. sarkar, a. ghosh, s. chakraborty, l. l. kumar singh, ''a new insightful exploration into a low profile ultra-wide-band (uwb) microstrip antenna for ds-uwb applications'', j. electromagn. waves appl., vol. 35, no. 3, pp. 1-19, 2021. [24] a. jassim, ''performances of multiuser interference using pulse amplitude modulation with time hoping for ultra wideband'', international journal of electronics, communication& instrumentation engineering research and development (ijecierd), vol. 6, no. 4, aug 2016. https://www.researchgate.net/profile/l-singh-2 new approach to a ds-cdma-uwbsystem using a pseudo orthogonal code 493 [25] i. čuljak, ž. lučev vasić, h. mihaldinec and h. džapo, ''wireless body sensor communication systems based on uwb and ibc technologies: state-of-the-art and open challenges'', sensors, vol. 20, no. 12, p. 3587, jun 2020. [26] a. ramesha, a. nareshb, n. v. seshagiri raoc, ''technique for reduction of inters symbol interference in uwb'', in proceedings of the international conference on emerging trends in engineering, science and technology (icetest), 2015, pp. 812-819. [27] s. im and e. j. powers, ''an algorithm for estimating signal to noise ratio of uwb signals'', ieee trans. veh. technol., vol. 54, no. 5, pp. 1905–1908, 2005. [28] l. bo, q.-z. liu, z.-d. yin and z.-l. wu, ''a novel snr estimator for ds-uwb wireless sensor network'', destech trans. comput. sci. eng., 2017. [29] f. ramirez-mirles, ''on the performance of ultra wideband signals in gaussian noise and dense multipath'', ieee trans. veh. technol., vol. 50, no. 1, pp. 244249, jan. 2001. [30] md. a. azim, h. mohammad, m. rahman and n. amin, ''direct sequence ultra wideband system design for wireless sensor network'', in proceedings of the international conference on computer and communication engineering, 13-15 may 2008, pp. 1131-1135. [31] l. sneler, t. matic and i. galic, ''the fpga system for evaluation of uwb wireless sensor network based on transmitted reference integral pulse frequency modulator'', in proceedings of the ieee zooming innovation in consumer technologies conference (zinc), 2018, pp. 55-57. [32] c. thomos and g. kalivas, ''fpga-based architecture of a ds-uwb channel estimator and rake receiver employing a hybrid selection scheme'', in proceedings of the ieee 17th international conference on telecommunications, 2010, pp. 903-909. [33] m. cervetto, e. marchi and c. g. galarza, ''a fully configurable soc-based ir-uwb platform for data acquisition and algorithm testing'', ieee embed. syst. lett., vol. 13, no. 2, pp. 53-56, june 2021. 10573 facta universitatis series: electronics and energetics vol. 35, no 4, december 2022, pp. 541-555 https://doi.org/10.2298/fuee2204541t © 2022 by university of niš, serbia | creative commons license: cc by-nc-n original scientific paper green computing for iot – software approach* haris turkmanović1, ivan popović1, dejan drajić2,3, zoran čiča2 1university of belgrade, school of electrical engineering, department of electronics, 2university of belgrade, school of electrical engineering, department of telecommunications 3innovation centre of school of electrical engineering abstract. more efficient usage of limited energy resources on embedded platforms, found in various iot applications, is identified as a universal challenge in designing such devices and systems. although many power management techniques for control and optimization of device power consumption have been introduced at the hardware and software level, only few of them are addressing device operation at the application level. in this paper, a software engineering approach for managing the operation of iot edge devices is presented. this approach involves a set of the application-level software parameters that affect consumption of the iot device and its real-time behavior. to investigate and illustrate the impact of the introduced parameters on the device performance and its energy footprint, we utilize a custom-built simulation environment. the simulation results obtained from analyzing simplified data producer-consumer configuration of iot edge tier, under push-based communication model, confirm that careful tuning of the identified set of parameters can lead to more energy efficient iot end-device operation. key words: green iot, energy saving, real-time iot, push communication technology, embedded systems 1. introduction many technological achievements in recent years, especially in field of information and communication technologies, have enabled the usage of a wide range of iot applications and devices. healthcare systems, smart cities, home automation and security, wearable devices, and agriculture are just some of the applications whose rapid development is facilitated by the advancement of various iot communication technologies [2]. according to research [3], global number of connected iot devices is expected to grow nearly 10% per year, where the number of ip connections by the year 2023. is expected to received march 6, 2022; revised april 28, 2022; accepted may 8, 2022 corresponding author: haris turkmanović university of belgrade, school of electrical engineering, department of electronics, bulevar kralja aleksandra 73, 11120 beograd, serbia e-mail: haris@etf.bg.ac.rs * an earlier version of this paper was presented at the 15thinternational conference on advanced technologies, systems and services in telecommunications (telsiks 2021), october 20-22, 2021, in niš, serbia [1] 542 h. turkmanović, i. popović, d. drajić, z. čiča be three times higher than the total world population. this accelerated growth of network traffic and the increased number of connected iot devices lead to elevated global energy consumption and pollutions related to co2 emission [4]. it has been predicted that soon iot devices and systems will be leading energy consumers in the domain of information and communication technologies [5]. the utilization of energy-efficient technologies to reduce energy consumption as well as co2 emission has become mandatory in the design of green iot systems (giot). there are various definitions of iot systems which commonly include different edge iot devices distributed all over the iot network. these devices produce data and autonomously communicate with other parts of iot system without direct human intervention [6]. design of an iot system that relies on the use of modern iot communication technologies is very complex because of different challenges such as security [7], scalability [8], data management, real-time performance [9], and others [10]. the scope of giot systems is even more complex because it involves different set of green technologies in product life cycle. green technologies are targeting hardware and software design, green production, green utilization and green disposal of iot devices. iot edge devices are usually designed as battery-operated embedded devices that have constrained resources and capabilities, such as limited memory and cpu processing power. hardware and software design requirements of such devices are mostly related to efficient usage of limited available energy. these requirements are important to prolong iot application operational runtime and enable utilization of green technologies. therefore, the limitation in terms of the available energy resources represents the main issue in designing energy-efficient iot applications [11-13] and giot systems. the base for designing such system is green computing which main goal is to reduce iot devices' energy consumption without degrading their performance [14]. in this paper, we introduce a set of application-level parameters that shape the communication and operational behavior of iot edge devices but also affect their energy consumption and performance. the conducted study aims to explore the way to utilize a software engineering approach in controlling this set of parameters to achieve the balance between the energy consumption of iot edge devices, its operational runtime [15], and the required real-time performance of iot services and applications. to quantify the contribution of the selected software engineering approach, a simulation environment is developed as a flexible open-source framework for analyzing the behavior, performance, and energy footprint of arbitrary distributed iot systems and applications [16-17]. achievements and contributions of our proposed software engineering approach in designing giot edge devices are the following: ▪ introduction of application-level software parameters that enable fine-tuning of iot application performance vs. energy consumption of the iot devices located at the network edge ▪ the extension of the simulation framework with the set of parameters for modeling consumption of iot system processing and communication elements, enables comprehensive analysis of the energy requirements of an arbitrary iot system and/or its components. ▪ the analysis of real-time performance and energy consumption confirmed the trade-off potential of the proposed software approach for driving the operation of giot edge devices. green computing for iot – software approach 543 the rest of the paper is organized as follows. section ii presents related work regarding the existing energy optimization methods. section iii presents a brief overview of the simulation framework with details related to the implementation of energy calculation algorithms. in section iv, we present the simulation model used in our analysis while section v presents and discusses obtained results. section vi concludes the paper. 2. related work energy consumption-related problems have been attracting a lot of attention from the scientific and research community. data involved in processing and communication within iot applications are usually generated by battery-powered edge iot devices that generate data by sensing their environment. until today, various energy optimization methods, which are addressing different energy-intensive aspects of iot systems, have been developed to prolong iot edge device operational time and to provide enhanced real-time performance of giot applications. the rest of the section provides an overview of existing optimization methods, their classification, and approaches related to the energy-aware design of resourceconstrained embedded devices found in various iot applications and systems. although the standard methodologies in designing low-power embedded systems involve the range of approaches from simple usage of low power products to complex algorithms for scheduling system workload, there is no single universally accepted methodology that fits all applications needs. all optimization techniques can be classified into two major categories: hardware and software energy optimization techniques. within this research we be explored only the software optimization techniques which, based on [18] can be further classified as data center-based, cloud computing-based, and virtualization-based techniques. a more detailed classification of software optimization techniques is done in [4] where they are classified into nine different groups. based on [19], all software optimization techniques can be classified into three groups: instruction level, compiler level, and operating system level. the taxonomy presented in this paper is an entry point for the analysis of energy optimization techniques applicable for iot edge devices. in paper [20] is presented overview of such techniques and it has been discussed influence that certain techniques have on edge device energy consumption and overall iot system consumption as well. iot system edge devices are, in a certain sense similar to sensing node devices used within wireless sensor networks (wsn). therefore, in the case of energy optimization of iot edge devices, some optimization techniques already considered in the case of wsn can be utilized. a review of energy optimization techniques in [21-22] gives a systematic classification of the solutions that can be used to preserve energy in wsn. moreover, these papers introduce the division of battery-powered sensor devices on basic subsystem components which consume energy: sensing subsystem, processing subsystem and communication subsystem. this division makes sensing node energy analysis more systematic, and it can be also applied in case of giot edge devices. it is pointed out that the consumption of each subsystem must be considered equally during the energy profiling analysis of iot edge nodes. in iot applications based on wireless communication technologies, communication subsystems in most cases consume significantly more energy compared to processing or sensing subsystems. different software methods are focused to reduce the consumption of 544 h. turkmanović, i. popović, d. drajić, z. čiča the communication subsystem. based on research presented in [23] all these methods can be classified into two groups: duty-cycling based methods and in-processing methods. dutycycling-based methods reduce energy consumption by disabling communication components in a case when they are not used. in-processing methods use various data compression and/or data aggregation techniques to reduce the amount of data involved in communication. the amount of data involved in communication increases with the number of edge nodes participating in the communication. different research has shown that there is a certain similarity between the data produced by sensor nodes [24-25]. data aggregation methods exploit this feature to reduce the overall amount of data in iot applications. instead of forwarding the data instantaneously, data are first collected and then aggregated using functions like sum, average, threshold. in [26] some of the data aggregation methods are presented. it is shown that utilization of these methods has a huge impact on the reduction of sensor node energy consumption, but also decreases real-time performance since delay in data delivery time over iot applications is increased. therefore, in the case of applying methods based on data aggregation, it is important to use two metric parameters: energy consumption and data delay to describe overall iot application performance. in [27] analytical model is presented which enables calculation of energy consumption and packet delivery time in case of aggregation optimization methods usage. this model allows determining parameters values such as buffering time and maximum number of buffered packets. additionally, this work shows that the utilization of aggregation methods may lead to a significant decrease in energy consumption. in [28], it is shown that the energy consumption of the processing subsystem may increase compared to the communication subsystem when complex memory-intensive compression algorithms are used. however, a study conducted in [28] modifies already developed techniques, such as discrete cosine transform (dct) and discrete wavelet transform (dwt), that can be applied on sensor nodes to compress data. these techniques are modified in such a way that utilization of memory and processing capability is not too high compared to the original algorithm. it has been shown that among other things, utilization of these modified techniques leads to reduction of energy consumption and prolonged device operational runtime. when data aggregation methods are used it is very important to decide when to communicate to send aggregated data to the consumer node. paper [30] named this parameter as transmission period. it is shown that this parameter has a significant impact on sensor node performance in terms of energy, data accuracy, and data freshness. the specific approach is developed in this work which gives a possibility to balance between energy saving and data availability at the higher tiers of hierarchically organized iot systems. in some practical applications, sensing subsystem can consume significantly more energy compared to other edge device system parts. work presented in [31] establishes an approach based on smart sensing policy which achieves less energy consumption of sensing subsystem compared to a usage of standard fixed sensing period policy. this policy used a learning model based on a backpropagation neural network. it has been concluded that this policy may reduce consumption by up to 50%. in [32] adaptive sampling algorithm is proposed which can dynamically estimate optimal sampling frequency. the performance of this algorithm is estimated in simulation of snow monitoring application. obtained simulation results show that this algorithm may reduce the energy consumption of the sensing subsystem up to 97%. beside the techniques that are directly related to our research, there are other techniques that also affect power consumption. in [33] power management techniques are categorized as green computing for iot – software approach 545 dynamic voltage and frequency scaling, subthreshold design, asynchronous circuit design and power-gating. in case of edge devices that utilize real-time operating systems there are different os-level techniques that impact task scheduling [33-34]. within iot multimedia applications control of parameters such as frames per seconds (fps) is also found as a common approach to lower energy consumption [36]. in the domain of e-healthcare applications there are several solutions offering energy-efficient frameworks using internet of medical things (iomt) protocol to optimize the communication overhead and overall energy consumption while transmitting the healthcare data [37-38]. although most reviewed solutions and approaches investigate the impact of individual parameters on the power or energy consumption, neither of them analyzes trade-off potential and more complex tuning of device operation through the control of a group of parameters. on the other hand, our study presented in this paper provides a comprehensive analysis of the performance and energy consumption properties of iot edge devices, during their operation under the different setup of the selected application-level parameters. 3. materials and methods the first part of this section presents general aspects of traffic engineering relevant for iot system energy consumption and performance analysis. set of application-level software parameters, that enable tuning of performance and consumption properties of iot devices, is also introduced. metric associated with the quantification of these properties is presented within the first part of this section, while utilization of simulation framework for energy analysis is described in the rest of the section. 3.1. overview of the approach there are many different possibilities of iot system realizations, but in most cases, it is possible to identify four main elements: 1) the intelligent devices where data are produced – producer devices, 2) the gateways that extract data, aggregate data, and/or perform protocol translation, 3) the network used to establish communication between devices and 4) the device which receives data – consumer device. in the simplest representation of an iot system, it is possible to consider that system consists only of producer and consumer device. the communication between producer nodes and the rest of the distributed iot system determines producer nodes' energy consumption and the real-time performance of the iot application. two traffic engineering strategies can be applied when designing an edge-tier iot system: pull and push [39]. the messaging patterns of these two strategies are illustrated in figure 1. in case of pull strategy, data generated on the producer node side are sent to the consumer node side only when the consumer node sends a pull request to the producer node. pull strategy is suitable for implementation in a case where the consumer node is interested in partial data from certain producer nodes (there is a correlation between data sent fig 1. a) pull b) push traffic strategy 546 h. turkmanović, i. popović, d. drajić, z. čiča from different producer nodes to the same consumer nodes). contrary, push traffic strategy involves sending data or notifications from the producer node periodically or when a particular event occurs on the producer node. push strategy forces real-time performance of iot applications [40], although in some iot applications it is also possible to combine both communication strategies. since we observe real-time iot applications, the push traffic technology is considered as the reference for our research, since it supports a higher number of parameters on the edgedevice node's side. table 1 gives overview of the parameters that are available from an application point of view for tailoring the device operation for push communication strategy. table 1 overview of software available parameters for push strategy parameter description sampling time (st) defines how often data are generated on the producer node. aggregation rate (ar) defines the level of data reduction on the producer node. transmission period (tp) defines the period for sending data from the producer to the consumer node. the performed analysis explores how much these three parameters impact energy consumption and overall real-time performance of iot applications. to quantify this impact, two metrics are introduced: ▪ energy consumption(e) – expressed in milliamperes per hour. ▪ average data delivery time (adt)– the time interval that elapses from the moment of data generation to the moment of data processing at the destination node. 3.2. tools and procedures the simulation framework used within this work enables the creation of arbitrary iot system topologies and analysis of various iot application performance parameters. results obtained by simulation provide a detailed overview of data availability across the entire iot system at any point in time. by analyzing obtained results, it is possible to quantify various iot application performance parameters such as iot system consumption, real-time performance, and scalability of iot system architecture. in paper [17], it is already described how to exploit developed simulation framework to quantify scalability of the iot architecture. in this section, we describe in more details the main aspects of the simulation framework important for better understanding of how to quantify the influence of a certain set of parameters on iot system energy consumption. the created simulation framework is available as an open-source solution [16] and it can be further developed and adapted to satisfy any requirements which are not supported by the current framework version. the simulation framework comprises simulation core and a graphic user interface. the simulation core is in charge to implement all functionalities related to the simulation of iot system behavior on different levels of iot system architecture. these functionalities rely on component’s model which exists within iot system in the general case: node model – which represents a device that generates data or consumes data, link model – which represents a connection between iot system components, and protocol – which encloses all information related to data created and consumed within iot system. current version of simulation framework used within this analysis supports only the simplest models of iot system components that exist in general case such as processing green computing for iot – software approach 547 devices, links and protocols. these models support only basic parameters configuration. model included within current simulation framework version does not support modeling of packet dropouts, connection losses and packet retransmissions which can be significant for overall iot application quality of service analysis. within each model, it is possible to configure a certain set of parameters. for the easy process of configuring the model’s parameter, the graphical user interface is developed. communication between the framework core and the graphic user interface is established by using the model’s configuration file. at the end of the simulation, different log files are created. by examining and analyzing the content of these files it is possible to understand how iot systems behave. a general overview of the simulation framework is presented in figure 2. fig. 2 simulation framework architecture the node, link, and protocol model support different parameters. the current version of the simulator supports three-node models: producer, gateway, and consumer. node consumption, processing time, adjacent nodes, aggregation level, compression rate, and transmission period are parameters that can be configured for each node model. additionally, in the case of the producer node model, it is possible to define the amount of data produced on the node but also it is possible to define the data sampling rate. the model of the link supports the configuration of the following parameters: link speed, link consumption (transmit and receive), and link speed deviation. data are exchanged between nodes using a protocol model where it is possible to define protocol overhead and optionally it is possible to enable a handshaking mechanism. a more detailed description of parameters is given in [17]. energy consumption calculation within the simulation framework is implemented based on current overall node consumption (conc) expressed in ma. to calculate the charge consumed by a node for specific action, conc is multiplied by the time required for executing specific action on the node. calculation algorithms print the cumulative sum of consumed charge over time (csc), expressed in ma per time resolution – r, to the node log file. node energy consumption is directly proportional to csc value, and it is easy to calculate it directly if information about node voltage power supply is available. based on this information, it is easy to profile nodes based on energy consumption. conc value is determined by the current node operation mode as well as the type of the links used to communicate with adjacent nodes. two operating modes are supported by each node model: active and low power mode. node is in active mode when data are processing on node or data are transmitting/receiving from/on a node. if there is no any action on the node, it is in low power mode. for each of these modes, within the node’s model configuration file, it is possible to configure current node consumption (cncm) by simulation log files model’s configuration files simulation framework gui core 548 h. turkmanović, i. popović, d. drajić, z. čiča setting parameter value related to specific node mode m (cnca – current node consumption when node is in active mode, cnclp – current node consumption when it is in low power mode). link models enable configuration of current link consumption clcs during different states s such as transmission clct and receiving data clcr. the following equation is used for conc calculation: 𝐶𝑂𝑁𝐶 = 𝐶𝑁𝐶𝑚 + 𝐶𝐿𝐶𝑠(1) where cncm and clcs take value depending on current actions on the node as presented in table 2: table 2 value of conc depends on action on the node action on the node conc = low power mode cnclp processing received data cnca processing received data and receiving new data from another node cnca + clcr receiving data cnclp + clcr transmitting data clct it needs to be mentioned that the improvements of available simulation’s models to correspond with practical mcu based device implementations is seen as a part of future work. goal of this future work will be to extend the simulation model to accurately represent both, device and communication power and performance behaviors. to illustrate the working principle of developed algorithms and to illustrate the potential of developed simulation framework in terms of profiling node’s energy consumption, we examine the behavior of simple node n which is connected to the rest of the iot system over link l. information relevant to this example is presented in tables 3 and 4. table 3 node n parameters values parameter name value unit processing speed 50 [b/s] data production rate 15 [s] data size 50 [b] cnclp 10 [ma] cnca 90 [ma] table 4 link l parameters values parameter name value unit link speed 12.5 [b/s] clcr 400 [ma] clct 400 [ma] figure 3 showspart of the node log file obtained after completed simulation. the shownpart of the log file includes actions on the nodes inside the time interval [45s, 72s]. green computing for iot – software approach 549 fig. 3 part of node’s log file the node log file is given in form of a csv file where each value in a single row represents information about node parameter value at a specific point in time. more information about specific values is given in [17] while in this analysis we focus only on the values important for energy analysis such as timestamp (1st value), conc (next to last value), and csc (last value). obtained values are extracted and visualized in figure 4. the analysis shown in figure 4. illustrates the charge and the consumption of the selected iot node for the selected time interval [45-73s]. time intervals 1 and 4 include all node actions which occur on the node log file within intervals [45-47.4s] and [60-62.4s], respectively. these actions are mostly based on data processing of created data. time intervals 2 and 5 represent actions on the node within [47.4-57s] and [62.4-72s] where the action of processed data transmission is executing. after node sends data, there are no more actions on the node, and the node goes to low-power mode. this node state is observed in time interval 3 within [57-60s] as found from a log file. fig. 4 node charge consumption and conc values within time interval [45s-73s] 550 h. turkmanović, i. popović, d. drajić, z. čiča 4. case study this section gives a description of the experiment setup, including iot system topology and node and link configuration, and the simulation results illustrating the impact of introduced application-level parameters on node operation and consumption. the parametric analysis and the discussion of the associated trade-off properties are also given. in our analysis, data communication at the edge-tier of the iot system is modeled as an interaction between the data producer node (iot edge device) and corresponding consumer (destination) node located in the higher hierarchy of the rest of the iot system. data from edge devices are pushed toward data consumer device through a link used to establish communication between producer and consumer device. this iot system is illustrated in figure 5. while parameters used in simulation are presented in tables 5. and 6. fig. 5 illustration of iot system used in our analysis table 5 producer node parameters value parameter name value unit processing speed 1 [mb/s] data size 100 [b] data overhead 70 [b] cnclp 20 [ma] cnca 110 [ma] table 6 link parameters value parameter name value unit link speed 18 [kb/s] maximum transmission unit (mtu) 1500 [b] clct 410 [ma] clcr 410 [ma] edge devices can be considered as simple mcu-based embedded system which gathers data by sensing its environment, performs simple data processing, like data aggregation, and provides physical connectivity with the rest of the iot system. from software's perspective is only possible to control parameters such as data sampling rate, aggregation rate, and transmission rate. the range of values of these three parameters is shown in table 7. table 7 range of parameters values parameter name range unit data sampling time 0.1-10 [s] aggregation rate 1-100 transmission period 1-10 [s] the analysis of the results obtained by variation of these three parameters’ values in a presented range is given in the next section. green computing for iot – software approach 551 5. results analysis results obtained by simulation are presented in figure 6. results are normalized to the operating point with the coordinates q0(ar0, tp0, st0) = (10,10s,1s). the normalized results are adopted to illustrate the potential of adjusting the parameter values on observed system properties given on different scales. each parameter value at q0 is selected as a midpoint of the parameter range given in the logarithmic scale. furthermore, parameter range is chosen to avoid boundary conditions of system operation where sampling rate interval is comparable with the data processing time and/or communication latency. based on the obtained results, it is possible to quantify the impact of these parameters on reducing energy consumption, but also on reducing the average time of data availability on consumer nodes. normalized energy consumption (e) and optimization cost function (o) values are presented on each graphics' left side, while normalized average data delivery time (adt) values are presented on the right side. from figure 6-i it is noticeable that increasing the aggregation rate within the first half of the observed range [0.1-1] reduces data payload size which leads to the reduction of the total energy consumption (~0.26). within the second half of the observed range [1-10], increasing aggregation rate to a lesser extent contributes to a further reduction of energy consumption (~constant) because payload size becomes negligible to protocol header size. from the same graph, it can be also noticed that the increase in aggregation rate does not cause a significant change in data delivery time (~0.03). this impact is expected because the change of the aggregation rate does not change the outcome in terms of the data availability time, but it changes only the form of the exchanged data since the original data are embedded within the aggregated data format. the change in the transmission period has a significantly greater impact on the reduction of energy consumption compared to the impact of the aggregation rate parameter, because of the reduced activity of the iot communication subsystem. it can be seen in figure 6-ii that due to the increase in transmission period, energy consumption decreases almost linearly along with the entire observed range. on the other hand, there is a proportional degradation of data availability time and corresponding real-time performance. the effect of the sampling rate parameter is shown in figure 6-iii. by controlling this parameter, we can achieve certain energy-saving up to half of the observed range [0.1 – 1], like in the case of the aggregation rate parameter. however, in contrast to the other two parameters in the second half of the observed range [1 – 10], it is possible to achieve significantly better characteristics in the domain of data availability at the consumer node side. to quantify the trade-off that can be achieved by tuning certain parameters, we introduce the optimization cost function defined as: o = 𝑘 ∙ e + 𝑞 ∙ adt (2) where parameters take a value within a range [0, 1] and relation between k and q is defined within following equality: k = 1 – q. the purpose of this cost function is to establish the relation between the power consumption and performance domains to find the optimal operating point for iot edge device. the cost function provides background for tuning the certain parameter within iot device at the edge tier to optimize power consumption and/or overall iot system real-time performance. operating at the best performance, without the concerns about the consumption means operating point with maximal sampling rate and communication rate without data aggregation. 552 h. turkmanović, i. popović, d. drajić, z. čiča thus, optimizing only performance imply that the value for q is set to 1 while k equals 0. if both k and q are higher than zero than we can talk about the trade-off in power-performance domain. analysis conducted in this paper considers that both requirements are equally important, and consequently both parameters’ values are set to 0.5. fig. 6 influence of aggregation rate (i), transmission period (ii) and sampling time (iii) on energy consumption (left scale – blue) and average data delivery time (right scale – orange) vs trade-off optimization norm (left scale black) green computing for iot – software approach 553 by analyzing cost functions presented on figure 6. is possible to find optimal operating point by tunning only single parameters. following the shape of the cost function o, presented in figure 6-i, decreasing the value of ar below the ar0 results in a significant increase in the optimization cost function’s value. alternatively, increasing the value of ar above ar0 has a minor effect on the cost function’s value. it’s obviously that optimal ar value is located at the end of the observed range. as visible from figure 6-ii, varying the value of the transmission period (tp) parameter away from tp0 degrades the value of o, since its optimal value of tp parameters is found around tp0. on the other hand, as observable from figure 6-iii it is feasible to identify that optimal st value is located left from st0 where cost function has minimum value. finding optimal operating point in 3d space of system parameters is found from the criterion for minimizing cost function. if both relationships for quantifying performance and power consumption are depending on operating point parameters according to linear equation in opposite direction, then it is expected that optimal parameters are found at the middle between boundary values. as the dependences are not linear as obvious form figure 6, then it is expected more complex relationship between optimization criterion and system parameters. 6. conclusion the energy requirements and the performance in the operation of the iot edge device are analyzed through the investigation of the typical data producer-consumer relationship. as the more generalized option, the iot edge device was considered a typical data producer which operates under a push-based communication model. iot edge device operation, under the influence of the identified set of parameters, was investigated utilizing the custom-built simulation environment. the simulation results have shown that the control of parameters such as sampling rate, aggregation rate, and transmission period at the data producer side can lead to the more optimal behavior of iot systems in the power-performance domain, where the optimization criteria can be tuned to fulfill the particular application requirements. simulation results confirmed the trade-off potential, where adjusting parameters often have opposite effects on the power requirement of the iot edge device node and the resulting real-time performance of the iot application. this trade-off potential was quantified by the introduced cost function, which defines the relationship between both, power, and performance domains, in linear form. by introducing the cost function, it has been shown that it is possible to find the optimal operating point where iot system real-time and edge device energy consumption performance will be optimized in case where power consumption and performance are equally important to optimize the exact position of optimal operating point in the 3d space of system parameters is complex to estimate without comprehensive parametric analysis since the complex relationship between system parameters and system power consumption and performance. in general, to lower energy consumption, in the same time compromising real-time performance, presumes less frequent sampling with higher aggregation rate and lower communication rate. the utilization of this approach can result in the development of an algorithm that would control introduced parameters to achieve optimal compromise and enable the design of giot applications. the design and the implementation of an algorithm that controls the introduced set of parameters to achieve optimal operation of the edge devices, in the same way enabling the deployment of giot applications, is seen as a part of future work. 554 h. turkmanović, i. popović, d. drajić, z. čiča acknowledgment: this work has been supported by the ministry of education, science and technological development of the republic of serbia. references [1] h. turkmanović, i. popović, d. drajić and z. čiča, "launching real-time iot applications on energyaware embedded platforms", in proceedings of the 15th international conference on advanced technologies, systems and services in telecommunications (telsiks), pp. 279-282, 2021. [2] r. lu, x. li, x. liang, x. shen and x.lin. "grs: the green, reliability, and security of emerging machine to machine communications", ieee commun. mag., vol. 49, no. 4, pp. 28-35, april 2011. [3] cisco, "cisco annual internet report 2018-2023", march 2020. [4] a. s. h. abdul-qawy, n. m. s. almurisi and s. tadisetty, "classification of energy saving techniques for iotbased heterogeneous wireless nodes", procedia comput. sci., vol. 171, pp. 2590-2599, 2020. [5] x. liu and n. ansari, "toward green iot: energy solutions and key challenges"," ieee commun. mag., vol. 57, no. 3, pp. 104-110, march 2019. [6] p. k. verma, r. verma, a. prakash, a. agrawal, k. naik, r. tripathi, m. alsabaan, t. khalifa, t. abdelkader and a. abogharaf, "machine-to-machine (m2m) communications: a survey", j. netw. comput. appl., vol. 66, pp. 83-105, 2016. [7] t. xu, j. b. wendt and m. potkonjak, "security of iot systems: design challenges and opportunities", in proceedings of the ieee/acm international conference on computer-aided design (iccad), pp. 417-423, 2014. [8] a. damian and l. kung-kiu, "evaluating iot service composition mechanisms for the scalability of iot systems", future gener. comput. syst., vol. 108, pp. 827-848, 2020. [9] b. diène, j. j. p. c. rodrigues, o. diallo, e. h. m. ndoye and v. v. korotaev, "data management techniques for internet of things", mech. syst. signal process., vol. 138, april 2020. [10] c. c. sobin, "a survey on architecture, protocols, and challenges in iot", wirel. pers. commun., vol. 112, pp. 1383-1429, 2020. [11] n. kimura and s. latifi, "a survey on data compression in wireless sensor networks," in proceedings of the international conference on information technology: coding and computing (itcc'05) volume ii, vol. 2, april 2005, pp. 8-13. [12] a. ali, g. a. shah and j. arshad, "energy-efficient techniques for m2m communication: a survey", j. netw. comput. appl., vol. 68, pp. 42-55, june 2016. [13] a. azari and g. miao, "energy-efficient mac for cellular-based m2m communications", in proceedings of the ieee global conference on signal and information processing (globalsip), december 2014, pp. 128-132. [14] m. muniswamaiah, t. agerwala and c. c. tappert, "green computing for internet of things", in proceedings of the 7th ieee international conference on cyber security and cloud computing (cscloud)/2020 6th ieee international conference on edge computing and scalable cloud (edgecom), 2020, pp. 182-185. [15] h. turkmanović and i. popović, "a systematic approach for designing battery management system for embedded applications", in proceedings of the zooming innovation in consumer technologies conference (zinc), may 2021, pp. 85-90. [16] h. turkmanovic, https://github.com/turkmanovic/lsnsimulator.git, github/turkmanovic, lsnsimulator. [17] h. turkmanović, i. popović, z. čiča and d. drajić, "simulation framework for performance analysis in multi-tier iot systems", in proceedings of the 29th telecommunications forum (telfor), 2021, pp. 1-4, [18] u. b. k. ramesh, s. sentilles and i. crnkovic, "energy management in embedded systems: towards a taxonomy", in proceedings of the first international workshop on green and sustainable software (greens), 2012, , pp. 41-44. [19] r. arshad, s. zahoor, m. a. shah, a. wahid and h. yu, "green iot: an investigation on energy saving practices for 2020 and beyond", ieee access, vol. 5, pp. 15667-15681, 2017. [20] a. haider, t. umair, h. james, z. xiaojun, l. liu, z. yongjun, b. faycal, a. abbes, f. kaniz, a. niko, "a survey on system level energy optimisation for mpsocs in iot and consumer electronics", comput. sci. rev., vol. 41, p. 100416, aug. 2021. [21] g. anastasi, m. conti, m. francesco and a. passarella, "energy conservation in wireless sensor networks: a survey", ad hoc netw., vol. 7, no. 3, pp. 537-568, may 2009. [22] r. soua and p. minet, "a survey on energy efficient techniques in wireless sensor networks", in proceedings of the 4th joint ifip wireless and mobile networking conference, october 2011, pp. 1-9. https://github.com/turkmanovic/lsnsimulator.git green computing for iot – software approach 555 [23] t. srisooksai, k. keamarungsi, p. lamsrichan, k. araki, "practical data compression in wireless sensor networks: a survey", j. netw. comput. appl., vol. 35, no. 1, pp. 37-59, january 2012. [24] d. parker, m. stojanovic and c. yu, "exploiting temporal and spatial correlation in wireless sensor networks", in proceedings of the asilomar conference on signals, systems and computers, november 2013, pp. 442-446. [25] y. zhou, l. yang, l. yang and m. ni, "novel energy-efficient data gathering scheme exploiting spatial-temporal correlation for wireless sensor networks", wirel. commun. mobile comput., vol. 2019, p. 4182563, 2019. [26] s. randhawa and s. jain, "data aggregation in wireless sensor networks: previous research, current status, and future directions", wireless pers commun., vol. 97, pp. 3355-3425, july 2017. [27] s.-y. tsai, s.-i. sou and m.-h. tsai, "reducing energy consumption by data aggregation in m2m networks", wireless pers commun., vol. 74, pp. 1231-1244, jan. 2014. [28] i. solis and k. obraczka, "the impact of timing in data aggregation for sensor networks", in proceedings of the ieee international conference on communications (ieee cat. no. 04ch37577), vol. 6, 2004, pp. 3640-3645. [29] t. sheltami, m. musaddiq and e. shakshuki, "data compression techniques in wireless sensor networks", future gener. comput. syst., vol. 64, pp. 151-162, nov. 2016. [30] i. solis and k. obraczka, "the impact of timing in data aggregation for sensor networks", in proceedings of the ieee international conference on communications (ieee cat. no. 04ch37577), vol. 6, 2004, pp. 3640-3645. [31] w. kim and i. jung, "smart sensing period for efficient energy consumption in iot network", sensors, vol. 19, no. 22, p. 4915, nov. 2019. [32] c. alippi, g. anastasi, c. galperti, f. mancini and m. roveri, "adaptive sampling for energy conservation in wireless sensor networks for snow monitoring applications", in proceedings of the ieee international conference on mobile adhoc and sensor systems, october 2007, pp. 1-6. [33] m. hempstead, m. j. lyons, d. brooks and g.y. wei, "survey of hardware systems for wireless sensor networks", j. low power electronics, vol. 4, pp. 1-10, april 2008. [34] w. h. cheng, w. isaac, y. cheng-wen, a. alagan and o. mohammad, "energy-efficient tasks scheduling algorithm for real-time multiprocessor embedded systems", j. supercomput., vol. 62, pp. 967-988, nov. 2012. [35] s. li and j. huang, "energy efficient resource management and task scheduling for iot services in edge computing paradigm", in proceedings of the ieee international symposium on parallel and distributed processing with applications and ieee international conference on ubiquitous computing and communications (ispa/iucc), december 2017, pp. 846-851. [36] c. h. lin, j. c. liu and c. w. liao, "energy analysis of multimedia video decoding on mobile handheld devices", comput. stand. interfaces, vol. 32, no. 1-2, pp. 10-17, jan. 2010. [37] s. a. alvi, g. a. shah, w. mahmood, "energy efficient green routing protocol for internet of multimedia things", in proceedings of the ieee tenth international conference on intelligent sensors, sensor networks and information processing (issnip), may 2015, pp. 1-6. [38] s. tanzila, h. khalid, a. imran and r. amjad, "secure and energy-efficient framework using internet of medical things for e-healthcare", j. infect. public health, vol. 13, no. 10, pp. 1567-1575, july 2020. [39] a. lindgren, f. b. abdesslem, b. ahlgren, o. schelén and a. m. malik, "design choices for the iot in information-centric networks", in proceedings of the 13th ieee annual consumer communications and networking conference (ccnc), january 2016, pp. 882-888. [40] r. c. sofia and p. m. mendes, "an overview on push-based communication models for informationcentric networking", future internet, vol. 11, no. 3, p. 74, march 2019. 10905 facta universitatis series: electronics and energetics vol. 36, no 2, june 2023, pp. 159-170 https://doi.org/10.2298/fuee2302159s © 2023 by university of niš, serbia | creative commons license: cc by-nc-nd original scientific paper investigation of dye-sensitized solar cell performance based on vertically aligned tio2 nanowire photoanode * biraj shougaijam, salam surjit singh department of electronics and communication engineering, manipur technical university, takyelpat-795004, manipur, india abstract. in this work, we present our results related to the development of dyesensitized solar cells (dsscs) based on vertically aligned tio2-nanowire (nw) and ag nanoparticle (np) assisted vertically aligned tio2-nw (tat) photoanode fabricated by the glancing angle deposition (glad) technique on fluorine doped thin oxide (fto) substrates. the scanning electron microscopy (sem) analysis reveals that the ag-np assisted vertically aligned tio2-nw photoanode was successfully deposited on fto substrates. the average length and diameter of the nw have been measured to be ~ 350 nm and ~ 90 100 nm, respectively. moreover, transmission electron microscopy (tem) and x-ray diffraction (xrd) manifest the presence of small crystals of tio2 and ag. further, the absorption spectrum analysis reveals that the incorporation of ag-np in tio2-nw increases absorption in the visible region, but decreases the efficiency of the cell after the incorporation of the nanoparticle. the calculated bandgap of the annealed ag-np (30 nm) assisted tio2-nw (tat@30nm) sample from the photoluminescence (pl) graph is ~ 3.12 ev. finally, it is observed that the tio2-nw based dssc device shows better performance in terms of photo conversion efficiency (pce) compared to the tat@30nm photoanode based device, with an efficiency of ~0.61 % from the former and ~ 0.24 % from the latter. this reduction in the efficiency of tat@30nm based devices is due to the larger size of ag-np, in which the nanoaprticle acts as an electron sink and acts as a blocking layer. key words: dsscs, e-beam, nanowire, nanoparticle, tio2 received july 09, 2022; revised august 13, 2022; accepted september 05, 2022 corresponding author: biraj shougaijam department of electronics and communication engineering, manipur technical university, takyelpat-795004, manipur, india e-mail: biraj.sh89@gmail.com * an earlier version of this paper was presented at the international conference on micro/nanoelectronics devices, circuits and systems (mndcs 2022), january 29-31, 2022, national institute of technology silchar, india [1] 160 b. shougaijam, s. s. singh 1. introduction the number of natural disasters is rising daily due to the rapid climate change in the last few decades, directly or indirectly caused by carbon emissions from fossil fuels and the destruction of forest areas. therefore, it is important to develop a sustainable energy conversion device to overcome the issue of increasing energy demand. fossil fuels are one of the main sources of energy that will eventually run out, which will have an effect on the environment and the ecosystem in the area. as a result, scientists are always working to improve renewable energy sources like hydroelectricity, solar energy and wind energy. since a significant amount of sunlight penetrates the earth's surface, solar cells, among other energy sources, play a significant part in the generation of electrical energy. lots of design and development have been done on photovoltaic technology to maximize the conversion efficiency of sunlight. the most commonly used solar cell materials are silicon, perovskites, graphene, iii-v nitrites and organic dyes [1-6]. among these, dye-sensitized solar cells (dsscs) became attractive after o’regan and gratzel’s reported the outstanding properties of dsscs like the multicolor option, easy integration into building architecture, ease of fabrication, low cost and affordability [7]. similar to how plant chlorophyll performs photosynthesis, this solar cell's operation relies on the photo-electrochemical reaction, in which the dye molecule acts as a molecular electron pump by trapping the light. when light falls on the surface of the cell, the excited dye molecule is oxidized and transferring those excited electrons into the conduction band of a wide bandgap semiconductor, such as nanostructured tio2. the excited electron in the tio2 nanostructure is then transported to the counter electrode by the process of diffusion through the external circuit. again, the oxidized dye molecule present inside the cell is regenerated from iodine present in the redox electrolyte medium and further regenerated from iodine by reduction of triiodide on the counter electrode [8]. the four different modules that make up dsscs are photoanode, dye, electrolyte, and counter electrode. among these, the photoanode is crucial to the process of photon conversion and the dye sensitizer influences how well the dsscs work. the sensitizer/dye should possess broad and strong absorption from the visible to the near region. ruthenium compound is the most commonly used efficient and stable dye. even though these dyes have some disadvantage compared to eosin-y and porphyrin, they have excellent electron injection, higher absorption in the visible range, good stability and efficient charge transfer, thereby giving the highest efficiency [9, 10]. moreover, tio2 has a high bandgap, resistance to photo corrosion, and nontoxicity compared to other metal oxides like zno2, sno2, cu2o, wo3 and in2o3 [11-12]. naturally, tio2 crystal, which belongs to the transition metal oxides, can assume any of the three forms, i.e., anatase, rutile or brookite [13-14]. anatase tio2 is mainly employed to create photoanodes for dsscs because of its higher charge transport and stability. the power conversion efficiency (pce) of the dsscs is significantly impacted by the form and size of the tio2 nanoparticle. tio2 is a chemically inert substance because it has a bandgap of ~ 3.2 ev or less and does not induce chemical reactions in the absence of light. due to scaling laws, the chemical and physical properties of nanomaterials change as their geometrical dimension decreases [15]. so, there has been recent progress in the synthesis of tio2 nanomaterial like nanorods (nrs), nanowires (nws) and nanotubes (nts), which possess different properties because of the different synthesis techniques, unique nanostructure, and high surface-to-volume ratio that enhance the delocalized carrier charge particle, thereby investigation of dye-sensitized solar cell performance based on vertically aligned tio2 nanowire photoanode 161 increasing the charge transportation. this 1d nanostructure can be used in designing photoanodes for the dsscs application, which can enhance the efficiency through rapid electron transport. nanomaterial synthesis can be done in different ways, like the sol-gel method, the hydrothermal method, chemical vapour deposition (cvd) and physical vapour deposition (pvd) techniques, etc. [16-17]. furthermore, tio2-nw enhanced the performance of the energy-sensing because of more reaction sites due to the high surface area and larger extension of the depletion region. in addition to that, tio2-nw has confined conductive channels which can reduce charge recombination, hence enhancing charge transportation as compared to other bulk structures. yang et al. reported that porous tio2 tf was deposited by glancing angle deposition (glad) using e-beam deposition. also, it was reported that tio2 film has the largest internal surface area, which enhances the dye absorption of the dsscs [18]. wong et al. reported that tio2 photoanodes were prepared using the e-beam technology. it is also observed that the efficiency of 6.1% was achieved at an inclined glad angle of 73º which improves the light trapping nature of the tio2 photoanode as it is a columnar structure [19]. the highest reported pce of dsscs is ~ 14.2 %, which is fabricated using the chemical process on screen-printed tio2 film [20]. even though the efficiency of the dsscs is much lower than that of si solar cells, they have remarkable performance under low light intensity, which can be used in indoor lighting. so, in this work, the glad method was used to grow vertically aligned tio2-nw as photoanodes on fluorine-doped thin oxide (fto) for dssc application without using any catalyst. furthermore, it should be noted that the glad can be used to precisely control the shape, size, and thickness of the nanostructure [21]. the vertical tio2 nanostructure achieved from gald deposition enhances the efficiency by shortening the electron pathway through the vertical tio2-nw. further, an attempt has been made to put ag metal nanoparticles in the middle of the tio2-nw to enhance the photon absorption through surface plasmon resonance (spr). the spr effect mainly depends on the type of metal used, its shape and the size of the metal nanoparticle. both ag and au exhibit a strong spr effect in the visible region. however, the cost of the ag metal is comparatively lower as compared to au. again, ag-np is highly stable and can withstand corrosion and less oxidized [22]. it is noteworthy to mention that the ag nanoparticle can be used in various applications like supercapacitors, biosensors and other optoelectronic applications, etc [23-25]. therefore, an effort is made to develop dssc based on tio2-nw and ag-np embedded vertical tio2-nw photoanodes deposited by the glad method on fto substrates for the dssc application. the samples were analyzed using scanning electron microscopy (sem), transmission electron microscopy (tem) and x-ray diffraction (xrd) (rigaku ultima iv, cuka radiation, k = 0.1540) analysis for morphology and structural analysis, respectively. finally, the performance of two types of dsscs, i.e., tio2-nw and ag-np assisted tio2-nw photoanode based devices, is analyzed. 2. experimental details 2.1. materials both the tio2 and ag (both 99.999% pure) were procured from tecnisco advanced materials pte ltd, singapore. n719 ruthenium dye sensitizer (95% pure) was purchased from srl pvt. ltd, fto/glass (12-14 ω/cm2) from mti corporation, usa and iodolyte 162 b. shougaijam, s. s. singh hi-30 electrolyte were purchased from solaronix, switzerland. for the deposition of the tio2 and ag, the material is loaded into the crucible and put into the evaporation chamber. before creating the vacuum, the chamber was cleaned properly by applying acetone. further, the vacuum is created inside the chamber and the samples are inclined at 81° during the nw and np deposition. 2.2. photoanode preparation fto glass substrates having a resistivity of ~ 12-14 ohm/cm2 were properly cleaned sequentially by rinsing them in deionized water (di) (oxford lab fine chem llp (cas no. 7732-18-5)) for 1 minute each and drying them in the open air for 5 minutes before putting them inside the chamber. the vertically oriented tio2-nw and ag-np assisted vertically aligned tio2-nw (tat) are deposited on fto coated glass substrates by the glad using an e-beam evaporator (model no. smart coat 3.0, hhv india). this glad mechanism, which is installed inside the chamber, allows the change of the angle by moving the axis of it. in our previous work, the details of the fabrication process of tio2nw and tat samples were discussed [21]. here, the process is explained in brief. the samples are kept at an inclined angle of 81° and rotated at 30 rpm to form a vertically aligned tio2 nanostructure. in the first round of deposition, tio2-nw (350 nm) samples were deposited on an fto-coated glass (1 cm x 1 cm) substrate by employing the glad method. for another group of samples, tio2-nw (175 nm) was initially deposited on glass (1 cm x 1 cm). further, ag-np (30 nm) was deposited above the tio2-nw (175 nm). again, tio2-nw (175 nm) was deposited above the ag-np (30 nm)/tio2-nw (175 nm). finally, we achieved the staking of tio2-nw (175 nm)/ag-np (30 nm)/tio2-nw (175 nm) (tat@30nm) samples by employing the glad method. for every tio2 (30 nm) deposition, deposition was done for 12 minutes at a rate of ~ 0.6 å/sec and the ag (30 nm) deposition rate was kept constant at 0.8 å/sec for 7 minutes. the deposition rate and thickness of the deposited film were monitored through a digital thickness monitoring system in all the deposition process to control the film thickness precisely. similarly, vertically aligned tio2-nw with ag 60 nm (tat@60nm) and ag 90 nm (tat@90nm) samples are prepared using the same process and parameters. all these processes are performed under high vacuum conditions of ~ 2 x 10-5 mbar. the pressure of the chamber was maintained at ~ 6 x 10-6 mbar before the start of deposition. however, during the deposition, the pressure drops to ~ 2 x 10-5 mbar. further, the photoanode samples for dsscs fabrication are annealed at 500 °c for 3 hours. the samples were cooled down slowly and processed for dye loading. 2.3. dye preparation and counter electrode preparation 6mg of n719 (95%, srl pvt. ltd.) dye salt powder is mixed with a 0.5 mm concentration of ethanol using a vortex (ependorf mixmate) at 200 rpm for 20 minutes to make 10 ml of n719 dye solution. the resulting dye solution is kept for 1 day for stabilization, as shown in fig. 1 (inset). fig. 1 shows the absorption peak of the n719 sample being measured using a uv-vis spectrophotometer (an-uv-6500n antech), which reveals four bands at ~ 504 nm, ~ 376 nm and ~ 308 nm, with a shoulder peak at ~ 252 nm. the two peaks in the visible band are attributed to metal-to-ligand charge transfer (mlct). investigation of dye-sensitized solar cell performance based on vertically aligned tio2 nanowire photoanode 163 250 300 350 400 450 500 550 600 650 700 252 nm 308 nm a b so r p ti o n ( a .u ) wavelength (nm) n719 dye 504 nm376 nm fig. 1 shows optical absorption spectrum of the n719 dye sample the tio2-nw and tat coated fto glass samples are immersed in the n719 dye for 24 hours, which is kept at room temperature in a dark room. to remove the excess dye, the tio2 photoanode sample is washed gently with ethanol after taking it out of the dye solution and dried for 3 minutes in the open air. again, plastisol t/sp paste from the solaronix was coated on the fto substrate using the doctor-blade technique for making the counter electrode (ce). here, the 3m scotch tape covers all four edges of the sample by keeping a 2 x 2 cm2 space at the centre of the fto glass. this sample is placed in the furnace for annealing at 450ºc for 1 hour, which will activate the pt particles. 2.4. fabrication of dsscs the dye-sensitized tio2 photoanode and pt-coated counter electrode were preheated at 100 °c before being sandwiched together. this pretreated process will remove the moisture present on the surface of the photoanode and counter electrode. further, the pt activated counter electrode is sandwiched and sealed with the tio2 photoanode by using a paper clip to complete the dsscs module. lastly, the electrolyte solution was introduced in between the electrodes by capillary action. 3. results and discussion 3.1. sem analysis the morphology of the as-deposited tat@30nm sample was analyzed using a sem instrument, as shown in fig. 2 which shows the successful deposition of tat@30nm nanowires. the magnified sem image of tat@30nm sample is shown in fig. 2(b). the larger diameter nanowires indicated by dotted circle, shown in fig. 2(b), are built by cluster formation through shadowing effects during the deposition [26]. the average top diameter of the tat@30nm was measured and calculated from the magnified sem image and found to be ~ 72 nm, as shown in fig. 2(c). fig. 2(d) shows the cross-sectional image of tat@30nm. this image proves that vertical tat@30nm is successfully grown onto the fto substrate by employing the glad technique. it also reveals the 164 b. shougaijam, s. s. singh presence of ag-nps, which are indicated by blue dotted circles in the middle of the nws. the height of the tat nanowire is ~337 nm. a) b) 0 50 100 150 200 0 5 10 15 20 25 30 c o u n t diameter (nm) average= ~ 72.07 nm (c) ~ 337 nm (tio2-nw) ag c) ag (d) fig. 2 (a) the sem image of the as-deposited tat@30nm, (b) represent the magnified image showing the porous nature of the sample and (c) showing the calculated average diameter, (d) a cross-sectional image of the tat@30nm sample these vertical nanostructures enhanced the efficiency of the dssc solar cell by enhancing the surface area of the active layer as compared to thin-film [27]. moreover, the vertical nanostructures have beneficial effects for dsscs, since they have antireflection properties through the nanostructures that efficiently trap more light. therefore, this method can be employed for developing high surface area photoanode nanostructures for dssc applications and other optoelectronic applications. 3.2. tem analysis for tem analysis, the tio2 nanostructure layer deposited on the glass substrate was scrapped out using a doctor blade, which dispersed the scrapped-out powder into the acetone in a vial and ultrasonically sonicated the sample properly for a few minutes for good dispersion. finally, place a drop of sonicated solution onto the tem grid for tem analysis. the tem analysis of the tio2 and tat samples is shown in fig. 3 (a) and (b). the tio2-nws are successfully grown using the glad technique, as shown in fig. 3(a). the typical length measured from the nanowire image is ~ 259 nm and the arrow mark indicates the growth direction. further, the tem image of tat sample manifests the presence of ag-np assisted at the mid of the tio2-nw. the hr-tem images in inset (1) and (2) of fig. 3(b) show the investigation of dye-sensitized solar cell performance based on vertically aligned tio2 nanowire photoanode 165 presence of ag crystal and tio2 crystal. the measured lattice constant from the inset (1) is found to be ~ 0.35 nm, which corresponds to (101) crystal plan of anatase tio2 (jcpds no. 75-1753). from the inset (2), the measured lattice constant is found to be ~ 0.24 nm in the related crystal plane (111) of the ag crystal (jcpds no. 04-0783). ~ 259 nm 100 nm (a) a (101) ag (111) ~ 0.35 nm ~ 0.24 nm (1) (2) (b) fig. 3 (a) the tem image of the as deposited tio2-nw, b) tem image of annealed tat@30nm and the inset represent the magnified hr-tem image 3.2.1 xrd analysis the as-deposited tio2-nw and ag-np (30 nm, 60 nm and 90 nm) assisted vertically aligned tio2-nw samples are analyzed by x-ray diffraction (xrd). fig. 3(a) shows the xrd results of the as-deposited tio2-nw, tat@30nm, tat@60nm, and tat@90nm samples. the weak peaks observed at 2θ = 25.76˚, 37.58˚, 48.4˚ and 63.4˚ are attributed to tio2 crystals with the corresponding orientation of (101), (103), (200) and (204), respectively (jcpds no. 75-1753). the weak peaks may correspond to the small grain size of tio2 crystal grains. again, the peaks at 38.27˚, 34.72˚ and 77.33˚ are related to the (111), (220) and (310) planes of ag crystals (jcpds no. 04-0783). 30 40 50 60 70 80 in te n si ty ( a .u ) 2 degree tio2-nw tat@30nm tat@60nm tat@90nm r ( 1 1 0 ) r ( 1 1 1 ) a g ( 1 1 1 ) a g ( 1 1 1 ) a g ( 1 1 1 ) a g ( 2 2 0 ) a ( 1 0 1 ) a ( 1 0 1 ) r ( 1 1 0 ) r ( 1 1 0 ) a g ( 2 2 0 ) r ( 1 1 1 ) a ( 1 0 1 ) a ( 1 0 1 ) r ( 1 1 0 ) a ( 2 0 0 ) a ( 2 0 0 ) a ( 2 0 0 ) a ( 2 0 0 ) r ( 1 1 1 ) r ( 1 1 1 ) a g ( 2 2 0 ) (a) 30 40 50 60 70 80 in te n si ty ( a .u ) 2 degree as-deposited tio2-nw annealed tio2-nw a ( 1 0 1 ) a ( 1 0 3 ) a ( 2 0 0 ) r ( 1 1 1 ) r ( 1 1 0 ) a ( 1 0 1 ) r ( 1 1 0 ) a ( 2 0 0 ) a ( 1 0 5 ) a ( 2 0 4 ) (b) fig. 4 (a) shows the xrd results of tio2-nw, tat@30nm, tat@60nm, and tat@90nm samples deposited at room temperature and (b) shows the xrd peak results for asdeposited tio2-nw and annealed tio2-nw 166 b. shougaijam, s. s. singh the annealing of tio2-nw improves the crystalline structure of tio2, as shown in fig. 4(b). oblique deposition of titanium oxide layers for dsscs is done by using reactive e-beam deposition, which has the same weak peak. further, as-deposited tio2 film was annealed for 3 h at 500 ˚c to produce crystalline tio2 [28]. 3.3. uv-vis spectroscopy and photoluminescence spectroscopy the optical properties of tio2-nw and tat specimens fabricated on an fto substrate were analyzed in the wavelength range of 340 nm to 800 nm using a uv-vis spectrophotometer. the recorded absorption intensity of the sample is shown in fig. 5 (a). the absorption spectrum of tio2-nw shows a higher absorption peak in the ultraviolet range. this peak may be attributed to electron excitation from the outermost valence band (vb) to the conduction band (cb) of the tio2 [29]. moreover, the absorption spectrum of the tio2nw is significantly enhanced in the visible region after the incorporation of different np sizes, i.e., 30 nm, 60 nm and 90 nm. this significant improvement at around 400 to 600 nm in the absorption spectrum may be due to the spr effect of ag-np [30]. moreover, the 400 500 600 700 800 a b so r p ti o n ( a .u ) wavelength (nm) as-deposited tio 2 -nw as-deposited tat@30nm as-deposited tat@60nm as-deposited tat@90nm (a) 2.8 3.0 3.2 3.4 3.6 ~ 3.38 ev (a h n )2 energy (hu) (ev) tio2-nw tat@30nm ~ 3.27 ev (b) 400 450 500 550 600 650 700 in te n si ty ( a .u ) wavelength (nm) tat@30nm (as-deposited) tat@30nm (annealed) ~ 397 nm ~ 397 nm (c) 400 450 500 550 600 650 in te n si ty ( a .u ) wavelength (nm) tat@30nm (annealed) fit peak 1 fit peak 2 fit peak 3 cumulative fit peak ~ 397 nm ~ 386 nm ~ 448 nm ~ 519 nm 1 ~ 387 nm 2 3 (d) fig. 5 (a) the optical absorption spectra of as-deposited tio2-nw, tat@30nm, tat@60nm and tat@90nm specimens fabricated on the fto substrate, (b) the band gap of as-deposited tio2-nw and tat@30nm, (c) pl spectra of asdeposited and annealed tat@30nm samples and (d) shows the gaussian fitted pl graph of annealed tat@30nm sample investigation of dye-sensitized solar cell performance based on vertically aligned tio2 nanowire photoanode 167 calculated band gaps form the tauc plot of tio2-nw and tat@30nm are ~3.38 and ~ 3.27 ev, respectively. further, room temperature photoluminescence (pl) analysis of the ag-np assisted tio2-nw sample was done at an excitation wavelength of 340 nm by using a 370 nm stopband filter. the broad pl intensity of the as-deposited and annealed tat@30nm samples is plotted in fig. 5 (b). a broad emission peak at ~ 397 nm was observed from as-deposited and annealed tat@30nm samples. it was also observed that the pl intensity of the annealed tat@30nm specimen increased compared to the as-deposited sample, which may be due to the increase in the crystallinity of tio2 by reducing the oxygen vacancies. further, the gaussian fitted curve shows the peaks at ~ 385 nm, ~448 nm, and 519 nm, which correspond to the band-to-band transition of tio2 and oxygen defects present between the band gap, as shown in fig. 5(d). 3.4. device characterization the electrical performance of the fabricated dsscs is characterized at room temperature by using a source meter (keithley 2450) connected to the computer and the photocurrent measurement was taken under light illumination at 100 mw/cm2 powered by a solar simulator (ss150, scientech, canada). fig. 6 shows the dye absorbed photoanode, counter electrode and dssc device. the schematic of the dssc device based on the tat photoanode is shown in fig. 6(d). fig. 7 shows the j-v graph of dsscs based on tio2-nw and tat@30nm photoanodes. and, table i shows the cell performance of dsscs devices and the corresponding photovoltaic parameters. (a) (c) (b) dsscs device a fto fto e e e e e ag-np electrolyte n719 tio2 -nw pt (d) fig. 6 (a) fabricated counter electrode, (b) dye absorbed photoanode and (c) fabricated dssc device based on tio2-nw photoanode and (d) schematic of dssc device based on tat photoanode the pce of the tio2-nw is ~ 0.61% and the corresponding open-circuit voltage (voc) and short circuit current density (jsc) of the cell are ~ 0.51 v and ~ 3.21 ma/cm 2. the efficiency of the dsscs is reduced to ~ 0.24 percent after the incorporation of ag nanoparticles, with the corresponding voc and jsc being ~0.34 v and ~ 2.11 ma/cm 2, respectively. so, there is a difference between the jsc that depends on the light conversion activity and the structure of the photoanode, which determines the electron diffusion pce 168 b. shougaijam, s. s. singh of the solar cell. it is observed that tio2-nw photoanode based dssc devices show better efficiency compared to tat@30nm photoanode based devices. -0.4 -0.2 0.0 0.2 0.4 -2 0 2 4 c u r r e n t d e n si ty ( m a /c m 2 ) voltage (v) tio 2 -nw tat@30nm (a) 0.0 0.1 0.2 0.3 0.4 0.5 -3.5 -3.0 -2.5 -2.0 -1.5 -1.0 -0.5 0.0 c u r r e n t d e n si ty ( m a /c m 2 ) voltage (v) tio 2 -nw tat@30nm (b) fig. 7 a) j-v graphs of dsscs based on tio2-nw and tat@30nm photoanode, b) magnified j-v graphs of dsscs table 1 photovoltaic performance of tio2-nw and tat@30nm photoanode based dsscs photoanode voc (v) jsc (ma)/cm2 vm (v) im (ma) ff ƞ % reference mwcnt 0.28 1.76 --0.30 0.15 [31] zno film 0.49 2.15 --0.54 0.56 [32] tio2 film 0.56 1.17 --0.85 0.56 [33] tio2-nw 0.51 3.21 0.32 1.94 0.37 0.61 our result tat@ag 30 nm 0.34 2.11 0.21 1.13 0.34 0.24 our result however, the tio2-nw based device improves the accessibility of the entire surface to the dye and corresponding electrolyte medium, leading to a direct and shorter path for the transportation of the electrons. marquesa et al. reported an efficiency of ~ 1.2% from the dssc fabricated using the tape casting method. it was also observed that the highest efficiency was achieved by using 4-tert-butyl pyridine electrolytes [34]. again, erande et al. reported a pce of 0.2% in which the tio2 film was deposited using a chemical method [35].the natural dye, acting as a sensitizer of dsscs, was less efficient. even so, the efficiency of our dssc device based on tio2-nw was higher than that of dsscs using natural dye. furthermore, our device shows better performance in terms of efficiency compared to some of the recently reported devices, as shown in table i. however, the efficiency of our device is still low, which may be due to the small thickness of the photoanode. so, the efficiency may be further improved by increasing the tio2-nw photoanode thickness and also by reducing the size of the metal nanoparticles. 4. conclusion in conclusion, the glad method was used to develop tio2-nw and ag-np-assisted tio2-nw photoanodes on an fto substrate for the development of dsscs. the sem and tem analysis reveal the successful deposition of tio2-nw and tat nanowires. the xrd investigation reveals the presence of ag-np and tio2 crystals in the samples. the investigation of dye-sensitized solar cell performance based on vertically aligned tio2 nanowire photoanode 169 absorption enhancement from the ag-np assisted tio2-nw samples observed in the absorption spectrum may be due to the spr effect of ag-np present in the tio2-nw. the tio2-nw based dssc device shows better efficiency compared to the ag-np assisted tio2-nw photoanode based dssc device. it may be concluded that the size of the ag-np incorporation at the mid-point of the tio2 nw needs to be reduced to enhance the efficiency of dssc. therefore, this presented technique may be employed for developing dsscs and other optoelectronic device applications. acknowledgement: the authors acknowledge the department of science and technology (dst), science and engineering research board (serb), govt. of india for funding this work under file no. ecr/2018/000834. also, the authors would like to thank nit, durgapur and nit nagaland for fe-sem and xrd analysis, respectively. references [1] b. shougaijam and s. s. singh, growth of vertically aligned tio2 nanowire photoanode for developing dye-sensitized solar cell. in: t. r. lenka, d. misra, a. biswas, micro and nanoelectronics devices, circuits and systems. lecture notes in electrical engineering, springer, singapore, 2023, vol. 904, pp. 119-129. [2] l. a. reichertz, i. gherasoiu, k. m. yu, v. m. kao, w. walukiewicz and j. w. ager iii, "demonstration of a iii-nitride/silicon tandem solar cell", appl. phys. express, vol. 2, no. 12, p. 122202, dec. 2009. [3] w. chen, y. wu, y. yue, j. liu, w. zhang, x. yang, h. chen, e. bi, i. ashraful, m. gratzel and l. han, "efficient and stable large-area perovskite solar cells with inorganic charge extraction layers", science, vol. 350, no. 6263, pp. 944-948, oct. 2015. [4] s. i. cha, y. kim, k. h. hwang, y. j. shin, s. h. seo and d. y. lee, "dye-sensitized solar cells on glass paper: tco-free highly bendable dye-sensitized solar cells inspired by the traditional korean door structure", energy environ. sci, vol. 5, pp. 6071-6075, jan. 2012. [5] l. l. estrella, s. h. lee and d. h. kim, "new semi-rigid triphenylamine donor moiety for d-π-a sensitizer: theoretical and experimental investigations for dsscs", dyes and pigments, vol. 165, pp. 1-10, june 2019. [6] l. l. estrella and d. h. kim, "theoretical design and characterization of nir porphyrin-based sensitizers for applications in dye-sensitized solar cells", sol. energy, vol. 188, pp. 1031-1040, aug 2019. [7] q. miao, m. wu, w. guo and ma. tingli, "studies of high-efficient and low-cost dye-sensitized solar cells", front. optoelectron. china, vol. 4, pp. 103-107, april 2011. [8] n. heo, y. jun and j. park, "dye molecules in electrolytes: new approach for suppression of dyedesorption in dye-sensitized solar cells", sci rep., vol. 3, pp. 1712, april 2013. [9] p. baviskar, a. ennaoui and b. sankapal, "influence of processing parameters on chemically grown zno films with low-cost eosin-y dye towards efficient dye sensitized solar cell", sol. energy, vol. 105, pp. 445-454, july 2014. [10] c. hora, f. santos, m. g. f. sales, d. ivanou and a. mendes, "dye-sensitized solar cells for efficient solar and artificial light conversion", acs sustainable chem. eng., vol. 7, pp. 13464-13470, 2019. [11] s. s. d. mir, l. e. liezel, m. a. a. ivy, l. anton, m. nikita, a. mikaee, n. massoma, w. mohebullah, z. hameedullah and s. tomonobu, "photocatalytic applications of metal oxides for sustainable environmental remediation", metals, vol. 11, pp. 1-25, jan. 2021. [12] r. k. pandey and v. k. prajapati, "molecular and immunological toxic effects of nanoparticles", int j biol macromol, vol. 107, pp. 1278-1293, feb. 2017. [13] r. allen, "the cytotoxic and genotoxic potential of titanium dioxide (tio2) nanoparticles on human shsy5y neuronal cells in vitro", the plymouth student scientist, vol. 9, pp. 5-28, 2016. [14] b. shougaijam, r. swain, c. ngangbam and t. r. lenka, "analysis of morphological, structural and electrical properties of annealed tio2 nanowires deposited by glad technique", j. semicond., vol. 38, no. 5, p. 053001, may 2017. [15] r. s. dubey, k. v. krishnamurthy and s. singh, "experimental studies of tio2 nanoparticles synthesized by sol-gel and solvothermal routes for dsscs application", results in physics, vol. 14, p. 102390, sept. 2019. 170 b. shougaijam, s. s. singh [16] h. lee, m. y. song, j. s. jurng and y. k. park, "the synthesis and coating process of tio2 nanoparticles using cvd process", powder technology, vol. 214, pp. 64-68. nov. 2011. [17] h. k. e. latha and h. s. lalithamba, "synthesis and characterization of titanium dioxide thin film for sensor applications", mater. res. express, vol. 5, p. 035059, march 2018. [18] h. y. yang, m. f lee, c. h. huang, y. s. lo, y. j. chen and m. s. wong, "glancing angle deposited titania films for dye-sensitized solar cells", thin solid films, vol. 518, pp. 1590-1594, dec. 2009. [19] m. s. wong, m. f. lee, c. l. chen and c. h. huang, "vapor deposited sculptured nano-porous titania films by glancing angle deposition for efficiency enhancement in dye-sensitized solar cells", thin solid films, vol. 519, pp. 1717-1722, dec. 2010. [20] j. m. ji, h. zhou, y. k. eom, c. h. kim and h. k. kim, "14.2% efficiency dye-sensitized solar cells by co-sensitizing novel thieno [3, 2-b] indole-based organic dyes with a promising porphyrin sensitizer", adv. energy mater., vol. 10, no. 15, p. 2000124, feb. 2020. [21] b. shougaijam and s. s. singh, "structural and optical analysis of ag nanoparticle-assisted and vertically aligned tio2 nanowires for potential dsscs application", j mater sci: mater electron, vol. 32, pp. 19052-19061, june 2021. [22] c. liu, t. li, y. zhang, t. kong, t. zhuang, y. cui, m. fang, w. zhu, z. wu and c. li, "silver nanoparticle modified tio2 nanotubes with enhanced the efficiency of dye-sensitized solar cells", micropor. mesopor. mat., vol. 287, pp. 228-233, oct. 2019. [23] b. pandit, v. s. devika and b. r. sankapal, "electroless-deposited ag nanoparticles for highly stable energy-efficient electrochemical supercapacitor", j. alloys compd., vol. 726, pp. 1295-1303, dec. 2017. [24] k. v. alex, p. t. pavai, r. rugmini, m. s. prasad, k. kamakshi and k. c. sekhar, "green synthesized ag nanoparticles for bio-sensing and photocatalytic applications", acs omega, vol. 5, no. 22, pp. 13123-13129, may 2020. [25] n. s. rohizat, a. h. a. ripain, c. s. lim, c. l. tan, r. zakaria, "plasmon-enhanced reduced graphene oxide photodetector with monometallic of au and ag nanoparticles at vis–nir region", sci. rep., vol. 11, p. 19688, oct. 2021. [26] p. wen, y. han, w. zhao, "influence of tio2 nanocrystals fabricating dye-sensitized solar cell on the absorption spectra of n719 sensitizer", nanotechnol. solar energy, p. 906198, july 2012. [27] a. barranco, a. borras, a. r. gonzález-elipe and a. palmero, "perspectives on oblique angle deposition of thin films: from fundamentals to devices", prog. mater. sci., vol. 76, pp. 59-153, march 2016. [28] s. r. bhattacharyya, z. mallick and r. n. gayen, "vertically aligned al-doped zno nanowire arrays as efficient photoanode for dye-sensitized solar cells", journal of elec mater., vol. 49, pp. 3860-3868, april 2020. [29] y. wang, j. cheng, m. shahid, m. zhang and w. pan, "a high performance tio2 nanowire uv detector assembled by electrospinning", rsc adv., vol. 7, p. 26220, may 2017. [30] m. a. k. l. dissanayake, j. m. k. w. kumari, g. k. r. senadeera and c. a. thotawatthage, "efficiency enhancement in plasmonic dye-sensitized solar cells with tio2 photoanodes incorporating gold and silver nanoparticles", j. appl. electrochem., vol. 46, pp. 47-58, sept. 2016. [31] p. a. mithari, a. c. mendhe, s. s. karade, b. r. sankapal and s. r. patrikar, "mos2 nanoflakes anchored mwcnts: counter electrode in dye-sensitized solar cell", inorganic chem. commun., vol. 132, p. 108827, july 2021. [32] a. n. ossai, s. c. ezike, p. timtere and a. d. ahmed, "enhanced photovoltaic performance of dyesensitized solar cells-based carica papaya leaf and black cherry fruit co-sensitizers", chem. phys. impact, vol. 2, p. 100024, april 2021. [33] n. purushothamreddy, r. k. dileep, g. veerappan, m. kovendhan and d. p. joseph, "prickly pear fruit extract as photosensitizer for dye-sensitized solar cell", spectrochimica acta part a: molecular and biomolecular spectroscopy, vol. 228, p. 117686, oct. 2019. [34] k. b. erande, p. y. hawaldar, s. r. suryawanshi, b. m. babar, a. a. mohite, h. d. shelke, s. v. nipane and u. t. pawar, "extraction of natural dye (specifically anthocyanin) from pomegranate fruit source and their subsequent use in dssc", mater. today: proc., vol. 43, no. 4, pp. 2716-2720, july 2021. [35] a. s. marques, v. a. s. silva, e. s. ribeiro and l. f. b. malta, "dye-sensitized solar cells: components screening for glass substrate, counter-electrode, photoanode and electrolyte", mat. res., vol. 23, no. 5, p. e20200168, nov. 2020. instruction facta universitatis series: electronics and energetics vol. 29, no 2, june 2016, pp. 309 323 doi: 10.2298/fuee1602309s in-channel misrouting suppression technique for deflection-routed networks on chip  igor z. stojanovic, goran lj. djordjevic faculty of electronic engineering, university of niš, serbia abstract. deflection routing, where port-contentions in routers are resolved by intentionally misrouting some of packets along unwanted directions instead of storing them, has been proposed as a promising approach for improving power and area efficiency of large-scale networks on chip (nocs). however, at high network load, when packets are misrouted more frequently, the cost and energy benefits of this simple routing scheme are offset by the performance degradation. to address this problem, we propose a technique that uses small in-channel buffers to capture some of deflected packets before they take a misrouting hop. the captured packets are then looped-back to the routers where they suffered deflection and routed again. to improve the efficiency of this in-channel misrouting suppression scheme we also slightly modify the routing function of the deflection router by restricting the choice of productive directions for misrouted packets. evaluations on synthetic traffic patterns show that the proposed misrouting suppression mechanism yields an improvement of 36.2% in network saturation throughput when implemented into the conventional deflection-routed network. key words: network-on-chip, multi-core, deflection routing, misrouting suppression. 1. introduction network-on-chip (noc) has been proposed as an efficient and scalable solution to the challenging on-chip interconnection problems in modern many-core systems on chip (socs). to accommodate the communication needs of tens or even hundreds of processing elements (pes) integrated on a single chip, this architecture employs dedicated routers interconnected by some form of network topology. nocs typically use wormhole routing with virtual channel (wormhole/vc) flow control to route data packets from the source to the destination pe. this flow control scheme enables deadlock avoidance, optimize channel utilization, improve performance and provide quality of service [1, 2]. although wormhole/vc routing needs considerably less amount of buffer storage then other traditional flow control schemes (e.g. virtual cut-through and store-and-forward), the in-router buffers are still a significant source of area and energy overhead. for a static random access memory received june 3, 2015; received in revised form august 3, 2015 corresponding author: igor.stojanovic@elfak.ni.ac.rs faculty of electronic engineering, university of niš, a. medvedeva 14, 18000 niš, serbia (e-mail: mita@iritel.com) 310 i. z. stojanovic, g. lj. djordjevic (sram) buffer implementation, the input buffers can consume 46% of the total on-chip network power while occupying 17% of the total area [3]. to address the issue, several bufferless noc architectures have recently been proposed. in these architectures, in-router buffers are removed and contentions among packets are handled by employing the deflection routing [4-13]. with deflection routing, data packets are divided into flits (flow control units) which are then routed independently through the network and reassembled at their destination. flits arrive synchronously on the router’s input ports, and each flit is routed via the output port that offers the shortest path to its destination. when two incoming flits require the same output port, the router deflects one of the flits to an alternative output port (this is always possible as long as the router has as many outgoing as incoming ports). in this way, port contentions cause flits to be misrouted temporarily, in contrast with the wormhole/vc scheme where such flits must be buffered. deflection routing has several advantages over wormhole/vc scheme. first, since the number of incoming ports is equal to the number of outgoing ports, and flits move between routers synchronously, deadlock cannot occur. the adaptive nature of deflection routing also enables hot spots avoidance and provides fault-tolerance in the network [4]. this approach also eliminates the need for backward status links to implement flow control, and thus the design of the router is greatly simplified. finally, the deflection routing permits the use of as few as one flit-wide register per inter-router link, thereby realizing significant savings in hardware cost and power consumption over wormhole/vc nocs, which must provide ample buffers in each router. recent studies have shown that in the deflection-routed nocs, the power consumption is reduced by 20-40%, and the router area on die is reduced by 40-75% [6]. deflection routers target mainly low-latency operation at low network load [5]. under such load conditions, deflections are rare so that flits rapidly advance toward their destinations over shortest paths. on the other hand, under high load, frequent deflections might cause flits to deviate significantly from their shortest paths, leading to early saturation and poor energy efficiency. the issue of limited maximum throughput of deflection-routed networks has been addressed by several prior works. one line of research is aimed at improving the design of router’s port allocator and switching (pas) stage. within this stage, input flits are first permuted and then passed to the router’s output ports so that as many flits as possible are directed toward their desired directions. bless router uses the pas stage composed of a 44 crossbar switch controlled by an allocator unit that arbitrates the flits to output ports based on oldest-first arbitration policy [6]. the full priority ordering of flits results in fewer deflections, but it incurs a long critical path delay, thus limiting router operation to low clock frequencies. chipper router speeds up the critical path of the router by replacing the crossbar with a two-stage permutation network composed of four independently controlled 22 switch modules [7]. however, the simplicity of this design results in an increased deflection rate, and consequently lowers the maximum network throughput. another line of research deals with techniques for reducing the overhead of flit deflection. such misrouting suppression mechanisms try to prevent deflected flit to take a misrouting hop by temporary holding the flit at its current route position. the minimally buffered deflection router (minbd) achieves the misrouting suppression by a small sidebuffer attached between the output and the input of the router’s pas stage [8]. at each clock cycle, the side-buffer can accept up to one of deflected flits from pas output, and in-channel misrouting suppression technique 311 resubmit that flit to the pas input at some later cycle. by preventing a fraction of deflected flits to leave the router, this technique significantly improves the maximum network throughput. however, it also introduces the contention between the buffered flits and the new flits waiting for injection, which can cause the injection unfairness among routers in a highly loaded network. in our previous work, we proposed an in-channel misrouting suppression technique, referred to as the dual-mode channel, which uses a lightweight link-control mechanism to force deflected flits, when possible, to loop-back to their current routers instead of being misrouted [9]. this simple and effective method improves performances without compromising the injection fairness, but the obtained maximum network throughput is lower than that obtained with the side-buffering technique. in this paper, we further improve the misrouting suppression efficiency of the dualmode channel by adding small buffers at both ends of the channel. these buffers temporary store deflected flits that cannot be looped-back during the same clock cycle when they are entering the channel. also, we slightly modify the routing function of the baseline deflection router to remove the tendency of misrouted flits to take immediate reverse hops. this modification is motivated by our observation that such hops have an adverse effect on how often the channel is able to loop-back the deflected flits. when combined, the proposed mechanisms suppress more than 50% of misrouting hops, raising the maximum throughput by 36.2% with respect to the baseline deflection-routed network. the throughput improvement is 8.7% higher than with the side-buffering technique, and is achieved without compromising the injection fairness in the network. the remainder of the paper is organized as follows. section 2 provides a background on deflection routing including the overview of two representative misrouting suppression techniques: the side-buffering and the dual-mode channel. section 3 presents the novel misrouting suppression scheme for deflection-routed nocs. in section 4, evaluation and results are presented. section 5 concludes this paper. 2. deflection-routed noc architecture overview in this section, we first provide a generic model of deflection-routed noc architecture, which includes only the essential features reported in several previous proposals [5-13]. in particular, we consider a network of 2d mesh topology composed of non-pipelined (i.e. combinational) deflection routers connected by synchronous bidirectional communication channels. then we also discuss two existing techniques to improve the performance of the baseline deflection-routed network via misrouting suppression. 2.1. baseline 2d mesh deflection network figure 1 illustrates the fundamental elements of a generic 2d mesh deflection-routed noc. the noc is constructed as a grid of routers where each router is connected by bidirectional communication channels only to its neighbors. each router is also connected to a local pe, which serves as a source and sink for data packets. before being injected to the router, packets are split into smaller flow control units, so called flits, and each flit is routed independently through the network. in the most basic form, the deflection router is a pure combinational logic module, which directs the incoming flits from the input ports to the proper output ports. the inter-router communication channel includes a pair of 312 i. z. stojanovic, g. lj. djordjevic oppositely oriented flit-wide edge-triggered registers. since there are no in-router buffers, these so-called flit-registers are the only memory elements for storing flits in transit. therefore, during traveling towards their destinations, flits are always on the move, by hopping between the flit-registers and propagating through the routers. x,y ew n s flit register x-1,y x-1,y+1 x,y+1 pe pe pe pe fig. 1 2d mesh deflection-routed noc architecture routers attempt to route each flit along a shortest path to its destination. a router forwards a flit through a productive output port in a productive direction if the distance between the current flit position and its destination decreases. in 2d mesh network, when a flit reaches a router, there are at most two productive directions (i.e. output ports) to its destination. if the router is not able to grant the productive output port, the flit is deflected to any free but non-productive output port. deflection occurs within the internal router structure when multiple incoming flits contend for the same output port. on the other hand, the term misrouting refers to an external manifestation of the flit deflection. it corresponds to a transfer of a deflected flit over the inter-router channel one hop further in a non-productive direction. the cost of misrouting is two clock cycles since each non-productive hop must be compensated by one productive hop in the opposite direction. let note that in the baseline deflection-routed network, every flit deflection leads to a flit misrouting. e i c1 c2 c3 c4 pas nin ein sin win pin pout r sout nout eout wout nout eout sout wout c1 c2 c3 c4 a) b) fig. 2 architecture of baseline deflection router: a) internal structure, and b) pas based on permutation network in-channel misrouting suppression technique 313 figure 2a shows the architecture of the deflection router with four pairs of input and output network ports (denoted as n north, s south, w west and e east) and a pair of inject and eject ports (denoted as pin and pout) which are connected to the local pe. the router is composed of four consecutive stages: the routing stage (r), the eject stage (e), the inject stage (i), and the port allocation and switching stage (pas). through these stages, four internal flit-channels, c1, ..., c4, are established to guide flits from the set of input to the set of output ports. the routing stage associates a set of productive ports to each incoming flit. the routing function is based on offsets in x and y dimensions between the current router and the flit’s destination router. the number of productive ports assigned to a flit can be: 0 (flit is addressed to the local pe, i.e. both xand y-offset are zero), 1 (flit is already at one of the axes of its final destination, i.e. either xor yoffset is zero) or 2 (both xand y-offset are different than zero). the eject stage picks randomly one of locally-addressed flits (if any), and directs that flit to the local pe. the inject stage detects the presence of a free flit-channel and directs the new flit (generated by the local pe) to that channel. if the new flit is not injected into the network because all flit-channels are occupied, then that flit remains in the pe’s transmission queue and is resubmitted in the next clock cycle. the pas stage permutes and passes the flits from flitchannels to output network ports. here, we adopt a pas stage introduced in chipper router [7], which consists of four two-input switch modules arranged into two stages (fig. 2b). each switch module is controlled by an arbitration logic which first, decides the winner between two flits, and then, sends the winning flit toward its productive output port. the losing flit is directed to the other output of the module. the winner between two input flits is determined according to the silver-flit arbitration policy [8]. in this arbitration scheme, a single randomly selected flit is designated as a silver flit, i.e. it is prioritized above the others. the silver flit always wins in arbitration. the winner between any two non-silver flits is decided randomly. 2.2. misrouting suppression techniques the term misrouting suppression refers to any technique for reducing delay overhead incurred by flit deflection in deflection-routed networks [9]. these mechanisms cannot cancel flit deflection, which occurs within the pas stage of the router, but they can recognize a deflected flit and force it to temporary stay at its current route position instead of making a non-productive hop. the misrouting suppression can be implemented either within the deflection router or within the inter-router communication channel. fra frb router b router a straight-through loop-back a) b) fig. 3 misrouting suppression techniques: a) in-router misrouting suppression with side-buffer, and b) in-channel misrouting suppression with dual-mode channel 314 i. z. stojanovic, g. lj. djordjevic side-buffering. the side-buffering [8] is an in-router misrouting suppression technique which uses a small buffer memory (so-called side buffer) attached to each router to buffer some deflected flits that otherwise would be misrouted. the side buffer can be implemented either as a single flit-register or as a small-size fifo (composed of several flit-registers). as shown in fig. 3a, the side buffer (sb) is attached to the deflection router via two additional stages: the buffer-eject stage (be) and the buffer-inject stage (bi). the be stage recognizes deflected flits at the output of the pas stage, and puts one of them into the side buffer if the side buffer is not full. this flit is picked randomly among the deflected flits. the buffered flit will be re-ejected through the bi stage in some later clock cycle, when there is a free flitchannel after flit ejection. previous studies have shown that even adding the smallest side-buffer (1-flit in size) can reduce the misrouting rate by 50%, and can improve the maximum network throughput by 26% [8]. however, the studies have also shown that the performance improvement of this technique does not scale with the increasing side-buffer size because increasing the buffer size over 2 flits leads to only marginal performance gain. more importantly, as pointed out in [9], the presence of side buffers can cause an imbalance between the injection and ejection bandwidth available to pes in the areas of the network congested with in-transit traffic. this occurs because of the arrangement of stages within the side-buffered deflection router, which gives injection precedence to the flit residing in the side-buffer over the new flit waiting at the pe inject port. when the router is overloaded with in-transit flits, a free flitchannel appears rarely and is occupied by buffered flit in most cases, leaving the new flit to wait for another chance. dual-mode channel. the dual-mode channel is an in-channel misrouting suppression technique which prevents some non-productive network hops by forcing deflected flits, when possible, to loop-back to their current routers instead of being misrouted [9]. the datapath for this design is shown in fig. 3b. the approach is based on enhancing the interrouter communication channel with the capability to dynamically (i.e. on a cycle-by-cycle basis) switches between two modes of operation. if deflected flits are present on both ends of the channel, or one flit is deflected and the other one is absent, then the channel activates the loop-back mode (indicated by dotted lines in fig. 3b). in this mode, the flits are returned back to the corresponding input ports of their current routers. otherwise, the channel is configured in the straight-through mode (indicated by dashed lines) allowing both flits to make one network hop. with this scheme, a deflected flit will be misrouted only if there is a productively-routed flit on the opposite side of the channel. in all other cases, the deflected flit will stay at its current route position. it is important to note that the loop-back mechanism is transparent for productively-routed flits, which flow as in a network with the conventional inter-route channels. our previous simulation results show that this simple in-channel misrouting suppression mechanism offers 14.3% performance improvement in terms of maximum network throughput when implemented in the baseline deflection-routed noc [9]. the improvement is smaller when comparing with the side-buffering technique, but is accomplished with lower implementation cost (i.e. there is no need for additional buffer memory) and without any modification to the underlying router microarchitecture. an important advantage of the dualmode channel approach over the side-buffering is that it preserves the injection fairness in the network. in-channel misrouting suppression technique 315 3. misrouting suppression with in-channel buffering the limited misrouting suppression efficiency of the dual-mode channel is a consequence of the fact that the channel cannot save a deflected flit from misrouting if a productivelyrouted flit is present on the opposite end of the channel. under high traffic, when the interrouter channels are almost fully utilized, the loop-back mode can only be activated when both ends of the channel are occupied by deflected flits, which occur rarely. in this section we propose two techniques to mitigate the performance limitation of the dual-mode channel. the first one relates to modifying the routing function of the baseline deflection router with goal to increase the frequency of simultaneous appearance of deflected flits at both sides of the channel. the second technique deals with adding a small in-channel buffer memory for temporary storing deflected flits that cannot be looped-back immediately. 3.1. optimized routing function according to the results of our simulation experimentation with 2d mesh deflection networks under saturated load with uniform random traffic pattern, a deflected flit appears at a router’s output port with the probability of δ = 0.3. assuming that flit deflections occur in neighboring routers independent, the probability that both sides of an inter-router channel are fed with deflected flits should be δ 2 = 0.09. however, the simulation results show that this probability is actually 0.05. that is, the loop-backs in dual-mode channels occur less frequently than would be expected. a closer examination of the patterns of inter-router communication reviles that the discrepancy between expected and measured loop-back probability is caused by the tendency of the misrouted flits to return back to the routers wherein they have suffered deflection during the previous clock cycle. suppose that a flit f is deflected in a router a and then misrouted to a router b over channel cab. upon arriving at router b, the flit f is assigned with at most two productive ports. because flit f is misrouted, one of its productive ports must be the port through which it just has entered the router b. therefore, during the next clock cycle, there is a high chance that flit f will be returned back to the router a over the channel cab, but now as a productively-routed flit, thus forcing the straight-through configuration of the dual-mode channel. if happens that router a sends deflected flit to channel cab during the next clock cycle, that flit will be misrouted, too. thus, the net effect of such behavior is that the likelihood of flit misrouting depends on whether a flit sent by the same router over the same channel during the previous clock cycle was misrouted or not. in order to resolve this performance issue, we slightly modify the routing function of the baseline deflection router by restricting the choice of productive ports for misrouted flits. in particular, we extend the routing function of the deflection router with the following rule: rule 1: let flit f has entered a router a through the input port t  {n,s,e,w}, and let p  {n,s,e,w}be the set of productive output ports for flit f in router a. if the size of p is two, then remove t from p. rule 1 only impacts the implementation of the routing stage of a deflection router (fig. 3a). it is applied after the incoming flits are assigned with productive ports. if flit f has reached router a by a productive hop, then rule 1 has no effect on the routing decision regarding f because t cannot be in p. otherwise, if flit f has arrived at router a 316 i. z. stojanovic, g. lj. djordjevic by a misrouting hop, then port t must be in p. in this case, port t will be preserved in p if t is the only productive option for f. otherwise, t is removed form p. without t in the set of its productive ports, flit f will not be intentionally returned back to the previous router, unless it is deflected within the pas stage of router a. it should be noted that rule 1 does not preclude the possibility that a misrouted flit will be returned back to the previous router; it only decreases the likelihood of such event to occur. 3.2. in-channel buffering the main motivation for using the in-channel buffers is to decouple the operations of the two sides of the dual-mode channel by enabling each side to buffer incoming deflected flits which cannot be looped-back immediately. thus, instead of being misrouted to a neighboring router, the buffered deflected flit will be kept at its current route position until the condition for looping-back is met. when eventually looped-back to the router that has caused its deflection, the flit will get a new chance to continue traveling along a productive direction toward its destination. the datapath of the proposed inter-router channel with in-built flit-buffers is shown in fig. 4a. in comparison to the dual-mode channel (fig. 3(b)), the buffered channel contains two additional small-sized fifo sections which parallel direct loop-back paths. with fifos included, the dual-mode channel is enhanced with several new options on how to handle the incoming and buffered flits. as indicated by dotted lines in fig. 4a, the buffered channel can carry out one or more of ten different flit-transfer actions during each clock cycle. the choice of the actions depends on the routing statuses of the incoming flits as well as the statuses of the two fifos. the first set of options is for transferring of an incoming flit straight-through to the flit-register on the opposite side of the channel. if the incoming flit is productively-routed, this action leads to a productive hop (actions labelled as 1a/1b); otherwise, if the flit is deflected, the straight-through transfer causes a misrouting hop (2a/2b). the second set of options is those that keep an incoming flit on the same side of the channel. the flit loop-back action (3a/3b) allows an incoming flit to bypass the fifo and immediately reach the flit-register (fra/frb) on the same side of the channel. the incoming flit can also be buffered (4a/4b), and a buffered flit can be looped-back (5a/5b). a c-like pseudo code describing the operation of the dual-mode channel with in-built buffers is shown in fig. 4b. consider the operation of the a-side part of the channel in more details. the b-side part operates analogously. the a-side part of the channel can be configured in either the straight-through or the loop-back mode. the straight-through mode moves the opposite-side flit, fb, into the a-side flit-register, fra. in the loop-back mode, either the a-side incoming flit, fa, or the flit taken form fifoa is written into the fra. the straight-through mode is prioritized over the loop-back mode, and occurs in two distinct situations: when the flit fb is productively-routed (1b), and when the flit fb is deflected and must be misrouted (2b). the deflected flit fb is misrouted if there are no other options for handling that flit, i.e. the loop-back path of b-side is blocked by a productivelyrouted flit fa and the fifob is full. even if the a-side part of the channel is configured in the straight-through mode, a deflected flit fa can still be saved from misrouting by storing into fifoa if fifoa is not full (4a). if the a-side loop-back path is enabled, the flit-register fra receives either a flit from fifoa (5a) if fifoa is not empty or an incoming flit fa, if that flit is deflected and fifoa is empty. in the case of buffered loop-back action (5a), in-channel misrouting suppression technique 317 the incoming flit fa, if deflected, is written into fifoa (4a). it should be noted that a situation where both incoming flits are misrouted is not possible with this scheme. the critical case is one where both fifos are full, and both incoming flits, fa and fb, are deflected. according to the algorithm, in this case, both sides of the channel activate the buffered loop-back operation (5a/5b), which enables buffering of both flits (4a/4b) regardless of the current fifos statuses. out in out in (4a) buffering (3a) flit loopback (2a) misrouting hop (1a) productive hop (5a) buffered loopback (4b) (3b) (2b) (1b) (5b) fa fb fifob fifoa fra frb side b side a router b router a a) side a: side b: if(fb.p || fb.n && fa.p && fifob.full){ if(fa.p || fa.n && fb.p && fifoa.full){ 1b/2b fra ← fb; 1a/2a frb ← fa; if(fa.n && !fifoa.full){ if(fb.n && !fifob.full){ 4a fifoa ← fa; 4b fifob ← fb; } } } else if(!fifoa.empty){ } else if(!fifob.empty){ 5a fra ← fifoa; 5b frb ← fifob; if(fa.n){ if(fb.n){ 4a fifoa ← fa; 4b fifob ← fb; } } } else if(fa.n){ } else if(fb.n){ 3a fra ← fa; 3b frb ← fb; } } b) fig. 4 misrouting suppression with in-channel buffering: (a) datapath; (b) pseudo code. notice: f.p is true if flit f is productively-routed; f.n is true if flit f is deflected; sign “←” denotes a register transfer operation. in-channel buffering vs. side-buffering. the rationale of using in-channel buffering is similar to that of using side-buffering – to buffer some deflected flits that otherwise would be misrouted. in difference to the side-buffering, which picks and buffers deflected flits before they leave the router, the in-channel buffers store deflected flits that have entered 318 i. z. stojanovic, g. lj. djordjevic the inter-router channel but cannot be looped-back immediately. by placing buffers within the channels instead of within the routers brings the following advantages. as opposite to the side-buffering that can accept up to one deflected flit per router per clock cycle, the buffered dual-mode channel can loop-back/store up to two deflected flits at each clock cycle. in a 2d mesh network with dimension of nn  , the number of routers is n 2 and the number of inter-router channels is 2n 2 2n. because the number of inter-router channels is almost two times greater than the number of routers, the opportunities to capture deflected flits are more frequent with the in-channel buffering than with the side-buffering. moreover, being stored outside the routers, the flits buffered into the in-channel fifos re-enter the routers via network ports, and consequently they do not block the new flits generated by pe to enter the router. in this way, the problem of injection unfairness is avoided. the minimum delay overhead of a deflected flit which is buffered into an in-channel fifo is two clock cycles: the first cycle is used for buffering, and the second one for looping-back the buffered flit. although the delay overhead is the same as in the case of misrouting, the inchannel buffering is still beneficial since the buffered flit does not occupy the resources of the neighboring router. 4. performance evaluations in order to evaluate the performance impact of the proposed misrouting suppression technique, we have developed a discrete-event, cycle-accurate simulator for modeling deflection-routed noc using systemc [14]. it provides support to experiment with deflection noc with various options available, such as network topology and size, router/ channel architecture, buffer parameters, and traffic modelling. the simulator provides output performance metrics, such as latency, throughput, transport delay, and deflection rate for a given set of choices. the main building blocks of the simulator are: 1) processing element, 2) deflection router, and 3) inter-router channel (irc). the processing element block generates and injects flits into the network according to the user-specified configuration, including the traffic pattern and injection rate. it is also responsible for ejecting flits from the destination endpoints and collecting appropriate statistics. the router block mimics the behavior of the generic non-pipelined deflection router described in section 2. it can be configured in the bufferless mode (i.e. without side-buffer) or the buffered mode (with sidebuffer of configurable size). the configuration options for the irc block are the following: conventional channel (a pair of oppositely oriented flit-registers), dual-mode channel (see fig 3b), and buffered channel (see fig 4). the simulation results presented in this section are obtained for 2d mesh network with size of 88 nodes. the default buffer size in buffered architectures was set to 1 flit. each simulation run was started with a warm-up period of 1,000 cycles followed by a measurement period of 20,000 cycles. 4.1. performance under saturation load the first set of evaluations was carried out in a saturation mode under uniform random traffic pattern. in this mode, the transmission queue of each pe is assumed to be always nonempty. under such overloaded conditions, each pe injects a new flit into the network in every clock cycle when a free flit-channel is available in the router inject stage. the in-channel misrouting suppression technique 319 injected flits are destined randomly to other pes with an equal probability. a summary of the results is given table 1. table 1 comparison of saturation performance of baseline deflection-routed noc architecture and architectures with misrouting suppression support th td h r r e baseline 0.265 13.216 13.216 0.298 0.298 0 side-buffering 0.332 11.016 8.696 0.295 0.143 51.5% dual-mode channel 0.303 11.555 10.889 0.298 0.240 19.36% in-channel buffering 0.361 14.541 8.144 0.305 0.145 52.3% the details of the performance measures reported in table 1 are as follow. the saturation throughput (th) is defined as the average number of flits received per pe per clock cycle. it is the single most important network-level performance indicator, which being measured under saturation load provides an absolute limit reached by the throughput of a deflection-routed network. the transport delay (td) is the time, measured in clock cycles, elapsed from the instant when the source pe injects a flit to the network to the instant when the destination pe receives it. both the saturation throughput and the transport delay are correlated with the average hop count (h), which is defined as the average number of hops (i.e. channels traverses) a flit takes from source to destination. the average hop count accounts for both productive and non-productive (i.e. misrouting) inter-router hops. in networks where deflected flits are misrouted more often, the average hop count is larger, and consequently the transport delay is longer and throughput is lower. deflections occur within the routers due to inability of pas stage to grant productive ports to all incoming flits. the tendency of the pas stage to produce deflections is measured with the deflection rate (r), which is defined as r = nd / nr, where nd is the total number of deflected flits, and nr is the total number of flits that are processed by pas stages of all routers during the simulation. similarly, the misrouting rate (r) is defined as r = nm / nr, where nm is the total number of flits that are misrouted after deflection. the baseline deflection-routed network misroutes every deflected flit, thereby r = r . with a misrouting suppression mechanism implemented, not all deflected flits are misrouted. the misrouting suppression efficiency is defined as e =((r  r) / r)100% . the results in table 1 show that the implementation of misrouting suppression techniques brings a significant improvement in saturation throughput over the baseline architecture. the dual-mode channel, as the simplest misrouting suppression technique, raises the throughput by 14.3% over the baseline, while the improvement reaches 25.3% for the side-buffering technique. the highest throughput of 0.361 flits/cycle is achieved with the in-channel buffering, which represents an improvement of 36.2% over the baseline. in the baseline architecture, a flit takes 13.2 inter-router hops on average to reach its destinations. misrouting suppression techniques decrease the average hop count (and thus increase the throughput) by temporary holding some of deflected flits at their current route positions. this way, in the network with dual-mode channels, the average hop count is reduced for 2.33 hops with respect to baseline, while the reduction for 4.52 hops has achieved with the side-buffering. as expected, the lowest average hop count of 8.14 hops 320 i. z. stojanovic, g. lj. djordjevic is achieved with the in-channel buffering, which represents a decrease of 5.07 inter-router hops per flit (or, 38.4%) with respect to the baseline. in the baseline architecture, the transport delay equals the average hop count because each hop (either productive or misrouting) takes one clock cycle. in the networks with a misrouting suppression support, the transport delay incurred by a flit is the sum of two components: the hop count and the time the flit spends blocked by a misrouting suppression mechanism. for example, each time the dual-mode channel activates the loop-back mode, it adds one clock cycle to the transport delay of the looped-back flits. however, since the loopback saves two hops, the total transport delay is lower than in the baseline noc. in difference to the dual-mode channel, deflected flits captured by the side-buffering or inchannel buffering mechanism may stay buffered at their current route positions for several clock cycles before they get a chance to make the next inter-router hop. a closer examination of the simulation statistics revealed that flits, while traveling toward their destinations, spend 2.32 clock cycles in the side-buffers on average, which is low enough to provide a 16.6% lower total transport delay than in the baseline network. on the other hand, with the in-channel buffering the average buffer delay is 4.85 clock cycles. the high buffer delay is the reason why the transport delay with the in-channel buffering is larger than in the baseline architecture, despite a significant reduction in hop count. note that the in-channel buffering achieves a high saturation throughput even with a high transport delay. this is because buffered flits waiting to be looped-back do not block other flits that could otherwise make forward progress. let note that the transport delay can be reduced by limiting the time (i.e. the number of clock cycles) that flits are allowed to spent in in-channel buffers – when the time limit is reached, the buffered flit is forced to loop-back, regardless of the routing status of the flit arriving from the opposite side of the channel. however, an inevitable consequence of such buffering policy will be reduction of the network throughput due to lower utilization of in-channel buffers. for this reason, we have excluded this design option from further consideration. the results in table 1 do not show significant difference in deflection rates between the baseline and nocs architectures with the misrouting suppression support. this is because the same pas stage (i.e. permutation network with silver flit arbitration policy) is used in all investigated noc configurations. on the other hand, the misrouting rate depends not only on how often flits deflect, but it also depends on how efficiently the misrouting suppression mechanism prevents the deflected flits to make misrouting hops. the side-buffering technique reduces the misrouting rate by preventing some of deflected flits to leave the router. in this way, 51.5% of misrouting hops are prevented. the dual-mode channel uses the loop-back mode to return some of deflected flits back to their current routers. with this strategy, the dual-mode channel succeeds to prohibit about 19.36% of all deflected flits to make misrouting hops without adding extra buffers. by adding buffers into the dual-mode channels and optimizing the routing function of the deflection router, the proposed inchannel buffering technique reaches the misrouting suppression efficiency which is slightly higher than that of the side-buffering technique. 4.2. injection fairness as emphasized out in section 3, the arrangement of stages within the side-buffered deflection router may create injection unfairness in the network, in sense that some pes get to transmit more flits than others. this phenomenon can be best observed in fig. 5a, which shows distribution of the injection rate (i.e. the number of flits injected by each pe per clock in-channel misrouting suppression technique 321 cycle) over all pes in the side-buffered deflection noc under saturated load with uniform traffic pattern. as can be seen, the injection rate differences between pes are significant: while corner pes can inject their flits at almost every cycle, the pes in the middle of the mesh get a chance to inject a flit on every tenth cycle. as shown in fig. 5b, the in-channel buffering provides almost uniform injection rate distribution under the same load conditions. this advantage occurs because the in-channel buffering is transparent for the deflection router, which treats each incoming flit equally, regardless of whether the flit is looped-back by the in-channel misrouting suppression logic or it comes from a neighboring router. a) b) fig. 5 injection rate distribution under saturation load in deflection-routed 2d mesh nocs with misrouting suppression support: a) side-buffering; b) in-channel buffering 4.3. sensitivity to buffer size the second set of simulations deal with the impact of buffer size on the effectiveness of the side-buffering and in-channel buffering techniques. observed form table 2, although increasing the buffer size improves the throughput and misrouting suppression efficiency under saturated traffic load, this improvement is relatively small and rapidly saturates. doubling the buffer size from 1-flit to 2-flits increases the saturation throughput by only 2.71% for side-buffering, and 4.15% for in-channel buffering technique. in addition to high hardware cost, the price paid for this throughput improvement is 10% longer transport delay for side-buffering, and even 28% longer for in-channel buffering. further increase of buffer size increases the saturation throughput only marginally, while the transport delay continues to steadily increase. these results suggest that buffers with size larger than 1 flit increases hardware complexity and wastes power without significant performance benefit. table 2 comparison of saturation performance of baseline deflection-routed noc architecture and architectures with misrouting suppression support side-buffering in-channel buffering buffer size th td e th td e 1 flit 0.332 11.016 51.5% 0.361 14.541 52.3% 2 flits 0.341 12.126 57.2% 0.376 18.613 58.6% 3 flits 0.344 13.476 59.2% 0.382 22.899 61.2% 4 flits 0.346 14.915 60.0% 0.386 27.201 62.4% 322 i. z. stojanovic, g. lj. djordjevic 4.3. latency analysis finally, we evaluate the impact of different misrouting suppression schemes on the latency performance of deflection-routed network. latency is defined as the time (in clock cycles) since the flit is generated at the source pe until it arrives at the destination pe, including the time the flit spends in the source pe’s transmission queue. in these simulations, each pe generates flits following poisson distribution with mean rate λ (λ is also called the average flit injection rate for the noc). generated flits remain in its queue until they are successfully injected to the network. for each network configuration, the flit injection rate is varied from zero to the point when the first transmission-queue in the network becomes saturated. fig. 6 latency comparison of baseline deflection noc architecture and architectures with misrouting suppression support under uniform traffic pattern figure 6 contains load-latency graph under uniform traffic pattern. as observed, at low injection rates, deflection-routed networks with the misrouting suppression support experience almost the same average flit latency as the baseline deflection network. this is because of the fact that the network is free from congestion. however, as load in the network increases, the effect of misrouting suppression technique adopted becomes more visible. the graph in fig. 6 shows that the proposed in-channel buffering technique significantly improves the routing performance by providing low-latency communication at higher injection rates. as can be observed in fig. 6, for every deflection scheme, except for the side-buffering technique, the maximum injection rate achieved closely matches the saturation throughput reported in table 1. this is because these schemes provide injection fairness so that all transmission queues in the network become saturated at approximately the same injection rate. on the other hand, in the side-buffered deflection network, the transmission queues of pes in the middle area of the network become saturated at much lower injection rate than those of boundary pes, leading to early saturation. in-channel misrouting suppression technique 323 5. conclusions in this paper, a misrouting suppression technique for deflection-routed networks on chip was presented. the presented technique avoids misrouting hops by looping-back or capturing deflected flits into small in-channel buffers, immediately after they have appeared at router’s output ports. the efficiency of the technique is further improved by modifying the routing function of deflection router in a way to prevent misrouted flits to take immediate reverse hops. the simulation results show that the proposed schemes improves performance of the baseline deflection-routed noc by 36.2% in terms of saturation throughput. results also show that the misrouting suppression with the in-channel buffering offers higher saturation throughput than with the in-router buffering (i.e. side-buffering) although with a penalty in terms of hardware cost. moreover, the performance improvement is achieved without incurring injection unfairness among network nodes, which characterizes the sidebuffering approach. acknowledgement: this work was partially supported by the serbian ministry of science and technological development project no. tr-33035. references [1] w. j. dally "virtual-channel flow control", ieee trans. parallel distributed syst., 1992, vol. 3, no. 2, pp. 194-205. [2] t. bjerregaard, s. mahadevan, "a survey of research and practices of network-on-chip", acm comput. surv., vol 38, no. 1, 2006. [3] a. kumar, p. kundu, a. singh, l. s. peh and n. jha, "a 4.6 tbits/s 3.6 ghz single-cycle noc router with a novel switch allocator in 65 nm cmos", in proc. of 25th international conference on computer design, iccd, 2007, pp. 63-70. [4] a. kohler and m. radetzki, "fault-tolerant architecture and deflection routing for degradable noc switches", in proc. of the 3 rd ieee international symposium on networks-on-chip, 2009, pp. 22–31. [5] g. michelogiannakis, d. sanchez, w.j. dally, c. kozyrakis, "evaluating bufferless flow control for onchip networks", in proc. of the 4 th acm/ieee int. symposium on networks-on-chip, 2010, pp. 9-16. [6] t. moscibroda and o. mutlu, "a case for bufferless routing in on-chip networks", in proc. of the 36 th annual international symposium on computer architecture, acm, new york, 2009, pp. 196-207. [7] c. fallin, c. craik and o. mutlu, "chipper: a low-complexity bufferless deflection router", in proc. of the 17 th international symposium on high performance computer architecture (hpca), 2011, pp. 144–155. [8] c. fallin, g. nazario, x. yu, k. chang, r. ausavarungnirun and o. mutlu, "minbd: minimallybuffered deflection routing for energy-efficient interconnect", in proc. of the 6th ieee/acm international symposium on networks on chip, 2012, pp. 1-10. [9] i. z. stojanovic, m. d. jovanovic and g. lj. djordjevic, "dual-mode inter-router communication channel for deflection-routed networks-on-chip", the journal of supercomputing, springer us, published online: march 2015. [10] y. li, k. mei, y. liu, n. zheng, yi xu, "ldbr: low-deflection bufferless router for cost-sensitive network-on-chip design", microprocessors and microsystems, 2014, vol. 38, no. 7, pp. 669-680. [11] m. hayenga, "scarab: a single cycle adaptive routing and bufferless network”, in proc. of the 42 nd annual ieee/acm international symposium on microarchitecture (micro-42), 2009, pp. 244-254. [12] j. jose, b. nayak, k. kumar and muyam m, "debar: deflection based adaptive router with minimal buffering", in proc. of the design, automation & test in europe conference & exhibition (date), 2013, pp. 1583–1588. [13] c. feng, j. li, z. lu, a. jantsch, m. zhang, "evaluation of deflection routing on various noc topologies", in proc. of ieee 9th international conference on asic (asicon 2011), pp. 163-166. [14] open systemc initiative. systemc v2.1 language reference manual, 2005. http://www.systemc.org/ 10853 facta universitatis series: electronics and energetics vol. 36, no 1, march 2023, pp. 53-75 https://doi.org/10.2298/fuee2301053m © 2023 by university of niš, serbia | creative commons license: cc by-nc-nd original scientific paper optimal power management of dgs and dstatcom using improved ali baba and the forty thieves optimizer belkacem mahdad department of electrical engineering, university of biskra, algeria abstract. in this study an improved ali baba and the forty thieves optimizer (iaft) is proposed and successfully adapted and applied to enhance the technical performances of radial distribution network (rdn). the standard aft governed by two sensible parameters to balance the exploration and the exploitation stages. in the proposed variant a modification is introduced using sine and cosine functions to create flexible balance between intensification and diversification during search process. the proposed variant namely iaft applied to solve various single and combined objective functions such as the improvement of total power losses (tpl), the minimization of total voltage deviation and the maximization of the loading capacity (lc) under fixed load and considering the random aspect of loads. the exchange of active powers is elaborated by integration of multi distribution generation based photovoltaic systems (pv), otherwise the optimal management of reactive power is achieved by the installation of multi dstatcom. the efficiency and robustness of the proposed variant validated on two rdn, the 33-bus and the 69-bus. the qualities of objective functions achieved and the statistical analysis elaborated compared to results achieved using several recent metaheuristic methods demonstrate the competitive aspect of the proposed iaft in solving with accuracy various practical problems related to optimal power management of rdn. key words: ali baba and the forty thieves optimizer, integration of distributed generation, rdn, dstatcom, power losses, loading capacity received june 12, 2022; revised july 06, 2022; accepted july 24, 2022 corresponding author: mahdad belkacem department of electrical engineering, university of biskra, al e-mail: belkacem.mahdad@univ-biskra.dz 54 b. mahdad list of abbreviations iaft improved ali baba and the forty thieves rdn radial distribution network tpl total power losses tvd total voltage deviation lc loading capacity pv photovoltaic systems dstatcom distributed static compensator sc shunt compensator dg distributed generation facts flexible ac transmission systems cb capacitors bank gwo grey wolf optmizer abc artificial bee colony aca ant colony algorithm iwho improved wild horse optimization algorithm sd standard deviation bsoa backtracking search optimization algorithm simbo-q swine influenza model-based optimization with quarantine hho harris hawks optimization algorithms moihho multi-objective improved harris hawks optimization algorithms ieo improved equilibrium optimizer pm power management lms loading margin stability fwa fireworks algorithm bfoa bacterial foraging optmization algorithm hsa harmony search algorithm tm taguchi method ga/pso genetic algorithm/particle swarm optimization wca water cycle algorithm tsa tabu search algorithm itsa improved tabu search algorithm egwa enhanced grey wolf algorithm mrfa manta ray foraging algorithm jfsa jellyfish search algorithm mc margin capacity rdn radial distribution network tlbo teaching-learning based optimization qosimbo-q quasi-oppositional swine influenza model-based optimization with quarantine ihho improved harris hawks optimization algorithms fuzzy-ias fuzzy and artificial immune system 1. introduction due to economic aspect, the radial distribution network (rdn) is exploited based on simple topology, as a result the energy quality delivered to consumers is greatly affected which requires urgent measures and additional costs to satisfy the desired objectives. actually, with the large diffusion of various types of renewable sources such as wind and photovoltaic (pv) energy, the rdn becomes more flexible to exploit in terms of improving the energy quality, reducing cost investment and emission. otherwise, the intermittent aspect of this optimal power manegement of dgs and dstatcom using improved ali baba and the fourty ... 55 energy is the main drawback which affects the energy quality delivered to consumers. recently, many smart management strategies based on the adaptation of several novel metaheuristics methods have been proposed for integration of various types of renewable sources to improve the performances of modern rdn [1].the various power management strategies developed until now aim to find the optimal solutions to the following technical and economic problems, such as: what are the best locations and size of multi types of dgs units, how to find the best locations of conventional capacitor banc (cb) and shunt compensators (sc) based facts devices [1], how to optimally coordinate the amount of active powers of various types of dgs units and the reactive powers of shunt compensators [2], to optimize individually and simultaneously several objective functions, and finally how to design the optimal reconfigurations of rdn under normal and abnormal situations in the presence of multi dgs and sc to reduce the total power loss (tpl), improve the total voltage deviation (tvd), reduce emission, and enhance the total cost investment of modern rdn. a deep statistical review of large number of metaheuristic methods introduced in the recent literature reveals that the success of the majority these methods depends on the structure of the diversification and intensification mechanism. dynamic interactivity between exploration and exploitation during search process allows the algorithm to solve with accuracy various complex optimization problems [1, 2]. among many developed strategies based recent metaheuristic methods applied with success to improve the technical and economical performances of rdn, authors in [3] proposed a hybrid technique based on combining an analytical method and metaheuristic optimization techniques for solving the optimal location of bank capacitors to improve the performances the various rdn. in [4] a water cycle algorithm is adapted and applied to solve the location and sizing of bank capacitors and dgs in rdn. in [5] an efficient jellyfish search algorithm is successfully applied to solve the power management of rdn such as the location and coordination of shunt compensators based facts devices and dgs, and the reconfiguration operation to improve the power quality delivered to consumers such as the improvement of voltage deviation and the reduction of the tpl. in [6], three novel metaheuristic methods such as the grey wolf optimizer, the dragonfly and moth–flame optimization algorithms have been applied to solve the optimal location and sizing of multi dgs and cb in rdn. in [7], a spring search algorithm is applied to solve the optimal integration of capacitor banks and various dgs; various objective functions have been treated to elevate the rdn performances. in [8] a hybrid method based on combining the ga and the pso algorithm for optimal setting and sizing of multi dgs units, the various multi objective problems are transformed to a single objective function by employing fuzzy optimal theory. in [9], a combined technique based on genetic algorithm and mathematical optimization, is presented to improve the operating cost and reducing the tpl, the particularity of the proposed hybrid method validated on three test rdn (10-bus, 33-bus and 69-bus). in [10] artificial bee colony (abc) method is investigated for optimal location of dgs considering the operation cost and tpl in rdn. in [11] a probabilistic technique based pso is proposed for optimal allocation of dstatcom based facts devices in coordination with renewable sources such as wind turbines and solar photovoltaic (pv) to enhance the rdn. in [12] an approach based on ant colony algorithm (aca) for optimal location of dgs to reduce tpl and improve the voltage profile of loads. in [13] a novel quasi-oppositional chaotic harris hawk’s optimization (qochho) algorithm is adapted to solve the optimal sitting and sizing of distributed generation (dg) installed in the 33-bus and the practical brazil 136-bus radial distribution network (rdn) considering different types of load models at three load levels). in [14], an improved wild horse 56 b. mahdad optimization algorithm (iwho) is proposed to improve the reliability of various rdn test systems, the 33-bus, 69-bus and the 119-bus. in [15] a new circuit theory based branch oriented for loss allocation in rdn considering different load model and dgs units. in [16], an improved equilibrium optimizer (ieo) designed for selecting the suitable location and the most effective size of dgs based pv systems in practical rdn. due to the robust characteristic and fast response of the statcom device to regulate the voltage magnitude in particular at critical situations such as severe faults, this device is also investigated by researches to improve the system loadability of multi machine based on imperialist competitive algorithm [17] and cuckoo search algorithm [18]. in [19] ant lion algorithm is applied for optimal allocation and sizing of various dgs based renewable sources. in [20], an efficient reactive power management strategy based on a modern metaheuristic algorithm is proposed for reduction the tpl in rdn. in [21], a new optimization variant namely a novel opposition-based tuned-chaotic differential evolution technique designed to improve the techno-economic aspect of the optimal placement of dgs in rdn. in [22], an enhanced equilibrium optimizer (eeo) is applied for optimal planning of pv-bes units in rdn considering time-varying demand. in [23], a parallel slime mould algorithm (psma) is proposed for optimal reconfiguration of rdn in coordination with dgs integration. in [24], a hybrid genetic dragonfly algorithm (hgada) is proposed and applied for optimal allocation of dgs to improve the technical performances of rdn. in [25], a planning strategy based on an improved grey wolf optimizer (igwo) and loss sensitivity (ls) is proposed to improve the integration of dgs in rdn. in [26] an improved coyote optimization algorithm (icoa) is proposed for optimally installing solar photovoltaic sources in rdn. in [27], a single and multi objective technique based on an improved harris hawks optimizer (ihho) is applied for optimal location and sizing of multi dgs. in [28], an improved meta-heuristic method is proposed to maximize the penetration level of multi dgs in rdn. in [29], a novel hybrid technique is proposed to solve the multi objective problem related to the integration of multi cbs and multi dgs in rdn. recently, authors in [30] developed a novel optimizer tool namely ali baba and the forty thieves (aft). the efficiency of this technique validated on many modal and multi benchmark functions [30].results confirmed the particularity of this technique and its ability to solve complex optimization problems. the best of our knowledge there is no application of this technique to solve practical problems related to power system operation and control, otherwise, it is found that the two proposed critical values of the standard algorithm which are responsible to create balance between exploration and exploitation are not generalized and depends on the problem to be solved. the main contributions of this paper compared to existing in the literature are summarized as follows: 1. a novel variant based aft is proposed and successfully applied to solve the power management of practical rdn. 2. the modification introduced in the standard aft algorithm allows the mechanism search to be more flexible and interactive to locate the global solution. 3. the active power of multi dgs units and the reactive power of multi dstatcom devices are optimized in coordination to improve the performance of two standard rdn, the 33-bus, and the 69-bus. 4. obtained results are compared to many recent metaheuristic methods demonstrate the efficiency of the proposed iaft in solving optimal power management of various rdn. optimal power manegement of dgs and dstatcom using improved ali baba and the fourty ... 57 2. formulation of the energy management problem the strategy of power management (pm) consists in improving the performances of modern rdn by optimizing individually or simultaneously several objective functions formulated as follows: 2.1. tpl improvement the objective function associated to minimization of tpl is expressed as follow: 1 _ 1 ( , 1) nbr loss k obj tpl min p k k =   = = +     (1) where, the active and reactive power losses in lines are expressed by the following expressions: 2 2 , 1 , 1 , 12 ( , 1) k k k k loss k k k p q p k k r u + + +  +  + =      (2) 2 2 , 1 , 1 , 12 ( , 1) k k k k loss k k k p q q k k x u + + +  +  + =      (3) 2.2. improvement of loading capacity margin capacity (mc) known also as loading margin stability (lms) of rdn reflects the capability of the rd network to deliver energy quality under sever situations such as faults and load growth. delivering power quality to consumers under this critical situation is a challenge for expert. in such situation, it is mandatory to dispatch optimally the reactive power delivered by the substation and the reactive power to be injected or absorbed by the distributed statcom devices. the lower reactive power delivered by the principal transformer, improves the mc of the rdn. the objective function related to the mc is expressed as follows: obj_2 = max (mc) (4) where, mc is the margin capacity of the rdn. 2.3. minimization of tvd the mathematical expression associated to the minimization of the normalized tvd is formulated as follow: 1 _ 3 min ( ) min ( ) npq des i i obj tvd v v =   = = −     (5) where; vdes is the permissible voltage magnitude, vi is the voltage magnitude reported at load buses, npq is the number of load buses. 58 b. mahdad 2.4. improvement the tvd and the tpl the tpl and the tvd may be two conflict objective functions. for practical planning and operation of rdn, it is mandatory to find the equilibrium balance between tpl and tvd to ensure efficient power quality. this multi objective problem may be solved using the following mathematical expression: _ 4 min ( , ) min ( (1 ) )obj tvd tpl tpl tvd = =  + −  (6) where, α, is a balancing coefficient introduced to find the compromise solution between tpl and tvd. the two weighting coefficients are selected in the range [0 1]. 2.5. operation constraints management 2.5.1. active and reactive power balance to ensure reliable operation of rdn under normal and abnormal conditions, it is mandatory to ensure the following equality constraints:   = == +=+ nl i nbr k klossid ndg i idgslacktr pppp 1 1 ,, 1 ,, (7)   = == +=+ nl i nbr k klossid ndg i idgslacktr qqqq 1 1 ,, 1 ,, (8) 2.5.2. security constraints the security constraints consist of inequality constraints associated to the secure operation of all elements of the rdn. ▪ voltage constraint: the voltage magnitude is an important index of power quality. to satisfy consumers the voltage magnitude must be within security values. nbusivvv iii ,,2,1, maxmin = (9) ▪ dg constraints the active power delivered by the dg units which considered as a control variable must be controlled within specified security limits. ndgippp idgidgidg ,,2,1, m ax ,, m in , = (10) ▪ level of dg integration due to the stochastic and intermittent aspect of various types of dgs, the exchanged of active powers delivered by various dgs such as pv and wind sources must be dispatched within their security range. the penetration level () to satisfy is introduced within the following operation inequality constraint:  ==  nl j jd ndg i idg pp 1 , 1 , (11) where,  is the level of active power penetration in the rdn. optimal power manegement of dgs and dstatcom using improved ali baba and the fourty ... 59 ▪ dstatcom constraints the dstatcom device must be operated within its admissible reactive power limits. m ax ,, m in , istcistcistc qqq  (12) ▪ line current transit in branches: the transit of currents in lines must be controlled without violation their permissible value. nliii lili .....1 m ax = (13) 3. modeling of dstatcom device the dstatcom device is a shunt compensator from the facts family designed principally to regulate the voltage magnitude at specified bus. fig. 1 model of dstatcom device compared to the capacitor bank (cb) and to the svc devices the dstatcom controller consists of a robust characteristic capable to regulate the voltage magnitude at critical situations. the dstatcom devices can regulate the voltages by injecting or absorbing reactive power from the network. fig. 1 shows the basic structure of the dstatcom device. 3. distributed generation for practical installation, as well shown in fig 2, the dgs units are classified on three categories: category 1: this category include al types of dgs units which can only exchange the active power with the network such as the pv sources which have been intensively integrated in many practical electrical networks in world. category 2: this category includes dgs units which can exchange the active power with the network and absorb the reactive power. the wind sources based renewable sources are also integrated in various electrical networks. bus k ±q qstc, max qstc, min bus i pk+jq k pik+j(qik-qstc) 60 b. mahdad category 3: this category include al the dgs units which can exchange the active power with the network and absorb or inject the reactive power. these dgs are efficient which allows to control simultaneously the active power and the reactive powers with the network. in this study an alternative solution is proposed to relieve the main drawback of the pv sources by installing shunt compensator based facts devices such as the dstatcom to ensure flexible control of reactive power in coordination of the active powers. fig. 2 categories of dgs units: a) dgs with only active power control, b) dgs with active power control and only reactive power absorption, c) dgs with active and total reactive power control 4. ali baba and the forty thieves optimizer the aft mimics the human intelligence and interactivities to find the best food’s sources, materials and treasures. the current algorithm is particularly inspired from the famous tale of ali baba and the forty thieves. the following key words summarize the main strategy of the proposed aft [30]: ▪ in the tale of ali baba, the thieves’ behavior tries to find the location of ali baba, so the thieves are the individuals in the search space (environment). ▪ the home of ali baba is the objective function to achieve ▪ ali baba location is considered as the global solution ▪ the forty thieves search within an interactive group, they travel from an initial location and try to find the best location which is the house of ali baba. ▪ marjaneh is considered as an intelligent operator designed to deliver astute ways to protect ali baba. 4.1. modeling of aft optimizer ▪ initial positions of n individuals are generated randomly in the search space characterized by d dimension. bus i p b) dg +q bus i p c) dg +q bus i p a) pv pmin pmax pmax, qmax pmin pmin, qmin pmax, qmax voltage control optimal power manegement of dgs and dstatcom using improved ali baba and the fourty ... 61                 = n d nn d d xxx xxx xxx x .. ..... ..... .. .. 21 22 2 2 1 11 2 1 1 (14) i jx denotes the jth dimension of the ith thief (individual), ( ) i j j j x lb rand ub lb= + − (15) x i, is the position of the ith individual in the search space, ubj, lbj denotes the upper and lower bunds in the jth dimension, ▪ initialize randomly the wit level of marjaneh as follow:                 = n d nn d d mmm mmm mmm m .. ..... ..... .. .. 21 22 2 2 1 11 2 1 1 (16) fitness evaluation: the values of control variables are evaluated during search process based on each thief’s position using the following matrix form. (   ) (   ) (   )                = n d nn n d d xxxf xxxf xxxf f .. ..... ..... .. .. 21 22 2 2 12 11 2 1 11 (17) update locations of thieves: the new locations of thieves can be updated using the following expression: ( ) 1 1 2 3 4 ( ) ( ) sgn ( 0.5) 0.5, a ii i i i it it it it it it it it it x gbest td best y r td y m r rand r r pp +  = + − + − −     (18) where; i itx 1+ denotes the position of the ith thieve at iteration (it+1), i ity is the position of the ali baba at iteration it, tdit is the tracking distance of the thieves at iteration it, ppit is the perception potential of the thieves at iteration it, and ( )a i it m denotes the marjaneh’s intelligence level, the parameter a is defined as: [( 1) ( 1)]a n rand n= − − (19) 62 b. mahdad the tracking distance and the perception potential are formulated as follow: ( ) max1 0 it it etd it − = (20) 0 max0 log ( ) bit itit pp b= 21) update marjaneh astute plane using the following expressions: ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) i i i i it it it it i i i it it it x if f x f m m m if f x f m       =   (22) where, f (.) denotes score of the fitness function. the key steps of the standard aft algorithm [30] 1 input setting variables of aft: pop_size, iter_max, trial_max, dim, ubj, lbj 2 randomly generate initial position, x, of all individuals (thieves) in the search space 3 initialize the best position (best i it) and the global best position (gest i it)for all individuals 4 initialize the intelligence degree of marjaneh with respect to all individuals 5 evaluate the position of all individuals using the appropriate fitness function (f(x)) 6 set it  1 7 while (it < it_max) do 8 calculate tdit using eq.20 9 calculate ppit using eq.21 10 for i = 1,2,...,n do 11 if (rand  0.5) then 12 if (rand  ppit) then 13 update i itx 1+ using eq. 18 14 else update i itx 1+ using eq.15 15 end if 16 else 17 update i itx 1+ using eq.18 18 end if 19 end for 20 for i = 1,2,...,n do 21 check the feasibility of the new position 22 evaluate and update the new position of the individuals (thieves) 23 update the solution best i it and gest i it 24 update  itm using eq. 22 25 end for 26 it=it+1 27 end while 4.2. proposed variant the main contribution of this proposed variant is related to its ability to ensure the interactivity between the exploration phase and the exploitation phase during search process. the following are the modifications introduced to improve the performances of the original algorithm: optimal power manegement of dgs and dstatcom using improved ali baba and the fourty ... 63 the first modification: the standard aft is governed by various parameters to be well carefully identified and adjusted to achieve the near global solution. among these parameters, the tracking distance (td) and the perception potential (pp).the following proposed modeling expressions are suggested to create flexible balance between diversification and intensification. fig. 3 shows the evolution of the proposed tracking distance (td) and perception potential (pp) during search process. fig. 3 evolution of the proposed tracking distance (td) and perception potential (pp) during search process 4.3. analysis methodology the following steps summarize the analysis methodology based iaft designed to solve various single and multi objective functions: 1. read the technical data of the rdn such as the line data, load data, 2. select and specify the objective function 3. introduce the initial parameters of the iaft, such as: population size, generation max, trials. 4. run power flow tool to determine the initial state of the rdn in terms of total power loss, low voltage magnitude, maximum current transit in lines. 5. select preliminary buses to install dgs and dstatcom devices based on sensitivity power index. 6. run iaft to minimize the objective function 7. save the best solution 8. check the convergence condition based on gmax and tmax 9. return the optimized solutions such as the best active of dgs, the reactive power of dstatcom, and the voltage profiles. 64 b. mahdad 5. statistical results analysis 5.1. test 1: rdn 33-bus this first rdn 33-bus consists of 32 lines and 33 buses; the active and reactive power of loads to satisfy is 3.715 mw and 2.300 mvar respectively [2, 31]. fig 4 shows the standard topology of the rdn 33-bus. the performances of the proposed optimizer tool namely iaft is demonstrated via experiencing the following test cases. fig. 4 the topology of the rdn 33-bus 5.1.1. case 1: tpl improvement based dstatcom under normal condition this test case is focused to show the impact of integration only three statcom devices on the performances of rdn with 33-bus. three efficient locations are considered in buses 14, 24 and30. the maximum size of each statcom device is 1 mvar. by considering the voltage limits of all pq buses in the range [0.95 1.05] p.u, the optimized tpl found using the proposed iaft is 126.5868 kw and by considering voltage limits in the range [0.9 1] p.u, the optimal tpl achieved is 132.2102 kw. detailed optimized results related to decision variables of this case are shown in table 1.the results of this case are compared to various metaheuristic methods such as: psga, gsa, sa, ip, fpa, mfo, gwo, dfo, pso, and water cycle algorithm (wsa), it is absolutely clear that the proposed iaft achieves better solution quality. the lowest voltage magnitude is 0.95 p.u reported at bus 18. the convergence behaviour of tpl is shown in fig. 5, it is important to mention that only 5 trails are sufficient to locate the best solution. optimal power manegement of dgs and dstatcom using improved ali baba and the fourty ... 65 fig. 5 convergence characteristic of tpl minimization: case 1 table 1 optimized decision variables based three statcom: case 1: rdn 33-bus methods [3, 5] limits of v (p.u) location of scs scs size (kvar) min voltage (p.u) tpl (kw) tdv (p.u) psga [0.95 1.05] 6 28 29 1200 760 200 0.9463 135.4 gsa [0.95 1.05] 13 15 26 450 800 350 0.9672 134.5 sa [0.95 1.05] 10 14 30 450 900 350 0.9591 151.75 ip [0.95 1.05] 9 29 30 450 800 900 0.9501 171.78 fpa [0.95 1.05] 6 9 30 250 400 950 0.9365 171.78 mfo [0.95 1.05] 8 13 30 450 300 900 0.9400 134.0725 gwo [0.95 1.05] 8 13 30 450 300 900 0.9400 bus 18 134.0725 dfo [0.95 1.05] 8 13 30 450 300 900 0.9400 bus 18 134.0725 pso [0.95 1.05] 8 13 30 450 300 900 0.9400 bus 18 134.0725 wsa [0.95 1.05] 14 24 30 397.3 451.1 1000 0.951 bus 18 130.912a proposed iaft [0.9 1] 14 24 30 361.10 547.20 1043.7 0.9389 bus 18 132.2102 proposed iaft [0.95 1.05] 14 24 30 358.70 541.90 1036.3 0.9601 bus 18 126.5868 0.5129 66 b. mahdad 5.1.2. case 2: tpl improvement based three dgs units under normal condition the main objective of this second test case is to show the impact of integration only three dgs units without considering the reactive power support of shunt compensators based dstatcom devices. the maximum size of each dg unit is 2 mw. it is found that by integrating three dgs at buses 14, 24 and 30, the tpl is reduced to a competitive value 70.6725kw when the voltage magnitude at all pq buses taken in the range [0.95 1.05] p.u, and by considering the limits of voltages at pq buses in the range [0.9 1] p.u, the optimized tpl becomes 71.4572 kw. detailed optimized results of this case are shown in table 2, the obtained results are compared to various competitive metaheuristic methods such as fwa, bfoa, hsa, tm, ga/pso, pso, ga, and water cycle algorithm (wca), it is clearly evident, that the proposed iaft achieves better solution at competitive number of iteration and trials. the lowest voltage magnitude is 0.95 reported at bus 18. the convergence behaviours of tpl are shown in figs 6-7. fig. 6 convergence characteristic of tpl minimization: case 2, vϵ [0.9 1] p.u fig. 7 convergence characteristic of tpl minimization: case 2, vϵ [0.9 1] p.u optimal power manegement of dgs and dstatcom using improved ali baba and the fourty ... 67 table 2 optimized decision variables based three dgs: case 2: rdn 33-bus methods [3, 5, 27] location of dgs dgs size (kw) min voltage (p.u) tpl (kw) tdv (p.u) fwa 14 18 32 589.70 189.00 1014.6 0.968 88.68 bfoa 17 18 33 633.00 90.00 947.00 0.964 98.3 hsa 17 18 33 572.4 107.0 1046.2 0.967 bus 29 96.76 tm 15 25 33 587.6 195.9 783.0 0.958 bus 30 91.305 ga/pso 11 16 32 925.0 863.0 1200 0.980 bus 25 103.4 pso 8 13 32 1176.8 981.60 829.70 0.980 bus 30 105.35 ga 11 29 30 1500.0 422.8.0 1070.0 0.981 bus 25 106.3 wsa 14 24 30 854.60 1101.7 1181.0 0.973 bus 33 71.052 lsf 18 33 25 720 810 900 85.07 fuzzy-ias 32 30 31 2071 1113.8 150.3 117.36 bsoa 13 28 31 632 486 550 89.05 bfoa 14 25 30 779 880 1083 73.53 tlbo 10 24 31 824.6 1031.1 886.2 75.54 qotlbo 12 24 29 880.8 1059.2 1071.4 74.10 simbo-q 14 24 29 763.8 1041.5 1135.2 73.4 qsimbo-q 14 24 30 770.8 1096.5 1065.5 72.8 hho 14 24 30 745.69 1022.69 1135.78 72.98 ihho 14 24 30 775.54 1080.83 1066.69 72.79 proposed iaft 14 24 30 754.00 1099.7 1071.4 0.9687 bus 33 71.4572 0.5872 proposed iaft 14 24 30 748.90 884.60 1072.3 0.9771 bus 33 70.6726 0.3224 68 b. mahdad 5.1.3. case 3: tpl improvement based dgs units and dstatcom under normal condition in this test case, three dgs and three dstatcom are integrated at buses 14, 24 and 30. the proposed iaft is designed to optimize the amount of active powers of dgs and the reactive powers of dstatcom to be exchanged with the electric network. the optimal tpl achieved is 11.60 kw which is significantly improved compared to the last two cases and also compared to the results achieved using many recent methods such as, bfoa, wca, tsa, itsa, egwa, mrfa, and jfsa. details of optimized control variables are depicted in table 3. the convergence characteristics for tpl minimization under two levels of penetration (76.72 % and 74.47 %) of dgs are shown in figs 8-9, respectively, the lowest voltage magnitude is reported at bus 8. fig. 8 convergence behavior of tpl improvement considering 3 dgs and 3 dstatcom devices: penetration level=76.72 %: rdn 33-bus fig. 9 convergence behavior of tpl minimization considering 3 dgs and 3 dstatcom devices, penetration level= 74.47 %: rdn 33-bus optimal power manegement of dgs and dstatcom using improved ali baba and the fourty ... 69 table 3 optimized decision variables based three statcom and three dgs units: case 3: rdn 33-bus methods [4] dgs penetration level % limits of v (p.u) location of dgs dgs size (kw) location of scs scs size (kvar) min voltage (p.u) tpl (kw) tdv (p.u) bfoa 42.98 [0.95 1.05] 17 18 33 542 160 895 18 33 30 163 338 541 0.9783 41.41 wca 68.61 [0.95 1.05] 25 29 11 973 1040 536 23 30 14 465 565 535 0.98 24.68 tsa 71.57 [0.95 1.05] 24 30 12 766 917 976 30 11 24 1060 246 566 nr 15.0 itsa 70.39 [0.95 1.05] 13 25 30 788 742 1085 7 15 30 603 269 834 nr 14.4 egwa 76.096 [0.95 1.05] 24 14 30 1094.96 767.74 964.22 25 14 30 388.75 334.77 1189.91 0.9924 12.7 mrfa 78.5 [0.95 1.05] 13 24 30 803 1073 1040 14 24 30 300 600 900 0.992 12.572 jfsa 77.6 [0.95 1.05] 14 24 30 748 1079 1056 14 24 30 300 600 900 0.992 12.40 proposed iaft 76.72 [0.9 1] 14 24 30 743.9 1066.7 1039.7 14 24 30 348.2 510.8 1014.6 bus 8 11.60 0.1284 proposed iaft 74.47 [0.9 1] 14 24 30 771.7 999.6 995.4 14 24 30 314.7 610.5 1076.5 bus 8 12.01653 0.1287 mc 1 5.1.4. case 4: improvement tpl and margin capacity based dgs and dstatcom devices this test case is dedicated to improve the technical performances of rdn under critical situation at loading margin satiability. the tpl is optimization in coordination with the lc. for fair comparison with the third test case, three dstatocm and three dgs units are integrated on three optimal locations (14-24-30). the tpl optimized and the mc achieved are 71.311 kw and 2.2 p.u, respectively, the corresponding voltage deviation becomes 0.2889 p.u. the minimum voltage magnitude obtained is 0.9812 p.u reported at bus 8. table 4 shows the values of optimized decision variables such as the table 4 optimized decision variables obtained by optimizing the tpl and mc of the rdn 33-bus methods dgs penetration level % limits of v (p.u) location of dgs dgs size (kw) location of scs scs size (kvar) min voltage (p.u) tpl (kw) tdv (p.u) proposed iaft [0.9 1] 14 24 30 1786.10 1999.90 1999.90 14 24 30 607.30 1666.70 2661.20 0.9812 bus 8 71.3110 0.2889 mc 2.2 70 b. mahdad active powers of dgs units and the reactive powers delivered by the statcom devices. the convergence behavior of tpl under loading margin stability is shown in fig. 10. fig. 10 convergence behaviour of tpl improvement under mc maximization considering 3 dgs and 3 dstatcom devices 5.2. test 2: rdn 69-bus the proposed iaft is also validated on a medium rdn, the 69-bus. all data of this second test system are given in [2, 31]. this second test system consists of 69 bus and 68 branches, with 12.66 kv, the total apparent power to satisfy to loads is (3.8+j2.69) mva. the exploitation states of this test system at normal condition without integration of compensators and without installation of dgs are: the total power loss 224.95 kw and the low voltage magnitude is 0.9092 (p.u) reported at bus 65. the one line representation of the rdn 69-bus is shown in fig 11. fig. 11 topology of the rdn 69-bus optimal power manegement of dgs and dstatcom using improved ali baba and the fourty ... 71 5.2.1. case 5 for fair comparison with other recent technique, this test case is focused to minimize the tpl at normal condition. three dgs and three statcom devices are optimally integrated at efficient locations (bus 11, bus 18 and bus 61). the sizes of dgs and statcom devices are 2 mw, and 1.5 mvar, respectively. the obtained optimized variables such as the active power of dgs and the reactive power of the three statcom devices are recapitulated in table 5. the best tpl achieved using iaft is 4.2693kw, which is better than several recent techniques such as: tsa, sma, cso, itas, and jfa. the convergence behaviour of the iaft for tpl minimization is shown in fig 12, the distribution of voltage profile is shown in fig 13. it is absolutely clear, that the proposed variant namely iaft achieves the best solution quality, at a reduced time. for this test system, the population size is 10, and the maximum number of iteration is 40. table 5 comparison of optimized decision variables obtained using iaft and other techniques: case 5: rdn 69-bus methods [5] dgs penetration level % limits of v (p.u) location of dgs dgs size (kw) location of scs scs size (kvar) min voltage (p.u) tpl (kw) tsa 65.97 [0.95 1.05] 9 16 61 452 555 1500 21 53 61 299 605 1148 6.9 sma 58.78 [0.95 1.05] 16 30 61 497 112 1625 2 13 61 708 623 1091 9.0053 cso 67.42 [0.95 1.05] 17 71 67 535 1728 299 61 67 68 1367 311 323 7.5488 itsa 60.05 [0.95 1.05] 10 12 61 291 491 1500 9 23 61 288 292 1149 0.9944 6.8012 jfsa 67.05 [0.95 1.05] 11 18 61 495 379 1674 18 51 61 300 300 1200 0.994 4.6826 proposed iaft 67.15 [0.95 1.05] 11 18 61 498.8 379.3 1673.9 11 18 61 365.7 249.8 1196.3 0.9943 bus 50 4.2693 fig. 12 convergence behaviour of tpl improvement considering 3 dgs and 3 dstatcom devices: rdn 69-bus 72 b. mahdad fig. 13 voltage profile after integration of three dgs and three dstatcom: rdn 69-bus 5.3. statistical analysis the performances of the proposed variant are demonstrated by elaborating a statistical analysis. the mean, the max and the standard deviation (sd) are the three well known statistical indexes used largely to identify the advantages and the drawbacks of many metaheuristic optimizers, for the first analysis five trials, and ten trials are elaborated. for all accomplished test cases, the maximum number of iterations is fixed to 40, and the population size is taken 10, the sd achieved for 10 trials is 4.3546e-6 which is remarkably better than the sd associated to a new metaheuristic namely jfsa (0.7146). fig. 14 shows the convergence characteristics for tpl achieved for 10 trails; however the evolution of the optimized value tpl for 10 trials and 5 trials are shown in figs.15-16, respectively. it is evident that the global solution achieved at a reduced number of trials. table 6, depicts the statistical values achieved by using the proposed variant namely iaft. fig. 14 convergence behavior of tpl minimization for 10 trials; pop_size=10 optimal power manegement of dgs and dstatcom using improved ali baba and the fourty ... 73 fig. 15 values of optimized tpl for10 trials: pop_size=10 table 6 robustness evaluation of optimized results: case 1: rdn 33-bus scenario 1: 3 dstatcom and 3 dgs methods pop_size max_it limits of v (p.u) min_tpl (kw) mean_tpl (kw) max_tpl (kw) std trials base case 316.2 jfsa [0.95 1.05] 12.4002 13.1092 15.1889 0.7146 proposed iaft 5 40 [0.9 1] 11.6366 12.0971 13.9252 0.001 5 proposed iaft 10 40 [0.9 1] 11.63636 11.6386 11.64734 4.3546e-06 10 mc 1 fig. 16 convergence behavior of tpl for 5 trials: pop_size=10 74 b. mahdad 6. conclusion in this current study, a new variant namely ifta is successfully adapted and applied to solve with accuracy the optimal location and setting of multi dgs units and multi shunt compensators based dstatcom devices. to improve the technical performances of rdn, two objective functions are optimized, the tpl and the loading margin capacity of the rdn. the tpl decreased to a competitive value 11.6473 kw when considering both three dgs and three dstatcom devices. the loading capacity of the rdn 33-bus is optimized to 2.2 without affecting the operation constraints. otherwise, the particularity of the proposed strategy is also demonstrated in optimizing the active and reactive powers of dgs and dstatcom devices considering the uncertainties in loads for 24 hours. it has been clearly demonstrated that the proposed power management strategy based ifta almost gives better results in terms of solution quality and convergence behavior compared to many recent optimization algorithms. a statistical analysis demonstrated that for the rdn 33-bus, only 5 trials are sufficient to locate the near global solution, as a result the average execution time required will be reduced at a competitive value. the proposed metaheuristic variant namely ifta may be considered as a competitive optimizer tool to solve various power management problems of large rdn. in future work, the application of the proposed optimizer tool based ifta will be adapted to solve the stochastic multi objective power management considering various types of facts devices and renewable sources. references [1] b. mahdad, "a novel tree seed algorithm for optimal reactive power planning and reconfiguration based statcom devices and pv sources", sn applied sciences, vol. 3, id. 336, 2021. [2] b. mahdad, "novel adaptive sine cosine arithmetic optimization algorithm for optimal automation control of dg units and statcom devices", smart science, 2022. [3] a. selim, s. kamel, f. jurado, "capacitors allocation in distribution systems using a hybrid formulation based on analytical and two metaheuristic optimization techniques", computers and electrical engineering, vol. 85, 106675, 2020. [4] a. a. a. el-ela, r. a. el-sehiemy, and a. s. abbas, "optimal placement and sizing of distributed generation and capacitor banks in distribution systems using water cycle algorithm", ieee systems journal, pp.1-8, 2018. [5] a. m. shaheen, a. m. elsayed, a. r. ginidi, e. e. elattar, "effective automation of distribution systems with joint integration of dgs/ svcs considering reconfiguration capability by jellyfish search algorithm", ieee access, vol. 9, pp. 92053-92069, 2021. [6] a. a. z. diab, h. rezk, "optimal sizing and placement of capacitors in radial distribution systems based on grey wolf, dragonfly and moth–flame optimization algorithms", iran j sci technol trans electr eng, vol. 43, pp. 77-96, 2019. [7] m. dehghani, z. montazeri, o. p. malik, "optimal sizing and placement of capacitor banks and distributed generation in distribution systems using spring search algorithm", international journal of emerging electric power systems, id. 20190217, 2020. [8] m. h. moradi, m. abedini, "a combination of genetic algorithm and particle swarm optimization for optimal distributed generation location and sizing in distribution systems with fuzzy optimal theory", international journal of green energy, vol. 9, pp. 641-660, 2012. [9] f. e. riaño, j. f. cruz, o. d. montoya, h. r. chamorro and l. alvarado-barrios, "reduction of losses and operating costs in distribution networks using a genetic algorithm and mathematical optimization", electronics, vol. 10, no. 4, id. 419, p. 25, 2021. [10] e. a. al-ammar, k. farzana, a. waqar, m. aamir, saifullah, a. u. haq, m. zahid, m. batool, "abc algorithm based optimal sizing and placement of dgs in distribution networks considering multiple objectives", ain shams engineering journal, vol. 12, pp. 697-708, 2021. optimal power manegement of dgs and dstatcom using improved ali baba and the fourty ... 75 [11] s. rezaeian-marjani, s. galvani, v. talavat, m. farhadi-kangarlu, "optimal allocation of d-statcom in distribution networks including correlated renewable energy sources", electrical power and energy systems, vol. 122, id. 106178, 2020. [12] a. a. ogunsina, m. o. petinrin, o. o. petinrin, e. n. offornedo, j. o. petinrin, g. o. asaolu, "optimal distributed generation location and sizing for loss minimization and voltage profile optimization using ant colony algorithm", sn applied sciences, vol. 3, id. 248, 2021. [13] k. balu, v. mukherjee, va novel quasi-oppositional chaotic harris hawk’s optimization algorithm for optimal siting and sizing of distributed generation in radial distribution system", neural processing letters, vol. 54, pp. 4051-4121, 2022. [14] m. h. ali, s. kamel, m. h. hassan, m. tostado-véliz, h. m. zawbaa, "an improved wild horse optimization algorithm for reliability based optimal dg planning of radial distribution networks", energy reports, vol. 8, pp. 582-604, 2022. [15] a. p. hota, s. mishra, "active power loss allocation in radial distribution networks with different load models and dgs", electric power systems research, vol. 205, id. 107764, 2022. [16] t. t. nguyen, t. t. nguyen, m. q. duong, "an improved equilibrium optimizer for optimal placement of photovoltaic systems in radial distribution power networks", neural computing and applications, vol. 34, pp. 6119-6148, 2022. [17] s. m. abd-elazim, e.s. ali, "imperialist competitive algorithm for optimal statcom design in a multimachine power system", electrical power and energy systems, vol. 76, pp. 136-146, 2016. [18] s. m. abd-elazim, e.s. ali, "optimal location of statcom in multimachine power system for increasing loadability by cuckoo search algorithm", electrical power and energy systems, vol. 80, pp. 240-251, 2016. [19] e. s. ali, s. m. abd elazim, a. y. abdelaziz, "optimal allocation and sizing of renewable distributed generation using ant lion optimization algorithm", electr eng, vol. 16, no. 1, pp. 445-458, 2016. [20] t. t. nguyen, k. h. le, t. m. phan, and m. q. duong, "an effective reactive power compensation method and a modern metaheuristic algorithm for loss reduction in distribution power networks", hindawi, complexity, vol. 2021, id. 8346738, p. 21, 2021. [21] s. kumar, k. k. mandal and n. chakraborty, "a novel opposition-based tuned-chaotic differential evolution technique for technoeconomic analysis by optimal placement of distributed generation", engineering optimization, vol. 52, no. 2, pp. 303-324, 2019. [22] a. eid, s. kamel & e. h. houssein, "an enhanced equilibrium optimizer for strategic planning of pv-bes units in radial distribution systems considering time-varying demand", neural computing and applications, vol. 34, pp. 17145–17173, 2022. [23] h.-j. wang, j.-s. pan, t.-t. nguyen, s. weng, "distribution network reconfiguration with distributed generation based on parallel slime mould algorithm", energy, vol. 244, part b, id. 123011, 2022. [24] g. v. n. lakshmi, a. jayalaxmi & v. veeramsetty, "optimal placement of distribution generation in radial distribution system using hybrid genetic dragonfly algorithm", technology and economics of smart grids and sustainable energy, vol. 6, id. 9, 2021. [25] m. sodani, h. h. aly, t. a. little, "optimal planning of distributed generation using improved grey wolf optimizer and combined power loss sensitivity", in proceedings of the 2021 ieee canadian conference on electrical and computer engineering (ccece), 2021, id. 21380949. [26] t. t. nguyen, t. d. pham, l. c. kien, and l. v. dai, "improved coyote optimization algorithm for optimally installing solar photovoltaic distribution generation units in radial distribution power systems", complexity, vol. 2020, p. 34, id. 1603802, 2020. [27] a. selim, s. kamel, a. s. alghamdi, and f. jurado, "optimal placement of dgs in distribution system using an improved harris hawks optimizer based on singleand multi-objective approaches", ieee access, vol. 8, pp. 52815-52829, 2020. [28] k. h. truong, p. nallagownden, i. elamvazuthi, d. n. vo, "an improved meta-heuristic method to maximize the penetration of distributed generation in radial distribution networks", neural computing and applications, vol. 32, no. 1, 2019. [29] c. venkatesan, r. kannadasan, m. h. alsharif, m.-k. kim, and j. nebhen, "a novel multiobjective hybrid technique for siting and sizing of distributed generation and capacitor banks in radial distribution systems", sustainability, vol. 13, no. 6, id. 3308, 2021. [30] m. braik, m. h. ryalat, h. al-zoubi, "a novel meta-heuristic algorithm for solving numerical optimization problems: ali baba and the forty thieves", neural computing and applications, vol. 34, pp. 409-455, 2022. [31] r. d. zimmerman, c. e. murillo-sanchez, r. j. thomas, "matpower: steady-state operations, planning and analysis tools for power systems research and education". ieee trans power syst, vol. 26, pp. 12-19, 2011. https://link.springer.com/article/10.1007/s11063-022-10800-1#auth-korra-balu https://link.springer.com/article/10.1007/s11063-022-10800-1#auth-v_-mukherjee https://link.springer.com/journal/11063 https://www.sciencedirect.com/science/article/pii/s235248472101461x#! https://www.sciencedirect.com/science/article/pii/s235248472101461x#! https://www.sciencedirect.com/science/article/pii/s235248472101461x#! https://www.sciencedirect.com/science/article/pii/s235248472101461x#! https://www.sciencedirect.com/science/article/pii/s235248472101461x#! https://www.sciencedirect.com/journal/energy-reports https://www.sciencedirect.com/journal/energy-reports/vol/8/suppl/c https://www.sciencedirect.com/science/article/abs/pii/s0378779621007458#! https://www.sciencedirect.com/science/article/abs/pii/s0378779621007458#! https://www.sciencedirect.com/journal/electric-power-systems-research https://www.sciencedirect.com/journal/electric-power-systems-research/vol/205/suppl/c https://link.springer.com/article/10.1007/s00521-021-06779-w#auth-thang_trung-nguyen https://link.springer.com/article/10.1007/s00521-021-06779-w#auth-thuan_thanh-nguyen https://link.springer.com/article/10.1007/s00521-021-06779-w#auth-minh_quan-duong https://link.springer.com/journal/521 https://link.springer.com/article/10.1007/s00521-022-07364-5#auth-ahmad-eid https://link.springer.com/article/10.1007/s00521-022-07364-5#auth-salah-kamel https://link.springer.com/article/10.1007/s00521-022-07364-5#auth-essam_h_-houssein https://link.springer.com/journal/521 https://www.sciencedirect.com/science/article/abs/pii/s0360544221032606#! https://www.sciencedirect.com/science/article/abs/pii/s0360544221032606#! https://www.sciencedirect.com/science/article/abs/pii/s0360544221032606#! https://www.sciencedirect.com/science/article/abs/pii/s0360544221032606#! https://link.springer.com/article/10.1007/s40866-021-00107-w#auth-g__v__naga-lakshmi https://link.springer.com/article/10.1007/s40866-021-00107-w#auth-a_-jayalaxmi https://link.springer.com/article/10.1007/s40866-021-00107-w#auth-venkataramana-veeramsetty https://link.springer.com/journal/40866 https://link.springer.com/journal/40866 https://ieeexplore.ieee.org/author/37089005058 https://ieeexplore.ieee.org/author/38340591500 https://ieeexplore.ieee.org/author/37289727900 https://ieeexplore.ieee.org/xpl/conhome/9569025/proceeding https://ieeexplore.ieee.org/xpl/conhome/9569025/proceeding https://www.hindawi.com/journals/complexity/ 11176 facta universitatis series: electronics and energetics vol. 36, no 2, june 2023, pp. 253-266 https://doi.org/10.2298/fuee2302253m © 2023 by university of niš, serbia | creative commons license: cc by-nc-nd original scientific paper the fredkin gate in reversible and quantum environments claudio moraga1, fatima z. hadjam2 1technical university of dortmund, dortmund, germany 2university of djillali liabes, sidi bel abbes, algeria abstract. reversible computing circuits are characterized by low power consumption and their proximity to circuits for quantum computing. the fredkin gate was one of the earliest proposed controlled reversible circuits, which however, was soon superseded by the toffoli gate, the not, and cnot gates, which constituting a flexible functionally complete set could also realize the fredkin gate as a building block. in quantum computing circuits, the fredkin gate (under the name controlled swap) plays an important role regarding the superposition of states. the present paper studies extensions of the fredkin gate in terms of mixed polarity in the reversible domain and an application in quantum computing. keywords: fredkin gate, reversible circuits, quantum computing circuits 1. introduction the earliest contributions to the development of reversible computing circuits may be traced back to e. fredkin [6] and t. toffoli [26] who introduced the first controlled reversible gates. a reversible gate realizes a bijection. therefore, it does not lose information. if the outputs are known, then the inputs may be precisely recovered. the realization of reversible circuits as fanout-free and feedback-free cascades of reversible gates was stimulated by r. landauer‘s theorem [9] stating that erasing or deleting information in a circuit produces heat dissipation. moreover, c. bennet [2] showed that a computer could work with low power dissipation if all its circuits would be reversible. since the early times most work on the synthesis of reversible circuits has been based on the set of gates {not, cnot, toffoli}, known as nct, which is functionally complete. the symbols and functionality of these gates in a common environment of two (eventually) controlling lines and a target line are shown in fig. 1. received october 05, 2022; revised january 13, 2023; accepted january 18, 2023 corresponding author: claudio moraga technical university of dortmund, 44221 dortmund, germany e-mail: claudio.moraga@tu-dortmund.de received september 29, 2022; revised december 11, 2022; accepted january 06, 2023 254 c. moraga, f. z. hadjam fig. 1 symbols and functionality for the reversible gates not, controlled not and toffoli the not gate (represented with an exor symbol) is not controlled and acts directly on its target line. in the cnot and toffoli gates, control signals and target signals are distinguished. the not component acts on the target signal whenever the control signals have the value 1. black dots identify which signal or signals are controlling the not component. the design of minimal boolean circuits is known to be np-complete, and the design of minimal reversible circuits is np-hard [20]. this has led to the development of different heuristics to synthesize reversible circuits [4], [17], [21], [24], [25], and also to postprocessing strategies to improve the minimization of circuits [19], [23]. in the last 25 years there have been important developments that have contributed to the improvement of the synthesis methods. among the most relevant from the hardware point of view are the increasing speed of computers and the increasing size of memories. this has opened the possibility of including search [12], evolutionary algorithms [7], [8], [10] or sat solvers [17] for the synthesis of reversible/quantum circuits. at the software side, the development of specialized efficient libraries may be mentioned. at the level of gates, both the use of the value 0 for control signals identified by “white dots” [23], [14], frequently referred to as “mixed polarity”, and the use of disjoint control signals [13], [15] may be mentioned. in what follows, the “generalization” of fredkin gates in the reversible domain and a relevant application in the quantum domain will be analyzed. it may be mentioned that in [5] the term “generalized fredkin gate” is used, referring to fredkin gates with multiple control lines. 2. the reversible domain it is not known whether the fredkin gate ever had an own representation symbol (other than a box with three inputs and three outputs). possibly a first symbolic representation was introduced in [11], which has been later replaced by the symbol used in circuits for quantum computing, as in e.g. [5]. in the literature this gate appears frequently as a toffoli and cnots building block, as shown in fig. 2 (left). in what follows, this building block will frequently be called simply fredkin “gate” and will be used to illustrate the effects of mixed polarity. at the output side, variables will have a prime apostroph sign. (complemented variables, on the other hand, will have a dash over their names, as frequently used in switching theory.) in the barenco et al. based quantum model [1], fig. 2 (right), a white box represents a v gate, whose functionality equals the square root of not and the box with a diagonal represents the adjoint of v. fig. 2 representation of the fredkin gate as an nct building block with positive control (left) and its barenco et al. based quantum model (right) the fredkin gate in reversible and quantum environments 255 the functionality of the building block representing the fredkin gate is given by: 𝑐3 ′ = 𝑐3 ⨁ 𝑐1(𝑐2 ⨁ 𝑐3) = 𝑐3 ⨁ 𝑐1𝑐2 ⨁ 𝑐1𝑐3 = 𝑐1𝑐2 ⨁ 𝑐1̅𝑐3, 𝑐2 ′ = 𝑐2 ⨁ 𝑐3 ⨁ 𝑐3 ′ = 𝑐2 ⨁ 𝑐3 ⨁ 𝑐1𝑐2 ⨁ 𝑐1̅𝑐3 = 𝑐1̅𝑐2 ⨁ 𝑐1𝑐3, (1) 𝑐1 ′ = 𝑐1 . equations (1) may be expressed in the following summary: c1 = 0 c1 = 1 c2’ c2 c3 c3’ c3 c2 the tableau expressed in words means that whenever c1 has the value 0, the fredkin gate behaves as an identity, whereas when c1 has the value 1, then the target signals c2 and c3 are exchanged. this means that the fredkin gate behaves as a “controlled swap” although this name is hardly used in the community working on reversible circuits. an important property of the fredkin gate is its completeness.in (1), let c2 = 1. then: c2’ = 𝑐1̅ ⊕ 𝑐1𝑐3 = 1 ⊕ 𝑐1 ⨁ 𝑐1𝑐3 = 1 ⨁ 𝑐1𝑐3̅ = 𝑐1̅ ∨ 𝑐3 = 𝑐1 → 𝑐3 on the other hand, if for some x 𝑐1 → (𝑥 → 0) then 𝑐1 → (�̅� ∨ 0) = 𝑐1 → (�̅�) = �̅�1 ∨ �̅� = 𝑐1𝑥̅̅ ̅̅ = nand(𝑐1, 𝑥). since nand is functionally complete, so is also fredkin complete. a different formal representation of the fredkin gate, appropriate to determine e.g. the performance of (simulated) circuits [28], is that of a transfer matrix. c1 c2 c3 c1’c2’c3’ [ 1 0 0 1 0 0 0 0 0 0 0 0 1 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 1 0 0 1 0 0 0 0 0 1] ∙ [ 0 0 0 0 0 1 0 1 0 0 1 1 1 0 0 1 0 1 1 1 1 1 0 1] = [ 0 0 0 0 0 1 0 1 0 0 1 1 1 0 0 1 1 0 1 1 0 1 1 1] . (2) equation (2) shows a fredkin-matrix as a blockdiag(i4, swap), where i4 denotes the 4⨯4 identity matrix. it becomes clear that the product of the fredkin-matrix with a matrix of all possible inputs preserves c1, c2 and c3 when c1 = 0, since the i4 submatrix is active and c1’c2’c3’ = c1 c3 c2 when c1 = 1, since then the swap submatrix is active. if a white dot is placed on the c1 line of the original fredkin gate, it is fairly obvious, that this will change the polarity of the control. the gate will become active when c1 = 0 and it will remain inhibited, behaving as an identity, when c1 = 1. the circuit and the quantum model of a fredkin gate, which is active when the control signal c1 has the value 0 is shown in fig. 3, where the quantum model uses the same twoqubit-gates as in fig. 2, but in a different order. = 256 c. moraga, f. z. hadjam fig. 3 a fredkin gate with negative control and its quantum model since the quantum cost of a reversible gate is obtained as the gate count of the barenco et al. quantum model of the gate [22], [27], it becomes apparent that fredkin gates with positive or negative control have the same quantum cost of 7. a set of new behaviors is obtained if white dots are introduced in the lines c2 or c3, indicating that a signal is effective if it has the value 0. a first pair of equivalent variations is shown in fig. 4. fig. 4 fredkin gate equivalent variations the functionality of the building block at the left of fig. 4 is given by: 𝑐3 ′ = 𝑐3 ⊕ 𝑐1(𝑐3̅ ⊕ 𝑐2) = 𝑐3 ⊕ 𝑐1𝑐3̅ ⊕ 𝑐1𝑐2 = 1 ⊕ 𝑐3̅ ⊕ 𝑐1𝑐3̅ ⊕ 𝑐1𝑐2 = = 1 ⊕ 𝑐1̅𝑐3̅ ⊕ 𝑐1𝑐2, (3) 𝑐2 ′ = (𝑐3 ⊕ 𝑐2) ⊕ 𝑐3 ′ = 1 ⊕ 𝑐3 ⊕ 𝑐1̅𝑐3̅ ⊕ 𝑐2 ⊕ 𝑐1𝑐2 = 𝑐1̅𝑐2 ⊕ 𝑐1𝑐3̅, 𝑐1 ′ = 𝑐1 . equations (3) may be summarized as follows: in words: if c1 = 0 the building block behaves as an identity and if c1 = 1 both targets will be exchanged and complemented. it is straightforward to show that the gate at the right of fig. 4 has the same functionality. further variations, and eventually their equivalent nct reversible circuits are analyzed below. fig. 5 second fredkin gate variation v v c1 c2 c3 c1‘ c2‘ c3‘ the fredkin gate in reversible and quantum environments 257 the functionality of the building block of fig. 5 is given by: 𝑐3 ′ = 𝑐3 ⊕ 𝑐1(𝑐2 ⊕ 𝑐3̅) = 1 ⊕ 𝑐3̅ ⊕ 𝑐1𝑐2 ⊕ 𝑐1𝑐3̅ = 𝑐1𝑐2 ⊕ 𝑐1̅𝑐3̅ ⊕ 1, 𝑐2 ′ = 𝑐2 ⊕ 𝑐3̅ ⊕ 𝑐3 ′ = 𝑐2 ⊕ 𝑐1𝑐2 ⊕ 𝑐3̅ ⊕ 𝑐1̅𝑐3̅ ⊕ 1 = 𝑐1̅𝑐2 ⊕ 𝑐1𝑐3̅ ⊕ 1, (4) 𝑐1 ′ = 𝑐1. equations (4) may be summarized as follows: in words: if c1 = 0 then c2 will be complemented and c3 will be preserved. if c1 = 1 then the complement of c2, and c3 will be swapped. a third variation is shown in fig. 6. fig. 6 third variation of the fredkin gate the functionality of the building block of fig. 6 is given by: 𝑐3 ′ = 𝑐3 ⨁ 𝑐1(𝑐2 ⨁ 𝑐3) = 𝑐1𝑐2 ⨁ 𝑐1𝑐3 ⨁ 𝑐3 = 𝑐1𝑐2 ⨁ 𝑐1̅𝑐3, 𝑐2 ′ = (𝑐2 ⊕ 𝑐3) ⊕ (𝑐3 ′ ⊕ 1) = 𝑐2 ⊕ 𝑐3 ⊕ 𝑐1𝑐2 ⨁ 𝑐1̅𝑐3 ⊕ 1 = (5) = 𝑐1̅𝑐2 ⊕ 𝑐1𝑐3 ⊕ 1, 𝑐1 ′ = 𝑐1 . equations (5) may be summarized as follows: in words: if c1 = 0 then c3 will be preserved, but c2 will be complemented. if c1 = 1 then the signal c3 will be complemented and swapped with c2. from all former variations follows that a variation comprising two white dots at the bottom and a white dot at the center of the middle line will have the same functionality as the original fredkin gate. this is illustrated in fig. 7. fig. 7 equivalent fredkin gates 258 c. moraga, f. z. hadjam for the same reasons, the following variations on the fredkin gate are equivalent, as illustrated in fig. 8. fig. 8 further equivalence of fredkin variations an additional way of introducing variations on the fredkin gate consists of replacing the classical toffoli gate with a disjunct controlled toffoli gate [13], [15] possibly called “or-toffoli”. for this gate, up-side-down triangles –(black or white)– are used instead of dots to identify the effectivity of driving control signals with value 1 and 0, respectively. up-side-down triangles are used, based on their similarity with “”, the disjunction symbol in mathematical logic. a different variation, based on the or-toffoli gate, is shown in fig. 9 and may be called “or-fredkin”. notice that the quantum model based on [1] does not use an adjoint v gate, but otherwise it uses the same gates as in the quantum model of the classical fredkin gate. therefore, it has the same quantum cost, 7. fig. 9 the or-fredkin gate and its quantum model the functionality of the building block of fig. 9 (left) is given by: 𝑐3 ′ = 𝑐3 ⊕ (𝑐1 ∨ (𝑐2 ⊕ 𝑐3)) = 𝑐3 ⊕ 𝑐1 ⊕ 𝑐2 ⊕ 𝑐3 ⊕ 𝑐1(𝑐2 ⊕ 𝑐3) = = 𝑐1 ⊕ 𝑐2 ⊕ 𝑐1(𝑐2 ⊕ 𝑐3) = 𝑐1 ⊕ 𝑐2 ⊕ 𝑐1𝑐2 ⊕ 𝑐1𝑐3 = 𝑐1̅𝑐2 ⊕ 𝑐1𝑐3̅, (6) 𝑐2 ′ = 𝑐3 ⊕ 𝑐2 ⊕ 𝑐3 ′ = 𝑐3 ⊕ 𝑐2 ⊕ 𝑐1̅𝑐2 ⊕ 𝑐1𝑐3̅ = 1 ⊕ 𝑐1̅𝑐3̅ ⊕ 𝑐1𝑐2, 𝑐1 ′ = 𝑐1. equations (6) may be summarized as follows: c1 = 0 c1 = 1 c2’ 𝑐3 𝑐2̅ c3’ 𝑐2 𝑐3̅ an equivalent nct circuit may be obtained, as shown in fig. 10, where the two gates in the middle have the functionality of the or-toffoli gate. fig. 10 an nct equivalent circuit for the or-fredkin gate the fredkin gate in reversible and quantum environments 259 another variation is illustrated in fig. 11, where for an or-toffoli gate, a white dot at the input side is introduced. fig. 11 a simple variation of the or-fredkin gate the functionality of the building block of fig. 11 is given by: 𝑐3 ′ = 𝑐3 ⊕ (𝑐1 ∨ (𝑐2 ⊕ 𝑐3̅)) = 𝑐3 ⊕ (𝑐1 ⊕ (𝑐2 ⊕ 𝑐3̅) ⊕ 𝑐1(𝑐2 ⊕ 𝑐3̅)) = = 𝑐3 ⊕ 𝑐1 ⊕ 𝑐2 ⊕ 𝑐3̅ ⊕ 𝑐1(𝑐2 ⊕ 𝑐3̅) = 1 ⊕ 𝑐2 ⊕ 𝑐1(𝑐2 ⊕ 𝑐3) = = 1 ⊕ 𝑐1̅𝑐2 ⊕ 𝑐1𝑐3 , (7) 𝑐2 ′ = 𝑐2 ⊕ 𝑐3̅ ⊕ 𝑐3 ′ = 𝑐2 ⊕ 𝑐3̅ ⊕ 1 ⊕ 𝑐1̅𝑐2 ⊕ 𝑐1𝑐3 = 𝑐1𝑐2 ⊕ 𝑐1̅𝑐3 , 𝑐1 ′ = 𝑐1. equations (7) may be summarized as follows: c1 = 0 c1 = 1 c2’ 𝑐3 𝑐2 c3’ 𝑐2̅ 𝑐3̅ in words: if c1 = 0 then c2 will be complemented and swapped with c3, whereas if c1 = 1 there will be no swapping, but c3 will be complemented. fig. 12 shows an equivalent nct circuit for the modified fredkin gate of fig. 11. fig. 12 equivalent (more complex) nct circuit for the gate of fig. 11 two equivalent variations are shown in fig. 13, with a different distribution of the white dots/triangle. fig. 13 equivalent variations of the or-fredkin gate the functionality of the building block at the left of fig. 13 is given by: 𝑐3 ′ = 𝑐3 ⊕ 𝑐1 ∨ (𝑐2 ⊕ 𝑐3) = 𝑐1 ⊕ 𝑐2 ⊕ 𝑐1𝑐2 ⊕ 𝑐1𝑐3 = 𝑐1̅𝑐2 ⊕ 𝑐1𝑐3̅ , 𝑐2 ′ = 𝑐2 ⊕ 𝑐3 ⊕ 𝑐3 ′ ⊕ 1 = 𝑐2 ⊕ 𝑐3̅ ⊕ 𝑐1̅𝑐2 ⊕ 𝑐1𝑐3̅ = 𝑐1𝑐2 ⊕ 𝑐1̅𝑐3̅ , (8) 𝑐1 ′ = 𝑐1. 260 c. moraga, f. z. hadjam equations (8) may be summarized as follows: c1 = 0 c1 = 1 c2’ 𝑐3̅ 𝑐2 c3’ 𝑐2 𝑐3̅ in words: c3 will be complemented and if c1 = 0 then it will be swapped with c2. if c1 = 1 then no swapping takes place. it is simple to show that the circuit shown at the right of fig. 13 has the same functionality. an nct circuit equivalent to the variation, is shown in fig. 14. it may be seen that its quantum cost [22], [27] and depth is higher by 1 with respect to the variations in fig. 13. fig. 14 equivalent nct circuit of the or-fredkin variation of fig. 13 another variation is possible, with both bottom dots white, as shown in fig. 15. fig. 15 another orfredkin variation the functionality of the building block of fig. 15 is given by: 𝑐3 ′ = 𝑐3 ⊕ 𝑐1 ∨ (𝑐2 ⊕ 𝑐3̅) = 𝑐3 ⊕ 𝑐1 ⊕ (𝑐2 ⊕ 𝑐3̅) ⊕ 𝑐1(𝑐2 ⊕ 𝑐3̅) = = 1 ⊕ 𝑐1 ⊕ 𝑐2 ⊕ 𝑐1𝑐2 ⊕ 𝑐1𝑐3̅ = 𝑐1̅𝑐2 ⊕ 𝑐1𝑐3 ⊕ 1 = 𝑐1̅𝑐2̅ ⊕ 𝑐1𝑐3̅ , (9) 𝑐2 ′ = (𝑐2 ⊕ 𝑐3̅) ⊕ 𝑐3 ′ ⊕ 1 = 𝑐2 ⊕ 𝑐3̅ ⊕ 𝑐1̅𝑐2 ⊕ 𝑐1𝑐3 = = 𝑐1𝑐2 ⊕ 𝑐1̅𝑐3 ⊕ 1 = 𝑐1𝑐2̅ ⊕ = 𝑐1̅𝑐3̅ , 𝑐1 ′ = 𝑐1 . equations (9) may be summarized as follows: c1 = 0 c1 = 1 c2’ 𝑐3̅ 𝑐2̅ c3’ 𝑐2̅ 𝑐3̅ in words: if c1 = 0 then 𝑐2 and 𝑐3 will be complemented and swapped, whereas if 𝑐1 = 1, 𝑐2 and 𝑐3 will be just complemented. equivalent circuits not using or-fredkin variations are shown in fig. 16. it is easy to see that these equivalent nct circuits are more complex than the or-fredkin variation. the fredkin gate in reversible and quantum environments 261 fig. 16 equivalent nct circuits for the or-fredkin variation of fig. 15 another variation is shown in fig. 17, where in analogy to the white dot, a white triangle is introduced, meaning that the corresponding control signal will be complemented before calculating the disjunction. fig. 17 a mixed-polarity or-fredkin variation the functionality of the building block of fig. 17 is given by: 𝑐3 ′ = 𝑐3 ⊕ (𝑐1 ∨ (𝑐3 ⊕ 𝑐2 ⊕ 1)) = 𝑐3 ⊕ 𝑐1 ⊕ 𝑐3 ⊕ 𝑐2̅ ⊕ 𝑐1(𝑐3̅ ⊕ 𝑐2) = 𝑐1(𝑐3̅ ⊕ 𝑐2̅) ⊕ 𝑐2̅ = 𝑐1𝑐3̅ ⊕ 𝑐1̅𝑐2̅ , (10) 𝑐2 ′ = 𝑐2 ⊕ 𝑐3 ⊕ 𝑐3 ′ = 𝑐2 ⊕ 𝑐3 ⊕ 𝑐1𝑐3̅ ⊕ 𝑐1̅𝑐2̅ = 𝑐1̅𝑐3̅ ⊕ 𝑐1𝑐2̅ , 𝑐1 ′ = 𝑐1 . equations (10) may be summarized as follows: c1 = 0 c1 = 1 c2’ 𝑐3̅ 𝑐2̅ c3’ 𝑐2̅ 𝑐3̅ it becomes apparent that equations (9) and (10) are equal. this means that the corresponding or-fredkin variations are equivalent. moreover, it may be noticed that the distribution of “white elements” in these variations is the same as the distribution of white dots shown in fig. 4 for variations of the classical fredkin gate. an additional or-fredkin variation is shown in fig. 18. fig. 18 or-fredkin variation the functionality of the circuit is: 𝑐3 ′ = 𝑐3 ⊕ (𝑐1 ∨ (𝑐2 ⊕ 𝑐3̅)) = 𝑐3 ⊕ 𝑐1 ⊕ 𝑐2 ⊕ 𝑐3̅ ⊕ 𝑐1(𝑐2 ⊕ 𝑐3̅) = = 1 ⊕ 𝑐1 ⊕ 𝑐2 ⊕ 𝑐1𝑐2 ⊕ 𝑐1𝑐3̅ = 𝑐1𝑐3 ⊕ 𝑐1̅𝑐2 ⊕ 1 , (11) 𝑐2 ′ = 𝑐2 ⊕ 𝑐3 ⊕ 𝑐3 ′ ⊕ 1 = 𝑐2 ⊕ 𝑐3 ⊕ 𝑐1𝑐3 ⊕ 𝑐1̅𝑐2 = 𝑐1𝑐2 ⊕ 𝑐1̅𝑐3 , 𝑐1 ′ = 𝑐1 . 262 c. moraga, f. z. hadjam equations (11) may be summarized as follows: c1 = 0 c1 = 1 c2’ 𝑐3 𝑐2 c3’ 𝑐2̅ 𝑐3̅ it may be seen, that eqs. (7) and (11) are equal. therefore, the corresponding fredkin variations are equivalent. they have the same distribution of white elements as the variations shown in fig. 8. the comparison is shown in fig. 19. fig. 19 pairs of equivalent fredkin variations 3. the quantum domain in the domain of circuits for quantum computing, the fredkin gate is not known under this name, but as “controlled swap”, possibly because in the case of quantum circuits there is a very simple symbol for the swap of two “qubits” (= quantum bits). (see fig. 19). without knowing whether in some “quantum technology” the controlled swap is also realized as in the reversible domain, i.e. cnot-toffoli-cnot, no variations as presented in the former section will be discussed. however, some changes in the surroundings of the controlled swap may be considered. a relevant example of an effective use of the controlled swap was introduced in [3] to efficiently determine whether two qubits are equal or have an inner product with absolute value ≥  a threshold in [0, 1]. (see fig. 20). a detailed analysis follows. in the dirac notation [16], let |0〉 and |1〉 be the basis states of the working hilbert space [16]. let |〉 denote the state of a control qubit and let h denote the hadamard gate 𝟏 √𝟐 [ 𝟏 𝟏 𝟏 −𝟏 ]. if |〉 = |0〉 = [ 1 0 ]t (in the vector notation), then : 𝑯|0〉 = 𝟏 √𝟐 [ 1 1 1 −1 ].[ 1 0 ] = 𝟏 √𝟐 [ 1 1 ] = 𝟏 √𝟐 (|0⟩ + |1⟩). (12) this represents a superposition of states and a quantum circuit will work in both states simultaneously, which is one of the main characteristics of circuits for quantum computing. fig. 20 circuit to compare two qubits, in the dotted box, the symbol for the controlled swap the fredkin gate in reversible and quantum environments 263 at the output side the circuit of fig. 20, produces (h ⊗ i4)(c swap)(h ⊗ i4) |0〉|〉〉 , (13) where i4 represents the 4⨯4 identity matrix. lemma 1: the output of the circuit of fig. 20 is given by: (1/2)[ |0〉(|〉〉 + 〉|〉) + |1〉(|〉〉 – 〉|〉) ]. (14) proof: considering the most general case, let |〉 = (0〉 + |1〉), with ||2 + ||2 = 1 and 〉 = (0〉 + |1〉), with |2 + |2 = 1. recall that swap = [ 1 0 0 0 0 0 1 0 0 1 0 0 0 0 0 1 ] and let q = (h ⊗ i4)(c swap)(h ⊗ i4). let s stand for swap. then q = 1 √2 [ i4 i4 i4 −i4 ] ⋅ [ i4 04 04 s ] ⋅ 1 √2 [ i4 i4 i4 −i4 ] = 1 2 [ i4 s i4 −s ] ⋅ [ i4 i4 i4 −i4 ] = = 1 2 [ i4 + s i4 − s i4 − s i4 + s ]. (15) moreover, |〉〉 = [ ]t ⊗ [ ]t = [     () 〉|〉 = [ ]t ⊗ [ ]t = [     () therefore, |〉〉 + 〉|〉 = = [( + ) (  + ) (  + ) (  + ) (18) |〉〉 – 〉|〉 = = [( – ) (  – ) (  – ) (  – ) (19) since  and  are possibly complex values, the products  and  are commutative. therefore, the first and last components of the vector in (18) equal 2 and 2 respectively, and the first and last components of the vector in (19) equal 0. therefore, |〉〉 + 〉|〉 = [(2 ) (  + ) (  + ) () (20) |〉〉 – 〉|〉 = [ 0 (  – ) (  – )   (21) to calculate q |0〉|〉〉 the explicit expression for (15) will be needed, where “t” will be used to represent –1 and preserve the format of the matrix. 264 c. moraga, f. z. hadjam q |0〉|〉〉 = 1 2 [ 2 0 0 1 0 0 1 0 0 1 0 0 1 0 0 2 0 0 0 1 0 0 𝐓 0 0 𝐓 0 0 1 0 0 0 0 0 0 1 0 0 𝐓 0 0 𝐓 0 0 1 0 0 0 2 0 0 1 0 0 1 0 0 1 0 0 1 0 0 2] ⋅ [ 𝛼1𝛼2 𝛼1𝛽2 𝛽 1 𝛼2 𝛽 1 𝛽 2 0 0 0 0 ] = 1 2 [ 2𝛼1𝛼2 𝛼1𝛽2 + 𝛽1𝛼2 𝛼1𝛽2 + 𝛽1𝛼2 2𝛽1𝛽2 0 𝛼1𝛽2 − 𝛽1𝛼2 −𝛼1𝛽2 + 𝛽1𝛼2 0 ] . (22) the resulting vector in (22) may be additively divided into two vectors as follows: 1 2 [ 2𝛼1𝛼2 𝛼1𝛽2 + 𝛽1𝛼2 𝛼1𝛽2 + 𝛽1𝛼2 2𝛽1𝛽2 0 𝛼1𝛽2 − 𝛽1𝛼2 −𝛼1𝛽2 + 𝛽1𝛼2 0 ] = 1 2 ( [ 2𝛼1𝛼2 𝛼1𝛽2 + 𝛽1𝛼2 𝛼1𝛽2 + 𝛽1𝛼2 2𝛽 1 𝛽 2 0 0 0 0 ] + [ 0 0 0 0 0 𝛼1𝛽2 − 𝛽1𝛼2 −𝛼1𝛽2 + 𝛽1𝛼2 0 ] ) . (23) from eq. (21), with (18.b) and (19.b) follows that the first vector of (21) equals (1/2)|0〉( |〉〉 + 〉|〉 ) and the second vector of (21) equals n(1/2)|1〉( |〉〉 – 〉|〉 ). this ends the proof that q |0〉|〉〉 = (1/2)[ |0〉(|〉〉 + 〉|〉) + |1〉(|〉〉 – 〉|〉) ] . □ notice that if (|〉 = 〉) then (|〉〉 = 〉|〉) and (|〉〉 – 〉|〉) = 0. this means that in this case, |1〉 would be measured with probability 0, whereas |0〉 would be measured with a non-zero probability. the equality of two state vectors may be obtained in one step, whereas a classical algorithm would require two comparisons. a possible “variation” may consider |〉 = |1〉 = [ 0 1 ]t (in the vector notation), then 𝑯|1〉 = 𝟏 √𝟐 [ 1 1 1 −1 ].[ 0 1 ] = 𝟏 √𝟐 [ 1 −1 ] = 𝟏 √𝟐 (|0⟩ − |1⟩). (24) lemma 2: if in the circuit of fig. 20 |〉 is set to |1〉, then q|1〉|〉〉 = (1/2)[(|0〉(|〉〉 – 〉|〉) + |1〉(|〉〉 + 〉|〉)]. proof: at the output side the circuit with |〉 = |1〉 now gives (h ⊗ i4)(c swap)(h ⊗ i4) |1〉|〉〉 = q|1〉|〉〉 = = q([ 0 1]t ⊗ [     ) = the fredkin gate in reversible and quantum environments 265 = 1 2 [ 2 0 0 1 0 0 1 0 0 1 0 0 1 0 0 2 0 0 0 1 0 0 t 0 0 t 0 0 1 0 0 0 0 0 0 1 0 0 t 0 0 t 0 0 1 0 0 0 2 0 0 1 0 0 1 0 0 1 0 0 1 0 0 2] ⋅ [ 0 0 0 0 𝛼1𝛼2 𝛼1𝛽2 𝛽 1 𝛼2 𝛽 1 𝛽 2] = 1 2 [ 0 𝛼1𝛽2 − 𝛽1𝛼2 −𝛼1𝛽2 + 𝛽1𝛼2 0 2𝛼1𝛼2 𝛼1𝛽2 + 𝛽1𝛼2 𝛼1𝛽2 + 𝛽1𝛼2 2𝛽1𝛽2 ] . (25) as in the former case, (recall eq. (23)), the final vector of eq. (25) may be split into two components associated to |0〉 and |1〉 respectively, leading to: q|1〉|〉〉 = (1/2)[(|0〉(|〉〉 – 〉|〉) + |1〉(|〉〉 + 〉|〉)] . (26) in this case, if |〉 = 〉, |0〉 would be measured with probability 0. a (pseudo) variation may be introduced if a signal, not the controlled swap, is modified. recall that the pauli x matrix [18] equals [ 0 1 1 0 ] and behaves as a quantum inverter. it is fairly obvious that if pauli x gates are included, as shown in fig. (21), then the complement of |〉 will be compared with 〉, which is equivalent to compare |〉 with the complement of 〉. fig. 21. modified swap test circuit to compare one state with the complement of another. the modified swap test may be expressed as: (h ⊗ x ⊗ i2)(c swap)(h ⊗ x ⊗ i2) |0〉|〉〉. (27) the proof of effectiveness, i.e. measuring 〉 with probability 0, follows the same steps as in the former first case. 4. conclusions variations on the fredkin gate, based on mixed polarity, have been analyzed in the reversible domain. the or-fredkin gate is introduced and in all shown variation cases their circuits showed a lower complexity (quantum cost [22], [27], i.e. number of elementary gates on two qubits in the quantum model, and depth) than equivalent classical nct circuits. a wide range of functionalities of the fredkin gate under mixed polarity were shown, thus adding flexibility to the design of reversible circuits. some equivalent variations were found and associated patterns of distribution of the white elements could be detected. in the quantum domain an application of the controlled swap to efficiently test whether two states are equivalent was given a step by step calculation of behaviour and one possible extension of the test circuit was shown. • 266 c. moraga, f. z. hadjam references [1] a. barenco, c. h. bennett, r. cleve, d. p. di vincenzo, n. margolus, p. shor, t. sleator, j. a. = smolin, and h. weinfurter, "elementary gates for quantum computation", phys. rev. a, vol. 52, pp. 3457-3467, 1995. [2] c. bennett, "logical reversibility of computation", ibm j. res. develop., vol. 17, pp. 525-532, 1973. [3] h. buhrman, r. cleve, j. watrous and r. de wolf, "quantum fingerprinting", phys. rev. lett., vol. 87, no. 16, p. 167902-1-4, 2001. [4] c. s. cheng and a. k. singh, "heuristic synthesis of reversible logic – a comparative study", theoretical appl. electr. eng., vol. 12, no. 3, pp. 210-225, 2014. [5] o. dovhamuk and v. deibuk, “cmos simulation of mixed-polarity generalized fredkin gates", in proceedings of the 12th international conference on advanced computer information technologies (acit), ieee press, 2022. [6] e. fredkin and t. toffoli, "conservative logic", int. jr. theor. phys., vol. 21, no. 3/4, pp. 219-253, 1982. [7] f. z. hadjam and c. moraga, "rimep2. evolutionary design of reversible digital circuits", acm j. emerg. technol. comput. syst., vol. 11, no. 3, pp. 27:1-27:23, 2014. [8] f. z. hadjam and c. moraga, "a hierarchical distributed linear evolutionary system for the synthesis of 4-bit reversible circuits" in r. seising and h. allende-cid (eds.), studies in fuzziness and soft computing 349, pp. 233249. springer, 2017. [9] r. landauer, "irreversibility and heat generation in the computing process" ibm j. res. develop., vol. 5, pp. 183191, 1961. [10] m. lukac, m. a. perkowski, h. goi, m. pivtoraiko, ch. h. yu, k. chung, h. jeech, b.-g. kim and y. d. kim, "evolutionary approach to quantum and reversible circuits synthesis", artif. intell. rev., vol. 20, no. 3-4, pp. 361-417, 2003. [11] d. maslov, g. w. dueck and d. m. miller, "synthesis of fredkin-toffoli reversible networks", ieee trans. very large scale integ. (vlsi) syst., vol. 13, no. 6, pp. 765-769, 2005. [12] m. d. miller and g. w. dueck, "search-based transformation synthesis for 3-valued reversible circuits" in i. lanese, and m. rawski (eds.), reversible computation, lncs 12227, 218-236, springer, 2020. [13] c. moraga, "hybrid gf(2)-boolean expressions for quantum computing circuits", in a. de vos and r. wille (eds.), rc 2011, lncs 7165, pp. 54-63, springer, 2012. [14] c. moraga, "using negated control signals in quantum computing circuits", fu elec. energ., vol. 24, no. 3, pp. 423-435, 2011. [15] c. moraga, "or-toffoli and or-peres reversible gates", in s. yamashita and t. yokoyama (eds.) reversible computation, lncs 12805, pp. 266-273, springer, 2021. [16] m. nielsen and i. chuang, quantum computation and quantum information. cambridge univ. press, uk, 2000. [17] ph. niemann, l. müller and r. drechsler, "finding optimal implementations of non-native cnot gates using sat", in s. yamashita, t. yokoyama, (eds.), reversible computation, lncs 12805, pp. 242-255, springer, 2021. [18] w. pauli, handbuch der physik, chapter 24, springer, berlin, 1933. [19] m. rahman and g. w. dueck, "an algorithm to find quantum templates" in proceedings of the ieee congress on evolutionary computing, ieee press, 2012, pp. 623-629. [20] i. rahul, b. loff and i. c. oliveira, "np-hardness of circuit minimization for multi-output functions", in proceedings of the 35th computational complexity conference (ccc), 2020, pp. 22:1–22:36. [21] m. saeedi and i. l. markov, "synthesis and optimization of reversible circuits – a survey", acm comput. surveys, vol. 45, no. 2, pp. 1-34, 2013. [22] z. sasanian and d. m. miller, "ncv realization of mct gates with mixed control", in proceedings of the ieee pacific rim conference on communications, computers and signal processing (pacrim), 2011, pp. 567-571. [23] m. soeken and m. k. thomsen, "white dots do matter: rewriting reversible logic circuits", in g. w. dueck and d. m. miller (eds.), reversible computation, lncs 7948, pp. 196-208, springer, 2013. [24] m. soeken, g. w. dueck and m. d. miller, "a fast symbolic transformation-based algorithm for reversible logic synthesis", in s. devitt and i. lanese i. (eds.), reversible computation, lncs 9720, pp. 307-321, springer, 2016. [25] s. stojković, m. m. stanković and c. moraga, "complexity reduction of toffoli networks based on fdd", fu: elec. energ., vol. 28, no. 2, pp. 251-262, 2015. [26] t. toffoli, "reversible computing", in j. w. baker and j. van leeuwen (eds.), alp 1980, lncs 84, pp. 632644, springer, 1980. [27] r. wille, m. saeedi and r. drechsler, "synthesis of reversible functions beyond gate count and quantum cost", 2010, pp. 1-7. [28] a. zulehner and r. wille, "simulation and design of quantum circuits", in i. ulidowski, i. lanese, u. p. schulz and c. ferreira, (eds.), reversible computation: extending horizons of computing, lncs 12070, pp. 60-82, springer open, 2020. instruction facta universitatis series: electronics and energetics vol. 29, n o 4, december 2016, pp. 675 688 doi: 10.2298/fuee1604675s nonrigorous symmetric second-order abc applied to large-domain finite element modeling of electromagnetic scatterers  slobodan v. savić 1 , milan m. ilić 1,2 1 university of belgrade, school of electrical engineering, belgrade, serbia 2 colorado state university, department of electrical and computer engineering, fort collins, co, usa abstract. nonrigorous symmetric second-order absorbing boundary condition (abc) is presented as a feasible local mesh truncation in the higher-order large-domain finite element method (fem) for electromagnetic analysis of scatterers in the frequency domain. the abc is implemented on large generalized curvilinear hexahedral finite elements without imposing normal field continuity and without introducing new variables. as the extension of our previous work, the method is comprehensively evaluated by analyzing several benchmark targets, i.e., a metallic sphere, a dielectric cube, and nasa almond. numerical examples show that radar cross section (rcs) of analyzed scatterers can be accurately predicted when the divergence term is included in computations nonrigorously. an influence of specific terms in the second-order abc, which absorb transverse electric (te) and transverse magnetic (tm) spherical modes, is also investigated. examples show significant improvements in accuracy of the nonrigorous second-order abc over the firstorder abc. key words: absorbing boundary condition, electromagnetic scattering, finite element method, numerical methods 1. introduction the finite element method (fem) is a widely used computational tool in the frequency-domain analysis of electromagnetic (em) problems [1-4]. to preserve the sparsity of the fem system when analyzing open-region (radiating and scattering) problems, the necessary artificial truncation of the computational domain is often done by applying approximate local absorbing boundary conditions (abcs) [4]. the symmetric second-order vector absorbing boundary condition (abc) is a very popular choice among abcs because it preserves the symmetry of the fem system while maintaining received september 29, 2015; received in revised form december 29, 2015 corresponding author: slobodan v. savić university of belgrade, school of electrical engineering, belgrade, serbia (email: ssavic@etf.rs) 676 s. savić, m. ilić satisfactory accuracy of the solution [5, 6]. however, this formulation requires computation of the divergence term on the faces of finite elements (fes) belonging to the absorbing boundary surface (abs). this, in turn, is a problem on its own because the required normal continuity of the fields is generally not enforced across the edges of adjacent elements in a standard weak-form fem discretization where edge-based curl-conforming vector basis functions are employed. in addition, a divergence calculation of the nonconforming basis functions in such formulations cannot be done analytically for the generalized curved fes, even across the faces of elements at the abs (excluding the troublesome edges) where these functions are continuous and differentiable. this problem has been addressed before, however all reported conclusions pertain to evaluation of the second-order abc in small-domain spatial discretization frameworks [7-9], where the fem volume elements are electrically small (e.g., their edges are on the order of /10,  being the wavelength at the operating frequency of the implied timeharmonic excitation). this spatial discretization results in a rather fine mesh throughout the computational domain and at the abs as well. it appears that in such meshes omitting the divergence term in the second-order abc, or computing it nonrigorously without enforcing the normal continuity of the fields yields approximately the same error [8]. on the other hand, the method which rigorously implements the second-order abc on small curved tetrahedra, while preserving the symmetry of the system, has been recently proposed in [9]. however, this method employs auxiliary variables thus mandating significant changes in the existing fem code. conversely however, in the open literature there appear to be no analyses of the second-order abc performance in coarse large-domain fem meshes, although fine meshes and small elements are really not required at the abs, which is typically moved away from the analyzed structure and resides in a homogeneous free space. the em field is usually not changing rapidly at the abs, hence the advantages of large-domain modeling can be fully exploited. with the above in mind, we proposed that large-domain discretization utilizing curved elements whose edges are up to 2 long, coupled with truly higher order (e.g., up to the 10 th order) polynomial field expansion, can be efficiently used in the abs tessellation. the number of edges shared by faces of adjacent finite elements at the abs is thus reduced, which can, in turn, significantly reduce the error introduced by direct computation of required derivatives, because these edges are the sole locations where discontinuities of the normal field components actually arise when the second-order abc is implemented nonrigorously. preliminary results of the proposed method applied to a simple metallic spherical scatterer can be found in [10]. in this work we present the implementation details of the nonrigorous symmetric second-order abc applied on large curvilinear hexahedra in higher-order fem and evaluate its performance on a comprehensive set of benchmark targets which include: a metallic sphere, a dielectric cube (as an example of penetrable structure with sharp edges and vertices), and a metallic nasa almond as a standard nontrivial benchmark target of the electromagnetic code consortium (emcc). nonrigorous symmetric second-order abc applied to large-domain finite element modeling ... 677 2. theory and implementation 2.1. higher-order large-domain fem formulation when solving three-dimensional (3-d) linear steady-state em problems by the fem, we first geometrically discretize the domain of interest using lagrange-type generalized curved hexahedra of arbitrary orders, ku, kv, and kw (ku, kv, kw  1). these hexahedra are geometrically flexible and can be used for large-domain modeling of arbitrary shapes [11]. they are analytically described by position vector [11]      u v w wvu k i k j k k k k k j k iijk wlvlulwvu 0 0 0 )()()(),,( rr ,       u u k il l li lk i uu uu ul 0 )( , 1,,1  wvu , (1) where ),,( kjiijk wvurr  are position vectors of interpolation nodes and u k i l represent lagrange interpolation polynomials in the u coordinate, of the local parametric u-v-w coordinate system, with l u being the uniformly spaced interpolating nodes defined as uul kklu /)2(  , u kl ,...,1,0 , and similarly for )(vl v k j and )(wl w k k . we then solve the electric field vector wave equation within each of the finite elements [1, 3]. in every hexahedron we expand the electric field vector as 1 1 1 , , , , , , 0 0 0 0 0 0 0 0 0 u v w u v w u v wn n n n n n n n n u ijk u ijk v ijk v ijk w ijk w ijk i j k i j k i j k                     e f f f , (2) where f are curl-conforming (and generally div-nonconforming) hierarchical polynomial vector basis functions defined as r w k jiijkw r vk j iijkv r ukj i ijku wvpup wpvup wpvpu af af af )()( )( )( )()( , , ,    ,             odd ,3, even ,2,1 1,1 0,1 )( iuu iu iu iu up i ii , 1,,1  wvu , (3) nu, nv, and nw are the adopted degrees of the polynomial approximation, which are entirely independent of the element geometrical orders, ku, kv, and kw, and ijku, , ijkv, and ijkw,  are unknown field-distribution coefficients (to be determined by the fem). the reciprocal unitary vectors r u a , r v a and r w a in (3) are defined as j wv r u /)( aaa  , j uw r v /)( aaa  and j vu r w /)( aaa  , where wvu j aaa  )( is the jacobian of the covariant transformation and u a , v a and w a are unitary vectors defined as u u  ra , v v  ra and w w  ra . by adopting higher-order polynomial field expansion [nu, nv, and nw in (2) can be up to 10 th order], through the process of p-refinement, fes could be up to 2 long in each direction [11]. applying the standard galerkin-type discretization yields the disconnected system of linear equations for each of the finite elements [1] 2 0 ([ ] [ ]) { } { } s a k b g   , (4) 678 s. savić, m. ilić where k0 represents the free-space wave number and {} is the column vector of electric field distribution coefficients from (2). disconnected system of linear equations does not take into account boundary conditions which fields must satisfy on the interfaces between two adjacent fes, but considers each finite element (fe) separately. in order to facilitate implementation (and coding), matrices [a] and [b] can be represented using submatrices as in [11] [ ] [ ] [ ] [ ] [ ] [ ] [ ] [ ] [ ] [ ] uua uva uwa a vua vva vwa wua wva wwa            , [ ] [ ] [ ] [ ] [ ] [ ] [ ] [ ] [ ] [ ] uub uvb uwb b vub vvb vwb wub wvb wwb            . (5) the entries in the submatrices [uva] and [uvb] are given as 1 ˆ ˆ r ,ˆ̂ ˆ̂, , rˆ ˆ ,ˆ̂ ˆ̂, , ( ) d , d , v ijkijk ijk u ijk v v ijkijk ijk u ijk v uva v uvb v                      f f f f ,,...,1,0,ˆ ,1,...,1,0 ,,...,1,0ˆ ,,...,1,0 ,1,...,1,0ˆ w v v u u nkk nj nj ni ni      (6) where v stands for the volume of the fe and r and r are relative permittivity and permeability tensors [12, 13], respectively. the electric field expansion orders nu, nv, and nw in (2) are selected in accordance with reduced-gradient criterion [14, 15] and by following the recipes in [16] which facilitate optimal higher-order computation. the remaining entries of matrices [a] and [b] are calculated in a similar manner. analogously, column vector {gs} can be represented as { } { } { } { } s s s s ug g vg wg            , (7) and the entries in the column vector {ugs} are given as                     s kjiukjis sug d 1 rˆˆ̂,ˆˆ̂, nef , ,,...,1,0ˆ ,,...,1,0ˆ ,1,...,1,0ˆ w v u nk nj ni    (8) where s stands for the boundary surface of an element, e is the electric field vector at s (generally not known in advance) and n is the unit normal on s pointing outwards of the element. the remaining entries of the column vector {gs} are calculated in a similar manner. connected system of linear equations [1] is then assembled from (4) and the surface integrals in {gs} [as in (8)] are calculated only at the outer boundary of the fem domain, and not at the boundary of each element [3]. connected system of linear equations takes into account natural boundary conditions, i.e., tangential continuity of electric fields (explicitly) and magnetic fields (implicitly) which must be satisfied at the interfaces nonrigorous symmetric second-order abc applied to large-domain finite element modeling ... 679 between finite elements. consequently, {gs} is calculated only at the outer fem domain boundary, thus it represents a natural connection (interface) between the fem domain and the surrounding space. finally, to obtain a well-defined numerical problem, appropriate em field boundary conditions must be imposed at the outer fem boundary. these boundary conditions can be (i) exact and nonlocal, as in the hybrid finite element method-method of moments (fem-mom) [17], (ii) exact and local, when the fem domain is surrounded by a perfect electric conductor (pec) or a perfect magnetic conductor (pmc), or (iii) approximate and local, e.g., when em field propagation through free space, far from em sources and media discontinuities, is approximated by an abc placed relatively close to the scatterer. the local boundary conditions do not reduce sparsity in the final system of linear equations, which is a highly desirable property [18, 19] and one of the strongest benefits of the fem compared to mom. 2.2. symmetric second-order absorbing boundary condition consider an em scatterer (or generally em field sources) occupying a finite volume, surrounded by free space and illuminated by an incident em field (e inc and h inc ), as shown in fig. 1. in most cases the incident em field is a uniform plane wave, but the theory presented here applies to a general case as well. let sabc be a fictitious spherical surface of radius rabc, centered at the origin and surrounding the scatterer. we truncate the fem computational domain by applying abc at sabc. symmetric (resulting in symmetric system of linear equations) second-order abc, obtained by approximation of the term sc ( ) r  i e utilizing the wilcox expansion [20], given as [6]  sc sc scabc0 0 abc scabc 0 abc ( ) j ( ) [ ( )] 2(1 j ) ( ) , 2(1 j ) r r r r r t t r k k r r k r                i e i i e i i e e (9) will be applied at sabc, where incsc eee  represents the scattered electric field, r i is spherical coordinate system radial unit vector, t in subscripts represents the tangential (to sabc) part of a vector or gradient operator and j is the imaginary unit. fig. 1 with the analysis of open em problems using abc. 680 s. savić, m. ilić note that for the connected system of linear equations, the surface integrals in {gs} are calculated (only) at the entire outer fem domain boundary sabc, and that they are zero at two finite elements junction. on the other hand, the basis and testing functions appearing in the integrals are taken locally, from a specific element, as the integration progresses. terms in surface integrals in {gs} [as in (8)] can be rearranged for easier implementation of the second-order abc (9) as abc abc 1 ˆ ˆ ˆrˆ̂ ˆ̂ ˆ̂, , , d [ ( )] d rs ijk u ijk u ijk s s ug s s                    f e n i e f , (10) since r in  and 1 r [i]   at sabc, with ]i[ being the identity matrix. applying (10) and imposing the second-order abc (9), the system of linear equations (4) becomes 2 abc 0 0 ([ ] [ ] j [ ]) { } { } s a k b k s g    . (11) matrix [s] in (11) is the sum of three parts: the part corresponding to the first-order abc, the part corresponding to the second-order abc, which absorbs transverse electric (te) spherical modes, and the part corresponding to the second-order abc, which absorbs transverse magnetic (tm) spherical modes [6, 10]. in the matrix notation this can be written as   te tm 1abc 2abc 2abcabc 0 0 abc [ ] [ ] [ ] [ ] , 2 ( j) r s s s s k k r     (12) where the corresponding terms are self explanatory. analogously as in (5), matrix [s] can be represented using submatrices, namely [ ] [ ] [ ] [ ] [ ] [ ] [ ] [ ] [ ] [ ] uus uvs uws s vus vvs vws wus wvs wws            , (13) where the entries in the submatrix [uvs], for example, are given [in accordance with (12)] as   te tm 1abc 2abc 2abcabc ˆ ˆ ˆ ˆˆ̂ ˆ̂ ˆ̂ ˆ̂, , , , 0 0 abc , 2 ( j)ijk ijk ijk ijk ijk ijk ijk ijk r uvs uvs uvs uvs k k r     (14) and analogously for all other submatrices in (13). the entries corresponding to the firstorder abc, the te part corresponding to the second-order abc, and the tm part corresponding to the second-order abc, respectively, are calculated as   abc te abc tm abc 1abc ˆ ˆ ,ˆ̂ ˆ̂, , 2abc ˆ ˆ ,ˆ̂ ˆ̂, , 2abc ˆ ˆ ,ˆ̂ ˆ̂, , ( ) ( )d , [ ( )][ ( )] d , ( )( ) d , r r v ijkijk ijk u ijk s r r v ijkijk ijk u ijk s t v ijkijk ijk t u ijk s uvs s uvs s uvs s                    i f i f i f i f f f .,...,1,0,ˆ ,1,...,1,0 ,,...,1,0ˆ ,,...,1,0 ,1,...,1,0ˆ w v v u u nkk nj nj ni ni      (15) nonrigorous symmetric second-order abc applied to large-domain finite element modeling ... 681 the column vector abc { } s g in (11) can be written in the form shown in (7), with the addition of the superscript “abc” to distinguish the column vectors in (4) and (11). hence, similarly as in (12), the column vector abc { } s g can be represented as the sum of part corresponding to the first-order abc, the te part corresponding to the second-order abc, and the tm part corresponding to the second-order abc, respectively, as te tm abc 1abc 2abc 2abc { } { } { } { } . s s s s g g g g   (16) the entries in the column vector  abc s ug , for example, are given as   abc te abc tm 1abc inc inc ˆ ˆ ˆ0ˆ̂ ˆ̂ ˆ̂, , , 2abc incabc ˆ ˆˆ̂ ˆ̂, , 0 abc 2abc abc ˆ ˆˆ̂ ˆ̂, , 0 abc ( ) ( ) j ( ) ( ) d , [ ( )][ ( )] d , 2(1 j ) ( ) 2(1 j ) r r rs ijk u ijk u ijk s r rs ijk u ijk s s ijk t u ijk ug k s r ug s k r r ug k r                        i f e i f i e i f i e f abc inc ( ) d , t s s     e ,,...,1,0ˆ ,,...,1,0ˆ ,1,...,1,0ˆ w v u nk nj ni    (17) and analogously for the remaining entries in abc { } s g . 2.3. computation of the surface integrals appearing in the symmetric second-order absorbing boundary condition applied to curvilinear elements consider the surface integrals appearing in (11) when computing entries in [s] and abc { } s g . the utilized basis and testing functions are curl-conforming and generally divnonconforming, hence the divergences in the tm parts of (15) and (17), and all similar terms, cannot be expressed in the closed form. moreover, as already discussed, these surface integrals are calculated over the entire sabc surface; in other words, they are calculated not only over the finite element surfaces belonging to sabc, but across the junctions (edges between the elements) as well. since the basis and testing functions possess only tangential continuity, this results in appearance of squares of delta-functions ( 2 ) in the kernels of the surface-integral terms at all edges enveloping the surfaces of the finite elements belonging to abc s [9]. in order to rigorously treat the divergence of the basis and testing functions at the edges of elements over sabc, the basis and testing functions must be adopted to enforce the normal continuity of the em field over sabc [8] or additional auxiliary (scalar) variables need to be introduced as in [9]. nevertheless, since the utilized higher-order polynomial basis and testing functions are continuous and differentiable over fes faces, their divergence can be readily calculated numerically. for example, from (3) it follows that the divergence of fu,ijk is given as 1 , 1 ( ) ( ) ( ) ( ) ( ) ( ) 1 ( ) ( ) ( ) ( ) ( ) 1 ( ) ( ) ( ) ( ) . i r r i r r u ijk j k u u j k u u ji r r i r r k u v j k u v i r r i r rk j u w j k u w iu p v p w u p v p w j j u p v u p w u p v p w j v j v p w u p v u p v p w j w j w                         f a a a a a a a a a a a a (18) 682 s. savić, m. ilić partial derivatives in (18) are calculated numerically utilizing the symmetric finite difference. for example, ' d ' d ( ) ( ) ( ) , 2 d r r r r u v u vr r v v v v v v u v j j j v v           a a a a a a (19) where vd is a numerical-differentiation step. since these divergences are computed only at the fem domain-truncation boundary sabc, numerical differentiation represents minimal addition to the complexity of the overall algorithm, and computation time for the surface integrals abc { } s g is almost negligible compared to the computation time for the fem volume integrals appearing in matrices [a] and [b]. the procedure is similar when divergence is calculated for the functions ijkv,f and ijkw,f . 3. numerical results and discussion 3.1. pec spherical scatterer as the first numerical example, consider a pec spherical scatterer of radius a = 1 m. the scatterer is situated in free space, with permittivity 0  and permeability 0  , and illuminated by a time-harmonic plane-wave of a free space wavelength m1 0  (f = 299.792 mhz), as shown in fig. error! reference source not found. (a). when constructing numerical model, infinite free space surrounding the scatterer is truncated at the artificial spherical boundary sabc, of radius m5.1b , where the nonrigorous symmetric second-order abc is imposed. the normalized thickness of the free space layer between the scatterer and sabc is 5.0)( 0  ab and it is meshed by only six cushion-like triquadratic curved hexahedral fes. 0 1 2 3 4 5 6 7 8 9 10 10 -3 10 -2 10 -1 10 0 1 st ord. abc 1 st ord. abc with g s 2abc, te and s 2abc, te 1 st ord. abc with g s 2abc, tm and s 2abc, tm nonrigorous 2 nd ord. abc unknowns fem-abc l 2 n o rm ( b ir c s ) / l 2 n o rm (m ie b ir c s ) n 10 1 10 2 10 3 10 4 10 5 10 6 u n k n o w n s (a) (b) fig. 2 (a) large-domain fem-abc model of a pec spherical scatterer. (b) normalized l 2 error norm of the computed bistatic rcs for the pec spherical scatterer and the number of unknowns. nonrigorous symmetric second-order abc applied to large-domain finite element modeling ... 683 first, we will consider far field results. a bistatic radar cross section (rcs) of the scatterer is computed by the proposed fem-abc technique. the order of the polynomial expansion of the electric field for all fes and in all directions is nu = nv = nw = n. numerical integration is performed by means of the 13 th order gauss-legendre quadrature. the bistatic rcs is computed in all directions uniformly (from  0start to  180 stop with the resolution of  5 , and from  0start to  360stop with the resolution of  5 ), and its error (with respect to the analytical mie’s series solution) is calculated as a normalized 2 l norm 180 360 miebircs 2 mombircs2 0 0 2 miercs 180 360 miebircs 2momrcs mombircs 0 0 (fembircs( , ) ( , )) l norm( bircs) l norm( ) ( , )                        , (20) where fembircs stands for the numerical solution for the bistatic rcs obtained by the proposed fem-abc technique and miebircs stands for the analytical (reference) results in the form of mie’s series. in the following subsection, when analytical miebircs solution is not available, the results obtained by mom, denoted as mombircs, will be used as a reference, as indicated in (20). in fig. error! reference source not found. (b) numerical results are compared for the firstand nonrigorous second-order abc, along with results for the first-order abc with only one term included from the nonrigorous second-order abc [ , te abc2 s g te abc2 s and , tm abc2 s g tm abc2 s from (12) and (16)]. to validate the convergence of the method with p-refinement, the solutions are obtained for various orders n, ranging from n = 1 to n = 9. from fig. error! reference source not found. (b) it can be concluded that, although not being implemented rigorously and not contributing independently to the accuracy of the solution, the tm part of the symmetric second-order abc together with the te part synergistically contributes to the overall solution accuracy. in addition, due to very rough mesh in this example, the fem solution becomes sufficiently accurate for 97  n with n = 8 yielding the lowest error, which is consistent with the results reported in [16]. moreover, the lowest errors obtained with the proposed large-domain fem with the nonrigorous second-order abc are of the same order of magnitude as those reported in the first example in [9], where the same scatterer was analyzed utilizing the rigorously implemented second-order abc. in this example the nonrigorous second-order abc performs significantly better in far field compared to the first-order abc, and for n = 8 the solution error is 2.7 times lower compared to results obtained utilizing the first-order abc. note that this error difference is even greater (8.8 times in favor of the nonrigorous secondorder abc) when the abc is set closer to the scatterer, i.e., when 1.0)( 0  ab , as reported in [10]. noting that far fields, and related derived parameters, are less sensitive to computational errors than near fields, in order to obtain and demonstrate an even more rigorous and complete validation of the proposed fem-abc technique, we next analyze the accuracy of the computed near field of the presented pce spherical scatterer. using the mesh from fig. error! reference source not found. (a) and setting n = 8 (for all elements in all direction) we compute the near electric field numerically and analytically 684 s. savić, m. ilić and show the comparison of obtained results in fig. 3. shown in fig. 3 is the magnitude of the x-component of the total electric field, in the 0x plane, obtained (a) analytically (mie’s series solution) and numerically using (b) the first-order abc and (c) the proposed second-order abc. the incident electric field is ]m/v[1 inc xie  ( xi being the cartesian unit vector in the x-direction) traveling in the z-direction, as shown in fig. 3 (d). in figs. 3 (e) and (f) the error of the electric field computed by the fem (relative to the reference mie’s series solution) for the first-order and second-order abc models are plotted, respectively. the error is calculated as 2immie, im fem, 2re mie, re fem, )()( xxxxx eeeee  , where ex,fem and mie,xe ex,mie are x-components of the electric fields obtained numerically and analytically, respectively, and re and im stand for the real and imaginary part of the complex quantities, respectively. (a) (b) (c) (d) (e) (f) fig. 3 near field results for the pec spherical scatterer from fig. error! reference source not found. obtained (a) analytically and numerically using (b) the first-order and (d) the proposed second-order abc. (d) large-domain fem-abc model of a pec spherical scatterer with illustrated incident field. electric field error (relative to the reference mie’s series solution) for (e) the first-order and (f) the proposed secondorder abc. from fig. 3, it can be concluded that the proposed second-order abc significantly outperforms the first-order abc. the results obtained using nonrigorous second-order abc are more accurate than those using the first-order abc in the complete x = 0 plane, and especially for z > 0. note that, due to symmetry, the remaining two cartesian components of the electric field vanish in the 0x plane (ey = 0, ez = 0), hence they are not shown. also, note that other field components in different planes exhibit similar nonrigorous symmetric second-order abc applied to large-domain finite element modeling ... 685 errors, hence they are not shown here for brevity. in addition, the errors in the near field can be further reduced employing p-refinement. 3.2. dielectric cubical scatterer as the second numerical example, consider a dielectric cubical scatterer with relative permittivity 25.2 r  and relative permeability 1 r  , of edge length m2a . the scatterer is situated in free space and illuminated by a time-harmonic plane-wave of a free space wavelength m2 0  (f =149.896 mhz), as shown in fig. 4 (a). when constructing the numerical model, infinite free space surrounding the scatterer is truncated at the artificial spherical boundary sabc, of radius m2b , where the nonrigorous symmetric second-order abc is imposed. free space between the scatterer and the abc s is again meshed by only six cushion-like triquadratic curved hexahedral fes and the dielectric scatterer is meshed by only one trilinear fe. minimal normalized distance between the scatterer and abc s is 13.0)35.0( 0  ab and this maximal distance is (b  0.5a)/0 = 0.5. 0 1 2 3 4 5 6 7 8 9 10 10 -3 10 -2 10 -1 10 0 1 st ord. abc 1 st ord. abc with g s 2abc, te and s 2abc, te 1 st ord. abc with g s 2abc, tm and s 2abc, tm nonrigorous 2 nd ord. abc unknowns fem-abcl 2 n o rm ( b ir c s ) / l 2 n o rm (m o m b ir c s ) n 10 1 10 2 10 3 10 4 10 5 10 6 u n k n o w n s (a) (b) fig. 4 (a) large-domain fem-abc model of a dielectric cubical scatterer. (b) normalized l 2 error norm of the computed bistatic rcs for the dielectric cubical scatterer and the number of unknowns. normalized l 2 error norm of the computed bistatic rcs for the cubical scatterer is calculated as discussed in subsection 0 and shown in fig. 4 (b). the error is calculated with respect to the fully converged mom solutions obtained by wipl-d software [21]. numerical parameters regarding the field expansion and integration in the fem model are kept the same as in the previous example. it can be concluded based on fig. 4 (b) that the nonrigorously implemented tm part of the second-order abc independently contributes to the quality of solutions and that, together with te part of the second-order abc, both parts synergistically contribute to the overall solution accuracy. in this example, the nonrigorous second-order abc performs significantly better compared to the first-order 686 s. savić, m. ilić abc, and for 7n the error obtained using the second-order abc is 5.6 times smaller than that for the first-order abc. 3.3. pec nasa almond scatterer as the last example, consider a pec nasa almond scatterer, which is one of the standard benchmarks of the emcc. the nasa almond is geometrically described by the parametric equations given above fig. 2 in [22]. the almond of length mm37.252d (parameter d from equations in [22]), situated in free space, and illuminated by horizontally and vertically (in  90 plane) polarized incident em field at the operating frequency ghz19.1f ( mm252 0  ) will be considered, as shown in fig. 5. fig. 5 pec nasa almond scatterer. higher-order fem-abc model of the pec nasa almond scatterer consists of 96 triquadratic large-domain lagrange-type fes. these fes model the free space between the almond and the spherical surface abc s , where nonrigorous symmetric second-order abc is applied. the radius of abc s is mm220b . minimum and maximum distances from the almond to abc s are 0 373.0  and 0 801.0  , respectively, and the field expansion orders are set to 6n (for all finite elements and in all directions), which results in 62220 unknown field distribution coefficients. using the proposed nonrigorous second-order abc coupled with the 0 30 60 90 120 150 180 -50 -45 -40 -35 -30 -25 -20 -15 -10 wipl-d feko ----------------------------------higher order fem-abc n u =n v =n w =6, 62220 unkn. nonrigorous 2 nd ord. abc = 90 0 m o n o st a ti c r c s [ d b m 2 ]  0 30 60 90 120 150 180 -50 -45 -40 -35 -30 -25 -20 -15 -10 wipl-d feko ----------------------------------higher order fem-abc n u =n v =n w =6, 62220 unkn. nonrigorous 2 nd ord. abc m o n o st a ti c r c s [ d b m 2 ]  = 90 0 (a) (b) fig. 6 computed monostatic rcs of the pec nasa almond from fig. 5 for the (a) horizontal and (b) vertical incident field polarization; comparison of proposed fem-abc and two mom results obtained by wipl-d [21] and feko [23] software. nonrigorous symmetric second-order abc applied to large-domain finite element modeling ... 687 large-domain higher-order fem technique, the monostatic rcs in the horizontal plane (  90 , )1800  is computed. the results are compared with results obtained by mom technique [21, 23] for both horizontal and vertical incident field polarizations, and shown in fig. 6. from fig. 6 it can be concluded that a very good matching between the fem-abc and mom results is achieved in all directions, and that scatterers of relatively complex shapes can also be accurately analyzed by the proposed fem-abc method. 4. conclusions we have presented, implemented, and validated by representative numerical experiments, a nonrigorous symmetric second-order abc in combination with largedomain higher-order fem technique for frequency domain em scattering analysis. in the proposed method, the abc is implemented nonrigorously, without imposing the normal field continuity and without introducing additional variables. the required divergence of the nonconformal field components is computed numerically on the faces of elements belonging to the abs, using simple finite differences. numerical experiments have shown that the nonrigorous second-order abc performs significantly better compared to the first-order abc and that the proposed method results mach very good with referent numerical solution of high accuracy. moreover, the examples have shown that the errors in computation of the rcs can be significantly lower if the divergence term is included in the abc, as described, than if it is omitted. this conclusion is in contrast with results reported thus far in the literature, where examples with small-domain fem meshes have been utilized exclusively. finally, examples with a dielectric cubical scatterer and the nasa almond have shown that the proposed method can be successfully applied in analysis of scatterers with sharp edges and tips. acknowledgement: this work was supported by the serbian ministry of science and technological development under grant tr-32005. references [1] p. p. silvester and r. l. ferrari, finite elements for electrical engineers, 3 ed. new york: cambridge university press, 1996. [2] j. l. volakis, a. chatterjee, and l. c. kempel, finite element method for electromagnetics (antennas, microwave circuits, and scattering applications), 1 ed. new york: ieee press, 1998. [3] j.-m. jin, the finite element method in electromagnetics. hoboken, new jersey: john wiley & sons, 2014. [4] j.-m. jin and d. j. riley, finite element analysis of antennas and arrays, 1 ed. hoboken, new jersey: wiley-ieee press, 2009. [5] j. p. webb and v. n. kanellopoulos, "absorbing boundary conditions for the finite element solution of the vector wave equation," microwave and optical technology letters, vol. 2, no. 10, pp. 370-372, october 1989. [6] a. f. peterson, "accuracy of 3-d radiation boundary conditions for use with the vector helmholtz equation," ieee transactions on antennas and propagation, vol. 40, no. 3, pp. 351-355, march 1992. [7] v. n. kanellopoulos and j. p. webb, "3d finite element analysis of a metallic sphere scatterer: comparison of first and second order vector absorbing boundary conditions," journal de physique iii, vol. 3, no. 3, pp. 563-572, march 1993. 688 s. savić, m. ilić [8] v. n. kanellopoulos and j. p. webb, "the importance of the surface divergence term in the finite element-vector absorbing boundary condition method," ieee transactions on microwave theory and techniques, vol. 43, no. 9, pp. 2168-2170, september 1995. [9] m. m. botha and d. b. davidson, "rigorous, auxiliary variable-based implementation of a secondorder abc for the vector fem," ieee transactions on antennas and propagation, vol. 54, no. 11, pp. 3499-3504, november 2006. [10] s. v. savić, b. m. notaroš, and m. m. ilić, "accuracy analysis of the nonrigorous second-order absorbing boundary condition applied to large curved finite elements," in 2015 international conference on electromagnetics in advanced applications (iceaa), turin, italy, 2015, pp. 58-61. [11] m. m. ilić and b. m. notaroš, "higher order hierarchical curved hexahedral vector finite elements for electromagnetic modeling," ieee transactions on microwave theory and techniques, vol. 51, no. 3, pp. 1026-1033, march 2003. [12] s. v. savić, a. b. manić, m. m. ilić, and b. m. notaroš, "efficient higher order full-wave numerical analysis of 3-d cloaking structures," plasmonics, vol. 8, no. 2, pp. 455-463, june 1 2013. [13] s. v. savić, b. m. notaroš, and m. m. ilić, "conformal cubical 3d transformation-based metamaterial invisibility cloak," journal of the optical society of america a, vol. 30, no. 1, pp. 7-12, january 2013. [14] j. c. nedelec, "mixed finite elements in r3," numerische mathematik, vol. 35, no. 3, pp. 315-341, september 1980. [15] j. c. nedelec, "a new family of mixed finite elements in r3," numerische mathematik, vol. 50, no. 1, pp. 57-81, january 1986. [16] e. m. klopf, n. j. šekeljić, m. m. ilić, and b. m. notaroš, "optimal modeling parameters for higher order mom-sie and fem-mom electromagnetic simulations," ieee transactions on antennas and propagation, vol. 60, no. 6, pp. 2790-2801, june 2012. [17] m. m. ilić, m. djordjević, a. ţ. ilić, and b. m. notaroš, "higher order hybrid fem-mom technique for analysis of antennas and scatterers," ieee transactions on antennas and propagation, vol. 57, no. 5, pp. 1452-1460, may 2009. [18] g. strang, linear algebra and its applications, 4 ed.: brooks cole, 2005. [19] g. strang, introduction to linear algebra, 4 ed. wellesley, ma: wellesley cambridge press, 2009. [20] c. h. wilcox, "an expansion theorem for electromagnetic fields," communications on pure and applied mathematics, vol. 9, no. 2, pp. 115-134, may 1956. [21] "wipl-d pro," 11.0 wipl-d d.o.o., 2013 available: http://www.wipld.com. [22] a. c. woo, h. t. g. wang, m. j. schuh, and m. l. sanders, "benchmark radar targets for the validation of computational electromagnetics programs," ieee antennas and propagation magazine, vol. 35, no. 1, pp. 84-89, february 1993. [23] "feko," altair development s.a. (pty) ltd,, 2011 available: http://feko.info/applications/rcs. http://www.wipld.com/ http://feko.info/applications/rcs paper title (use style: paper title) facta universitatis series: electronics and energetics vol. 29, n o 3, september 2016, pp. 367 381 doi: 10.2298/fuee1603367j from intelligent web of things to social web of things nafaâ jabeur 1 , hedi haddad 2 1 dept. of computer science, german university of technology in oman, oman 2 dept. of computer science, dhofar university, oman abstract. numerous challenges, including limited resources, random mobility, and lack of standardized communication protocols, are currently preventing a myriad of heterogeneous devices to interact and provide web services within the context of the web of things (wot). we argue in this paper that these devices should be augmented with artificial intelligence techniques for an enhanced management of their resources and an easier construction of web applications integrating real world things (rwt). to this end, we present a new classification of the wot challenges and highlight the opportunities of embedding smartness into rwt. we also present our vision of intelligent wot by proposing a multiagent system-based architecture for intelligent web service composition. in addition, we discuss the shift of the wot toward a social wot (swot) and debate our ideas within two important scenarios, namely the intelligent vanet-wot and smart logistics. key words: internet of things, web of things, multiagent systems, web service composition, social web of things, smart logistics 1. introduction continuous technological advances are bringing communication and computing technologies from large to small and tiny scales. for instance, new range of small devices, including wireless sensor networks (wsns), are capable of acquiring and reporting data about a variety of spatial objects and events of interest, anytime and everywhere [2]. these devices are since there profiting from incessant progress in the fields of networking capabilities, mobile and pervasive computing, and miniaturization. they are not anymore being considered as simple data collecting devices. their capabilities are, indeed, being augmented with processing and intelligent mechanisms to assess on their own their current received august 30, 2015; received in revised form november 15, 2015 corresponding author: nafaâ jabeur dept. of computer science, german university of technology in oman gutech, p.o. box 1816, pc 130, muscat, oman (e-mail: nafaa.jabeur@gutech.edu.om) *an earlier version of this paper was presented at the international conference on recent advances in computer systems racs-2015, hail university, saudi arabia, 2015 [1]. 368 n. jabeur, h. haddad situations and make the right decision at the right time. a new era bridging cyber and physical worlds have then emerged with the vision to insert smartness everywhere. this era is particularly marked with the recent emergent fields of cyber physical systems [3] and internet of things. the internet of things (iot) could be defined as a global networking infrastructure that uses data capturing devises and communication resources to link virtual and physical objects [4]. it can, therefore, be perceived as an amalgamation of a variety of sensing, communication, and networking devices and systems in order to connect people and things with common interests. in this configuration, anybody can efficiently access the information of any object and any service, at any time and any place, regardless the heterogeneity of communication protocols and devices [5]. the web of things (wot) is a subset of the iot where web standards are used to seamlessly integrate and connect physical objects and information resources [6]. the emerging development of wot is expected to offer solutions in a wide variety of domains, including transportation management, energy monitoring, logistics and supply chain management, military and rescue scenarios, and healthcare applications. this is expected to be facilitated thanks to the increasing abundance of smart devices with web-enabled capabilities. the wot vision particularly aims to use web protocols and technologies to allow an easy building of web applications exploiting real world things (rwt). however, due to the heterogeneity of their hardware/software specifications and capabilities, the nonhomogeneity of their data representations and quality as well as their commonly nondeterministic mobility, rwt are facing serious problems to interoperate. these problems are more and more challenging because of the absence of widely accepted standards. with the continuous expansion of cyber and physical words toward each other as well as toward a social world, additional challenges concerning trust, privacy, and security are raising up. as it can be seen clearly, the challenges of the wot concern several levels and issues. we believe that autonomy, flexibility, and intelligence must be integrated to any approach addressing these challenges, and we argue that techniques from the artificial intelligence field would allow the creation of efficient candidate solutions. in this perspective, few approaches have been proposed [7][8]. however, the integration of intelligence into rwt has not been clearly investigated. furthermore, a major success factor for the wot is driven by the prevalence of web expertise. the internet networking infrastructure and the existing standards for data storage, visualization, and sharing are, indeed, pillars of the wot vision. nevertheless, these standards and techniques must be extended, revised, and/or revolutionized in order to meet the operational requirements of the rwt and allow them to integrate the web and mutually exchange web services. these services should be easy to publish, discover, compose, and execute. the traditional web service paradigm should then be enriched by promoting the web from both cyber and physical worlds [6]. because of their hardware and software limitations, it would be beneficial to the rwt to collaboratively provide services going beyond their individual capabilities. we then argue that these rwt should organize themselves into clusters where web-enabled devices could act as proxies allowing other rwt to connect to the internet and share their services. as the issues of service composition and clustering within the context of wot were not specifically investigated, we propose in this paper to address them as well as other challenges of the wot using a multiagent-based approach. in the reminder of this paper, section 2 highlights existing works that have addressed the issue of web service provision from intelligent wot to social wot 369 in the wot. section 3 presents our categorization of the wot challenges. section 4 addresses the issue of intelligent wot where the need for intelligent techniques are emphasized and explained. section 5 brings hints about socializing the wot. section 6 focuses on the application of our ideas to two important scenarios, namely web of vehicular ad-hoc network and freight transportation. 2. related work the main challenge of the iot and therefore the wot is to allow a myriad number of rwt to interoperate and mutually “understand” each other. to facilitate this interoperability, several techniques, including universal plug and play (upnp), dlna, slp, and zeroconf have been proposed [9]. each of these techniques has individually been successful in enabling devices to communicate with each other [7]. however, in addition to being not strictly standardized, some of them are inappropriate to resource-constrained devices due to their heavy protocols. thanks to the increasing integration of web-enabled capabilities, large number of rwt are currently benefiting from the existent networking infrastructure of the internet. the wot is then providing these rwt with the service and application layer to interoperate over http [10]. other networking infrastructures like wi-fi and ethernet permit new opportunities to build additional applications and services [7]. furthermore, with the falling size of embedded systems and their growing hardware and software capabilities, it has become possible to integrate lightweight web servers into many appliances [11]. consequently, the academia and the business sector are giving increasing attention to using the web as a platform for the creation of new applications that integrate rwt [10][12]. this trend has resulted in the increasing use of web services for the interoperability of rwt, particularly because of their proprietary and heterogeneous technologies [7]. the possible integration of heterogeneous rwt into the web leads to a more advanced perspective, where these things are abstracted into reusable web services, and not only viewed as simple web pages [6]. for instance, soap-based web services (ws-*) and restful apis allow rwt to offer their functionalities. restful web services are based on representational state transfer (rest) [13] which is lightweight, simple, loosely coupled, flexible as well as easy to integrate into the web using the http application protocol [7]. although rest-based services are being incorporated into many wot applications, particularly where quality of service (qos) levels are firmly applied (e.g., stock market and banking), a more tightly coupled service paradigm like ws-* would be more ideal [14]. recent developments are successfully allowing to embed tiny web servers into rwt (e.g., [15][16]), especially since these servers do not need to handle large number of concurrent connections and requests. however, a lot of research and development efforts remain necessary in order to properly manage the increasing volume of demands from these servers while efficiently using the limited resources of the corresponding rwt. in the current literature, the wot did not attract enough research and development attention, worth of its value. we believe that this is due to its numerous challenges as well as the lack of maturity of related processing and communication capabilities of rwt. we also believe that artificial intelligence techniques, which have proven their extraordinary performance in dealing with problems of highly dynamic, uncertain, and heterogeneous environments, could bring solutions to the problems of wot. some works have integrated such techniques within the context of iot (e.g., [17][18][19]). however, to the best of our 370 n. jabeur, h. haddad knowledge this was not the case for the wot. an interesting study was proposed by zhong et al. [8] where the authors have suggested a holistic intelligence methodology called wisdom wot (w2t) for realizing "the harmonious symbiosis of humans, computers, and things in the hyper world" [8]. the methodology principally aims to implement a closed cycle that starts from things to data, information, knowledge, wisdom, services, humans, and then back to things. this macro-level cycle is not embedded on the rwt which are mostly being considered as data collectors with networking facilities to connect to server providers. 3. challenges of the web of things basically, building the wot concerns ways to design and implement scalable and industry-ready iot solutions on the web. as a subset of the iot, the wot shares many characteristics with wireless sensor networks (wsns), machine-to-machine (m2m), and ubiquitous computing technologies. furthermore, the wot integrates information and physical objects, necessitating new means to model and reason about a range of context types [20]. from a design perspective and compared to the traditional client-server architecture, the wot has a flat architecture that includes two main challenges: a) integrating the rwt into the web; and b) making the rwt provide web services capable of mutually interoperate and fuse into complex services [6]. from a general perspective, we classify the challenges of wot into five main categories: data preprocessing and storage, data analytics, service management, networking and communication, and security, privacy, and trust (figure 1). data preprocessing and storage. the spatially distributed rwt are generally moving in the space while collecting data, anytime, anywhere and for a variety of purposes. to this end, they are usually facing problems to make the appropriate use of their data. in this regard, the rwt have to identify which data is important to collect for the current situation and according to which sampling frequency. the data collected should then be filtered and evaluated according to its semantics, the current context as well as current and expected requirements. once data is cleaned and filtered, it should be stored according to appropriate representations, granularities, and quality. the abovementioned process could be performed with a convenient form of the commonly used extract transform, load (etl) process which is capable of merging data from different sources and creating specialized datasets for a variety of purposes. data analytics. once data have been transformed and fed into local embedded databases, some analytics can start. to this end, some lightweight algorithms could be applied in order to perform a variety of operations, including data mining, semantics extraction, and data correlation identification. these algorithms may derive from genetic algorithms, support vector machines, decision trees, neural networks, and/or cluster analysis. they will be basically applied to small, focused data owned by each rwt. because of their limited storage and processing capabilities, some data analytics processing would go beyond the individual capabilities of rwts. to this end, a trusted, federating entity would be necessary to carry out the necessary processing within appropriate timeframes. this entity could be a rwt endowed with extended capabilities or a remote server to which the participating rwts are registered. this entity has to collect data from individual rwts and aggregate them according to from intelligent wot to social wot 371 specific structures and requirements. because of the increasing number of sensing devices capable of acquiring huge amounts of data, anywhere, anytime, the resulting aggregated data is tending to be huge. advanced data analytics algorithms could then be thoroughly performed, leading to the potential discovery of new relevant data correlations as well as hidden communication, behavioural, mobility, and processing patterns. furthermore, in addition to increasing the context-awareness while processing data, the trusted rwt executing data analytics could infer actionable information through business intelligence mechanisms. these information could particularly allow the concerned rwt to make more informed actions. challenges of web of things data peprocessing service management networking and communication security, trust, privacy data representation data availability data storage data granularity data quality context service publishing service discovery service sharing searching engine service composition protocols mobility of things client/server architecture standards context service mobility context protocols evaluation mechanisms mobility context data analyticsdata mining dependability statistics searching engine context-awareness business intelligence event tracking fig. 1 a proposed classification of the wot challenges service management. the rwt can be directly integrated to the web (in the case they have ip addresses or they are ip-enabled when connected to the internet) and be, consequently, able to understand each other through standardized web languages. they can also be integrated indirectly to the web (e.g., sensor nodes in a wsn) for cost, energy and security considerations [6]. this is achieved through ip-enabled rwt proxies. in both cases, the rwt should allow other devices to interoperate with them and mutually benefit from their services, which requires the abstraction of the rwt into reusable web services [7]. one or both of the w3c web service paradigms (rest-compliant web services) and arbitrary web services can be adopted. the rwt services should be generated on-the-fly or at least within appropriate timeframes [7]. although some technologies (e.g., flyport: www.openpicus.com) and research initiatives (e.g., [15]) have successfully embedded tiny web servers on mobile http://www.openpicus.com/ 372 n. jabeur, h. haddad devices, additional research and development efforts are still needed, particularly because of the physical constraints of rtw. furthermore, the services of rwt should be published in appropriate locations with convenient mechanisms for their discovery. in this regard, existing searching engines and algorithms must be re-examined in order to allow an efficient and effective discovery of rwt services. because of the limited capabilities of rwt, service composition could be a challenging solution where a group of rwt collaboratively create complex services from their individual elementary services. furthermore, although the mobility of rwt offers new opportunities for service composition, it also brings new challenges, basically because it does not guarantee a durable availability of service providers. furthermore, increasing capabilities of smart things to connect to the web is enabling additional flexibility and customization possibilities for end-users. following the tendency of web 2.0 participatory services, especially web mashups, users are currently capable of creating new applications where rwt (e.g., home appliances) are mixed with virtual services on the web [21]. this type of applications is often referred to as physical mashup [22]. a web mashup is a special application that integrates several web resources in order to generate a new service or application. this integration is mainly performed in an opportunistic manner for the sake of end-user’s personal use and generally for non-critical applications [23]. in addition to serving short-term needs, mashups are usually created ad-hoc with well-known, lightweight web technologies (e.g., html, javascript). an example of mashup could be an application that displays on google maps the location of all the pictures posted to flickr [21]. within the context of wot, rwt could be used by mashups in order to create new web services. to this end, these rwt must be easy to locate through the web. in addition, they must maintain the availability of their contributions in the new services. networking and communication. during the last decades, several technologies and standards have been proposed for smart things' communication. the sporadic mobility of rwt makes communication difficult, especially in the context of indoor applications. with the huge variety of types and manufacturers of rwt, interoperability is an upward concern. for instance, the rwt should be able to understand each other by using welldefined communication protocols. since existing protocols, including upnp and jxta, have not been neither standardized nor widely accepted for embedded devices in industry, embedded tiny web servers could be an option [6]. the unpredictable mobility of rwt intensifies the problems of their communication and urges the need for new lightweight protocols, where the identities, capabilities, and requirements of things are supported. trust, privacy, and security. the issues of security, privacy, and trust are always fuelling intensive research works, especially within the context of large scale, open configurations in which specialized and non-specialized parties can participate anytime, anywhere. this is also the case for wot where rwt can exchange and share data/services without having a firm awareness about their mutual intensions and actions. the option of embedding tiny web servers on rwt adds up additional security challenges. the use of rest-based interfaces makes it possible to have secure interactions using https [24]. however, the erratic configuration of the wot and the lack of standards require new and revolutionary security mechanisms. the use of the social web as a platform to ensure the trust and privacy of things has been advocated [25] to control web-enabled things among trusted members on social web sites [7]. however, additional research and development works are still needed toward a successful, widespread use of the wot. from intelligent wot to social wot 373 4. intelligent web of things in this section, we propose a multiagent-based architecture in order to deal with the challenges of the wot. this architecture is expected to be embedded on rwt. we particularly focus on the issue of service composition. 4.1. need for intelligence because of their limited capabilities, non-standardized communication protocols, unplanned mobility, and potentially their heterogeneous data formats, accuracy, and granularity, the spatially distributed rwt definitively need suitable mechanisms to make convenient actions at the right time, depending on their current capabilities and context. in this paper, we argue that the multiagent system paradigm (mas) could be appropriate for the wot, thanks to its proven flexibility, autonomy, and intelligence to solve complex problems within highly dynamic, constrained, and uncertain environments [26]. we believe that several, well-established agent-based techniques could perfectly bring solutions to the deficiency and challenges of rwt highlighted in section iii. 4.2. multiagent-based architecture we propose, in this paper, to embed a mas into rwt in order to handle the wot challenges at different levels. data filtering agent content generation agent networking and communication agent raw data service repository protocols and communication links s e c u ri ty , p ri v a c y , t ru s t a g e n t a p p li c a ti o n service composition agent system fig. 2 an embedded multiagent architecture for rwt our architecture (figure 2) contains four main modules: data filtering agent (dfa), content generation agent (cga), networking and communication agent (nca), and security, trust, and privacy agent (stpa). the dfa processes and analyzes the data collected by local sensing devices as well as data received from neighbouring devices. agentbased techniques for data filtering (e.g., [27]) and data mining (e.g., [28]) can be used. the cga will then be able to create elementary services which will be published later. if a given service is requested by a tier, the rwt should use appropriate communication protocols (e.g., 6lowpan, zigbee, wifi) as well as appropriate communication pathways to respond and convey the requested service. this task is achieved by the agent nca. the operation of the rwt is carried out according to specific security, trust, and privacy rules handled by the 374 n. jabeur, h. haddad stpa. these rules will be updated and improved based on the accumulated experience and the envisioned wot application. our architecture also includes a dedicated agent-based system which will be used for service composition on-demand (requested by peers) or when the rwt is willing to create a new mashup, integrating local, neighbouring, and remote services from trusted peers. 4.3. service composition when some required services cannot be provided individually, rwt should have the option to collaboratively generate new contents beyond their individual capabilities. this collaboration is particularly needed for energy and safety reasons as well as shortage of resources due to rwt mobility. in order to enable rwt collaboration, we propose to allow them creating clusters of things that we call circles of friends (cof). each cof will be composed of a group of rwt that will select each other based on their own preferences. although the creation of cofs is beyond the scope of this paper, we give a brief overview of how they are formed. initially, while publishing its services, any rwt also publishes its wish to belong to a cof with specific social and/or professional aims. interested rwt could then contact each other to make a new cof. one of the rwt is appointed as a head of the cof (hcof). the hcof is responsible of selecting the appropriate rwt to provide the currently requested services and make the necessary plans to generate complex services from elementary ones. in order to motivate rwt to join cof so that complex services could be created more easily, smart things will be rewarded whenever they are participating and providing services within these circles. this will consequently affect their reputation in the wot. a reward, and therefore a reputation, is also assigned to each cof in order to motivate rwt to be active and maintain their cofs. translator agent service generator agent evaluator agent executor agent request service repository specifications service cof members revision function beliefs option generation function desires intentions action generation function plan generation inputs (new communication) fig. 3 (left) embedded multiagent system architecture for service composition, (right) belief-desire-intension architecture of rwt in order to carry out the tasks of a hcof, any given rwt with appropriate physical resources will include a service composition agent system (see figure 3) with the following from intelligent wot to social wot 375 agents: translator agent (ta), service generator agent (sga), evaluator agent (ea), and executor agent (xa) (figure 3, left). the ta will receive the requests for services from its corresponding cof and make the necessary translations between the external languages and communication formats and the internal ones to the hcof. if the request cannot be understood then the hcof can request the help of a member of the circle to make the necessary translations. once the request is translated, specifications are sent to the sga which will consult the repository of the services currently provided by the cof as well as the currently active rwt and their rewards, trust, and security levels. elementary services will be assigned to individual rwt. however, for complex services, the sga plans and generates options to the ea which will make the necessary assessments and select an appropriate service composition plan with one backup plan. the selected plan will then be executed by the concerned rwt and monitored by the agent xa. 4.4. mobile agents a mobile agent is a software component capable of transporting itself from one location to another while performing delegated tasks. it is capable of interacting autonomously with foreign hosts while gathering information on behalf of its owner and delivering information and service based on its context-awareness knowledge [19]. because of the limited capabilities of rwt and the restrictions to access the web, mobile agents could play crucial roles in enabling the wot. for instance, any rwt can create a mobile agent, instruct it with specific tasks, and send it to neighboring or far rwt. the main goals of such agent include reporting events, negotiating deals as well as delivering, promoting, or attracting services to cut operating costs and discovering new partners or proxies. as the rwt contributing to the wot have heterogeneous capabilities, mobile agents should be lightweight to ensure an easy migration from one rwt to another. mobile agents should also abide by the requirements of hosts in terms of security (to avoid attacks), communication protocols, local resources use, and any local operating regulations. to this end, we need an efficient architecture for such agent. this architecture is explained in what follows. 4.5. belief desire intension (bdi) architecture in order to allow the rwt to reason adequately about occurring events and the dynamic surrounding environment affecting their web access, we propose a belief-desire-intension (bdi) architecture [13] for every agent embedded to a rwt (figure 3, left). in this architecture, beliefs represent the local information that the agent has about itself and its rwt (e.g., its current operations, services, processing capabilities, battery lifetime when applicable, communication protocols, etc.) and the environment (neighbouring trusted and untrusted peers as well as their communication protocols and the services they are providing). beliefs could be true or false and are subject to change. the desires reflect the objectives or the situations that the agent would like to accomplish, while the intentions refer to the actions that the agent has chosen to do. the agent will be always listening to communications from neighbouring and remote peers with whom it has connections (e.g., belonging to the same cof). for any new communication received, a revision function is executed in order to update the current beliefs. based on these beliefs, an option generation function updates the desires of the agent. an action generation function is then applied to 376 n. jabeur, h. haddad deliberate the new intensions. a plan generation function is finally executed to schedule the actions of the agent and update the beliefs, desires, and intentions accordingly. 5. socializing the web of things several researches have applied the idea of social networking to the iot arguing that if the iot can be made to imitate the social behaviour of the humans then those smart objects will be able to provide a better service than locally connected objects [29]. this results in a new idea called social internet of things (siot) [30]. siot applications can be a valuable resource in several areas, including domestics, business, automation and industrial manufacturing, logistics, and intelligent transportation of people and goods [31]. by analogy, we adopt in this paper the notion of social web of things (swot) where rwt use the social web as a platform to guarantee network navigability (effectively performing the discovery of objects and services), guarantee scalability as in human social networks, and establish appropriate levels of trustworthiness to improve the degree of interaction among things that are friends. the swot is also an open structure where rwt can seek for help to find trusted peers for their web connection, particularly if they are not web-enabled. they can also find peers with similar objectives with which they can seek advices about the reputation and trustworthiness of other rwt, share operating costs, jointly create services beyond their individual capabilities, mutually delegate tasks, etc. we therefore believe that it is important to adapt existing social theories to the wot context and prepare an impending shift to an environment where social relations will exist between everything. this shift will also bring the swot to the social web of everything (swoe). in order to enable the swot, it is important to possess efficient tools that facilitate a seamless connection and cooperation among devices and users. to this end, it is important to leverage modern paradigms like social networks and crowd-based applications, create a platform allowing the development of swot while enabling its relevant business-wise ecosystem, and create data analysis and recommendation techniques that fit the above paradigms and enable useful application creation. re-examining the concept of mashups and adapting them to the context of wot would be an asset. 6. applications 6.1. intelligent web of vehicles advances on sensing and communication facilities are impelling the evolution of the conventional vehicular ad-hoc networking (vanet) activities to the cloud, creating thereby the emergent notion of internet of vehicles (iov) [32]. in the iov paradigm, each vehicle is potentially involved with heterogeneous devices, communication and networking technologies, service kinds, data formats/contents, accuracy/efficiency requirements, etc. in order to smoothly integrate and connect the rwt and information resources of the iov along with a seamless integration with the social context, we coin the term web of vehicles (wov) that particularly aims to leverage web protocols and technologies for vanet related devices/objects, while facilitating rapid service generation and sharing. some of the devices on vehicles could be web-enabled and could therefore be endowed with embedded tiny web servers. these devices could play the role of proxies for other devices which cannot connect to the internet. to this end, they may provide them with restful apis for a direct web-based access. from intelligent wot to social wot 377 within the context of wov, let us suppose that a commuter wants to reduce his travel time between two given locations. in order to avoid unexpected traffic jams and reduce stoppage time at road intersections, a speed sensor on the commuter vehicle continuously reports information to an onboard decision unit (similar decision units could be embedded to any of the rwt in the wov scenario). this unit also receives data from distance and environmental sensors as well as information/services from the road infrastructure, vehicles, humans, and sensors in the vicinity. in addition to measuring the distance between the current vehicle and neighbouring objects (vehicles, road infrastructures, etc.), a distance sensor on the commuter’s vehicle could receive measurements from similar sensors on vehicles in the vicinity. these measurements should be cleaned and filtered by the distance sensor in order to assess the position of the vehicle with respect to its neighbouring objects from the side where the sensor is deployed. the sensor should also timely share useful information with other appropriate rwt on the road. for a better assessment of the situation, all distance sensors on the commuter’s vehicle will collect similar data and submit reports to the decision unit onboard. agent-based techniques (e.g., [28][27]) could then be used for data filtering and mining purposes on any of the sensors/rwt. as road safety is a shared matter, on-road vehicles have to accommodate each other and mutually exchange contextual information and services on-time. examples of services may include vehicle driving conditions (speeding, planned driving directions, alerts on vehicle about critical situations, etc.), on-road events (traffic jams, accidents, etc.), and professional services (healthcare if the driver is doctor/nurse, plumber, etc.). the vehicles of the wov will create cofs. a cof does not necessary consist of geographically collocated vehicles. for instance, some vehicles may share the same destination or the same social interests and therefore would like to maintain their cof, although they may be very far from each other because of traffic conditions. for each cof, one vehicle will be elected as hcof using an appropriate clustering technique [33]. this vehicle will maintain the list of services provided by each of the vehicles in the circle. it can also request services on their behalf and enable them to socially connect with similar vehicles from other circles. the hcof should always stay tuned to the needs of the members of the circle, update their rewards, plan the composition of complex services, etc. to this end, all requests received by the hcof will be translated, when needed, into the internal language and formats by an onboard intelligent agent. service composition will be planned by a special agent based on the current offering, trust, and capabilities of the vehicles in the cof. since some vehicles would be competing to offer their services and increase their rewards, an agent evaluator will fairly and carefully check service composition plans before handing over the approved plan to an executor agent to monitor the required actions. rewards and trust levels will then be updated accordingly once this plan is achieved. 6.2. smart logistics roughly, logistics is a part of the supply chain process where the forward and reverse flow and storage of good, services, and related information are effectively and efficiently planned, implemented, and controlled between the point of origin and the point of consumption with the aim to meet customers’ requirements [34]. the logistics industry is being considered a key player currently benefiting from the revolution of iot [35]. for instance, large numbers of a variety of machines, vehicles and people are daily packing, moving, and tracking millions of freights around the world within complex ecosystems known for their large operational scales and unpredictable spatio-temporal events. integrating a wide range of heterogeneous assets 378 n. jabeur, h. haddad and allowing them to interoperate in timely fashion is being helped with iot capabilities while creating customized, dynamic, and automated services for their customers. in order to make increasing benefits within this context of falling prices of device components, devices should be allowed to smoothly connect to the web. the wot is an ideal platform that would allow devices to cooperate in a context of smart logistics scenarios. amid the existing applications of logistics, we will focus in what follows in the scenario of fright transportation. although it is already possible today to track and monitor a container in a freighter in the middle of the pacific and shipments in a cargo plane midflight, it is expected from the iot and the wot to provide the next generation of track and trace by allowing them to be faster, more accurate, more predictive, and more secure [35]. the spatially distributed sensing devices can be endowed with web-enabled capabilities to consult nearby and remote devices and request specific services of current interest. imagine that a given damaged container had been moved by some trucks before and another truck is going to move it this time. this latter truck may connect to the wot and request some details and recommendation about the best way to transport this container while avoiding problems because of the already existing damages. since the different tracks could be located in far regions, efficient communications mechanisms are needed. to meet the above goals, clear and standardized approaches are needed to allow a seamless interoperability for exchanging sensor information in heterogeneous environments. sensors should be able to establish trust relations with a circle of friends in order to overcome some privacy issues in the wot-powered supply chain. in order to clarify these ideas, let us suppose the scenario of figure 4 where rwt are embedded or deployed on several facilities, tcof cocof scof ccof legend: cocof (container circle of friends), tcof (truck cof), (scof (ship cof), ccof (crane cof) rwt_truck rwt_container rwt_ship rwt_crane fig. 4 freight transport scenario from intelligent wot to social wot 379 including ships, planes, containers, cranes, etc. these rwt may have different processing, storage, and communication capabilities. because of the highly dynamic environment (e.g., facility movements) and sporadic spatio-temporal events (e.g., accident on the container yard, heavy rain, etc.), rwt have interest to coordinate their effort and particularly take benefit from previous experiences of peers while currently performing similar tasks. to this end, rwt on trucks could create a truck circle of friends (tcof) and rwt on cranes could form a crane cof (ccot). similarly, we can talk about container cof (cocof), ship cof (scof), and plane cof (pcof). let us imagine that a container is being transported by a truck for shipment. an onboard master rwt (let’s call it mco_rwt: master container rwt) is assigned to the control of this container. the mco_rwt could request to join the tcof as service consumer since its container is being transported by a truck. the rwt has also to communicate with any rwt onboard of its container. relevant information could be conveyed timely within the tcof as for example when goods inside the container have underwent some damage and more careful transportation services should be observed. once the container is deposited on the shipment area, the mco_rwt has to confirm to the master rwt assigned to the crane (mc_rwt) its position as well as its local conditions and parameters. the mco_rwt will then unsubscribe from the tcof and subscribe to the ccof. although only one crane is generally responsible of shipping the container, other cranes could give recommendations based on the current conditions of the container as well as the ongoing environmental conditions. once on the ship, the mco_rwt may connect with other rwt during the marine transport (figure 5). our scenario could also be extended to the phase when the container is on road. besides, as some goods in the container could travel by air then the same scenario could be extended to air transportation. mco_rwt container_1 mco_rwt container_nmco_rwt container_2 cocof mt_rwt truck_1 mt_rwt truck_mmt_rwt truck_2 tcof [during transport] mc_rwt crane_1 mc_rwt crane_kmc_rwt crane_2 ccof [during shipment] ms_rwt ship_s ms_rwt ship_1 ms_rwt ship_3 scof [on ship] fig. 5 a multiagent system architecture for freight transport scenario 380 n. jabeur, h. haddad 7. conclusion real world things (rwt) are currently capable of establishing connections to the web, either directly via ip-enabled capabilities or via proxies. however, because of their spatial distribution, heterogeneity, limited resources, and sporadic mobility, maintaining efficient, secure, and durable connections is not straightforward. we therefore presented in this paper some conceptual steps towards enabling the vision of intelligent wot (iwot) and social wot through the use of mas techniques. in order to show the potential of our ideas, we discussed two important application scenarios, namely the intelligent web of vehicles and smart logistics. several issues still need to be addressed in the future to fully implement our vision. in this regard, the mas-based architecture proposed for service composition needs to be refined, implemented and experimented. then it needs to be extended to address the other challenges presented in the paper, including data processing and storage, networking and communication, and trust, privacy, and security. we also believe that considerable research and development works are needed towards socializing the wot. references [1] n. jabeur, h. haddad. “towards an intelligent web of things”, in proceedings of the international conference on recent advances in computer systems racs-2015, hail university, saudi arabia, november 2015. [2] n. jabeur, n. sahli, s. zeadally, “abama: an agent-based architecture for mapping natural ecosystems onto wireless sensor networks”, invited paper, in proceedings of 9th international conference on future networks and communications (fnc-2014), elsevier procedia computer science, volume 34, canada, august 2014. [3] r. rajkumar, i. lee, l. sha, j. stankovic, “cyber-physical systems: the next computing revolution”, in proceedings of the 47th design automation conference. acm, new york, usa, 2010, pp. 731-736. [4] casagras. casagras final report: rfid and the inclusive model for the internet of things, 2009, pp. 10-12. [5] z. pang , “technologies and architectures of the internet-of-things (iot) for health and well being”, kth royal institute of technology, 2013. [6] d. zeng, s. guo, z. cheng, “the web of things: a survey (invited paper)”, j. communications, vol. 6, no. 6, pp. 424-438, 2011. [7] s.s. mathew, y. atif, q.z. sheng, z. maamar. internet of things and inter-cooperative computational technologies for collective intelligence, bessis, n., xhafa, f., varvarigou, d., hill, r., li, m. (ed./s), 2013, pp.1-23 [8] n. zhong, j. ma, r. huang, j. liu, y. yao, y. zhang, j. chen. research challenges and perspectives on wisdom web of things, journal of supercomputing, springer, 2010. [9] s. cheshire, d.h. steinberg, zero configuration networking, the definitive guide, o’reilly, 2005. [10] d. raggett . the web of things: extending the web into the real world, sofsem 2010: theory and practice of computer science, jan 2010. [11] b. ostermaier, m. kovatsch, and s. santini, “connecting things to the web using program-mable lowpower wifi modules”, in proceedings of 2nd international workshop on the web of things, 2011. [12] d. guinard and v. trifa, “towards the web of things: web mashups for embedded devices”, in proceedings of the workshop mashups, enterprise mashups and lightweight composi-tion on the web (mem’09), 2009. [13] r. t. fielding, architectural styles and the design of network-based software architectures, ph.d. dissertation, 2000. [14] c. pautasso, o. zimmermann, and f. leymann, “restful web services vs. 'big' web services: making the right architectural decision”, in proceedings of the 17th international conference on world wide web, ser. www ’08. new york, ny, usa: acm, pp. 805–814, 2008. [15] s. duquennoy, g. grimaud, and j.j. vandewalle, “the web of things: interconnecting devices with high usability and performance”, in proceedings of the international conference on embedded software and systems (icess’09), 2009. from intelligent wot to social wot 381 [16] z. shelby, “embedded web services”, ieee wireless communication magazine, vol. 17, no. 6, pp. 52–57, 2010. [17] g. kortuem, f. kawsar, v. sundramoorthy, d. fitton, “smart objects as building blocks for the internet of things”, in proceedings of the ieee internet computing, vol. 14, no. 1, pp. 44-51, 2010. [18] a. m. mzahm, m. s. ahmad, y. alicia and c. tang, “agents of things (aot): an intelligent operational concept of the internet of things (iot)”, in proceedings of the 13th international conference on intelligent systems design and applications (isda 2013), pp. 159-164, 2013. [19] a. m. mzahm, m. s. ahmad, a. y. c. tang, “enhancing the internet of things (iot) via the concept of agent of things (aot)”, journal of network and innovative computing, vol. 2, pp. 101-110, 2014. [20] p. sawyer, a. pathak, n. bencomo, v. issarny, “how the web of things challenges requirements engineering”, in proceedings of the 3rd workshop on the web and requirements engineering at 12th international conference on web engineering icwe 2012, berlin germany, july 2012. [21] d. guinard, v. trifa, f. mattern, e. wilde, “from the internet of things to the web of things: resourceoriented architecture and best practices” d. uckelmann, m. harrison and f. michahelles, editors, architecting the internet of things, pp. 97-129. springer berlin heidelberg, berlin, heidelberg, 2011 [22] d. guinard, v. trifa, e. wilde, “a resource oriented architecture for the web of things”, in proceedings of ieee international conference on the internet of things (iot) 2010. tokyo, japan. [23] j. yu, b. benatallah, f. casati, f. daniel, “understanding mashup development”, ieee inter-net comput, vol. 12, pp.44-52, 2008. [24] e. wilde, putting things to rest, ucb ischool report 2007-015, school of information, uc berkeley, 2007. [25] d. guinard, m. fischer, and v. trifa, “sharing using social networks in a composable web of things”, in proceedings of the 1st ieee international workshop on the web of things (wot), 2010, germany, 2010. [26] s. bandyopadhyay and e.j. coyle an energy efficient hierarchical clustering algorithm for wireless sensor networks”, proc. of infocom 20013, ieee societies, 2013, vol. 3, pp. 1713-1723 [27] p. skocir, h. maracic, m. kusek, g. jezic, “data filtering in context-aware multi-agent system for machine-to-machine communication”, g. jezic et al. (ed.), agent and multi-agent systems: technologies and applications, smart innovation, systems and technologies 38, 2015. [28] k. a. albashiri, “an investigation into the issues of multi-agent data mining”, ph.d. dissertation, the university of liverpool, liverpool l69 3bx, 2010, united kingdom. [29] x. hannan, n. sidhu, b. christianson, "guarantor and reputation based trust model for social internet of things," in proceedings of the international wireless communications and mobile computing conference (iwcmc), 2015, pp. 600-605. [30] l. atzori, a. iera, g. morabito and m. nitti, “the social internet of things (siot) when social networks meet the internet of things: concepts, architecture and network characterization,” computer network, vol. 56, no. 16, pp. 3594-3608, 2012. [31] l. atzori, a. iera and g. morabito, “the internet of things: a survey,” computer networks, vol. 54, no. 15, pp. 2787-2805, 2010. [32] m. gerla, e-k. lee, g. pau, u. lee, “internet of vehicles: from intelligent grid to autonomous cars and vehicular clouds”, in ieee world forum on internet of things (wf-iot), 2014, pp.241-246. [33] s. vodopivec, j. bester, a. kos, "a survey on clustering algorithms for vehicular ad-hoc networks", in proceedings of the 35th international conference on telecommunications and signal processing (tsp), 2012, pp. 52-56. [34] b. tilanus, information systems in logistics and transportation. elsevier science ltd., uk, 1997 [35] dhl and cisco (2015) internet of things in logistics a collaborative report by dhl and cisco on implications and use cases for the logistics industry, available at: http://www.dpdhl.com/content/dam/dpdhl/ presse/pdf/2015/dhltrendreport_internet_of_things.pdf. http://www.dpdhl.com/content/dam/dpdhl/presse/pdf/2015/dhltrendreport_internet_of_things.pdf http://www.dpdhl.com/content/dam/dpdhl/presse/pdf/2015/dhltrendreport_internet_of_things.pdf instruction facta universitatis series: electronics and energetics vol. 30, n o 3, september 2017, pp. 295 312 doi: 10.2298/fuee1703295p control of functional electrical stimulation for restoration of motor function  dejan b. popović institute of technical sciences of sasa, belgrade, serbia emeritus professor, aalborg university, aalborg, denmark abstract. an injury or disease of the central nervous system (cns) results in significant limitations in the communication with the environment (e.g., mobility, reaching and grasping). functional electrical stimulation (fes) externally activates the muscles; thus, can restore several motor functions and reduce other health related problems. this review discusses the major bottleneck in current fes which prevents the wider use and better outcome of the treatment. we present a control method that we continually enhance during more than 30 years in the research and development of assistive systems. the presented control has a multi-level structure where upper levels use finite state control and the lower level implements model based control. we also discuss possible communication channels between the user and the controller of the fes. the artificial controller can be seen as the replica of the biological control. the principle of replication is used to minimize the problems which come from the interplay of biological and artificial control in fes. the biological control relies on an extensive network of neurons sending the output signals to the muscles. the network is being trained though many the trial and error processes in the early childhood, but staying open to changes throughout the life to satisfy the particular needs. the network considers the nonlinear and time variable properties of the motor system and provides adaptation in time and space. the presented artificial control method implements the same strategy but relies on machine classification, heuristics, and simulation of modelbased control. the motivation for writing this review comes from the fact that many control algorithms have been presented in the literature by the authors who do not have much experience in rehabilitation engineering and had never tested the operations with patients. almost all of the fes devices available implement only open-loop, sensory triggered preprogrammed sequences of stimulation. the suggestion is that the improvements in the fes devices need better controllers which consider the overall status of the potential user, various effects that stimulation has on afferent and efferent systems, reflexive responses to the fes and direct responses to the fes by non-stimulated sensory-motor systems, and the greater integration of the biological control. key words: functional electrical stimulation, neurorehabilitation, optimal control, finite state control received january 9, 2017 corresponding author: dejan b. popović institute of technical sciences of sanu, kneza mihaila 35, 11000 belgrade, serbia (e-mail: dbp@etf.rs ru) 296 d. b. popović 1. introduction an injury or disease of the central nervous system (cns) (fig. 1) leads to disability expressed with decreased sensory-motor performance (e.g., tetraplegia, paraplegia, hemiplegia, multiple sclerosis, cerebral palsy, etc.). fig. 1 the sketch of the central nervous system with the annotations of particular subsystems and types of disability (left panel) and the indication on possible sites of electrical stimulation (right panel). the disability changes the lifestyle and results with other medical problems (e.g., muscle atrophy, contractures, frequent bladder infections, reduced cardiovascular capacity). electrical stimulation (es) is used for external activation of sensory-motor systems after an upper motor neuron lesion to decrease the disability (fig. 1). fig. 1 shows the stimulation sites at the head (including the neck), spinal cord, and periphery. some of the sites are above the lesion and in complete transections do not reach the parts that are paralyzed, and vice versa when the fes is applied at the periphery, and there is a complete spinal cord lesion only parts of the body are directly activated. however, in both cases in persons with incomplete lesion (likely about 90% of the total population), the stimulation can be seen above and below the lesion and has multiple effects as suggested in fig. 2. when the es generates a function, then it is termed functional electrical stimulation (fes). a motor neural prosthesis (mnp) is the system which by employing the fes restores a motor function [1 25]. to regulate externally the activation of muscles and consequently the movements of the body parts one must understand the richness of natural mechanisms being involved in the operation. the reasons are the following: 1) muscles, tendons, ligaments, and joints differ from the man-made active kinematical chains; 2) the musculoskeletal system is control of functional electrical stimulation 297 highly redundant; 2) the biological control system comprises a highly large number of nerves that serve as sensors, transmission channels to the muscles, and the decision-making circuits operating as a neural network physically defined to an extent by the interconnections that are hard wired [26], and partly the networks operating based on the unsupervised and supervised training. the external control must implement natural like model since they work jointly because only some parts of the body are paralyzed and require artificial external control, while the rest of the body is controlled by the central nervous system. fig. 2 the schema of the mechanism of fes of peripheral electrical stimulation. the stimulation activates afferent and efferent nerves. the afferent activity results with the reflex response from lower and upper motor neuron, while the efferent stimulation generates muscle contraction. the hierarchical, self-organized biological control system relies on the extreme redundancy of the sensory-motor systems. motion performance in a healthy person appears to be flexible and uncomplicated, although the neuronal operation which controls the system is still vaguely understood, even for the movements which comprise a small number of body parts [27]. the question which comes out to mind from the motor control studies is: are the activities of the system components chosen randomly from the variety of possibilities to accomplish a given task, or there is a consistently reproducible pattern 298 d. b. popović of behavior that must be used? if the later, would it be possible to understand the constraints imposed to reproduce the same motor behavior whenever the motor task needs to be executed? a human can capture a spectrum of functional movements during the life. most of the movements are mastered in early childhood; however, the repertoire is increasing and changing throughout the life, if so required. a functional movement relies upon perceptuomotor coordination that involves three main elements: the sensory information, the internal coding that is appropriate, and the generation of motion. the author of this review assumes the model of control as a "black box" with an internal structure that is partly known [28 – 30]. the black-box approach follows many studies that the author proposed and carried out working with clinicians and motor control scientists. he started his research from the model-based control to design exoskeletons and hybrid assistive systems [31]. the clinical applications of these early developments made clear that the modeling is essential for studying the behavior and design components of the systems. fig. 3 summarizes the effects of the fes applied to a peripheral nervous system. the modeling approach limits in most cases the control to the sketch of the stimulator and the efferent activation of nerves (fig. 3, bottom left panel); while all other effects are not considered. the modeling approach carries the following: 1) over simplification of the real system; 2) no considerations of the time variability of the system; 3) not observable and measurable parameters of the model; 3) the target trajectory that is not known; 4) feedback which is not applicable due to the delays of the nerves, synapses and muscles; and 5) the fact that only parts of the system are controlled externally, while the remaining of the body is controlled by biological structures. the black-box model of the fes must consider that the input to the user comes through a variety of sensory modalities which continuously updates information about the state of the system and the interactions with the environment. the black-box model needs to include that the sensory input triggers perceptual elements which assist selecting the decision what and how to execute the task. the information is processed at several structures of the brain in serial and parallel operations. the intention command (function selection) signal travels to lower centers of the cortex, midbrain, brainstem, and to particular segments of the spinal cord. the signals from the spinal cord in synchrony activate muscles via peripheral nerves and adjust the commands based on the afferent inflow [32, 33]. muscles receive signals and produce forces which generate a relative movement of bodily parts, and due to the interaction with the environment the movement of the body in the space. the engineering description of this model is the following: 1) the control has a hierarchical structure with many parallel channels; 2) feedback tunes control at the lower and upper levels; 3) time delays characterize the operation and required a combination of the feedback and predictive control; and 4) the control needs to overcome the ambiguity of the redundant system. in this review, we concentrate on the stimulation of peripheral nervous system (pns). an fes generated bursts of short pulses of electrical charge triggers a series of action potentials in afferent and efferent neural pathways. the externally triggered efferent pathways directly activate muscles that are innervated by the said neurons; yet, not in the same manner as the volitional motor command would be coming from the upper motor neuron (fig. 3). in parallel, the activity triggered in afferent pathways carry action potential to the spinal cord where various reflexes are generated (e.g., cross-extension reflex, flexion reflex), and interneurons are activated transmitting signals which eventually reach the cortex [15]. the fes delivers bursts of the electrical charge that are converted into the ionic currents in the tissues via electrodes. the time varying magnetic field can induce ionic currents; thereby control of functional electrical stimulation 299 the electrodes could be replaced with magnetic coils. the stimulus waveform selected for the excitation process must take into consideration the physiological effect (action potential generation), potential damage to the tissue, and possible degradation of the electrode [34, 35]. the amplitude modulation (am) or pulse width (duration) modulation (pwm) controls the level of recruitment [36 41]. pwm utilizes a slightly lower charge density compared with the am to evoke a response of equal magnitude. since the timing circuits (i.e., regulating pulse width) can be easily constructed and controlled with a resolution of 1 µs or less, many stimulators implement the pwm. the typical duration of excitation pulses is in range to 250 µs with implanted electrodes, but longer with surface electrodes. the threshold for excitation of the fibers of a peripheral nerve is proportional to the diameter of the fiber. since the nerve is composed of a mixture of afferent and efferent fibers with a spectrum of fiber diameters, short pulses of constant amplitude will excite large afferent and efferent fibers. longer pulses may also stimulate smaller fibers, including afferents typically carrying information of noxious stimuli, and therefore may be painful to the subject. for this reason and to minimize the electrical charge injection, short pulses are preferred. recent research by our group resulted with multipad electrodes and the distributed stimulation which contribute to better selectivity, comfort for the users and substantially postponed muscle fatigue caused by the es [42 44]. the regulation of the strength of a motor response depends on the number of activated nerve fibers and the rate at which they are activated. these two mechanisms are called recruitment and temporal summation, respectively. the same terms are utilized to describe electrically elicited events. when the stimulus is sufficiently large, an action potential will be evoked in the nerve. in a healthy person the slow, fatigue-resistant motor units are activated at a lower effort compared with the larger, fast fatigable units. in the fes the recruitment order depends upon the variables of position and geometry as well as fiber size [36 41]. fig. 3 block diagram showing components for an fes stimulator. the acronyms used are: adc – analog to digital converter, dac – digital to analog converter, dc/dc – converter of the voltage from low to high level, i/p and o/p are digital input and output ports, emg – electromyography, eeg – electroencephalography. 300 d. b. popović the second mechanism affecting the force developed by the muscle is the temporal summation. at low frequencies the response is unfused, and the variations of the muscle force are expressed. the frequency at which the mechanical responses produced are sufficiently smooth is known as the fusion frequency. in most human upper extremity muscles the fusion occurs at about 16 pulses per second. the unit hz is often used instead of the correct term pulses per second. a maximum force (tetanus) can be reached at higher frequencies. the fast development of electronics, microcomputers, wireless communication, and sensors makes the design of stimulators relatively straightforward. the requirements are the following: the system needs to be safe, small (portable), efficient, battery operated, and adaptable for various applications. the stimulator needs to control the frequency of pulses, pulse duration, pulse amplitude, the shape of the pulse and support several output channels. the stimulator needs to communicate with the user and use sensors signals including electrophysiological signals for the selection of the pattern of bursts delivered as particular channels. battery supply and the charging circuitry are a must for an fes system. the block diagram of a stimulator is shown in fig. 3. the elements that need the attention in implantable systems are the biocompatibility, miniaturization and recharging circuitry. 2. control for fes-based motor neural prosthesis 1 the multi-level control appropriate for the fes system which operates as a motor neural prosthesis (mnp) is presented in fig. 4. fig. 4 the model of the hierarchical hybrid controller for fes aiming to restore the movement. 1 the presentation uses the modified material from the publication by the author [30]. control of functional electrical stimulation 301 the intentions and decisions of the user are at the top of the hierarchical controller. the user communicates to the external controller decisions what and how he/she is planning [45]. the appealing method for the communication is a brain-computer interface (bci). the bci is these days one of the top research topics. many techniques to extract signals are reported in the literature [46 56]. the interfaces for the bci are not perfected yet (fig. 5). to support this statement cite the conclusion from one of the most respected research groups in the world [46]: "despite a growing animal literature demonstrating on-line control of functional hand movements from spike patterns recorded with microelectrodes in the motor cortex, bci applications in neurological patients are rare and hampered by methodological difficulties. bcis using eeg measures allow verbal communication in paralyzed patients with als; bci-communication in totally locked-in patients, however, awaits experimental confirmation. movement restoration in patients with little residual movement capacity using noninvasive bci is possible, but a generalization of improvement to real life needs further experimentation". events related recordings from the skull, correlated with a particular cortical activity, can be used as the trigger to start or stop an operation of the fes system [57]. fig. 5 three interface types for the bci: noninvasive (eeg, left panel), and invasive (ecog and lfp, middle and right panels) the implantable bci systems provide much more detailed information compared with the skull recordings. the electrodes for the reproducible cortical recordings need to be improved. the positional instability of the electrodes is a critical issue. the longevity and reaction of the cortical tissues to the biocompatible materials need consideration. implanting electrodes and electronic circuitry are rather invasive techniques. the recordings from the cortical electrodes will create eventually an extensive communication channel between the user and the microcomputer driving the fes or some other computerized device (fig. 5, middle and right panels). the complexity of capturing motor commands from the brain can be easily seen from the reduced model of the flow of information within the brain, 302 d. b. popović cerebellum and brainstem (fig. 6). namely, the immediate question is where and what to record to allow the direct bci? the electrophysiological signals from the periphery (nerve or muscle) are an option to replace the higher centers cns interface. the myoelectric control is widely used in hand prosthetics with success. there are various sites in persons with the cns lesion that can be utilized. surface electrodes could be used for recordings of muscle activities. implantable electrodes are required for the interface with nerves. the plurality of electrophysiological signals can be used for multiple tasks mnp controllers. some of the limitations listed for the bci are valid for the implantable systems for the peripheral recordings and can be even more expressed. fig. 6 the sketch of the flow of signals and major brain parts participating in the control of movement. switches and other computer inputs which utilize artificial sensors are the most efficient interface. the sensors measuring the acceleration, angular rate, joint angle, interface force/pressures when manipulated by the user provide a reliable and reproducible set of control signals. if the computer inputs are "mounted" on the body in a manner that allows subconscious sending of the appropriate command, then the multichannel control is facilitated. recently, artificial visual perception [58, 59] was introduced as an input for the fes systems. it has been tested with success in artificial hand prostheses [60]. the artificial visual perception allows the planning that is ultimate interest for control (e.g., type of grasp, the position of the object with respect the hand, size of the target object when grasping; curb, stairs, slope, obstacle, perturbation when walking/standing). state control for coordination of movement. the two top levels of the control use the finite state control (fsc). the division to two levels is made to mimic the biological control where the temporal synchrony and spatial synergies are interplaying. the fsc implements nonanalytical tools and non-parametric models of movement [61]. the fsc inherently deals with the following problems of movement control: 1) redundancy, nonlinearity, and time variability of the plant; 2) redundancy of plausible trajectories; and 3) the significance of the preference criteria based on the task. control of functional electrical stimulation 303 non-numerical tools are the identification techniques. in many cases, the identification is strictly based on heuristics. the heuristic procedures consist of choosing methods, which seem promising while allowing the changes if the originals do not to lead quickly enough to a solution [61]. this procedure allows that the fsc learns from "mistakes" and improves the performance based on the acquired skill. the fsc use set theory to define the behavior based on the states and rules to set transitions between these states. the states are movement representations in the multidimensional phase space (e.g., joint locked, joint free to move, flexion, extension expressed regarding muscle forces or joint kinematics. the transition from state to state is defined by rules. the rules are logical relations (e.g., if-then, and, or) that connect state variables. a production rule is a state action pair; i.e., whenever a particular state is encountered, given on the left side of the rule, then the action on the right side of the rule needs to be executed. there are no a priori constraints on the forms of the states or the actions. a system based on production rules have three components: 1) the rule base, consisting of the set of production rules; 2) one or more data structures containing the known facts relevant to the domain of interest called facts bases; and 3) the interpreter of the facts and rules, which is the mechanism that decides which rule to apply to initiate the target action. each rule is an independent item of knowledge, containing all the conditions required for the application. due to the modularity rbc systems can be changed by the addition, deletion or modification of a rule. an important feature of the rbc systems is the ability to look first at the established facts and use forward chaining, or to start from the task and implement backward chaining. the problem of knowledge representation (determination of rules) is fundamental to the operation of the rbc. the rules these days are automatically generated through a procedure known as machine learning and classification (e.g., inductive learning, artificial neural networks, adaptive logic networks, fuzzy-logic networks, wavelet networks, hebbian learning, stochastic classification techniques like principal component analysis, etc.). the data for the machine classification come are sensory patterns acquired while observing the process and the plant. the sensory patterns are coded (e.g., single threshold, multi-threshold, timing, local vs. absolute minimum or maximum), and the rules define the relationship between sensory patterns and required motor activities. a set of sensors providing feedback signals has been so far arbitrarily determined (e.g., ground reaction force or pressure sensor, switch, goniometer, inclinometer, accelerometer, and proximity sensor); the choice is based on availability of sensors, reproducibility of the sensory recordings, and overall practicality of plausible day to day usage. sensors that are functionally equivalent to those used in biological control systems are preferred. increasing the number of sensors produces very fast growth of the number of control rules making the definition process time consuming and difficult, yet very functional. our group demonstrated and tested several systems that use the rbc for the control of the grasping, reaching and walking [62 66]. the fsc is not applicable directly to the fes systems because the on-off activation will result in jerky movements. also, the time delays require a predictive rule base to allow that adaptation of the stimulation to the tasks. the fsc is a result of machine mapping, and even the high level of confidence close to 100% is no guaranty that a correct classification will follow since the hypothesis that the events are stochastic and independent is semi correct. model-based control for the fes. the fsc cannot be directly implemented for control because it ultimately creates jerky movements. to eliminate the jerks the physical 304 d. b. popović characteristics and properties of the system components must be incorporated. the method that is most appropriate is the model based control (mbc). the mbc considers the human body as a system of rigid bodies (skeleton) connected with rotational joints driven by joint actuators (muscles). the mbc considers the elastic properties of tendons connecting the muscles and bones, and the ligaments connecting the neighboring bones. the mbc for fes systems deals with two tasks: standing and walking and goal-directed movement (manipulation and grasping). the differences are the consequences of the tasks: walking is a near cyclic operation where the legs provide support for the trunk and propulsion, while the manipulation requires complex temporal and spatial activation to allow goal-directed movement. the muscle forces for the standing/walking are an order of magnitudes larger compared with those required for the arm movement and grasping. the walking introduces one tough task: balance. at present, there is no solution how to ensure balance while standing on a small surface (sole of a single foot). the mbc is a highly complex task. the human body comprises more than 200 bones driven by about 600 skeletal muscles. the skeleton is presented with a spatial or a planar model comprising segments and joints with externally driven muscles. the fes drives only a subset of the whole system that is cut out from the biological control due to a cns lesion. the only method to come to a solution is to use a reduced model. the reduction relates to the number of segments included in the analysis and the type of joints that connect the selected segments. fig. 7 shows the 12-segment model and the two-segment planar model of the leg. we present here the model of the planar model shown in panel b (fig. 7) as in illustration of the complexity of the significantly reduced mechanical model of the body for the analysis of the walking [31]. fig. 7 the two-degree of freedom model of the leg. the middle panel shows the models of the thigh and the shank. the right panel shows the actuators (agonist and antagonist muscles) at the hip and knee joints. the following notations are used: k and h are the knee and hip joints; s and t are the angles of the thigh and shank with respect the x-axes direction; ms and mt the masses of the shank and thigh; jcs and jct are axial moments of inertia of the thigh and shank for control of functional electrical stimulation 305 the axes perpendicular to the sagittal plane through the corresponding centers of masses; ls and lt are the lengths of the segments; ds and dt the distances between the hip and knee and the corresponding centers of masses. the torques ms and mt are the torques acting at the thigh and shank, while mh and mk are net joint torques at the hip and knee joints. the gravity direction is opposite to the y-axes. the double pendulum representing the leg (fig. 7) allows the knee and hip to flex and extend within the typical physiological range of movement. two pairs of monoarticular muscles acting around the hip and knee joints (right panel in fig. 7) drive the leg. the second order nonlinear differential equations describing the model are: m = φ cos l y + φsin l x φcos ) g + y( a φsin x a ) φ φ (sin φ a + ) φ φ ( cos φ a + φ a sssgssgsh5 sh4st 2 t3stt2s1   m = φ cos l y + φsin l x φcos ) g + y( b φsin x b ) φ φ (sin φ b + ) φ φ ( cos φ b + φ b tttgttgth5 th4st 2 s3sts2t1   (1) where the coefficients are b= b , a= a , d m + l m = b , d m = a 4545ttts4ss4 b = b , a = a , a = b , l d m = a 232322tss2 d m + l m + j = b , d m + j = a 2 tt 2 tsc1 2 ssc1 ts we used the notations for the angles to match the anatomical definitions of flexion and extension at the hip and knee. the net torques ms and mt generate the motion, and the torques mk and mh are a linear combination of the ms and mt. the simulation requires the following inputs: joint angle (trajectory), acceleration of the hip (body motion), and ground reaction force. the simulation needs the parameters that are geometrical and inertial parameters of the leg segments. the model presented is deterministic; hence there is a unique solution for the joint torques. the simulation result for one healthy person walking 0.9 m/s is presented (fig. 8). the parameters for the model are from the literature [67, 68]. fig. 8 the inputs for simulation (target trajectory) are the joint angles, ground reaction forces, and hip motion. geometrical and inertial parameters are for a healthy young person. the speed of the walking was 0.9 m/s. right panels show the output from stimulation: joint torques at the hip and the knee. 306 d. b. popović however, in the fes system, the joint torque is generated by flexor and extensor muscles at joints as shown in fig. 7, right panel. two complex elements introduced are the redundancy, and nonlinearity and time variability due to the characteristics of the actuators and their attachments to the skeleton. the eq. 1 now includes a more complex term replacing the pure torques acting at the high and shank (eq. 2): pk,ek,fk,s pk,ek,fk,ph,eh,fh,t tttm ttttttm   (2) the active flexor and extensor torques th,f, th,e, tkf, and tke are given by eq. 3: e f n k, hm ),)g(a(u)f( = t nm,   (3) the muscle response to electrical stimulation can be approximated by a second order, critically damped, low pass filter with a delay as shown in eq. 4: djω 2 pp 2 2 p e ω2jωω ω )u(j )a(j       (4) the nonlinear joint torque resulting from the contraction of muscles depends on the length of the muscle and the rate of the change of the length [69, 70]. the simplified model of this behavior is presented with the parabolic dependence of the torque from the joint angle and the hyperbolic fall of the torque with the increased joint angular rate (eq. 5): b , 0 b 0 , )1/1( 0 ,b )g(, 0f 0, 0f ,aaa )f( 1 11 12 210                           bb b , 0 b 0 , )1/1( 0 ,b )g(, 0f 0, 0f ,aaa )f( 1 11 12 210                           bb (5) the coefficients a0, a1 and a2 and b1 are specific for joints and vary between uses for the same joint. the passive torques at the hip and knee is (eq. 6): iiii6,i5, c i3, c i1,pi, kccecect ii4,ii2,     (6) where the ci and ki are constants that are user dependent. the redundancy can be resolved only by applying the optimization. the cost function needs to be formulated based on the tasks of simulation. the cost function can include the time, energy, force, torque, jerk, fatigue, muscle activation, non-physiological loading, the number of muscles used for the task, tracking error, and any combination thereof. based on heuristics we suggest that the minimization of the tracking error and the muscle’s efforts, with the preset level of co-contraction of agonist and antagonists (minimum stiffness) for the stabilization of joints and minimization of jerks. to demonstrate the value of the model based analysis we show a result from one of our earlier studies [67]. the cost function is given by eq. 7 dt } (t)]u+(t)u+(t)u(t)u[ +])(+ )[({= )( 2 4 2 3 2 2 2 12 2t2 sm s 1 t+t t 0 0          tm ur (7) control of functional electrical stimulation 307 the right panel (fig. 9) shows substantial tracking errors and the recruitment level that is at the maximum. the direct conclusion is that the selected target trajectory is not achievable for the patient. fig. 9 the recruitment level for four consecutive steps. the recruitment reaches a maximum, and a substantial tracking error occurs for the paraplegic subject. the optimization does not consider the minimum stiffness. we show the difference in the cost function (eq. 7 and eq. 8) affects the total cost. 3i 2 4 2 3 2 2 2 12 2 tm t2 sm s 1 t+t t )(u and dt } (t)]u+(t)u+(t)u+(t)u[ +])()[({= )( 0 0             t ur (8) the graphs suggest the optimal conditions are when the effort is about 50%, and the tracking error is small (only a few degrees), while the lower values are obtained if the cost function does not consider the minimum cocontraction. the values on the axis of the graphs are normalized (fig. 10), and we intentionally do not discuss this because of the scope of this presentation. after discussing how a model is defined for the design of a controller we go back to the review of the literature. the simplest mbc operates without feedback (fig. 11, openloop). the controller is the inversed model of the plant [e.g., 71 74]. there are two major problems with this type of control: 1) the model is reduced in comparison with the real plant and parameters of the model are not reflecting the properties of the system adequately, and 2) the disturbances are not part of the model. the operation of this openloop controller might be sufficient for simple systems; yet, in most cases, the system is not 308 d. b. popović robust enough [75 77]. an open loop controller uses the trajectory as the input. the term trajectory is used in broad sense (position, angle, velocity, acceleration, etc.). fig. 10 the optimization surfaces for the two optimization functions (eq. 3) for the two segmented model during the walking cycle at speed 0.9 m/s. a closed-loop mbc (fig. 11, close-loop), often termed "error driven control" uses feedback information from sensors measuring the achieved trajectory and corrects the command signal. the practical problem with the error-driven mbc is the biological delay; therefore, a predictive closed-loop control needs to be implemented. the term delay is related to the response of muscles (actuators) to the stimulation of the neural pathways. the response of the muscle to stimulation can be described with the low pass filter with the delay. the closed-loop control also requires a model that reflects the complexity of the organism being controlled and parameters which characterize the system. also, the control requires precisely defined trajectory and the permitted errors that would not compromise the use of the np. fig. 11 model-based control is operating without feedback (open-loop) and with feedback (closed-loop). the important aspects which need to be considered when using the model-based control are the time variability of the responses of muscles (e.g., muscle fatigue, habituation, etc.) and nonlinearities. the model based control is not applicable for multichannel electrical stimulation systems because the model is oversimplified, there is no preferred trajectory, the interaction with biological control is not included, the parameters of the systems change with time, the geometry of stimulation changes during the movement, reflexive behaviors are nor included (dystonia, hyperreflexia, spasticity), some muscles are atrophied or even denervated, the control of functional electrical stimulation 309 overall changes occurred in the central nervous system due to the lesion and modified periphery, stimulation activates afferent system that generates reflex responses (e.g., contralateral extension or flexion), stimulation is not selective enough, muscle fatigue could interfere due to the nonphysiological order of recruitment. 3. conclusion control of fes for restoring movements as a highly complex task since it required full integration of the external components into the remaining and modified biological motor control. the plant is nonlinear and time variable, the system is not observable; thereby the stability of the system is questionable. fig. 12 the mimetic model of the hybrid hierarchical controller for control of movement using fes. the left panel shows the components that need to be integrated into the corkscrew to release the full potentials of the fes for rehabilitation. we suggest that the concept presented and summarized in fig. 12 has the best chance to contribute to the translation of the research into the clinical and home use by persons with disability as an orthosis or just a therapeutic device. other aspects of the implementation of the fes are much more technological changes and cosmetic features. the interface to the user is also touched in this review, and the author is not convinced that invasive techniques which output a small number of commands are appreciated by the users and greatly increase the costs for the logistic support and original installation. the tests of the efficacy of the system with noninvasive interfaces should always be a step before considering the implant that improves the operation. acknowledgement: the work on this project was partly supported by the project no rs35003, ministry of education, sciences and technological development of serbia and the project f 137 from the serbian academy of sciences and arts, belgrade. 310 d. b. popović references [1] a. l. benabid, s. chabardes, j. mitrofanis, p. pollak p. “deep brain stimulation of the subthalamic nucleus for the treatment of parkinson's disease”, the lancet neurology, vol. 31, no. 8(1), pp. 67-81. 2009. [2] j. h. burridge, m. haugland, b. larsen, r. m. pickering, n. svaneborg, h. k. iversen, p. b. christensen, j. haase, j. brennum and t. sinkjær, "phase ii trial to evaluate the actigait implanted drop-foot stimulator in established hemiplegia", j. rehabil. med., vol. 39, pp. 212-218, 2007. [3] d. g. everaert, r. b. stein, g. m. abrams, a. w. dromerick, g. e. francisco, b. j. hafner, t. n. huskey, m. c. munin, k. j. nolan and c. v. kufta, "effect of a foot-drop stimulator and ankle-foot orthosis on walking performance after stroke: a multicenter randomized controlled trial", neurorehabil neural repair, vol. 27, no. 7, pp. 579-591. 2013. [4] l. e. fisher, m. e., miller, s. n. bailey, h. a. jr. davis, j. s. anderson, l. r. murray, d. t. tyler and r. j. triolo, "standing after spinal cord injury with four-contact nerve-cuff electrodes for quadriceps stimulation", ieee trans. neural. syst. rehabil. eng., vol.16, pp. 473–478, 2008. [5] t. keller, m. r. popović, i. p. i. papas, p-y. muller, “transcutaneous functional electrical stimulator compex motion”, artif. organs, vol. 26, no. 3, pp. 219-223, 2002. [6] k. l. kilgore, h. e. hoyen, a. m. bryden, r. l. hart, m. w. keith and p. h. peckham, "an implanted upper-extremity neuroprosthesis using myoelectric control", j. hand surgery, vol. 33, pp. 539-550, 2008. [7] r. kobetič, c. s. to, j.r. schnellenberger, m. l. audu, t. c. bulea, r. gaudio, g. pinault, s. tashman and r. j. triolo, "development of hybrid orthosis for standing, walking, and stair climbing after spinal cord injury", j. rehab. res. develop., vol. 46, pp. 447–462, 2009. [8] a. kralj, t. bajd, “functional electrical stimulation: standing and walking after spinal cord injury”, boca raton, florida, crc press, 1989. [9] g. e. loeb, r. a. peck, w. h. moore and k. hood, “bion system for distributed neural prosthetic interfaces”, med eng phys, vol. 23, pp. 9-18, 2001. [10] n. m. malešević, l. popović-maneski, v. ilić, n. jorgovanović, g. bijelić, t. keller and d. b. popović, “a multi-pad electrode based functional electrical stimulation system for restoration of grasp”, j. neuro eng. rehabil., vol. 9, no. 66, 2012. [11] s. mangold, t. keller, a. curt and v. dietz, “transcutaneous functional electrical stimulation for grasping in subjects with cervical spinal cord injury”, spinal cord, vol. 43, no. 1, pp. 1-3, 2005. [12] p. h. peckham and j. s. knutson, "functional electrical stimulation for neuromuscular applications", annual review of biomed. eng., vol. 7, pp. 327-360, 2005. [13] d. b. popović, m. b. popović, t. sinkjær, a. stefanović and l. schwirtlich, "therapy of paretic arm in hemiplegic subjects augmented with a neural prosthesis: a cross-over study", can. j. physio. pharmacol., vol. 82, pp. 749-756, 2004. [14] d. b. popović, t. sinkjær and m. b. popović mb. “electrical stimulation as a means for achieving recovery of function in stroke patients”, j. neurorehab., vol. 25, pp. 45–58, 2009. [15] d. b. popović, “advances in functional electrical stimulation (fes)”. journal of electromyography and kinesiology. vol. 24, no. 6, pp. 795-802, 2014. [16] a. prochazka, m. gauthier, m. wieler and z. kenwell, z. "the bionic glove: an electrical stimulator garment that provides controlled grasp and hand opening in quadriplegia", arch. phys. med. rehabil., vol. 78, no. 6, pp. 608–14, 1997. [17] r. van den brand, j. heutschi, q. barraud, j. digiovanna, k. bartholdi, m. huerlimann, l. friedli, i. vollenweider, e. m. moraud, s. duis, n. dominici, s. micera, p. musienko and g. courtine, "restoring voluntary control of locomotion after paralyzing spinal cord injury", science, vol. 336 no. 6085, pp. 1182-1185, 2012. [18] v. visser-vandewalle, z. temel, ch. van der linden, l. ackermans and e. beuls, "deep brain stimulation in movement disorders: the applications reconsidered", acta neurologica belgica, vol. 104, pp. 33-36, 2004. [19] http://www.bioness.com/ ness_l300_for_foot_drop.php [accessed, dec. 2016] [20] http://www.bioness.com/ ness_h200_for_hand_rehab.php [accessed, dec. 2016] [21] http://www.walkaide.com/en-us/ pages/default.aspx [accessed, dec. 2016] [22] http://www.odstockmedical.com/ products/microstim-2v2-kit [accessed, dec. 2016] [23] http://www.markfelling.com/id450.htm [accessed, dec. 2016] [24] http://www.ottobock.com/cps/rde/ xchg/ob_com_en/hs.xsl/4762.html [accessed, 2016] [25] http://musclepower.com/parastep.htm [accessed, dec. 2016] [26] g. rizzolatti and g. luppino, "the cortical motor system", neuron, vol. 31, no. 6, pp. 889-901, 2001. control of functional electrical stimulation 311 [27] m. jeannerod, the neural and behavioural organization of goal-directed movements, clarendon press/oxford university press, 1988. [28] n. bernstein, the co-ordination and regulation of movements, pergamon press, oxford (1967) [29] m. l. latash, control of human movement, human kinetics, 1993. [30] d. b. popović and t. sinkjær, control of movement for the physically disabled, london: springer, 2000. [31] d. b. popović, “control of walking in disabled humans”, journal of automatic control, vol. 13(suppl.), pp. 1-38, 2003. [32] r. grasso, y. p. ivanenko m. zago, m. molinari, g. scivoletto v. castellano, v, macellari and f. lacquaniti,”distributed plasticity of locomotor pattern generators in spinal cord injured patients”, brain, vol. 127, no. 5, pp. 1019-10034, 2004. [33] y. p. ivanenko, r. p. poppele and f. lacquaniti f, “five basic muscle activation patterns account for muscle activity during human locomotion”, the journal of physiology, vol. 556, pp. 267-282, 2004. [34] a. scheiner, j. t. mortimer and u. roessmann, “imbalanced biphasic electrical stimulation: muscle tissue damage”, annals of biomed. eng., vol. 1, no. 18(4), pp. 407-25, 1990. [35] j. t. mortimer, “motor prostheses”, in handbook of physiology: the nervous system, motor control. published on line 2011. [36] t. j. bajzek and r. j. jaeger, “characterization and control of muscle response to electrical stimulation”, ann. biomed. eng., vol. 15, pp. 485–501, 1987. [37] r. baratta and m. solomonow, “the dynamic response model of nine different skeletal muscles”, ieee trans. biomed. eng., vol. 37, pp. 243–51, 1990. [38] p. e. crago, p. h. peckham and j. t. mortimer, ”the choice of pulse duration for chronic electrical stimulation via surface, nerve and intramuscular electrodes”, ann. biomed. eng., vol. 2, pp. 252–264, 1974. [39] p. e. crago, p. h. peckham and g. b. thrope, “modulation of muscle force by recruitment during intramuscular stimulation”, ieee trans. biomed. eng,, vol. 27, pp. 679–684, 1980. [40] j. a. gruner and c. p. mason, "nonlinear muscle recruitment during intramuscular and nerve stimulation." j rehabil. res. develop., vol. 26, no. 2, pp. 1-16, 1988. [41] d. b. popovic, t. gordon, v. f, rafuse and a. prochazka, “properties of implanted electrodes for functional electrical stimulation”, annals of biomed. eng., vol. 19, no. 3, pp. 303-316, 1991 [42] d. b. popović and m. b. popović, “automatic determination of the optimal shape of the surface electrode: selective stimulation”, j. neurosci. methods, vol. 178, pp. 174-181, 2009. [43] l. popović-maneski, m. kostić, t. keller, s. mitrović, lj. konstantinović and d. b. popović, “multipad electrode for effective grasping: design”, ieee trans. neur. sys. rehabil. eng., vol. 21, pp. 648654, 2013. [44] l. popović-maneski, n, malešević, a. savić and d. b. popović, “spatially distributed asynchronous stimulation delays muscle fatigue”, muscle & nerve, vol. 48, pp. 930-937, 2013. [45] t. sinkjær, m. haugland, a. inmann, m. hansen and d. k. nielsen, “biopotentials as command and feedback signals in functional electrical stimulation systems”, medical engineering & physics, vol. 25, no. 1, pp. 29-40, 2003. [46] n. birbaumer, a. r. murguialday and l. g. cohen, “brain-computer-interface (bci) in paralysis”, the european image of god and man, pp. 483-492, lippincott williams & wilkins, 2010. [47] j.j. daly, r. cheng, j. rogers, k. litinas, k. hrovat and m. dohring, “feasibility of a new application of noninvasive brain computer interface (bci): a case study of training for recovery of volitional motor control after stroke”, journal of neurologic physical therapy, vol. 33, no. 4, pp. 203-11, 2009. [48] a. h. do, p. t. wang, c. e. king, a. abiri, z. nenadic, “brain-computer interface controlled functional electrical stimulation system for ankle movement”, journal of neuroengineering and rehabilitation, vol. 8, no. 49. 2011. [49] j. d. millán, r. rupp, g. r. müller-putz, r. murray-smith, c. giugliemma, m. tangermann, c. vidaurre, f. cincotti, a. kübler, r. leeb and c. neuper, “combining brain–computer interfaces and assistive technologies: state-of-the-art and challenges”, frontiers in neuroscience, vol. 4, 161, 2010. [50] c. t. moritz, s. i. perlmutter, e. e. fetz, ”direct control of paralysed muscles by cortical neurons”, nature, vol. 456 (7222), pp. 639-642, 2008. [51] g. pfurtscheller, g. r. müller-putz, j. pfurtscheller and r. rupp, “eeg-based asynchronous bci controls functional electrical stimulation in a tetraplegic patient”, eurasip journal on applied signal processing, p. 3152-5, 2005. [52] g. pfurtscheller g. r. müller, j. pfurtscheller, h. j. gerner and r. rupp. ”thought–control of functional electrical stimulation to restore hand grasp in a patient with tetraplegia”, neuroscience letters. vol. 351, no. 1, pp. 33-36. 2003. 312 d. b. popović [53] c. ethier, e. r. oby, m. j. bauman and l. e. miller, "restoration of grasp following paralysis through brain-controlled stimulation of muscles" nature, vol. 485, pp. 368–371, 2012. [54] a. m. savić, n. m. malešević and m. b. popović, “feasibility of a hybrid brain-computer interface for advanced functional electrical therapy”, hindawi publ corp, scientific world journal, article id 797128, 2014. [55] a. b. schwartz, t. cui t, d. j. weber and d. w. moran, “brain-controlled interfaces: movement restoration with neural prosthetics”, neuron, vol. 52, no. 1, pp. 205–220, 2006. [56] m. taylor, s. i. helms tillery and a. b. schwartz, "direct cortical control of 3d neuroprosthetic devices", science, vol. 296 (5574), pp. 1829-1832, 2002. [57] j. r. wolpaw, n. birbaumer, w. j. heetderks, d. j. mcfarland, p.h. peckham, g. schalk, e. donchin, l. a. quatrano, c. j. robinson and t. m. vaughan, “brain-computer interface technology: a review of the first international meeting”, ieee trans. rehabil. eng., vol. 8, no. 2, pp. 164-173, 2000. [58] s. došen and d. b. popović, “transradial prosthesis: artificial vision for control of prehension”, artificial organs, vol. 35, no. 1, pp. 37-48, 2011. [59] s. došen, c. cipriani, m. kostić, m. c. carrozza and d. b. popović, “cognitive vision system for the control of a dexterous prosthetic hand: an evaluation study”, journal of neuroengineering and rehabilitation, on line 7:42, 2010 [60] m. marković. s. došen, c. cipriani, d. b. popović and d. farina, “stereovision and augmented reality for closed-loop control of grasping in hand prostheses”, j. neural engineering, vol. 11, no. 4, p. 046001, 2014. [61] r. tomović, d. b. popović and r. b. stein, nonanalytical methods for motor control, world sci publ, singapore, 1995. [62] j. kojović, m. djurić-jovičić, s. došen, m. b. popović and d. b. popović, “sensor-driven fourchannel stimulation of paretic leg: functional electrical walking therapy”, j. neurosci.. methods, vol. 181, pp. 101-105, 2009. [63] d. b. popović and m. b. popović, "tuning of a nonanalytic hierarchical control system for reaching with fes", ieee trans. on biomed. eng., vol. 45, pp. 203-212, 1998. [64] m. b. popović and d. b. popović, "cloning biological synergies improved control of elbow neuroprostheses", ieee emb magazine, vol. 20, no. 1, pp. 74-81, 2001. [65] m. b. popović, “control of neural prostheses for grasping and reaching”, med. eng. phys., vol. 25, no. 1, pp. 41-50, 2003. [66] d. b. popović, r. tomović, d. tepavac and l. schwirtlich, “control aspects of active above-knee prosthesis”, intern. j man-machine studies, vol. 35, no. 6, pp. 751-767, 2001. [67] d. b. popović, r. b. stein, m. n. oguztoreli, m. lebiedowska and s. jonić. “optimal control of walking with functional electrical stimulation: a computer simulation study”, ieee trans. rehabil. eng., vol. 7, no. 1, pp. 69-79, 1999. [68] s. jonić, t. janković, v. gajić and d. b. popović, “three machine learning techniques for automatic determination of rules to control locomotion”, ieee trans. biomed. eng., vol 46, no. 3, p. 10, 1999. [69] g. shue, p. e. crago and h. j chizeck, "muscle-joint models incorporating activation dynamics, moment-angle, and moment-velocity properties" ieee trans. biomed. eng., vol.42, pp. 212-223, 1992. [70] r. b. stein, e. p. zehr, m. k. lebiedowska, d. b. popović, a. scheiner and h. j. chizeck, "estimating mechanical parameters of leg segments in individuals with and without physical disabilities" ieee trans rehabil. eng., vol. 4, pp. 201-211, 1996. [71] r. riener and t. fuhr, “patient-driven control of fes-supported standing up: a simulation study”, ieee trans. rehabil.eng., vol. 6, no. 2, pp. 113-124, 1998. [72] m. ferrarin, f. palazzo, r. riener and j. quintern, “model-based control of fes-induced single joint movements”, ieee trans. neural syst. rehabi. eng., vol. 9, no. 3, pp. 245-257, 2001. [73] z. matjacic, k. hunt, h. gollee and t. sinkjaer, “control of posture with fes systems”, med. eng. phys., vol. 25, no. 1, pp. 51-62, 2003. [74] d. b. popović, m. radulović, l. schwirtlich and n. jauković, "automatic vs. hand-controlled walking of paraplegics", med, eng, phys., vol. 25, pp. 63-74, 2003. [75] s. jezernik, r,g. wassink and t. keller, “sliding mode closed-loop control of fes controlling the shank movement”, ieee trans. biomed. eng., vol. 51, no. 2, pp. 163-172, 2004. [76] d. graupe, “emg pattern analysis for patient-responsive control of fes in paraplegics for walkersupported walking”, ieee trans.biomed. eng., vol. 36, no. 7, pp. 711-919, 1989. [77] c. t freeman, a. m. hughes, j. h. burridge, p. h. chappell, p. l. lewin ans e. rogers, “iterative learning control of fes applied to the upper extremity for rehabilitation”, control engineering practice, vol. 17, no. 3, pp. 368-381, 2009. mems design simplification with virtual prototyping facta universitatis series: electronics and energetics vol. 29, n o 1, march 2016, pp. 11 34 doi: 10.2298/fuee1601011s mems design simplification with virtual prototyping renate sitte griffith university, griffith sciences – ict, gold coast, australia abstract. mems design requires a good understanding of interactions in complex processes and highly specialized interdisciplinary skills. traditional prototyping is not easy or cheap due to typically needing very expensive manufacturing facilities for its implementation. progress towards faster, cheaper prototyping has been achieved but, it cannot be applied to mems fabrication in general. this paper analyzes the benefits of virtual prototyping for a simplification and aid in mems design and proposes the continuation of mems animated graphic design aid (magda) project. its purpose is to simplify preliminary design stages and make mems design more accessible to a wider audience. key words: mems, scientific visualization, vr-cad tools 1. introduction and motivation the purpose of this paper is to motivate making the design of mems more broadly accessible and to give a glimpse and overview on where to start for those who wish to endeavour into this area. since its early days, the mems industry is now established and many of the papers presented here are pioneering work that have subsequently been adopted and laid the foundations of this industry. nevertheless, the production technology options for mems remains vast; there is not a “one size fits all”. manufacturing challenges are more the result of a particular innovation of a specific mems than of the production process itself. this is also reflected in the research publications. one of the difficulties in mems design and innovation is that it requires highly specialized skills and a wide interdisciplinary background with experience in, physics, advanced mathematical modelling (e.g. for microfluidics), chemistry, materials engineering and manufacturing technology to name a few. it requires such skills for both, the technology and design of the mems itself, and the science and engineering understanding at the mems’ application niche. these required specializations and skills limit the potential for a broader industrial development. this is because development requires adequate tools with powerful modelling and simulation software to reduce the prototyping and received september 19, 2015 corresponding author: renate sitte griffith university, griffith sciences – ict, gold coast, australia (e-mail: r.sitte@griffith.edu.au) https://www.griffith.edu.au/griffith-sciences 12 r. sitte optimization period. the introduction of cad packages was a critical step in the widespread development of vlsi devices and reduction of the design and prototyping phase [1]. despite the demand, there is a lack of cad tools to aid in the development of mems devices. there are several packages available and their benefit is supporting the mathematical modelling part, but for a realistic and useful application, they still require a strong interdisciplinary background. in computing for example, the introduction of icons and mouse in the early eighties made a huge impact and breakthrough for shortcuts of recurring tasks like file handling, starting programs drawing and visual output. this allowed focusing more on using the computer than typing commands for menial tasks. suddenly, it allowed a broader audience to use a computer. we need to be able to bring mems design to a less specialized audience. other engineering disciplines, such as mechanics or robotics have found their way into early education and entertainment (edutainment). despite their ballooning ubiquitosity and breakthrough as, for example, in biomedical applications, mems are not yet ready for edutainment, which has undoubtedly a favourable effect for a richer understanding of physical cause-and-effect and shaping of the mind in younger years. it will be many years before mems design can be simplified to the point of pick and place on a virtual prototyping (vp) computer screen, and see it functioning in 3d and 4d vp. mems can nowadays be made of a range of materials, not just silicon. those materials have different physical properties and behave differently in manufacture and use. therefore a virtual reality (vr) computer aided design (cad) software that can mimic functioning with physically correct results can be the meccano or lego toy for edutainment and discovery (acquiring an intuition) at earlier ages than postgraduates. our aim should be making the whole mems domain more popular. this could be by bringing it to undergraduate or even final years of high school level with introductory courses and gradually adding more ambitious courses in a similar way, as introductory mathematics courses are taught early on, shaping the mind. to achieve that, we need simulation tools that are easy to use and to understand. just the lengthy training time to handle the software tools and time their calculations take is a discouragement. novices do not have the patience or the maturity to wait for something they have neither background nor meaning. the bottleneck is no longer the computing power but having usable and curiosity stimulating simulation tools for the uninitiated. our research has developed techniques suitable for virtual prototyping that reduce the calculation time without sacrificing physical correctness. our methods are suitable for initial design that can then be refined with conventional methods. it serves for advanced researchers and novices alike. the paper is organized in the following way: overall, we progress throughout the following sections towards mems virtual prototyping. section 2 brings some background and context about the wide range of applications, product ramifications and variety of problems as mems have evolved in just two decades. section three discusses existing tools for mems modelling and simulation and moves into existing cad systems. it explains some of the difficulties and complexities affecting reliability in mems modelling. chapter four looks at the importance of prototyping and its strong potential for innovation. chapter five discusses virtual prototyping as an important and flexible design tool that has not yet really found its way into mems design. in chapter six gives a snapshot of our contribution magda. it briefly explains our fast algorithms that make the difference for speedier virtual prototyping. the last section concludes the discussions with suggestions for future work. mems design simplification with virtual prototyping 13 2. background and context this section looks into the multidisciplinary aspect of mems. its main purpose is to motivate and provide context towards an easier design phase and virtual prototyping (vp) and this is reflected in its literature review. due to the diversity and amount of mems material published, this paper does not and cannot replace review papers. both mems and vp are extensive disciplines with their own specialization branches. the project of vp for mems is huge and ambitious, and requires specialization topics such as for example “physically based rendering” or “turbulent flow” and many more. such topics require in-depth study on their own. the project also requires to overcome the old dilemma that engineers are weak programmers, and software developers are weak in science and engineering. mems are minute devices that are in widespread use, for example in airbag triggers and inkjet print heads, optical, medical, and many other applications. with ever increasing new applications in the r&d phase, the mems industry is strong and growing, in particular in the medical and optical applications. by their very nature, mems devices are microscopic and therefore difficult to observe in action. in the macroscopic world of our daily experience, inertia and gravity dominate the motion of objects. in contrast, in the microscopic domain of mems adhesion and friction are the dominant forces. therefore, mems designers cannot use their intuition on how things behave. because of the different dominant forces, mems cannot simply be downscaled counterparts of larger mechanical machines, requiring innovative designs and arrangements of their components, whose effects are often not fully understood. for example, a fluid pump with macroscopic dimensions would not function if it were downscaled to a miniature version with microscopic dimensions. 2.1. evolution and rise of mems mems emerged in the late eighties and nineties with the downscaling of transistors’ structures into the submicron scale and by perfectioning microlithography patterning. since these early days, mems sizes are not only in the micron range but can be several millimetres big. the intention is to keep them as small as possible. small means less materials and therefore less cost and more flexibility in their placing. mems materials are no longer limited to silicon but also other materials e.g. polymers or metal are used. mems appeared as a new opportunity in microelectronic manufacturing, in which many of the fabrication steps and factory facilities of semiconductor industry could also be used for mems fabrication. liga technology, developed at the fzk, germany [2] for micropatterning precise aspect ratio microstructures with steep trenches or walls, played an important role in patterning microstructures [3]. examples of achievements and benefits in aspect ratio precision with liga are micro optical devices using filters with submicron sized structures, wave guides and photonic crystals, or gears of gold (luxury watch components), that are so perfectly fitting that they do not need lubrication [4]. the use of polymers opened a new opportunity for mems. one often speaks of mems as complex devices. however, the structural complexity and the functional complexity of mems [5] can be very different. they can be made of a few simple components that produce sophisticated function (example: a movable mirror in an optical switch), or several components in a complex arrangement that do simple function (example: a microfuidic pump). by their small size and electronic controllability, mems can be built 14 r. sitte into larger devices, often replacing hitherto large, heavy equipment (e.g. gyroscopes) or saving time and laboratory space in chemical analysis (e.g. lab on a chip). 2.2. impact in medicine an increasingly important impact of mems is in the medical industry where it has changed medical diagnostics and surgery in an evolution from microgrippers to endoscopy and robotic surgery. this in turn has transformed and brought in new capabilities e.g. ultrasonic surgery, microsurgery e.g. in eyes, on embryos, tactile feedback and with it keyhole surgery with all its associated benefits [6]. mems’ share in the medical industry alone has grown into a multi billion dollar industry in less than twenty years. another successful niche for mems with remarkable advances are in biophysical applications. for example, margesin et al. designed a mems for measuring the electrical activity and metabolic activity (ion concentration) in a network of neurons using ionsensitive field-effect transistor (isfet) arrays [7]. a word of caution, microsystems and nanotechnology are often erroneously thought as being the same. they are not; they operate at different scales of resolution. nanotechnology deals at molecular and particle level and therefore uses different models; it has different challenges and different industrial potential. however, in microtechnology it is possible to produce nano-sized structures whenever necessary. rieth has written a good introduction in a nutshell about nanotechnology (suitable for advanced readers) [8]. 2.3. training and specialization when it comes to education, vr is nowadays a well established option in undergraduate multimedia curricula in many tertiary institutions, with some institutes more specialized in vr than others. in contrast, the teaching of microsystems is usually deferred to at least masters level. this is due to the multidisciplinarity required in understanding mems. institutes that are known by their excellence in the field also offer regularly specific specialization short courses. such short courses and summer schools provide an introduction to a specific topic; they are a valuable step towards postgraduate research. programs for short courses can be easily found through international professional organizations. examples are fsrm fondation suisse pour la recherche en microtechnique (swiss foundation for research in microtechnology), neuchatel [9], imec interuniversity microelectronics centre, leuven [10] or the ieee [11] and eurotraining [12]. while a researcher or student should stick to reputable and peer reviewed literature, one must never forget that in industry is where results of research come to fruition. there is a wealth of real life information in industry reports that they should take advantage of, albeit, with some caution. they complement research findings by providing eye opening context. 2.4. mems design here we present a selection of issues arising in mems design, with the intention of preparing the scene towards virtual prototyping. mems design can be overwhelming by the wide, almost infinite range of possible structures and how these structures work together to provide a useful function. mastromatteo and murari have designed and proposed an architecture to address the diversity of mems by grouping them into the traditional categories moems, mems, lab on chip, rf mems, data storage mems. mems design simplification with virtual prototyping 15 however, it is not always possible to allocate a mems into just one of any these categories in a very strict sense due to their cross category functionality [13]. grouping them helps to conceptualize, but it is not a strict definition or standard. spearing analyzes scaling the size of mems in the context of macro versus micro scaling. in his work the relation between mass – volume scaling and volumetric to area scaling are explained. this is summarized in a table for guidance for possible scaling in mems design. the work shows how some of the most important effects of scale on mems design or performance cannot be attributed to a single physical factor. it also shows the need for fabrication processes that allow for dimensional tolerance but that this can be limiting to the shapes achieved. as a consequence in mems distinct, more expensive materials may be used, whose cost would be prohibitive for larger sized devices [14]. materials play an important role, in for example flexing membranes in micropumps, or cantilever switches. senturia offers an introductory overview into this area structures, processes and modelling [15]. another good source is by pelesko and bernstein explaining structures and device behaviours by motivating and developing understanding and intuition and then moving into the modelling and optimization [16]. 2.5. mems evolving manufacturing alternatives mems are 3d devices. the traditional functional distinction is into sensors and actuators [17]. they can be one single structure, or the result of many components but they all need some circuitry or interaction to control them. in silicon manufacturing, it would save huge fabrication costs if mems and the circuitry to control them could be manufactured together on the same wafer. unfortunately, this is rarely possible because processing steps that involve heat can damage previously accomplished structures. depending on the materials used, the structures of mems components are either removed from a solid material, or built up in a deposition process. this is the area of microlithography and micromachining [18] and [19]. there is a good range of introductory and advanced literature available, but publications hardly keep up with mems’ fast technology advances. one reason for publication delay is industrial non-disclosure. an overview about the fabrication process, materials, processes and micromachining are presented by g. fedder [20] and subramanian et al. [21] about design. despite mature fabrication processes, new mems applications and innovations in their fabrication present constantly new hurdles that must be overcome. this is illustrated by the design and realization difficulties of a micromachined silicon nanopositioner with electrothermal positioning by zhu et al. [22]. another example is nanochannel fabrication using bond micromachining by j. haneveld [23]. normally, the fabrication of mems is not a simple process. if it is in traditional silicon technology, it requires a semiconductor foundry, capable of handling around 200 processing steps and very expensive equipment. some processing steps that require very specialized equipment or processing facilities may be outsourced to other foundries. experimental implementations are fundamental for frontier research. due to the high costs, research centres are often equipped with whatever funding allows and sometimes fabrication steps have to be carried out at industrial facilities. the collaboration between research institutes and fabricants providing service for prototyping and fabrication is a convenient way to overcome shortages for mutual benefit, sometimes even sponsoring research [24]. 16 r. sitte a vast range of technologies have been developed and there is more to come. nowadays mems are not only made of silicon but other materials are used, for example, glass, photoresists or other polymers that can be patterned by laser, rapid ion etching (rie) or other technologies. for example, desbiens et al. found that for prototyping, an excimer laser (uv laser) can be used for the removal of materials (ablation) for mems micromachining of 3d structures in approximately 1-5 m range. their research studied the interactions of repetition rate and mask dragging speed as parameters in a systematic study and measured the etch rate of material on samples of different materials, si, pzt and pyrex [25]. delille et al. have shown how photopatternable uv sensitive adhesives can be used for patterning up to 1cm thickness. the benefit is that the process is low cost and requires no baking and does not even require a cleanroom. some of these polymers bond irreversibly to glass and they can be compatible with living cells [26]. due to the ability to work in any room and under any light condition, makes their findings suitable for education purposes of mems fabrication. material deposition by 3d printing is becoming popular, but all depends on the purpose of the mems. more about 3d printing further down. to power a mems requires some source of energy. for medical mems implant applications or situations where mems application requires independence from a clumsy battery this means an additional difficulty. this leads to a niche for technologies and materials to harvest energy and supply to power a mems. iniewski et al. present a good introduction to this area about such materials and technologies [27] bermejo and castañer have studied to drive mems electrostatic actuators with a direct photovoltaic (pv) source. the benefits are that the number of solar cells can be customized for specific mems switches and better performance with increased reliability [28]. 3. tools for modelling and simulation this section explains the evolution and need of cad systems for mems. it also shows with examples the vast diversity of problems that appear and need to be addressed. mems design and fabrication requires a range of modelling techniques at different stages. on one hand, we have the mechanical and circuit modelling for the functioning of the mems. on the other hand, we have the fabrication design and experimentation with physical modelling to find out desired properties of our object mems. for cad tools, we have to distinguish the mathematical modelling and the visual images providing information about what we are modelling. mathematical modelling (mm) is essential in device design. mms are also the underlying simulation tool for mems cad software. without it, there can be no serious outcome in mems design. mm occurs at different levels. napieralski et al. have elaborated an interesting work on the evolution of mems and modelling. they demonstrate how the advances in mems technology and modelling methodologies not only depend on each other but even drag each other forward [29]. lyshevski provides a good introduction to the fundamentals of mathematical and physical modelling in context with mems and nems structures [30]. these models are necessary to calculate the dynamic behaviour of those structures working together in a purpose or function. in this endeavour, further calculations are needed to solve the resulting differential equations. this is done with numerical calculations, using solvers such as matlab tm and for mems in particular finite element analysis (fea) calculations. comsol multiphysics tm is a popular and steadily growing environment for calculations mems design simplification with virtual prototyping 17 using finite elements [31]. another one is ansys tm . these are not the only ones; there are other multiphysics solvers. one important step in modelling mems is the order reduction of differential equation systems (in particular non-linear) and differential algebraic equation systems. greiner et al. have developed a method to reduce the order (dimension) for finite element models of second order systems, which appears to work well for linear conditions [32]. additional practice about numerical and experimental evaluation of the mechanical properties of mems and nems are collated by frangi et al. in [33]. it also contains an investigation by ananthasuresh about continuous parameterization and the problems that arise and ways for optimization. bechthold et al. have developed a methodology for model order reduction for a range of mems [34]. their method has the potential for an automated implementation. mems involving fluids have a substantial impact in medical applications. fluids play a special role in many microsystems because fluids behave differently in a microchannel than in a macroscopic space. design considerations and microfluidic behaviour requires special mathematical and modelling skills in a different physical domain. nguyen and wereley provide a good introduction into this domain and microfluidics in mems [35]. 3.1. reliability reliability is defined as the time before failure. this quantity has been used for decades, but on closer observation, it does not give any indication why the device fails. this is aggravated in mems; they are not just the microelectrical circuitry but also the mechanical part that goes with it. microscopic structures will function in different physical domains than macroscopic devices. it may not be easy to pinpoint the source of functional failures because the dominant physical forces change gradually as their geometric dimensions increase or decrease. because it is a gradual change, it may not be evident how much is due to say adhesion, capillarity, or any other force, it is appropriate to consider a more structured approach. to address this issue, we have developed a hierarchically structured reliability model that allows giving different failure weights to different components. some components are more robust (or vulnerable) in their design than others. likewise, some materials are more robust (or vulnerable) than others; and again, some assembly or manufacturing processes are more difficult (vulnerable) than others. our model allows assessing and pondering a priori different combinations of options for design, materials and manufacture or assembly [36]. in a somewhat similar way, muratet et al. have focused on failure analysis given that the vast variety of structures in mems represent different points of material weakness and/or design failure. to demonstrate this, they have developed a time before failure prediction model and illustrated the procedure by implementing a wobble electrostatic micromotor as an example. they use testing failure (including failure criteria and conditions) and combine these observations with fea simulations (by including failures into the simulations) from which they can identify risk conditions (deformation, stress) from which they derive the time before failure model [37]. 3.2. cad systems in the early days, a relatively small number of mems design software environments were available on the market. their application potential was rather restricted to 18 r. sitte modifications of existing library designs. dewey et al. used analog hardware description languages (vhdl-ams) for their project visual integrated-microelectromechanical vhdl-ams interactive design (vivid) [38]. often the tools were by-products from code written for the design of another specific project [39,40] and often difficult to use [41]. they appear as a collection of tools [42], sometimes limited to specific applications [43]. one of the first cad systems for mems was memcad built at mit in the late nineties [44]. since then many more systems have been brought to the market, some of them disappeared, while other evolved with state of the art facilities. few have calculating mems manufacturing parameters as their primary purpose, and if so, more often than not, they are beyond reach researchers due to their high cost [45, 46]. due to the amount of calculations involved, the development of a cad tool is not an easy task, in particular for mems. this is caused by their multifaceted, multivariate aspect. we are dealing with 3d mechanical devices, with critical timings (4d) and acting forces, sensing, or performing chemical, spectral analysis or pattern recognition adding to the complexity and at the time of design most of it squeezed into a multiple representation on a 2d screen display. to aid imagination and interpretation, schematic drawings have progressively grown into cad tools. these in turn have diversified for specific application niches and with the purpose to be fed as a run specification program into a microlithography or micromachining tool, e.g. a laser. the reality is that design is usually a mix of back and forth between simulations and the development of prototypes. it appears that a one-only streamlined workflow that goes from the cad drawing board to the fabrication of a prototype does not exist yet. in an early endeavour for optimum design, gaddi et al. have developed a framework for a top down design approach based on ic design and electrical and mechanical parameters. their aim was for a hierarchically mixed design environment, using fea for validation [47]. this is unusual, because fea are normally used for calculating optimal parameter combinations in a systematic set of simulations, not the other way round. this model appears to be limited to silicon technology. one very early cad tool was developed by dasigenis et al. their cad tool recycles a previous mems converter design and allows its updating to a new design and producing its new processing parameters [48]. this approach is devoid of any modern user/menu driven software or architecture facilities. it equates to building up a library from scratch each time when it comes to designing another device, using a new technology or materials. a classic computing simplification approach was chosen by bardohl et al., who have used graph (that is the graph description of images) and transforming it into sets of reduced graphs [49]. it is questionable, whether these transformations for reducing information handling are efficient or even practical in a mems–cad application. in a biometric approach to manufacturing biomedical microdevices, hengsbach and díaz lantada have produced a multiscale biomedical microsystem for addressing the effect of surface texture on the cell mobility. the purpose was to fabricate multiple length scale geometries that allow interactions of implants with living tissues. they used a laser writer for the device structure (several mm) and a direct laser writer for finer, submicron size details. one problem that arose with the fine textures and microstructures was the cad file size of several hundred mb and some gb, which in turn affected the fabrication time excessively. the solution was to revert from a descriptive geometry as sets of layers to algorithmic geometry, by mapping a grid of channels as fractal surface functions to a matrix. this reduced the fabrication time by more than one order of magnitude [50]. mems design simplification with virtual prototyping 19 another problem that requires attention is that in the design of mems different physical or chemical phenomena must be simulated. that means several suites of solvers, and because they require surface or volume meshing for their calculations, they are slow and not suitable for interactive design. this is an inconvenience with all currently existing cad products for mems that are on the market, for example coventor tm [51], memspro tm [52], tanner tm mems design flow [53], intellisuite tm [54] and others. they offer mixed capabilities, mixed interactive facilities, speed and popularity. some of them are sophisticated but all require a good understanding of physics, mathematics, engineering, knowledge in mems design and training time to use the cad package efficiently. computing power has not yet resolved the problem of speed, the very essence for rapid prototyping. 4. prototyping for larger systems, industrial rapid prototyping played a substantial role in the development of new articles. a prototype is a model of a device with emphasis on either replicating its functioning and scale (dimensions) of the intended device, or to study its production feasibility. if the aim is to study the functioning, then neither materials nor the production process need to be the same as the intended ones for regular fabrication. however, the closer to the reality, the better is the prototype. if the aim is to study the feasibility of a certain production process, then the intended production process must be replicated. typically this is a “no frills” approach aimed at simplification, be it reduction in fabrication time or cost of materials or need of expensive equipment. prototyping is aimed at answering the question “can it be done” and “does it do what it is meant to do”. the answer must be fast, before large sums are invested in its production. this is why rapid prototyping has evolved over the years. even rapid prototyping requires to some extent a worked out design. a prototype serves to eliminate design flaws or unnecessary costs early on. the “no frills” means that the focus is on the functioning of a device for its intended purpose, the famous “fit for that purpose”. as an analogy, there is no point in modelling comfortable seats for an airplane that is unable to take off and fly. it must fly! that is its purpose. it was the search for faster, cheaper prototyping that enabled the evolution of mems from silicon manufacturing to other materials and processes. this happens when a mems prototype turns out to be satisfactory to the extent that the initial experiment goes almost instantly into production maturity. this is triggered by the insight that originally intended materials or the production process can be replaced with the cheaper ones used in the prototype. this has lead to an explosion of alternative mems materials and technologies, and with it pushing innovations and applications further from initially expensive devices to cheap single use medical mems products. in what follows, we will illustrate this with selected examples in a brief journey in time. in the process of rapid prototyping sometimes specific tools are required for being able to see small structures in mems. one such tool is “small spot” stereolithography but it was insufficient for small structures, and being replaced by microstereolithography, which was not yet fully developed. bertsch et al. have [55] conducted a comparative analysis of those different types of stereolithography and their suitability in mems prototyping. conventional stereolithography’s resolution was too limited for small mems structures. however, a later integral microstereolithography’s resolution with at least an order of magnitude better than small spot lithography turned out particularly 20 r. sitte suitable for manufacturing complete layers with small 3d structures of 0.05 to 0.2mm but without high aspect ratio. lin et al. used a thin layer of baked on photoresist, instead of a conventional mask on soda-lime glass substrates to produce microfluidic channels approximately 36 μm deep. a two-step baking process ensured good adhesion of the photoresist to the glass. this was then etched off in an iterative progression of wet etching of dipping and etching with ultrasonic agitation, which led to smooth etching results. the process was aimed at fast prototyping and mass production of microfluidic systems. after successful etching, the microfluidic channels were sealed with glass chips at 580oc. the whole process was done in ten hours [56]. another similar technique using photoresist as was developed by sampath et al. [57] to produce free moving structures. the authors use a 20 m layer of patterned photoresist (su-8) to form an insulating spacer layer on a silicon wafer. then they used wafer bonding to apply a 50 m layer of crystal silicon on top of the insulating spacer. this was followed by patterning the crystal silicon layer with rie to produce the desired structures, in this case, a spring and a piston. the difficulty was to achieve a tight bond between the photoresist and the crystal silicon layer, given that the thickness of the photoresist is critical to produce precisely the desired thickness but it is thermally sensitive and can crack in the silicon patterning processing steps. in mems manufacturing and prototyping we often see additive (building up in layers) and subtractive (removal of material) processing steps to achieve the desired structures. these fabrication methods allow alternative materials, often polymers and they do not need cleanrooms. they are faster, often cheaper alternatives to the traditional silicon wafer processing. li et al. have adapted shape deposition manufacturing (sdm) to microfabrication by developing an ultrasonic-based micro powder-feeding mechanism for precise microdeposition of dry powder onto a substrate. this was followed by patterning by sintering the powder patterns with a micro-sized laser beam to clad them onto a substrate [58]. khoury et al. used liquid phase photopolymerization for ultra rapid prototyping that are suitable as masters for micromoulding microfluidic channels. the process is suitable for lab on a chip mems used in life science, where fluids in very small quantities are used, mixed, cultured, etc. and discarded. for the process, the authors used a multichannelled universal cartridge as master. the cartridge was filled with fluid photoresist and unwanted parts were masked off before exposing with uv to harden the desired channel geometry. the remaining structure was then rinsed, leaving the desired channels open. the process is also suitable for a fast production of microfluidic devices without micromoulding [59]. high aspect ratio (structures with deep narrow trenches with straight walls) is a specific niche in mems. the processes that can be used for prototyping and the production of mems that require high aspect ratio depends on the materials used and consequently the intended application and life span of the mems. sarajlic et al. have used plasma processing with low pressure chemical vapour deposition (lpcvd) for high aspect trenches and after a few more processing steps using the “black silicon method” (bsm) for pattering, passivating and release with isotropic plasma etching [60]. the benefit is that the processing was drastically simplified with bsm by keeping all in the same run, that is, in the same vacuum chamber. this made it suitable for rapid prototyping. in an endeavour for finding alternative micromachining to produce polymer-based capacitive micro accelerometer yung et al. [61] have used direct write laser ablation (removing material by laser sublimation) for its production. they have shown that this is a more convenient and suitable technology than traditional lithography methods. this is mems design simplification with virtual prototyping 21 because traditional lithography as used in semiconductor manufacturing requires expensive equipment and expensive masks, which is justified for mass production, but not for smaller productions of some mems. the cheaper laser ablation made it ideal for nonmass produced mems and it is also much simpler by allowing other materials. abdelgawad et al. [62] have developed a cheap technology to produce actuators with 5060m electrode separation that allow droplets of 1-12 l microfluid to move, merge or split. they use digital microfluidics and electrowetting and electrophoresis to measure enzymatic activity (enzymatic assays). what makes their work so different is that they did most of this with very cheap resources, i.e. recycled circuit boards and compact disks (cds) for gold and metals. for electrode patterning they used an ink pen and ink masking made with a razor blade instead of expensive photolithography with uv exposure. for dielectric coating, they used cling wrap (plastic film used to cover food). for protective hydrophobic treatment, they used cheap car windshield protector instead of expensive and licensed use of teflon. they have successfully realized their experimental work for prototyping. to read about this work is not just inspiring, it is also highly commendable for education. in the strive for rapid prototyping of precise submicron and nanogaps, villarroya et al. have experimented combining a focused ion beam (fib) followed by reactive ion etching using aluminium masking. their process achieved trenches of 80nm wide and 11nm deep. the goal was to produce nanodots [63]. microfluidics present another challenge to rapid prototyping. the quantities of fluid used in the end product are minuscule and in biomedical application they are used briefly and discarded. this requires large quantities of mems or nems to be produced. this makes conventional manufacturing in silicon unattractive due to their complicated and slow production and expensive equipment in a foundry with cleanroom. do et al. have developed a process using a cutter plotter (a “printer” that removes material) to scratch or cut through a polymer substrate, patterning the structures layer by layer with holes and trenches. the polymer sheets are then assembled one upon the other (like pancakes) and bonded. the arrays of overlapping holes make the containers for the fluids. this process achieved 20m wide and 30m deep channels in less than 30 minutes. this process is fast and does not need cleanroom conditions [64]. another interesting case is printing mems onto paper. in a feasibility study, meiss et al. have developed a method and special ink to print resistive sensors onto paper substrates using inkjet equipment. the technique can be used in iterative development and complex model design of sensors for low cost applications, such as medical disposal or consumer goods packaging [65]. speed is paramount in prototyping. 3d printing had a substantial impact in rapid prototyping because it is a fast way of building up structures. by adding layer after layer, fine structures can be produced using materials like plastic and metal. lifton et al. have researched and compared 3d printing with other technologies. they found that for some currently silicon-based mems, the production time can be drastically reduced by replacing long-cycle prototyping and packaging loops with 3d printing. there is the potential to use 3d printing for electronic packaging of mems devices at the wafer stage. 3d printing is suitable for features larger than 1m, such as lab on a chip. however 3d printing is not suitable for structural elements such as cantilevers and springs because the polymers used incompatible with their desired functions [66]. 22 r. sitte 5. virtual reality prototyping virtual prototyping has been around for over a decade. it became possible with increased computing power and faster vr algorithms. its application in mems is rather limited, which is understandable, given the complexity of manufacturing. cecil et al. have presented a comprehensive research in virtual prototyping [67]. however their work was aimed at vp in general, not necessarily mems, but the explanations are equally important for mems vp. jiang et al. have developed a proposal for a service driven mems cad design tool. contrary to traditional bottom up approaches, the authors argue for a top down approach. this project was aimed at designers with little knowledge in mems manufacturing process technologies and the requirement to detach the mems design from its fabrication. the authors have produced a partial software prototype; its output produced bond graphs. [68]. schröpfer et al. went a step further and presented an overview of different modelling levels in mems and the cad tools that are relevant to these modelling levels. it serves both mems and ic designers. the authors also analyze the differences and benefits between their applied behavioural modelling and the two popular modelling with fea and boundary element method (bem). another important feature is the use of voxels (think of pixels in 3d) instead of pixels for their displays to facilitate 3d animations [69]. cecil et al. have proposed a virtual reality-based environment for micro assembly (vrem) that is linked with the physical manufacturing. the software for the vr environment mimics and displays on screen the tools and movements from the point view of an operator. an automated assembly sequence generator uses genetic algorithms to optimize assembly sequences. the outcomes from the virtual environment aim to produce a validated schedule for the fabrication of a mems, and the assembly instructions for the physical part (tools and autonomous robots) to be assembled by the available work cell resources. some examples of vrem are developed as prototypes [70]. despite the potential of current computing power, the availability of virtual reality with animations is still non-existent or very limited for mems. in 2001 we initiated our mems animated graphic design aid (magda) project [71]. this project is summarized below and specific example shown further down. magda aims for starting design templates for structures of mems. these design templates give priority to those parameters that are most sensitive in the proper functioning of the mems. it aims to provide a library with typical designs that can be changed further in a similar way as dewey et al. did in their project visual integrated-microelectromechanical vhdl-ams interactive design, vivid [38]. the difference is that our starting point is based on theoretical mems whose design templates summarize the features of typical classes of mems. this provides a more general starting point with more freedom, while in vivid only those designs that are available from the foundry cell libraries provided by the designers of the software can be used. our motivation for our more generic template is to start with a “feasible” mems. we use chua’s notion of “local activity” [72] to step into the design of a mems, whose internal complexity (hence its detailed mathematical modelling) can be deferred for a subsequent stage of fine-tuning. this is important to bring mems design closer to the less specialized and novices, and still offer (albeit limited) understanding and learning of cause and effect in mems parameters. mems design simplification with virtual prototyping 23 6. the magda project mems animated graphic design aid (magda) is our virtual prototyping project that aims at building a simulation environment to aid in the design of mems. the purpose of magda is to overcome the weaknesses of commercially available cad software. specifically, it aims to overcome the weakness in interface usability, by simulating the functioning of mems interactively, and by producing animated vr visualizations. it aims at contributing in a similar way to the mems industry as the introduction of cad packages was a critical step in the widespread development of vlsi. in its implementation, magda acts as a layer between the user and existing cad solvers currently used in mems design, with a capability for calculations on its own. figure 1 shows the basic organization of magda magda is an ambitious project that was initiated in 2001. it has attracted postgraduate students and international exchange students for research and implementation. to illustrate why it is an ambitious project, we look at collision detection that is suitable amongst the many collision detection algorithms. what adds appeal to it is the vr placing of different shapes in different conditions and arrangements, for example modelling a gear or a spring into a device being assembled. magda is about the manufacturability, which impacts on the suitability of models that can be used; it does neither use nor replace existing commercial products or finite element calculations. the objective of magda is to calculate faster for interactive early design and narrowing to a desired range of parameters and functionality towards a prototype. the result of magda can then be used for further fine tuning with finite elements or other mathematical models. this system must have several major components that are interrelated with each other. for the virtual prototyping project, this is a huge task that must be broken down into smaller, more manageable parts. we do this by following the naturally given classification of actuators and sensors according to their operational principles. however, we cannot isolate the design of each class of sensor, because it would defy the overall purpose of providing a design tool that offers flexibility and allows for innovation perhaps across different technologies. this has been exemplified in the previous sections of this paper. it is therefore that the virtual prototyping facility must be able to combine different technologies and at the same time work in concert with the different components of the mems to be designed. magda is not intended for virtual manufacturing; this is a different niche altogether. an exception to this is virtual etching because the different types of etching affect the shape of material removal and consequently the shape of the object (straight or curved corners, edges and shapes). for the software, development matlab tm and c wereu sed for the physical shape design drawing board and vrml for the visualizations. the benefit of matlab is that it can be used on windows and unix os. it is widely available, affordable, and it has good graphics facilities. for the control and interaction of the mems (systems modelling) simulink is suitable. an additional advantage for using matlab is that mems design engineers can link the virtual prototyping with their earlier calculations and results if they were done in matlab. however, one severe problem with matlab is that it is not a stable fig. 1 magda organizational diagram. 24 r. sitte software. as we have regrettably experienced, it suffers from version changes and upgrades that are not backward compatible, sometimes rendering existing software useless. 6.1. visualizations and animations in our magda vr visualizations, we use physically based rendering. most of the code is written in vrml. we also use transparency for flexibility and easier understanding the devices in 3d visualizations. visualizations can be rotated for easier inspection from different aspects. images on screens are two-dimensional arrays of pixels, sometimes representing 3d and moving structures. representing specific movements by showing series of lights (pixels) some flashing alternating with each other to make the whole series appear moving in a specific direction and changing, is not trivial, because its outcome depend a range of “by-effects” that affect the visual perception in either good or bad ways. a well known example is wheels (or gears) rotating “backwards” while the object where they are attached moves forwards. one of the main purposes of magda is to show animations of a functioning mems in scaled observable “real time”. this can include components simultaneously moving at cycle times that can differ in orders of magnitude, for example, a gear rotating, a cantilever flipping and a membrane bending. therefore, animations cannot be a simple a zoom in time, because it would cause too much distortion between moving parts with different motion rhythms. while observing the movement of one component in slow motion, another one could come to a stand still. we have to be aware that we are performing animated visualizations of simulations that must strictly map to their object’s physical behaviour without ever degrading to a cartoon. we have addressed this problem by simulated stroboscopic illumination with flexible fine tuning its two virtual stroboscopic flash parameters: duration and interval. this is necessary for overcoming results of specific undesirable visual side effects (jumpy or flickering images) and hardware influences such as pixel size and computing latency effects [73, 74] and to provide a smooth observable animation. if, for example, in a visual experiment the thickness of a micropump membrane as shown below in figure 2 is changed, the two stroboscopic parameters can be reset by moving a virtual slider on the screen, to bring the new conditions again into a smooth, non flickering animation. this makes magda different from other simulators. fig. 2 interactivevr environment showing a micropump with flexing membrane, flow and user controls [73]. mems design simplification with virtual prototyping 25 in what follows, some of magda’s research results are briefly presented and what difficulties they are overcoming. for an interactive system, fast response is paramount [71]. much of mems physical modelling is done with finite elements. despite substantially increasing computing power, they are still too slow for interactive modelling. there is another issue: the physical domain. mems can be microscopic or macroscopic. the boundary for the separation of the dominant physical forces (e.g. inertia and gravity vs. adhesion, capillarity etc) is hazy to say the least. this is crucial for the distinction of fluidic and microfluidic modelling, because the viscosity and channel materials affect the slip length which in turn affects the reynolds number and depends on the pumping speed or flow rate and the physical characteristics like size, hydrophyllia or hydrophobia of the channel [75]. in addition, a novice mems designer would rarely be familiar with the rather specialised topic of navier stokes equation systems for fluid modelling. in a systematic fea analysis, we have simulated microfluidic flow by varying stepwise a set of parameters to find the distinction between laminar and turbulent flow [76]. such subtle details affect the mathematical modelling, hence the outcome, but this is important in the vast area of chemical and medical analysis. the interactive vp environment must be equipped with recommended model guidance in (e.g. like a pop up alerting to turn on a menu for specific parameter setting combinations). 6.2. fluid flow in a microchannel, the fluid is flowing at very high velocity. this velocity is different throughout the channel: it flows at different rates in different regions. for example in the centre of a square section microchannel with 152 m sides, the flow has a velocity of 8.3e10 m/s, while towards the channel walls the velocity drops to 2/3 of the maximal velocity, and touching the walls it flows only at ¼ of that velocity. this velocity reduction is due to an electric friction with the walls of the microchannel, pulling into the opposite direction as the flow and it is induced by the high velocity of the fluid flow in the channel. in our research, we have been looking for valid replacements for finite element calculations because they are too slow. we have investigated new models for laminar and turbulent flow of microfluidics in a channel, for example how to model an inversion layer in a channel. we use a layer model for the different velocities as if they were distinct strata. this is shown in figure 3. those layers next to the channel walls rub against it producing friction and to lesser extent, they slow down the adjacent layer, which in turn also exerts friction on the next layer and so on. in the centre, fig. 3 vr simulation of fluid entering the channel and formation of the bullet nose as it moves at different velocities (coloured layers). the vertical stripes of the flow are to distinguish the movement. to the lower right are user visualization controls (blue/green) [73]. 26 r. sitte the particles move at high speed because there is little or no friction anymore. in the outer layers, the particles move much slower due to the friction with the channel wall. our aim was to model the different layers of fluids as an electrical network. to do this we have modelled the flow segmented into layers to the pertinent models. we used first a continuum model (euler and navier-stokes) for incompressible flow (liquids). this was done by solving the navier-stokes equation, obtaining an analytical model for the circular and a numerical model for the rectangular channels. these were then used to model the layers as an electrical network model in matlab simulink. the resistances of the layers are obtained from the velocity profile of the flow. compared with ansys, our electric network model for the circular microchannel gives percentage errors up to 6.6% and compared with hagen-poiseuille equation, the error is below 5.22%. one must bear in mind that ansis’ error can reach up to 10%. this is a satisfactory result for a faster model that does not require meshing nor lengthy iterative calculations [77, 78]. 6.3. turbulences turbulences are an important phenomenon in fluidic mems design; they may be desired (e.g. for mixing fluids in or undesired (for medical implant medication dispensers). turbulences have several phases in their existence: a beginning, a movement, and an end phase. the can move in rotational or undulated movements. initially the velocities of the fluid can be rendered with larger patches of colour, while as the turbulence sets in, the patches become increasingly smaller. this is because a turbulent diffusion process is ongoing, but the diffusion is slow, following the swirls and eddies that characterize turbulent flow [79]. the strict layered flow as it occurs in non-turbulent fluid starts mixing and some parts will move faster, some move slower across the channel. we have developed the cluster splitting method for displaying turbulences in a microchannel. our method is suitable for fast calculations virtual reality visualizations in an interactive cad tool with a 2d display. instead of calculating and recalculating all the nodes in a mesh as in finite elements, our method takes advantage of redundancies. for graphic visualizations, we do not necessarily have to go down to the level of atoms or molecules. our objects of interest can be composed of macroscopic particles or clusters, but interacting in similar ways like smaller particles. however, by staying in the potential domain instead of the force domain, physical approximations can be made, simplifying complex and lengthy calculations. we use the lennart-jones potential model, but instead of individual particles, we use clusters of particles [80]. we start when the fluid is pumped with a given force into a nozzle and the microchannel at (t0) with larger clusters of particles (think of circular droplets) that are moving with equal speed and direction in the stream pumped through the channel. after a time (t1) we divide each cluster in half (t2), calculate, divide again (t3) and so on [81]. the total time is the sum of fig. 4 cluster splitting [81] upper: model diagram (1:2 split) lower: simulation (1:4 split). mems design simplification with virtual prototyping 27 a well-known geometric progression. ideally, by just dividing each cluster into two we save 50% of calculations. in reality it takes slightly more because the calculation times have to be added in both cases, cluster splitting and fea [82] for comparison. in this method, calibration is required for different materials; this becomes part of the data library. for the example shown in figure 4, we used three layers and progressively reached finer cluster granularities that are well suited to show the bullet nosed fluid flow in the channel. our calculation of the channel used 6000 clusters in our worst-case dynamic simulation examples. the corresponding fea calculations used 90000 nodes for a static image. 6.4. flexing movements. another research aimed at developing faster models for magda were flexing membranes and cantilevers. normally these components are also calculated with finite elements. we derived faster models using splines. our parameters were material, thickness of membrane and size (diameter or length of cantilever). these were fed into ansys and the values obtained were then imported into matlab where splines and quadratic polynomials are fitted to them. then the equations describing the curves are obtained as well as the coefficients and errors of the structures. the process involves dividing the surface into three regions or segments of curvature. figure 5 shows the difference between the real flexed membrane and the calculated values at maximum deformation. the obtained errors are still within the errors of ansys. for the purpose of magda our models can be repeated by systematical stepped analysis and then bundled and simplified into a more generic model with simple parameter input [83]. fig. 5 membrane flexing modelled as three different segments and using spline approximations (red: actual, blue simulated). 6.5. virtual etching again, after an in-depth comparison of available software techniques, we found that the main problem is that they use finite elements to calculate material removal. again, 28 r. sitte this is not suitable for interactive vp because at current hw status this is still too slow. etching performance is well known from the integrated circuit processing, but it is not so predictable in mems because the shapes are more complex. underetching is not desired in ic technology, but it is crucial in shaping and releasing mems structures for free movement. the preparations for animated anisotropic etching, both for wet and dry etching are relatively straight forward, but isotropic etching requires a more sophisticated approach. fig. 6a etching square mask wiremesh obtained with marker string method, 2d view [84]. fig. 6b etching square mask wiremesh obtained with marker string method, 3d view [84]. fig. 6c etching square mask (marker string method) rendered , 2d view [84]. fig. 6d etching round mask (marker string method) rendered, 3d view [84]. for visual simulations of isotropic etching we use a marker/string method for the progressive mesh as a faster method suitable for interactive design [84]. the method is not known much for etching but has been proposed for modelling other ic processing [85]. the model never took off due to a problem with swallowtail conditions that appear on corners. we have found a way for overcoming swallowtail conditions and we are also able to simulate underetching. fig. 6a and 6b show the wire meshes obtained in the progress of etching using a square lithography mask calculated in 2d then rotated, and a square mask calculated in 3d respectively. the method can be extended into larger material removal cad visualizations. this is a crucial step towards filling a long existing need in virtual prototyping. figures 6c and 6d show rendered images (using the wire meshes calculated earlier) for etching with a square and a round mask respectively. transparency is part of magda visualizations, to allow better perception of ongoing processes. our marker string method can be adapted for direct laser writing (dlw). mems design simplification with virtual prototyping 29 for the simpler anisotropic visual simulation, we use the etch rate together with data picked from a small database of materials, crystalline orientation, and etchant. this is the input for the visualization, which is displayed progressively at simulated times (typically 2 min) intervals. image transparency is used to be able to observe the progress of the concave well formed by etching using basic geometric shape masks (square, round, rectangular). this process could be used for a round shaped mask only but other mask geometries will not produce a truthful visualization [86]. 6.6. microassembly in small mems microassembly is integrated with their production by etching out structures and then underetching them for mobility. in mems sufficiently large to be handled under a magnifying device, microassembly is done with microgrippers, but there are other means e.g. air, magnets, liquids, etc. in magda we do simulate microassembly disregarding the nature of helping devices (i.e. microgrippers) or autonomous visual servoing. we do not simulate microgrippers or aiding devices. we mimic assembly simply by mouse movements and clicks to test the feasibility of assembly in our virtual environment. simulating assembly is important. it allows testing for conflicts or impediments in the assembly of a device before prototyping or production. for interactive vrp these algorithms have to be fast and smooth. precision in collision detection is paramount for virtual microassembly. to this end, a comparison of efficiency and suitability of collision detections algorithms was performed and a new, more suitable and more efficient algorithm was derived [87]. this algorithm exploits the essentially 2d nature (flat shapes) of typical mems components (which are often etched into a silicon wafer and then underetched and released). in order to take advantage of a new point-based collision detection method, a convex hull is computed around the object, and using this convex hull, a series of concavities is derived. the shape itself and the derived concavities are then divided into a minimum number of convex shapes. a point-based collision detection to check for convex shapes can then be applied in one of two ways: (a) by checking all the convex bodies that make up the solid portion of the object, or (b) by checking the convex hull and the concavities to rule out a possible collision. by using the method that requires the least number of checks, we can arrive at a result in the quickest manner possible. this modification produces a computational advantage of this method over other popular existing methods for 2d (and 3d) collision detection. 6.7. design desk a design drawing board was implemented in magda. a range of shapes and typical mems components can be picked and placed on the drawing board. this is includes free hand drawing a component. all components can be edited, e.g. the number and sizes of cogs in a gear or comb. fig. 7 shows some examples of the interface. this work was done by final year students from germany [88]. 30 r. sitte fig. 7 examples of the shapes available from magda drawing desk menu. all shapes can be extruded into 3d shapes that can be placed individually or merged (intersected) to other shapes. the shapes can be associated with materials from a small database [88]. in its current state, the user interface and drawing board of magda are implemented with a good range of mems components, facilities and 3d including rotation and assembly. the moving parts (membranes, cantilevers, fluids) and consequently the functioning of mems as described earlier are researched and published but not implemented in code. this is the sad consequence of disrupted research continuity as it happens when postgraduate students graduate and other key players retire altogether. magda should be continued, but it needs a new owner, new postgraduate students and programmers. our team has done the groundwork and set the foundation but this is just the tip of the iceberg. one option is to continue it as a wiki with global contribution, but this is dangerous and difficult to track for scientific correctness. interactive vr can do many miracles, not necessarily real, but a vp mems design simulator must stick to the reality and manufacturability. the results must not become cartoons, but they must neither inhibit what could be done in the future, for example more research on a cheap mems technology with carbon nanotubes. a fast and easy virtual prototyping environment could help finding manufacturable designs and cheaper technology. one must never discard a jules verne’s like vision. to climb a mountain one has to take a first step. we have done that first step. now it needs a next generation and the vision to keep on climbing further. 7. conclusions mems design and fabrication are currently in the hands of a highly skilled, highly multidisciplinary privileged minority. to continue filling the trend of this fast expanding industry, we need to find ways to ensure understanding and development of intuition for mems to younger generations and enable the way to satisfy the increasing need for innovation and new mems technologies in the following decades. the aim of this paper is to motivate scholars to engage in this endeavour and contribute to researching fast algorithms suitable for interactive virtual reality design to ease mems understanding. this paper has also presented a progression from earlier research on mems towards alternative technologies, prototyping and mems animated virtual prototyping design aid (magda). the contribution of our research is that we demonstrated that there are ways for alternative methods and faster calculations for the visualizations, without compromising physical validity. magda is far from complete. we have barely scratched the surface. it needs to be mems design simplification with virtual prototyping 31 developed further by dedicated programmers to complement the research that we have initiated. this requires financing, implementation with extended user facilities, populating databases and beta testing by a commercial body in continuous cooperation with a dedicated research group. references [1] j. m. karam, b. courtois, h. boutamine, p. drake, a. poppe, v. szekely, m. rencz, k. hofmann, and m. glesner, “cad and foundries for microsystems”, in proceedings of the 34th conference on design automation (dac ’97), anaheim, ca, usa, 1997, pp. 674-679. [2] v. saile, u. wallrabe, o. tabata, j. g. korvink, eds. liga and its applications, advanced micro & nanosystems , wiley-vch vol. 7, 2008. [3] h. fujita, “a decade of mems and its future”, mems '97, in proceedings, ieee., tenth annual international workshop on micro electro mechanical systems, 26-30 jan 1997, pp 1–7. [4] w. bacher, v. saile, liga, “von der trenndüse zu zahnrädern für luxusuhren”, nachrichten – forschungszentrum karlsruhe, jahrg. 38, 1-2/2006, pp. 84-86. [5] r. sitte, “about the predictability and complexity of complex systems” in from system complexity to emergent properties m.a. aziz-alaoui & cyrille bertelle (eds), springer series understanding complex systems, 2009, part i, pp 23-48, isbn 978-3-642-02198-5 [6] k.j. rebello, “applications of mems in surgery”, proceedings of the ieee, vol. 92, no. 1, january 2004, pp 45-55 [7] b. margesin, l. lorenzelli, “silicon based physical and biophysical microsystems: two case studies, sensors and microsystems", sensors and microsystems, pp. 41-50, 2008. [8] m. rieth, nano engineering in science and technology – an introduction to the world of nano design, world scientific publishing, series on the foundations of natural sciende and technology, vol. 6, 2003, rep 2006 isbn 981-238-073-6 [9] http://www.fsrm.ch/ (aug. 2015) [10] http://www2.imec.be/ (aug. 2015 [11] https://www.ieee.org/index.html (aug. 2015) [12] http://ecd.eurotraining.net/ (aug. 2015) [13] u. mastromatteo and b murari “new architecture in designing microsystems” in proceedings of the 7th italian conference s ensors and microsystems, bologna, italy, february 2002, pp. 94-98 4 – 6 [14] s. m. spearing, acta materialia, vol. 48, issue 1, pp. 179-196 , 1 january 2000 [15] s. d. senturia, microsystem design, kluwer academic publishers, boston 2001 [16] j.a. pelesko,.d.h. bernstein, modelling mems and nems, chapman & hall/crc, 2003. [17] t.fukuda, w. menz, micro mechanical systems, principles and technology, elsevier 1998. [18] p.rai-choudhury (ed.) handbook of microlithography, micromachining and microfabrication, vol. 1, microlithography, 1997, spie optical engineering press. [19] p.rai-choudhury (ed.) handbook of microlithography, micromachining and microfabrication, vol. 2, micromachining and microfabrication, 1997, spie optical engineering press. [20] g.k. fedder, mems fabrication, proceedings ieee international test conference, itc, 2003, pp. 691 698 [21] k.subramanian, micro electro mechanical systems a design approach, springer-verlag, 2010. [22 ] y. zhu, a. bazaei, s.o.r. moheimani, m.r. yuce, “design, prototyping, modelling and control of a mems nanopositioning stage”, in proceedings of the ieee american control conference, san francisco, ca, usa, 2011, pp 2278-2283. [23] j. haneveld, “nanochannel fabrication and characteristic using bond micromachining”, phd thesis, 2006, university of twente, enschede, the netherlands. [24] g. menozzi “nexus & eurimus: two major initiatives to support r&d and strengthen european mems industry”. sensors and microsystems, pp. 13-29, 2002. [25] j.-p. desbiens, p. masson, “arf excimer laser micromachining of pyrex, sic and pzt for rapid prototyping of mems components”, sensors and actuators a 136, 554–563, 2007. [26] r. delille, m.g. urdaneta, s.j. moseley, e.smela, “benchtop polymer mems”, journal of microelectromechanical systems, vol. 15, no. 5, pp. 1108-1120, october 2006. http://www.fsrm.ch/ http://www2.imec.be/ https://www.ieee.org/index.html http://ecd.eurotraining.net/ 32 r. sitte [27] k.iniewski, s. sriram, m. bhaskaran, energy harvesting with functional materials and microsystems, crc press, 2014, taylor & francis group. [28] s. bermejo, l. castañer, “dynamics of mems electrostatic driving using a photovoltaic source”, sensors and actuators a: physical, vol. 121, issue 1, pages 237–242, 31 may 2005. [29] a. napieralski, m. napieralska, m.szermer, c. maj, “the evolution of mems and modelling methodologies”, the international journal for computation and mathematics in electrical and electronic engineering publisher:emerald group publishing limited, vol. 31, issue 5, pp. 1458 – 1469. [30] mems and nems – systems, devices and structures, s.e. lyshevski, ed., crc press, 2002. [31] multiphysics modelling with finite elements methods, world scientific, series on stability, vibration and control of systems, w.b.j. zimmermann & a. guran, eds.., series a., vol. 18, 2006 rep., 2007. [32] a. greiner, j. lienemann, e. rudnyi, j. g. korvink, l. ferrario, m. zen “automatic order reduction for finite element models”. sensors and microsystems: pp. 411-417, 2005. [33] advances in multiphysics simulation and experimental testing of mems, eds. a. frangi , c. cercignani, s. mukherjee, n. aluru, computational and experimental methods in structures, vol. 2, 2008, imperial college press. [34] t.bechtold, , e.b. rudnyi, , j.g.korvink, “automatic order reduction of thermo-electric model for micro-ignition unit” , international conference on simulation of semiconductor processes and devices. sispad (ieee cat. no. 02th8621) 2002, pp. 131 – 134. [35] n.t. nguyen and s.t. wereley, integrated microsystems: fundamentals and applications of microfluidics (2nd edition), 2006, artech house. [36] r. sitte, “visualizing reliability in mems vr-cad tool”, journal of wscg, vol. 11, no. 3, 2003, pp. 433-439. [37] s. muratet, jy. fourniols, g. soto-romero, a. endemaño, a. marty, m. desmulliez “mems reliability modelling methodolog: application to wobble micromotor failure analysis”, microelectronics reliability, vol. 43, pp. 1945-1949, 2003. [38] a. dewey, v. srinivasan, e. icoz, “visual modeling and design of microelectromechanical system transducers”, microelectronics journal, vol. 32, issue 4, pp. 373-381, april 2001. [39] s. p. levitan, t. p. kurzweg, p. j. marchand, m. a. rempel, d. m. chiarulli, j. a. martinez, j. m. bridgen, c. fan, f. b. mccormick, “chatoyant, a computer-aided design tool for free-space optoelectronic systems”, applied optics, vol. 37, no. 26, pp. 6078-6092, september 1998 [40] www.coventor.com (sept. 2015) [41] www.cfdrc.com (sept. 2015) [42] www.ansys.com (sept. 2015) [43] d. reznik, s. brown, j. canny, “dynamic simulation as a design tool for a microactuator array”, proceedings ieee conference of robotics and automation (icra), albuquerque, nm, april 1997, pp. 1675-1680. [44] j. gilbert, “integrating cad tools for mems design”, ieee computer, vol. 31, issue 4, pp. 98-101, 1998. [45] www.memscap.com (sept. 2015) [46] www.intellisense.com (sept. 2015) [47] r. gaddi and j. iannacci, “hierarchical multi-domain mems simulation within an ic-design framework”, sensors and microsystems, pp. 461-466, 2004. [48] m. m. dasigenis, d. j. soudris, s. k. vasilopoulou, and a. t. thanailakis, “a cad tool for automatic generation of rns & qrns converters, microelectronics, microsystems and nanotechnology, pp. 297300, 2001. [49] r. bardohl, g. taentzer, m. minas, a. schürr, “application of graph transformation to visual languages”, handbook of graph grammars and computing by graph transformation, pp.105-180, 1999. [50] s. hengsbach, a. díaz lantada, “rapid prototyping of multi-scale biomedical microdevices by combining additive manufacturing technologies” biomed microdevices 16, pp. 617–627, 2014. [51] http://www.coventor.com/mems-solutions/ (september 2015) [52] http://www.softmems.com/mems_pro.html (september 2015) [53] http://tannereda.com/mems (september 2015) [54] http://www.intellisense.com/ (september 2015) [55] a. bertsch, p. bernhard, c. vogt, p. renaud, ,"rapid prototyping of small size objects", rapid prototyping journal, vol. 6, issue 4, pp. 259 – 266, 2000. [56] c.h. lin, g.b. lee, y.h. lin, g.l. chang “a fast prototyping process for fabrication of microfluidic systems on soda-lime glass” j. micromech. microeng. 11 pp. 726–732, 2001. http://griffith.summon.serialssolutions.com/2.0.0/link/0/elvhcxmwy2awntiz0eure1iskw0sddiskk2szy1nleehoqazpcqzjiuaamfkispibg-ixhmtymbkzrnluhrzdxh20iunzmrdxzbik4b9cgcncexmitgwapvlqrimcsa2qpjrkjcxphiymkqlmyaap5iypxpzpfock7_kvdnjbinc5gaatzcxjg http://www.sciencedirect.com/science/article/pii/s0924424705000749 http://www.sciencedirect.com/science/article/pii/s0924424705000749 http://www.sciencedirect.com/science/article/pii/s0924424705000749 http://www.sciencedirect.com/science/journal/09244247/121/1 http://www.worldscientific.com/series/cems http://www.coventor.com/ http://www.cfdrc.com/ http://www.ansys.com/ http://www.memscap.com/ http://www.intellisense.com/ http://www.coventor.com/mems-solutions/ http://www.softmems.com/mems_pro.html http://tannereda.com/mems http://www.intellisense.com/ mems design simplification with virtual prototyping 33 [57] s. k. sampath, l. st.clair, xingtao wu, d. v. ivanov, q. wang, c. ghosh, k. r. farmer. “rapid mems prototyping using su-8, wafer bonding and deep reactive ion etching” ieee proceedings of the fourteenth biennial university/government/industry microelectronics symposium virginia commonwealth university richmond virginia , 2001 (cat. no.01ch37197) [58] x. li, h. choi, y.yang, “micro rapid prototyping system for micro components”, thin solid films 420 –421, 515–523, 2002. [59] c. khoury, g.a. mensing , d.j. beebe “ultra rapid prototyping of microfluidic systems using liquid phase photopolymerization” lab on a chip, vol. 2, issue 1, pp 50–55, 2002. [60] e. sarajli´c, m.j. de boer, h. v. jansen, n. arnal, m. puech, g. krijnen, m. elwenspoek “advanced plasma processing combined with trench isolation technology for fabrication and fast prototyping of high aspect ratio mems in standard silicon wafers”, institute of physics publishing journal of micromechanics and microengineering, j. micromech. microeng. 14 pp. 570–575, 2004. [61] k.c. yung, s.m. mei and t.m. “yue rapid prototyping of polymer-based mems devices using uv yag laser”, j. micromech. microeng. 14, pp. 1682–1686, 2004 [62] m. abdelgawad, a.r.wheeler microfluidics and nanofluidics, springer verlag, 2007, 101007/s10404007-0190-3, [63] m. villarroya, n. barniol, c. martin, f. perez-murano, j. esteve, l. bruchhaus, r. jede, e. bourhis, j. gierak, “fabrication of nanogaps for mems prototyping using focused ion beam as a lithographic tool and reactive ion etching pattern”, microelectronic engineering 84, pp. 1215–1218, 2007. [64] j. do, j.y. zhang, c.m. klapperich, “maskless writing of microfluidics:rapid prototyping of 3d microfluidics using scratch ona polymer substrate”, robotics and computer-integrated manufacturing, vol. 27, issue 2, pp. 245–248, april 2011. [65] t. meiss, r.wertschützky and b.stoeber, “rapid prototyping of resistive mems sensing devices on paper substrates”, ieee 27th international conference on micro electro mechanical systems (mems), 2014, pp 536 – 539. [66] v. a. lifton, g. lifton, s. simon, “options for additive rapid prototyping methods (3d printing) in mems technology”, rapid prototyping journal, vol. 20, issue 5, pp. 403-412, 2014. [67] j. cecil, a. kanchanapiboon, “virtual engineering approaches in product and process design”, int j adv manuf technol 31, pp 846–856, 2007. [68] p. jiang, x.yan, y. liu, ,"service in e-design", journal of manufacturing technology management, vol. 18, issue 1, pp. 90 – 105, 2007. [69] g. schröpfer, g. lorenz, s. rouvillois, s. breit, “novel 3d modeling methods for virtual fabrication and eda compatible design of mems via parametric libraries, j. micromech. microeng. 20, 064003 (15pp), 2010. [70] j. cecil, j. jones, “vrem: an advanced virtual environment for micro assembly”, int j adv manuf technol 72, pp. 47–56, 2014. [71] r. sitte, “modeling mems manufacturability with virtual prototyping cad tools”, electronics and structures for mems ii, neil bergman, editor, proceedings of spie vol 4591, pp. 125-133, 2001. [72] leon.o. chua, cnn: a paradigm for complexity, ed. leon o. chua, world scientific series in nonlinear science, 1998, series a, vol. 31. [73] z. li, “analysis and design of virtual realityvisualization for a micro electro mechanical systems (mems) cad tool”, phd thesis, 2005, griffith university, australia. [74] z. li, r. sitte, “scaling for mems virtual prototyping: size and motion dynamics visualizations”, proceedings of the 13-th international conference in central europe on computer graphics, visualization and computer vision, plzen, czech rep. , 2005, pp. 37-40. [75] c.-w. choi, k. johan, a. westin, k.s. breuer: “to slip or not to slip – water flows in hydrophilic and hydrophobic microchannels”, in proceedings of imece 2002, new orleans, louisiana, usa, 2002, pp. 1-8. [76] r. sitte, j. westphal, “sensitivity to the onset of microfluidic slip length in a microchannel”, spie volume 6035: microelectronics: design, technology, and packaging ii, cds197, pp. 6035-60350y-1 6035-60350y-8, 2005 . [77] m. aumeerally, r. sitte “layered fluid model and flow simulation for microchannels using electrical networks”, journal of simulation modelling practice and theory (elsevier) 14, pp. 82-94, 2006. [78] m. aumeerally, "simulation and modelling of microfluidic mems devices for vr-cad" phd thesis, 2006, griffith university, australia. [79] m. lesieur, o. métais, p. compte, large-eddie simulations of turbulence, cambridge university press, 2005 34 r. sitte [80] r. sitte, “introductory physics based visual simulation”, proceedings of the c# and .net technologies, plzen (czech republic), 2003, v. skala (ed), pp. 63-69. [81] r. sitte, r. kovacs, “iterative cluster splitting for fast vr visualization s of turbulences in microfluids”, european simulation and modeling conference esm'2005, porto, october 2005, pp. 455-462. [82] w. b.j. zimmerman, multiphysics modelling with finite element analysis, 2006/2007 world scientific publishing co.pte.ltd, singapore. [83] k. tatur, r. sitte, “spline approximations of flexible deformations for fast dynamic vr visualizations”, proceedings of the european simulation and modeling conference esmc 2003, naples, 2003, pp. 309315. [84] r. sitte, j. cai, “visualizing dynamic etching in mems vr-cad tool” proc. of the 14-th international conference in central europe on computer graphics, visualization and computer vision, plzen, czech rep. 2006, pp. 343-350. [85] d. adalsteinsson, j.a. sethian, “a level set approach to a unified model for etching, deposition, and lithography i: algorithms and two-dimensional simulations” journal of computational physics, vol. 120, pp. 128-144, 1995. [86] a. singh-jhandi, r. sitte, “virtual etching and transparency aiding in mems design”, international conference on information technology and applications (icita2002), bathurst, nsw australia, november 2002. [87] david wilson, fast collision detection and orientation for virtual assembly of microsystems, b.hon. thesis, nov. 2006, griffith university, australia. [88] a. baer, m. kellermann, v. von hintzenstern, w. schoor (university otto v. guericke, germany) “the magda project a computer aided mems design tool”, research report, griffitth university, australia, march 2003. design of microwave waveguide filters with effects of fabrication imperfections facta universitatis series: electronics and energetics vol. 30, n o 4, december 2017, pp. 431 458 doi: 10.2298/fuee1704431m design of microwave waveguide filters with effects of fabrication imperfections  marija mrvić, snežana stefanovski pajović, milka potrebić, dejan tošić university of belgrade, school of electrical engineering, belgrade, serbia abstract. this paper presents results of a study on a bandpass and bandstop waveguide filter design using printed-circuit discontinuities, representing resonating elements. these inserts may be implemented using relatively simple types of resonators, and the amplitude response may be controlled by tuning the parameters of the resonators. the proper layout of the resonators on the insert may lead to a single or multiple resonant frequencies, using single resonating insert. the inserts may be placed in the e-plane or the h-plane of the standard rectangular waveguide. various solutions using quarter-wave resonators and splitring resonators for bandstop filters, and complementary split-ring resonators for bandpass filters are proposed, including multi-band filters and compact filters. they are designed to operate in the x-frequency band and standard rectangular waveguide (wr-90) is used. besides three dimensional electromagnetic models and equivalent microwave circuits, experimental results are also provided to verify proposed design. another aspect of the research represents a study of imperfections demonstrated on a bandpass waveguide filter. fabrication side effects and implementation imperfections are analyzed in details, providing relevant results regarding the most critical parameters affecting filter performance. the analysis is primarily based on software simulations, to shorten and improve design procedure. however, measurement results represent additional contribution to validate the approach and confirm conclusions regarding crucial phenomena affecting filter response. key words: bandpass filter, bandstop filter, multi-band filter, printed-circuit discontinuity, equivalent circuit, fabrication effects 1. introduction a great diversity of microwave filters can be perceived in modern communication systems. continuous improvement of these systems needs microwave filters having much more features such as low cost, compact size, low loss and operation in several frequency bands. therefore, this topic still gains significant attention in the area of microwave engineering. received february 18, 2017 corresponding author: marija mrvić school of electrical engineering, university of belgrade, kralja aleksandra blvd. 73, 11120 belgrade, serbia (e-mail: marija.mrvic@gmail.com) 432 m. mrvić, s. stefanovski pajović, m. potrebić, d. tošić a filter design procedure consists of several steps, which assume specification, approximation, synthesis, simulation model, implementation, study of imperfections and optimization [1, 2]. the purpose of each step can be briefly explained as follows [3]. design starts by setting the criteria (a filter specification) to be met for potential application. specification should be mathematically represented, so we need an approximation which is actually a filter transfer function. at that point filter simulation model and filter prototype (a fabricated device) may be introduced and evaluated. study of imperfections is then performed to investigate the various effects and phenomena caused by the real components used for the filter implementation. finally, optimization may be used for systematic numerical tuning of filter parameters to meet the specification. amongst the available filter manufacturing technologies, rectangular waveguides are attractive in communication systems, such as radar and satellite systems, due to their ability to handle high power and have low losses [4]. in this technology, bandstop and bandpass filters can be easily implemented with properly employed feeders [5]. filters are designed by inserting discontinuities into the e-plane or h-plane of the rectangular waveguides. various types of resonators, in relatively simple forms to design and fabricate, can be used on these discontinuities to obtain resonating inserts with a single or multiple resonant frequencies. for the e-plane filters, it is important to properly couple the resonators of the same frequency, and to decouple the resonators operating at the different frequencies. on the other hand, for the h-plane filters, it is important to decouple the resonators with the different resonant frequencies on the same insert, and to properly implement the inverters between the resonators with the same resonant frequency [6]. in this paper, various types of bandstop and bandpass waveguide filters, with single or multiple frequency bands, are presented and their characteristics are analyzed in details. the proposed filters are designed to operate in the x frequency band (8.2–12.4 ghz); therefore standard rectangular waveguide wr-90 (inner cross-section dimensions: width a = 22.86 mm, height b = 10.16 mm) is used and the dominant mode of propagation te10 is considered. both e-plane and h-plane filters are presented. split-ring resonators (srrs) and quarter-wave resonators (qwrs) are used for the bandstop, and complementary splitring resonators (csrrs) for the bandpass filter design. along with the three-dimensional electromagnetic (3d em) models, equivalent microwave circuits are generated and, for the chosen examples, the obtained results are also experimentally verified. bearing in mind the operational frequency band and implementation technology, these filters can be used as components of radar and satellite systems of various purposes [3]. a study of imperfections, based on the fabrication side effects investigation, is also presented and exemplified. a waveguide resonator and a third-order bandpass waveguide filter are analyzed in details in terms of implementation imperfections, including: implementation technology, the tolerance of the machine used for fabrication and positioning of the inserts inside the waveguide. this investigation provided relevant results regarding the most critical parameters influencing the filter performance. it is based on the software simulations, thus shortening and improving design procedure, and verified by the measurements on a laboratory prototype. design of microwave waveguide filters with effects of fabrication imperfections 433 2. bandstop filter design bandstop filters, as key components in rf/microwave communication systems, have an important task to reject the unwanted signals [7]. they can be easily implemented by inserting discontinuities into the e-plane or h-plane of the rectangular waveguides. authors in [8] present the h-plane filter using horizontal and vertical stepped thin wire conductors connecting the opposite waveguide walls. the usefulness of the srrs is verified for compact waveguide h-plane filter design in [9-13] and for the e-plane filter design in [14-17]. in this section, e-plane and h-plane bandstop waveguide filters are discussed. both types of filters use printed resonators as qwrs and srrs. compact size and independent control of the designed stopbands is a common feature of presented filters. for both of them, independently tunable stopbands are achieved in diverse manners, so detailed design procedures and results are presented. 2.1. e-plane bandstop waveguide filters using qwrs e-plane single-band filter design using qwrs, presented in [18], is expanded for the multi-band bandstop filter design [19]. first, we consider waveguide qwr, shown in fig. 1a, designed for resonant frequency f0 = 11 ghz. presented qwr is printed on the upper side of the substrate and connected to the lower waveguide wall. (a) 7.5 8 8.5 9 9.5 10 10.5 11 11.5 12 12.5 -25 -20 -15 -10 -5 0 f circuit 3d em circuit 3d em (ghz) s (db) 11 (db)s 21 (b) fig. 1 a) waveguide qwr: 3d model and equivalent circuit, b) comparison of amplitude responses for the 3d model and equivalent electrical circuit of the qwr 434 m. mrvić, s. stefanovski pajović, m. potrebić, d. tošić fiberglass/ptfe resin laminate (tle-95) (www.taconic-add.com) is chosen as a substrate to implement the e-plane inserts. the parameters of the substrate are: εr = 3, h = 0.11176 mm, tanδ = 0.0028, t = 0.0175 mm. the metal losses due to the skin effect and surface roughness are taken into account by setting the conductivity ζ = 20 ms/m. the equivalent-circuit model of the waveguide qwr is shown in fig. 1a. simulated results for the 3d em model of the waveguide qwr and its equivalent circuit are compared in fig. 1b. the values of the circuit elements are calculated using equation (1), as proposed in [6]: )jω(1 )jω( 2 011 011 0 s s zr   , 2 0 011 0db3 ω )jω( 2 s zbl  , )jω(2 1 0110db3 szb c  , (1) where ω0 denotes the angular frequency in (rad/s), b3db is 3db bandwidth (rad/s), s11(jω0) is the value of the s11 parameter at the considered resonant frequency. the impedances of ports correspond to the value of the wave impedance of the waveguide for the resonant frequency of f0 = 9 ghz (550 ω). quality factor (q-factor) is an important parameter that characterizes a microwave resonator. detailed determination of the q-factor for the considered resonator is given in [19]. the obtained q-factors are ql = 22.5 for the loaded resonator, and qu = 175.34 for the unloaded resonator. (a) (b) fig. 2 e-plane waveguide bandstop filter a) 3d model, b) equivalent microwave circuit 2.1.1. bandstop waveguide filter and equivalent circuit bandstop waveguide filter using presented qwrs is shown in fig. 2a. a printedcircuit insert consisting of two identical qwrs is placed in the e-plane of the rectangular waveguide. center frequency of the bandstop filter can be targeted by adjusting the length of the used qwrs. qwrs are grounded to the lower waveguide wall and the spacing design of microwave waveguide filters with effects of fabrication imperfections 435 between them yields the desired bandwidth. for the considered filter, the center frequency is f0 =9 ghz and qwrs are spaced 8.5 mm apart to achieve the bandwidth of 570 mhz. 7.5 8 8.5 9 9.5 10 10.5 11 11.5 12 12.5 -40 -35 -30 -25 -20 -15 -10 -5 0 f (ghz) circuit 3d em circuit 3d em s 21 (db) s 11 (db) fig. 3 comparison of amplitude responses for the bandstop filter (fig. 2a) and its equivalent circuit (fig. 2b) equivalent circuit of the bandstop waveguide filter using qwrs is shown in fig. 2b. to develop equivalent circuit, qwrs are represented using mutually coupled lc resonators. the coupling is composed of the following elements: inductor (lm) provides the magnetic part of the coupling and capacitor (cm) provides electric part of the coupling. values of l and c are found from equation (1). the waveguide section of length w1 comprises the distance between middle parts of the qwrs in 3d em model, and in calculations, it was replaced by the equivalent transmission line of characteristic impedance zc = 550 ω, and electrical length θ = 1.60 rad at 9 ghz. having determined all these parameters, we can find the values of the coupling elements (lm, cm) using equation (2): 2 2 m m m m c m m c 1 m m c 2( ) cot 2 csc 4( ) ( 4( ) ) cos( ) 2 2 4( )( ) l l l l l l c c z l l c c z f l l c c z                            2 2 m m m c m 2 m m c ( ) tan ( ) 4( ) ( ) tan sign cos 2 2 2 . 2( )( ) l l l l c c z l l f l l c c z                                     (2) this equation is derived for the resonant frequencies (f1, f2) of the coupled qwrs. the numeric values of these resonant frequencies are found for unloaded coupled resonators in the 3d em model. values of the circuits elements are, as follows: l = 0.757 nh, lm = 0.00371 nh, c = 0.4136 pf, cm = 0.00038 pf, w1 = 12.35 mm and we1 = 5.255 mm. fig. 3. shows the comparison of simulated amplitude responses for the 3d em and equivalent circuit model of the waveguide bandstop filter. 436 m. mrvić, s. stefanovski pajović, m. potrebić, d. tošić 2.1.2. multi-band bandstop waveguide filter design to validate the design of the e-plane waveguide filters with multiple stopbands, filters with two and three stopbands are designed. presented filters exhibit independent control of the designed stopbands (icds). 3d models of the non-miniaturized icds (nmicds) and miniaturized (micds) dualband bandstop waveguide filters are shown in fig. 4. specified center frequencies of the dual-band bandstop filter are f01 = 9 ghz and f02 = 11 ghz. as for the nmicds dual-band filter design, all of the printed qwrs are connected to the same waveguide wall. fig. 4 3d models of the nmicds and micds e-plane dual-band waveguide filters to eliminate the unwanted coupling between the qwrs for different stopbands, they are separated far from each other by the spacing of 12.5 mm. in that manner, each of the stopbands can be controlled individually, and the whole filter is perceived as a cascade connection of the bandstop filters intended for particular stopband performance. overall length of the nmicds filter is 0.876 λg, where λg denotes the guided wavelength at the center frequency of the lower stopband. with the aim to reduce the footprint of the nmicds filter, qwrs for different stopbands are connected to the different waveguide walls, which is in fact relatively simple solution to implement micds dual-band bandstop waveguide filter. amplitude responses of the nmicds and micds filters exactly match. for the micds filter, the unwanted coupling is overcome by shifting the qwrs for specified stopband along the upper waveguide wall. it was found that minimal value of the shift is 12 mm. however, the overall length decreased to 0.512 λg. equivalent microwave circuit of the nmicds dual-band bandstop filter is the cascade of the equivalent networks of single-band filters (fig. 2b) with the specified center frequencies, and it is shown in fig. 5a. the ports impedances are set to 500 ω, which is the value adequate for the wave impedance at 10 ghz (frequency in the middle of the considered center frequencies). the values of the equivalent circuit elements of the filter at 9 ghz remain unchanged, while circuit elements’ values for the filter at 11 ghz are: design of microwave waveguide filters with effects of fabrication imperfections 437 l2 = 0.518 nh, lm2 = 0.001122 nh, c2 = 0.4036 pf, cm2 = 0.0007 pf, w2 = 10.98 mm, wm = 15.92 mm and we = 2.705 mm. amplitude responses of the 3d em model and its equivalent circuit are compared in fig. 5b. (a) 7.5 8 8.5 9 9.5 10 10.5 11 11.5 12 12.5 -40 -35 -30 -25 -20 -15 -10 -5 0 f (ghz) circuit 3d em circuit 3d em s 21 (db) s 11 (db) (b) fig. 5 a) equivalent microwave circuit of the nmicds filter from fig. 4. b) comparison of amplitude responses for the 3d em model of the nmicds filter and its equivalent circuit (a) 7 7.5 8 8.5 9 9.5 10 10.5 11 11.5 12 12.5 -55 -50 -45 -40 -35 -30 -25 -20 -15 -10 -5 0 f [ghz] 9 ghz 9 ghz 10 ghz 10 ghz 11 ghz 11 ghz tbbwf tbbwf s 11 [db] s 21 [db] (b) fig. 6 a) tbbwf b) comparison of amplitude responses for the icds tbbwf and single-band filters for each specified center frequency 438 m. mrvić, s. stefanovski pajović, m. potrebić, d. tošić according to the proposed design guidelines, triple-band bandstop waveguide filter (tbbwf) is designed for specified center frequencies f01 = 9 ghz, f02 = 10 ghz and f03 = 11 ghz. middle stopband is designed by adding pair of identical qwrs having their length tuned to resonate at f0 = 10 ghz. so, tbbwf consists of alternating pairs of qwrs for different stopbands, attached to the top and bottom waveguide walls. 3d model of the tbbwf is shown in fig. 6a. the proposed design of the filter with three stopbands assumes that qwrs for the second and third stopband are connected to the same waveguide wall, while the qwrs for the first stopband are grounded to the opposite waveguide wall. the distances between the qwrs are set to secure the independent control of the stopbands. comparison of amplitude responses for the tbbwf and singleband filters for each specified center frequency is given in fig. 6b. total length of the tbbwf is 0.86 λg, λg being the guided wavelength at the lowest center frequency. (a) 8 8.5 9 9.5 10 10.5 11 11.5 12 -50 -40 -30 -20 -10 0 f [ghz] icds dual-band filter icds dual-band filter uc dual-band filter uc dual-band filter s 21 [db] s 11 [db] (b) fig. 7 a) 3d model of the uc dual-band bandstop waveguide filter. b) comparison of simulated amplitude responses for the uc and icds dual-band waveguide filters 2.1.3. miniaturization further miniaturization of the micds dual-band bandstop filter is achieved through several steps. 3d model of the presented ultra-compact (uc) dual-band bandstop filter is shown in fig. 7a. some of the geometric parameters are given symbolically to investigate their impact on the filter response. qwrs for different stopbands are printed on different sides of the insert. the aim was to preserve the characteristics of the micds filter, but to reduce the length of the filter. the whole length of the uc dual-band bandstop filter is 0.295 λg. the proximity of the participating qwrs restricted the independent control of the stopbands. comparison of simulated amplitude responses for the uc and icds dual design of microwave waveguide filters with effects of fabrication imperfections 439 band bandstop waveguide filters is shown in fig. 7b. the effect of the alterations of the parameters on the center frequencies and obtained bandwidths is exposed in table 1. table 1 influence of the parameters on the response of the uc dual-band filter parameter in (mm) f01 (ghz) b3db1 (mhz) f02 (ghz) b3db2 (mhz) c21 ↑ − − ↓ ↑ c11 ↓ ↓ ↓ ↓ ↑ r2 ↓ − ↓ − ↑ r1 ↓ ↑ ↑ ↓ ↑ d11 ↑ ↓ ↑ − ↑ m ↑ ↑ ↑ − ↓ possibilities regarding further miniaturization included the straight form of the qwrs and variation in the increment of the dielectric constant of the substrate used for implementation of the qwrs. the filter design with qwrs in the straight form features significantly wider bandwidths compared to the case when qwrs are implemented as folded elements. so, to preserve the characteristics of the icds filter, the space between the qwrs should be increased, resulting in longer filter than micds. the same effect is observed for substrates with higher permittivity (εr). since the higher εr makes the length of the printed qwrs shorter, the bandwidth became significantly wider. so, we had to increase the distance between the qwrs, which in turn increases the length of the filter. as a consequence, that filter is longer than our proposed realization. additional solution for miniaturization is proposed in [20], where connection of the qwrs for specified stopband to the opposite waveguide walls is suggested. 7.5 8 8.5 9 9.5 10 10.5 11 11.5 12 12.5 -40 -35 -30 -25 -20 -15 -10 -5 0 f (ghz) 3d em exp 3d em exp s 21 (db) s 11 (db) (a) (b) fig. 8 a) a photograph of the fabricated e-plane dual-band bandstop waveguide filter. b) comparison of the simulated and measured results 2.1.4. experimental verification in order to demonstrate the effectiveness of the proposed design, the e-plane dualband bandstop filter is verified on a fabricated prototype (fig. 8a). the amplitude response was measured using agilent n5227 network analyzer. fig. 8b shows comparison between the measured and simulated amplitude responses for the dual-band bandstop 440 m. mrvić, s. stefanovski pajović, m. potrebić, d. tošić filter. measured response is in good agreement with the 3d em simulation results. slight discrepancies are observed in terms of the passband insertion loss, which occurred as a consequence of the losses within the waveguide walls and transitions from waveguide wr-90 to sma connectors (waveguide-to-coaxial adapters). these losses have not been taken into account during the 3d em analysis of the considered filter. 2.2. h-plane bandstop waveguide filters using srrs for the implementation of the h-plane filter, srrs in the form of the printed-circuit inserts are positioned in the transverse plane of the standard wr-90 waveguide [11, 12]. the printed-circuit inserts are implemented using copper clad ptfe/woven glass laminate (tlx-8) with the parameters: εr = 2.55, tanδ = 0.0019, h = 1.143 mm and t = 0.018 mm. the losses due to the skin effect and surface roughness are taken into account by setting the conductivity to ζ = 20 ms/m. 10 10.5 11 11.5 12 -30 -25 -20 -15 -10 -5 0 f [ghz] circuit circuit 3d em 3d em s 21 [db] s 11 [db] (a) (b) fig. 9 srr: a) 3d and equivalent circuit model. b) comparison of amplitude responses 2.2.1. waveguide srr 3d model of the considered h-plane waveguide srr is presented in fig. 9a. it is designed for resonant frequency of 11 ghz, so appropriate dimensions are given. equivalent circuit model is also presented in fig. 9a, and the values of the circuits’ elements are obtained using the equation (1). comparison of amplitude responses for the 3d em model and its equivalent circuit is shown in fig. 9b. 2.2.2. third-order bandstop waveguide filter using srrs a third-order bandstop waveguide filter using srrs is designed for the center frequency f0 = 11 ghz [11, 12]. 3d model of the filter is shown in fig. 10a, and its response is given in fig 10b. the h-plane inserts are separated by the waveguide section of length of λg11ghz/4 = 8.494 mm, to implement the quarter-wave inverters for the center frequency. design of microwave waveguide filters with effects of fabrication imperfections 441 8 8.5 9 9.5 10 10.5 11 11.5 12 -30 -25 -20 -15 -10 -5 0 f [ghz] s 11 [db] s 21 [db] (a) (b) fig. 10 h-plane bandstop filter using srrs: a) 3d model b) amplitude response equivalent microwave circuit of the third-order bandstop filter is shown in fig. 11a, and fully corresponds to the 3d em model of the filter. in the presented circuit, losses are not taken into account. values of the elements of the circuit are calculated using equation (1). comparison of the amplitude responses for the 3d em model and the equivalent microwave circuit is presented in fig. 11b. a good agreement between the results is observed in terms of the center frequency and the obtained bandwidth. 9 9.5 10 10.5 11 11.5 12 12.5 13 -40 -35 -30 -25 -20 -15 -10 -5 0 f [ghz] circuit circuit 3d em 3d em s 21 [db] s 11 [db] (a) (b) fig. 11 a) equivalent microwave circuit of the h-plane bandstop filter. b) comparison of the amplitude responses for the 3d em model and the equivalent microwave circuit 2.2.3. third-order dual-band bandstop filter using srrs to verify the usefulness of the design, a third-order h-plane dual-band bandstop filter is proposed for the center frequencies f01 = 9 ghz and f02 = 11 ghz [11, 12]. 3d model of the filter is shown in fig. 12a. srrs for different stopbands are separated by the quarterwavelength waveguide sections to realize the immittance inverters for the corresponding center frequency. so, designed stopbands can be controlled independently. srrs for the different stopbands are distanced by (λg9ghz λg11ghz)/4 = 3.678 mm. 442 m. mrvić, s. stefanovski pajović, m. potrebić, d. tošić 7 7.5 8 8.5 9 9.5 10 10.5 11 11.5 12 12.5 13 -40 -35 -30 -25 -20 -15 -10 -5 0 f [ghz] 9 ghz 9 ghz 11 ghz 11 ghz 9 and 11 ghz 9 and 11 ghz s 21 [db] s 11 [db] fig. 12 h-plane dual-band bandstop filter: a) 3d model b) amplitude response 3. bandpass filter design bandpass waveguide filters can be designed using inserts with different types of resonators. the inserts may be placed in the e-plane or h-plane of the rectangular waveguide. herein, bandpass waveguide filters using h-plane inserts with csrrs as resonating elements, are considered. in fact, as relatively simple resonators to model and fabricate, providing bandpass frequency response, csrrs are widely used for bandpass waveguide filter design. they allow us to control the frequency response by modifying their parameters, thus providing flexible design. some of the previously reported solutions can be found in the open literature. in [21], the use of csrr for the h-plane bandpass design is demonstrated. a third order bandpass filter using csrrs is presented in [22], while compact solution can be found in [23]. 3.1. resonating inserts with csrrs resonating insert with csrr, placed in the h-plane of the standard rectangular waveguide (wr-90), is assumed to be a basic element of the higher-order filters. therefore, various implementations of the waveguide resonators with such inserts are possible. first, waveguide resonator using multi-layer planar insert with csrr is shown in fig. 13a. substrate used for the printed-circuit insert is copper-clad polytetrafluoroethylene (ptfe)/woven glass laminate (tlx-8) (http://www.taconic-add.com). the parameters of this substrate are as follows: εr = 2.55, tan δ = 0.0019, h = 1.143 mm and t = 18 μm. the specification of this resonator requires a resonant frequency of f0 = 11.1 ghz and a 3-db bandwidth of b3db = 520 mhz. the equivalent microwave circuit of the waveguide resonator is also given in fig. 13a. the following equations [6, 24] are used for calculation of the circuit parameters: 21 0 0 21 0 ( jω ) 2(1 ( jω ) ) s r z s   , 2 0 021 0db3 ω2 )jω(s zbl  , )jω( 2 0210db3 szb c  , (3) where ω0 denotes the angular frequency in (rad/s), b3db is 3db bandwidth (rad/s), s21(jω0) is the value of the s21 parameter at the considered resonant frequency. the impedances of ports correspond to the value of the wave impedance of the waveguide for the resonant frequency of f0 = 11.1 ghz (468 ω). design of microwave waveguide filters with effects of fabrication imperfections 443 as shown in fig. 13b, the amplitude response meets given specification, for the chosen csrr dimensions. also, there is a god agreement of the obtained amplitude responses of the 3d em model and equivalent circuit. the printed-circuit insert presented here used basic csrr form; however, csrr may have additional elements for the amplitude response finetuning, as exemplified in [24, 25]. 10 10.25 10.5 10.75 11 11.25 11.5 11.75 12 -30 -25 -20 -15 -10 -5 0 f 0 [ghz] 3d em 3d em circuit circuit s 11 [db] s 21 [db] (a) (b) fig. 13 waveguide resonator using multi-layer planar insert with csrr: a) 3d model and equivalent microwave circuit, b) comparison of amplitude responses besides multi-layer planar structures, the resonating insert can be a pure metallic structure, which is even easier to implement (fig. 14a). the thickness of the metal insert is 100 μm. the conductivity of the metal plates is set to ζ = 20 ms/m to include the losses (the surface roughness and the skin effect). this resonator achieves resonant frequency of f0 = 11.06 ghz (fig. 14b). (a) (b) fig. 14 waveguide resonator using metal insert with csrr: a) 3d model, b) amplitude response in the previously considered models, csrr was centrally positioned on the insert. however, this is not mandatory; in fact, by changing the position of the resonator (besides modifying its parameters) one can influence the frequency response. relatively simple design of the waveguide resonator using metal insert with csrrlike resonator attached to the top waveguide wall [26] is depicted in fig. 15a. the 444 m. mrvić, s. stefanovski pajović, m. potrebić, d. tošić obtained amplitude response, having bandpass characteristic, is shown in fig. 15b (f0 = 11.06 ghz, b3db = 680 mhz). similarly, resonator can be attached to the bottom waveguide wall. a common property of all presented types of inserts is that more than one resonator can be accommodated on the insert, thus allowing for multiple resonant frequencies. in fact, by properly positioning the resonators, each frequency band can be independently tuned, by modifying parameters of a single resonator. this is an important property for the multi-band filter design. some of the previously reported printed-circuit discontinuities with multiple resonant frequencies can be found in [3, 6, 24, 26-28]. (a) (b) fig. 15 waveguide resonator using metal insert with csrr: a) 3d model, b) amplitude response 3.2. third-order filter using csrrs starting from the printed-circuit insert with csrr, higher-order filter can be designed. since the resonating circuits are connected in parallel, inverters are needed between them [4, 29]. for the waveguide filter design, an inverter can be deployed as a quarter-wave waveguide section at the center frequency of interest, as explained in [3, 6]. a third-order bandpass filter, with a single pass band, is considered as an example of the higher-order filter design. it uses multi-layer planar inserts with csrrs of the same substrate as the insert shown in fig. 13a. the 3d model of the filter is shown in fig. 16a. filter is designed to meet the following specification: f0 = 11 ghz, b3db = 300 mhz. therefore, the parameters of the csrrs are set to achieve that. also, the waveguide sections of length equal to λg 11ghz/4 = 8.49 mm represent inverters between the resonating elements. fig. 16b shows the obtained amplitude response. 3.3. multi-band bandpass filter design as previously stated, resonating inserts with multiple resonant frequencies can be used for the higher-order multi-band filter design. however, it is necessary to properly design the inserts and the inverters, as well. this means that each waveguide section representing inverter has to be of the proper length equal to λg/4 (λg is guided wavelength in the waveguide), for each center frequency. therefore, the folded inserts have been introduced design of microwave waveguide filters with effects of fabrication imperfections 445 as an adequate solution [3, 6, 25, 27, 28], being a novel solution at the same time, compared to the available open literature. to exemplify the use of the folded inserts for the filter design, a second-order dualband (f01 = 9 ghz, f02 = 11 ghz) filter model with two multi-layer planar inserts is shown in fig. 17a. as can be seen, the parts of the inserts with csrrs are mutually separated for the proper distance to meet the invertor requirement and the fold is achieved by adding a metal plate to connect these parts. the substrate used for the inserts is rt/duroid 5880 (εr = 2.2, h = 0.8 mm) (http://www.rogerscorp.com). according to fig. 17a, the lengths of the inverters are λg 9ghz/4 = 12.17 mm and λg 11ghz/4 = 8.49 mm and the metal plate length is lpl = (λg 9ghz λg 11ghz)/8 = 1.84 mm. for the insert designated as i1, the width of the metal plate corresponds to the waveguide width. the other possibility is to have narrow plate connecting the resonating inserts (insert i2). in the considered example, the width of the metal plate is set to wpl = 3 mm. the obtained amplitude responses for the filters having both inserts implemented as i1 or i2 (with the same dimensions and positions of the csrrs) are compared in fig. 17b. as can be seen, for the model with i2 inserts, a transmission zero occurs above the upper band. since dimensions of the csrrs have not been tuned for the i2 insert, the discrepancy between the parameters of the frequency bands is notable; however, the idea is to present the design possibilities and to point at their influence. (a) (b) fig. 16 third-order bandpass filter using multi-layer planar inserts with csrrs: a) 3d model, b) amplitude response the previous model may be simplified by using metal inserts [27, 28], instead of the multi-layer planar ones (fig. 18a). the filter is designed to meet the following specification: f01 = 9 ghz, b3db-1 = 450 mhz, f02 = 11 ghz, b3db-2 = 650 mhz. regarding the inverter implementation, the same stands as for the filter in fig. 17a. therefore, the distances between the resonators are λg 9ghz/4 = 12.17 mm and λg 11ghz/4 = 8.49 mm. the length of the metal plate of the folded insert is lpl = (λg 9ghz λg 11ghz)/8 = 1.84 mm. an equivalent microwave circuit has been generated for this filter in ni awr microwave office (http://www.awrcorp.com) (fig. 18b). each resonating insert is represented by a network consisting of rlc circuits (for each csrr) and an inductor connected between them. the inverter is represented by a waveguide section of length equal to λg 9ghz/4, inserted between these networks. the details regarding equivalent microwave circuit and the equations used for calculation of the lumped elements parameters can be found in [3, 6, 25, 27, 28]. 446 m. mrvić, s. stefanovski pajović, m. potrebić, d. tošić comparison of the amplitude responses obtained by a 3d em simulation and an equivalent circuit is given in fig. 18c. 7.5 8 8.5 9 9.5 10 10.5 11 11.5 12 12.5 -40 -35 -30 -25 -20 -15 -10 -5 0 f [ghz] model with i1 model with i1 model with i2 model with i2 s 21 [db] s 11 [db] (a) (b) fig. 17 second-order dual-band bandpass filter using multi-layer planar inserts with csrrs: a) 3d model, b) amplitude responses (a) 7 7.5 8 8.5 9 9.5 10 10.5 11 11.5 12 12.5 13 -50 -45 -40 -35 -30 -25 -20 -15 -10 -5 0 f 0 [ghz] circuit circuit 3d em 3d em s 11 [db] s 21 [db] (b) (c) fig. 18 second-order dual-band bandpass filter using metal inserts with csrrs: a) 3d model, b) equivalent microwave circuit, c) comparison of amplitude responses design of microwave waveguide filters with effects of fabrication imperfections 447 in order to develop a compact filter, the waveguide sections representing inverters may be shortened; however, an additional properly designed insert between the resonating inserts is needed to preserve the original filter response. this way of miniaturization assumes that the normalized lengths of the inverters are the same for both center frequencies, but the resonating inserts are still folded. for the sake of easier fabrication, a solution with flat inserts has been proposed, as more optimal one [3, 6, 25, 28] (fig. 19a). as can be seen, the additional insert is still needed; however the inverters for the center frequencies are not miniaturized in the same manner (the normalized length of the inverter for the csrrs with f01 = 9 ghz is λg 9ghz/8, while the normalized length of the inverter for the csrrs with f02 = 11 ghz is equal to 0.18λg 11ghz). fig. 19b shows comparison of amplitude responses before and after applying inverter miniaturization, with the same and different normalized lengths of the inverters, for the considered second-order dual-band filter. fabrication of the flat metal inserts is relatively simple; however, supporting plates and fixtures are needed in order to have stable inserts inside the waveguide [6, 30]. in [3, 28] a detailed explanation regarding filter fabrication can be found, including the implementation of the structure for precise positioning of inserts. the proposed solution has been successfully deployed for the experimental verification and the measured results have shown good agreement with the simulated ones. (a) 7 7.5 8 8.5 9 9.5 10 10.5 11 11.5 12 12.5 13 -40 -35 -30 -25 -20 -15 -10 -5 0 s 11 [db] s 21 [db] s 11 [db] s 21 [db] s 11 [db] s 21 [db] f [ghz] (b) fig. 19 compact second-order dual-band bandpass filter using flat metal inserts: a) 3d model, b) comparison of amplitude responses of the filter without inverter miniaturization (blue), with equal (red) and unequal (green) inverter miniaturization 448 m. mrvić, s. stefanovski pajović, m. potrebić, d. tošić 7.5 8 8.5 9 9.5 10 10.5 11 11.5 12 12.5 -35 -30 -25 -20 -15 -10 -5 0 f 0 [ghz] insert i1 insert i1 insert i2 insert i2 s 21 [db] s 11 [db] (a) (b) fig. 20 second-order dual-band bandpass filter using metal inserts with csrrs attached to the waveguide walls: a) 3d model, b) amplitude responses multi-band filters may be also designed using inserts with resonators attached to the waveguide walls, as an example with a single resonator shown in fig. 15a. a second order dual-band filter, with two folded metal inserts, is shown in fig. 20a [26]. dimensions of the resonators are tuned to provide center frequencies of f01 = 9 ghz and f02 = 11 ghz. therefore, the lengths of the inverters are λg 9ghz/4 = 12.17 mm and λg 11ghz/4 = 8.49 mm and the metal plate length is lpl = (λg 9ghz λg 11ghz)/8 = 1.84 mm. proposed resonators occupy less space on the insert, compared to the centrally positioned csrrs, for the same resonant frequencies (the occupied area may be reduced up to 20 %). for the insert designated as i1, the width of the metal plate corresponds to the waveguide width. the other possibility is to have narrow plate connecting the resonating inserts (insert i2). in the considered example, the width of the metal plate is set to wpl = 2 mm. the obtained amplitude responses for the filters having both inserts implemented as i1 or i2 (with the same dimensions and positions of the resonators) are compared in fig. 20b. as can be seen, for the model with i2 inserts, a transmission zero occurs above the upper band and better matching is obtained for that band, as well; however, at the expense of the wider band. 4. bandpass waveguide filter fabrication side effects an important step of the filter design procedure is certainly experimental verification, i.e. the measurement of the filter response on a fabricated prototype. at that point, obtained simulation models may be optimized and corrected and another control fabrication may be performed [31]. fabrication process itself may affect the obtained filter responses; thus a study of imperfections should be carried out in order to estimate the influence of the fabrication side effects on the amplitude response. this topic has already gained attention, since some previously published papers considered the influence of the substrate parameters on the frequency response of the microwave structures (e.g. [32]). regarding waveguide filters fabrication and possible deviations of the frequency response, some of the available solutions can be found in [33-38]. design of microwave waveguide filters with effects of fabrication imperfections 449 in our study, we have considered various implementation imperfections and fabrication side effects influencing the frequency response of the bandpass waveguide filters [31]. since these filters use printed-circuit inserts as discontinuities, we have taken into account the parameters of the substrate (dielectric permittivity, thickness, losses, including the tolerances) used for the multi-layer planar inserts. furthermore, a machine used for fabrication may introduce some inaccuracy and imperfections during the fabrication of the inserts. finally, it is not always possible to have stable and perfectly positioned inserts in the waveguide during the measurement and regular operation, so this should be also taken into account when investigating filter response deviation. our goal was to investigate the influence of the aforementioned imperfections on the bandpass waveguide filter amplitude response by making precise 3d em models, which included considered effects, and by performing software simulations. in this manner, we were able to estimate the influence of various effects and phenomena on the filter response and make conclusions regarding the most relevant ones. also, the advantage of this method of investigation is the fact that majority of settings can be made in software, without unnecessary fabrications, thus shortening filter design procedure. the experimental verification of the chosen models has confirmed simulated results, showing good mutual agreement, thus confirming the proposed method for investigation, as well. we have considered a waveguide resonator using single csrr (fig. 13a) and a thirdorder filter, as a more complex structure using three multi-layer planar inserts with csrrs (fig. 16a). in both cases, substrate used for the inserts is copper-clad polytetrafluoroethylene (ptfe)/woven glass laminate (tlx-8), with the following nominal values of the substrate parameters and the tolerances: εr = 2.55 ± 0.04, tan δ = 0.0019 ± 0.001, h = 1.143 ± 0.05715 mm, t = 18 μm (http://www.taconic-add.com/). the conductivity of the metal plates was set to ζ = 20 ms/m to include the losses (the surface roughness and the skin effect). for the modeling of the waveguide structures, wipl-d software has been used (http://www.wipl-d.com/), to make precise models with various effects included and to perform full-wave simulations of metallic and dielectric structures [39]. for the printedcircuit inserts fabrication, a mits electronics fp21-tp machine (http://www.mitspcb.com/) has been used. according to the manufacturer’s specification, precision of the machine can be specified as follows: a minimum achievable microstrip line width is 50 μm and a minimum gap between microstrip lines is 50 μm. csrrs have been made using milling process. all filter response measurements have been performed on the agilent n5227a network analyzer. in order to be able to investigate the influence of the considered effects and phenomena, we have analyzed the filter response deviation. in fact, this deviation could be qualified as a difference between the nominal value of the observed parameter of the amplitude response (center frequency, bandwidth, insertion loss) and the value obtained when some of the fabrication side effects are taken into account. furthermore, the deviation could be quantified by a relative change of the parameters of the amplitude response [31], xrel [%] = 100(x – xref)/xref, (4) where xrel is the relative change in percent, x represents the obtained value and xref is the reference (nominal) value, without introducing any inaccuracy. accordingly, an absolute change could be calculated as xabs = x – xref. 450 m. mrvić, s. stefanovski pajović, m. potrebić, d. tošić we have adopted a set of criteria to evaluate performance degradation. therefore, we have assumed that the filter response is not significantly degraded if the following conditions are met: 1) the relative change of the center frequency (f0rel) is less than 1 %, 2) the relative change of the bandwidth (b3dbrel) is less than 2 %, 3) the absolute change of the passband attenuation (s21abs(f0)) is less than 0.3 db. the filter response degradation was analyzed and evaluated using simulation results of the 3d em models and measurement results on the laboratory prototype. 4.1. influence of the design parameters in order to investigate the influence of the implementation technology, the substrate parameters have been varied according to the manufacturer’s specification provided earlier in this section. the same procedure has been carried out for the waveguide resonator and the third-order filter. in the latter case, it has been assumed that each printed-circuit insert was made using the same substrate board, thus the same type of imperfection was applied to all inserts. the substrate parameters εr, tan δ and h have been varied discretely, within the provided boundaries, and the frequency response parameters (f0rel, b3dbrel, s21abs(f0)) have been observed. a complete set of the obtained numerical results can be found in [31]. while the change of tan δ and h practically had no influence, the most significant degradation of the amplitude response has been introduced by varying εr (f0rel was nearly 0.5 %, b3dbrel was below 2 % and s21abs(f0) was significantly lower than 0.3 db, related to the reference values), for both the waveguide resonator and the filter. since the given criteria have been met, one can conclude that the variation of the substrate parameters within the tolerances provided by the manufacturer, does not introduce significant degradation of the amplitude response. fig. 21 shows comparison of amplitude responses for various values of εr. 10 10.25 10.5 10.75 11 11.25 11.5 11.75 12 -20 -15 -10 -5 0 f [ghz]  r =2.55  r =2.55  r =2.59  r =2.59  r =2.51  r =2.51 s 11 [db] s 21 [db] 10 10.25 10.5 10.75 11 11.25 11.5 11.75 12 -60 -50 -40 -30 -20 -10 0 f [ghz]  r =2.55  r =2.55  r =2.59  r =2.59  r =2.51  r =2.51 s 11 [db] s 21 [db] (a) (b) fig. 21 comparison of amplitude responses for various values of εr: a) waveguide resonator, b) waveguide filter since the relative dielectric permittivity had the most significant influence on the amplitude response, the next step in our study was to find analytical expression of the resonant frequency (f0) in terms of εr. therefore, we have analyzed the amplitude response for various values of εr in case only one printed-circuit insert was placed in the waveguide (the first/third insert or the second insert of the filter) and in case of the third-order filter. design of microwave waveguide filters with effects of fabrication imperfections 451 the obtained results have shown that there is a linear dependency between f0 and εr, in the following form [31]: f0 = k εr + m, (5) where k = 1.43 and m varies. this expression represents the best linear fit to each set of the obtained results (fig. 22). in practice, for the desired resonant frequency, one should perform a measurement using single insert, and based on that and the given family of curves, the exact permittivity can be determined and used for the filter design. 2.45 2.5 2.55 2.6 2.65 10.8 10.85 10.9 10.95 11 11.05 11.1 11.15 11.2 11.25 11.3  r f r [ g h z ] filter sim filter approx 1 st /3 rd resonator sim 1 st /3 rd resonator approx 2 nd resonator sim 2 nd resonator approx f r = -1.43 r +14.590 f r = -1.43 r +14.650 f r = -1.43 r +14.694 fig. 22 design curve: resonant frequency as a linear function of permittivity 4.2. inaccuracy of the machine used for fabrication the machine used for fabrication of the printed-circuit inserts may also introduce some inaccuracy, thus the obtained amplitude response may be degraded to some extent. we have considered a few possible issues related to the machine tolerance. as previously mentioned, the milling process was used to remove the metallization. therefore, it was possible to obtain traces, i.e. csrrs, with larger or smaller dimensions than those given in the design specification. the details of the analysis and the obtained simulation results are given in [31]. it has been shown that the amplitude response does not get degraded in case the deviation of the trace width is within the limits of ± 5 μm. the next considered issue is also a consequence of using the milling process. namely, while removing the metallization, the tool may dig into the substrate to a certain depth [40]. in our study, a trace of cylindrical tool was used and the 3d em model of such insert was successfully made in wipl-d software [31]. fig. 23 shows compared amplitude responses for various values of the digging depth d, for the waveguide resonator and the third-order filter. as can be seen, by increasing the depth, the center frequency increases, as well, and the bandwidth gets wider, for both the waveguide resonator and the filter. for the waveguide resonator, there is a good agreement of the simulated and measured results for d = 50 μm (fig. 23a), thus confirming the proposed method for modeling the influence of this type of inaccuracy in the software. in addition, the following conclusions can be made: 1) for a single insert, the digging depth of 10 μm can be declared as critical; 2) for the filter using three inserts with the same digging depth, critical value is even lower than 10 μm (which is around 50 % of the metallization thickness). 452 m. mrvić, s. stefanovski pajović, m. potrebić, d. tošić finally, we have considered the possibility to fabricate inserts with dimensions not exactly the same as those of the waveguide cross-section. in our example, the insert was narrowed by the same amount on both sides. the detailed analysis and the simulated and measured results can be found in [31]. it has been confirmed that this effect practically does not have influence on the amplitude response (for both waveguide resonator and filter), despite the fact that the insert was not physically short-circuited to each waveguide wall. precisely, in case the inserts were equally narrowed, by the same amount, on both sides, this amount should be kept below 500 μm (i.e., 1000 μm in total), so the filter response does not get degraded. 10 10.25 10.5 10.75 11 11.25 11.5 11.75 12 -30 -25 -20 -15 -10 -5 0 f [ghz] d =0 m d =0 m d =10 m d =10 m d =20 m d =20 m d =50 m -sim d =50 m -exp d =50 m -sim d =50 m -exp s 21 [db] s 11 [db] 10 10.25 10.5 10.75 11 11.25 11.5 11.75 12 -60 -55 -50 -45 -40 -35 -30 -25 -20 -15 -10 -5 0 f [ghz] d =0 m d =0 m d =10 m d =10 m d =20 m d =20 m d =50 m d =50 m s 11 [db] s 21 [db] (a) (b) fig. 23 comparison of amplitude responses for various values of digging depth d: a) waveguide resonator (including measurement results for d = 50 μm), b) waveguide filter 4.3. precise positioning of inserts the inaccuracy in positioning of printed-circuit inserts might introduce filter response degradation. therefore, we have considered two possible issues – inclined and rotated inserts – for both waveguide resonator and filter. the detailed analysis has been carried out for a single insert, and those results have been further taken into account when considering positioning of inserts for the third-order filter. fig. 24a shows an inclined insert in the waveguide and two possible situations from the practice were considered. in case the dimensions of the fabricated insert perfectly match the dimensions of the waveguide cross-section (b1 = b), the following equation can be used to calculate the inclination angle [31]: 2 2 2 cos(α) ( ), 2 ( )b b x x bw b w   . (6) it has been shown that the critical angle which still allows the insert to remain more or less stable, i.e. to have contact with the top and bottom waveguide walls, is α ≈ 13º. the other possible situation is to have the insert fabricated to be shorter than needed (b1 ≠ b). the inclination angle that still provides stable insert, for known value of b1, can be calculated using following equation [31]: 2 2 2 2 2 2 1 1 1 cos(α) ( ), ( ) ( )b b x x b w wb w b b b w      . (7) design of microwave waveguide filters with effects of fabrication imperfections 453 in case of shorter insert, the critical inclination angle may have lower values (e.g. α = 4º), compared to the case with b1 = b. furthermore, fig. 24b shows a rotated insert in the waveguide and the minimum rotation angle can be found using following equation [31]: 2 2 2 cos(θ) ( 2) ( 2 ), ( )a a x x aw a w    . (8) the minimum rotation angle for the insert with dimensions perfectly matching the waveguide cross-section is θ ≈ 6º. the maximum rotation angle (in positive or negative direction) which does not introduce response degradation is θ = 15º. it has been shown, that in this case the insert has physical contact with the waveguide walls over its top and bottom sides, so it should remain stable although it is not perfectly short-circuited to the side walls [31]. (a) (b) fig. 24 printed-circuit insert in the waveguide: a) inclined by α (side view), b) rotated by θ (top view) in case a single insert is inclined by α = 13.038º or rotated by θ = 15º, it has been shown that there is no significant influence on the amplitude response of the waveguide resonator [31]. the next step was to investigate the influence of the inaccurately positioned inserts on the third-order filter response. in this case, the function of the inverters may be disrupted, since their lengths may be inadequate. therefore, we have thoroughly investigated the filter response in case one or multiple inserts were rotated or inclined. we have considered the filter with the central insert rotated by θ = 15º (fig. 25a) and it has been shown that this type of inaccuracy does not introduce significant amplitude response degradation, particularly in the passband (fig. 25b). fabricated filter is shown in fig. 25c. the detailed explanation regarding filter fabrication along with the structures designed to hold the inserts can be found in [30, 31]. a comparison of the simulated and measured amplitude responses shows their good agreement, as can be seen in fig. 25d. finally, the amplitude response has been analyzed when two or three inserts were inclined or rotated, since these are also possible situations in practice. it has been shown that cases with all three inclined or rotated inserts exhibit the most significant response degradation, so these models were considered in details in [31], and herein the most important observations will be pointed out. in case of three inclined inserts, fig. 26a shows models with the most noticeable performance degradation. namely, model 1 results in the most significant response deviation, even for small inclination angles. however, model 2 is the most probable one in practice: in case the fixtures holding the inserts, attached to the top and bottom waveguide walls, are mutually shifted, all three 454 m. mrvić, s. stefanovski pajović, m. potrebić, d. tošić inserts are inclined by the same angle, in the same direction. for the model 2 with perfectly fabricated inserts and inclination angle α ≈ 13º, b3dbrel is around 5 %, compared with the reference bandwidth of the original filter. for the same model with slightly shorter inserts (b1 ≈ 10.1 mm) and inclination angle α = 8º for all three inserts, practically there is no response degradation, i.e. the parameters of the amplitude characteristic met the criteria provided earlier in this section. in case of three rotated inserts, model 1 in fig. 26b exhibited the most significant response deviation. it has been found that the maximum rotation angle, still providing acceptable amplitude response in terms of required criteria for an arbitrary position of the inserts, was θ = 8º. finally, in case the inserts were simultaneously inclined and rotated, the aforementioned criteria would be met for the inclination angle α ≤ 5º and the rotation angle θ ≤ 7º. 8 8.5 9 9.5 10 10.5 11 11.5 12 12.5 13 -60 -50 -40 -30 -20 -10 0 f [ghz] original original rotated rotated s 11 [db] s 21 [db] (a) (b) (c) (d) fig. 25 filter with central insert rotated by θ = 15º: a) top view and wipl-d model, b) comparison of amplitude responses for the original model and filter with rotated insert, c) fabricated filter, d) comparison of simulated and measured results for the filter with rotated insert (a) (b) fig. 26 a) inclined inserts, b) rotated inserts design of microwave waveguide filters with effects of fabrication imperfections 455 5. conclusions in this paper, various solutions for the bandstop and bandpass waveguide filter design have been presented. the goal was to exemplify the method for relatively simple waveguide filter design procedure, using printed-circuit discontinuities and different types of resonators, easy to design and implement. first, bandstop filters were designed using printed-circuit inserts within the rectangular waveguide. inserts with srrs were placed in the h-plane, while the insert with qwrs was positioned in the e-plane of the standard wr-90 waveguide. designed filters using these inserts have been thoroughly analyzed and the results have been presented. both types of the considered filters allow independent control of the designed stopbands and are compact in size. as for the e-plane filters, miniaturized icds multi-band bandstop waveguide filter design using qwrs has been discussed. as a proof of concept, e-plane icds dual-band and triple-band bandstop waveguide filters have been designed. center frequencies can be flexibly adjusted by the length of the corresponding qwrs. as for the icds dual-band bandstop filter, connection of the qwrs for different stopbands to the opposite waveguide walls has resulted in about 41 % of the size reduction, compared to the case where they are connected to the same waveguide wall. miniaturized icds dual-band bandstop filter has been fabricated and measured. the filter is 0.512 λg in length. further miniaturization of the dual-band bandstop filter has been achieved when qwrs of different size were printed one below another. in this arrangement, the unwanted mutual coupling has been particularly strong and restricted the control of the center frequencies and bandwidth. the impact of the physical dimensions alteration on the filter response has been thoroughly investigated and exposed. obtained ultra-compact e-plane dual-band bandstop waveguide filter has length of 0.295 λg, which is about 66 % and 42 % shorter compared to the non-miniaturized and miniaturized icds dual-band bandstop filter, respectively. additionally, equivalent microwave circuit of the multi-band bandstop filter with independently tunable stopbands is presented in the form of a cascade of the equivalent microwave networks of the single-band bandstop filters. equivalent circuit corresponds to the decomposed 3d filter structure, and it is suitable for faster filter design and optimization, as well. for the design of the h-plane filters, inserts with printed srrs have been used. the third-order bandstop filter has been designed using srrs distanced by the quarter-wave waveguide sections acting as immittance inverters for the center frequency. accordingly, dual-band bandstop filter has been implemented with srrs separated by the inverters for the specified center frequencies. the filter is 0.5 λg in length, which is attributed to the length of the quarter-wave waveguide section used as inverter for lower stopband design. regarding bandpass waveguide filters, various types of resonating inserts, having bandpass characteristic, have been introduced. they have been used for the higher-order h-plane bandpass filters with a single or multiple pass bands. a novel solution for dualband filter using folded inserts has been presented, in order to properly implement the inverters, i.e. the quarter-wave waveguide sections, for each center frequency. the inserts may be implemented either as multi-layer planar inserts or metal inserts, as a simpler solution. dual-band filter with folded metal inserts has been further modified to obtain compact solution with flat inserts and miniaturized inverters, optimized for fabrication. it has been also demonstrated that csrrs do not necessarily need to be centrally positioned on the inserts, but they may be attached to the top and bottom waveguide walls. 456 m. mrvić, s. stefanovski pajović, m. potrebić, d. tošić finally, the bandpass waveguide filters fabrication side effects have been investigated in details. the amplitude responses of the waveguide resonator and the third order filter have been analyzed in terms of the implementation technology, the tolerance of the machine used for fabrication and positioning of the inserts inside the waveguide. the obtained results are relevant for identifying critical parameters affecting the performance of the considered structures. various effects and phenomena have been modeled in software and for the chosen examples the results were also experimentally verified, showing good agreement with the simulation results. the obtained results can be summarized as follows: 1) regarding substrate parameters, the dielectric permittivity of the printed-circuit insert had the major impact on the amplitude response (a closed-form expression based on a linear dependency between the permittivity and center frequency was proposed as a design curve); 2) in terms of machine tolerance, the digging depth into the substrate during the milling process introduced the most significant response degradation; 3) the inaccuracy in positioning of the inserts in the waveguide did not introduce deviation of the filter response in the passband, for the critical angles which were determined, for both the waveguide resonator and the filter with three arbitrarily inclined or rotated inserts. the findings of our study may be applicable for the other types of waveguide filters using similar resonating inserts and also for the filters operating in different frequency bands, since the presented results pointed out the most significant phenomena and side effects of the fabrication process. the advantage of the proposed method is the possibility for improving and shortening the design procedure, by performing majority of setting and analyses in the software, thus avoiding unnecessary fabrications. acknowledgement: this work was supported by the ministry of education, science and technological development of the republic of serbia under grant tr32005. references [1] m. d. lutovac, d. v. tošić, b. v. evans, filter design for signal processing using matlab and mathematica, upper saddle river, nj: prentice hall; translated in chinese. beijing, p. r. china: publishing house of electronics industry, phei; 2004. [2] d. m. pozar, microwave engineering. new york: john wiley & sons, 2012. [3] s. stefanovski pajović, m. potrebić, d. v. tošić, "advanced filtering waveguide components for microwave systems", microwave systems and applications, dr. sotirios goudos (ed.), intech, january 2017. [4] r. j. cameron, c. m. kudsia, and r. r. mansour, microwave filters for communication systems: fundamentals, design and applications, new jersey: john wiley & sons, 2007. [5] b. milovanović, j. joković, t. dimitrijević, "analysis of feed waveguide length influence on em field in microwave applicator using tlm method", facta universitatis, series: electronics and energetics, vol. 21, no.1, pp. 65-72, april 2008. [6] s. lj. stefanovski, "microwave waveguide filters using printed-circuit discontinuities", ph.d. dissertation, school of electrical engineering, university of belgrade, belgrade, serbia, 2015. [7] s. c. dutta roy, "a new lumped element bridged-t absorptive band-stop filter", facta universitatis, series: electronics and energetics, vol. 30, no. 2, pp. 179-185, june 2017. [8] s. prikolotin, a. kirilenko, "a novel notch waveguide filter", microw. opt. techn. let., vol. 52, pp. 416-420, february 2010. [9] s. fallahzadeh, h. bahrami, m. tayarani, "a novel dual-band bandstop waveguide filter using split ring resonators", prog. electromagn. res. lett., vol. 12, pp. 133-139, 2009. design of microwave waveguide filters with effects of fabrication imperfections 457 [10] s. fallahzadeh, h. bahrami, m. tayarani, "very compact bandstop waveguide filters using split ring resonators and perturbed quarter-wave transformers", electromagnetics, vol. 30, no. 5, pp. 482-490, june 2010. [11] s. stefanovski, m. potrebić, d. tošić, "novel realization of bandstop waveguide filters", technics, special edition, pp. 69-76, 2013. [12] s. stefanovski, m. potrebić, d. tošić, "a novel design of dual-band bandstop waveguide filter using split ring resonators", j. optoelectron. adv. mat., vol. 16, no. 3-4, pp. 486-493, march-april 2014. [13] s. stefanovski, m. potrebić, d. tošić, z. cvetković, "design and analysis of bandstop waveguide filters using split ring resonators", in proceedings of the 11 th international conference on applied electromagnetics (pes 2013), niš, serbia, 2013, pp. 135-136. [14] m. mrvić, m. potrebić, d. tošić, z. cvetković, "e-plane microwave resonator for realisation of waveguide filters", in proceedings of xii international saum conference on systems, automatic control and measurements (saum 2014), niš, serbia, 2014, pp. 205–208. [15] s. stefanovski pajović, m. potrebić, d. tošić, z. stamenković, "e-plane waveguide bandstop filter with double-sided printed-circuit insert", facta universitatis, series: electronics and energetics, vol. 30, no. 2, pp. 223-234, june 2017. [16] h. sun, c. feng, y. huang, r. wen, j. li, w. chen, g. wen, "dual-band notch filter based on twist split ring resonators", int. j. antennas propag., vol. 2014, article id 541264, 6 pages, april 2014. [17] p. castro, j. barosso, j. leite neto, a. tomaz, u. hasar, "experimental study of transmission and reflection characteristics of a gradient array of metamaterial split-ring resonators", j. microw., optoelectron. electromagn. appl., vol. 15, no. 4, pp. 380-389, october/december 2016. [18] s. stefanovski, m. potrebić, d. tošić, "a novel design of e-plane bandstop waveguide filter using quarter-wave resonators", optoelectron. adv. mat., vol. 9, no. 1-2, pp. 87-93, january-february 2015. [19] m. mrvić, m. potrebić, d. tošić, "compact e-plane waveguide filter with multiple stopbands", radio sci., vol. 51, no. 12, pp. 1895-1904. [20] m. mrvić, m. potrebić, d. tošić, z. cvetković, "miniaturization of waveguide bandstop filter", in procееdings of the 12 th international conference on applied electromagnetics (pes 2015), niš, serbia, 2015, pp. 79–80. [21] n. ortiz, j. d. baena, m. beruete, f. falcone, m. a. g. laso, t. lopetegi, r. marques, f. martin, j. garcia-garcia, m. sorolla, "complementary split-ring resonator for compact waveguide filter design", microw. opt. techn. let., vol. 46, no. 1, pp. 88-92, july 2005. [22] m. m. potrebić, d. v. tošić, z. ţ. cvetković, n. radosavljević, "wipl-d modeling and results for waveguide filters with printed-circuit inserts", in proceedings of the 28 th international conference on microelectronics (miel 2012), niš, serbia, 2012, pp. 309-312. [23] h. bahrami, m. hakkak, a. pirhadi, "analysis and design of highly compact bandpass waveguide filter utilizing complementary split ring resonators (csrr)", prog. electromagn. res., vol. 80, pp. 107-122, 2008. [24] s. stefanovski, m. potrebić, d. tošić, "design and analysis of bandpass waveguide filters using novel complementary split ring resonators", in proceedings of the 11 th international conference on telecommunications in modern satellite, cable and broadcasting services (telsiks 2013), niš, serbia, 2013, pp. 257-260. [25] s. stefanovski pajović, m. potrebić, d. tošić, "microwave bandpass and bandstop waveguide filters using printed-circuit discontinuities", in proceedings of the 23 rd telecommunications forum (telfor 2015), belgrade, serbia, 2015, pp. 520-527. [26] s. stefanovski, đ. mirković, m. potrebić, d. tošić, "novel design of h-plane bandpass waveguide filters using complementary split ring resonators", in proceedings of progress in electromagnetics research symposium (piers 2014), guangzhou, china, 2014, pp. 1963-1968. [27] s. stefanovski, m. potrebić, d. tošić, z. stamenković, "a novel compact dual-band bandpass waveguide filter", in proceedings of ieee 18 th international symposium on design and diagnostics of electronic circuits and systems (ddecs 2015), belgrade, serbia, 2015, pp. 51-56. [28] s. stefanovski, m. potrebić, d. tošić, z. stamenković, "compact dual-band bandpass waveguide filter with h-plane inserts", j. circuit syst. comp., vol. 25, no. 3, 1640015 (18 pages), 2016. [29] j.-s. hong, microstrip filters for rf/microwave applications, nj: john wiley & sons, 2011. [30] s. stefanovski, m. potrebić, d. tošić, "structure for precise positioning of inserts in waveguide filters", in proceedings of the 21 st telecommunications forum (telfor 2013), belgrade, serbia, 2013, pp. 689-692. 458 m. mrvić, s. stefanovski pajović, m. potrebić, d. tošić [31] s. lj. stefanovski pajović, m. m. potrebić, d. v. tošić, z. ţ. cvetković, "fabrication parameters affecting implementation of waveguide bandpass filter with complementary split-ring resonators", j. comput. electron., vol. 15, no. 4, pp. 1462-1472, 2016. [32] s. c. gao, l. w. li, t. s. yeo, m. s. leong, "a dual-frequency compact microstrip patch antenna", radio sci., vol. 36, no. 6, pp. 1669–1682, november-december 2011. [33] m. albooyeh, a. a. lotfi neyestanak, b. mirzapour, "wideband dual posts waveguide band pass filter", int. j. microw. opt. techn., vol. 2, no. 3, pp.203-209, 2007. [34] n. s. choi, d. h. kim, g. jeung, j. g. park, j. k. byun, "design optimization of waveguide filters using continuum design sensitivity analysis", ieee t. magn., vol. 46, no. 8, pp.2771-2774, 2010. [35] r. l. villaroya, "e-plane parallel coupled resonators for waveguide bandpass filter applications", ph.d. dissertation, heriot-watt university, edinburgh, 2012. [36] [online] http://www.ros.hw.ac.uk/bitstream/handle/10399/2604/lopez-villarroyar_1012_eps.pdf?sequence= 1&isallowed=y [37] p. soto, d. de llanos, v. e. boria, e. tarin, b. gimeno, a. onoro, l. hidalgo, m. j. padilla, "performance analysis and comparison of symmetrical and asymmetrical configurations of evanescent mode ridge waveguide filters", radio sci., vol. 44, no. 6, rs6010, december 2009. [38] j. bornemann, j. uher, "design of waveguide filters without tuning elements for production-efficient fabrication by milling", in proceedings of asia-pacific microwave conference (apmc), pp.759-762, taipei, taiwan, 2001. [39] c. zhao, t. kaufmann, y. zhu, c. c. lim, "efficient approaches to eliminate influence caused by micromachining in fabricating h-plane iris band-pass filters", in proceedings of asia-pacific microwave conference (apmc), pp.1306-1308, sendai, japan, 2014. [40] b. m. kolundţija and a. r. djordjević, electromagnetic modeling of composite metallic and dielectric structures, 1 st ed. norwood, ma: artech house, 2002. [41] a. r. djordjević, d. i. olćan, a. g. zajić, "modeling and design of milled microwave printed circuit boards", microw. opt. technol. let., vol. 53, no. 2, pp. 264–270, 2011. 10479 facta universitatis series: electronics and energetics vol. 35, no 3, september 2022, pp. 405-420 https://doi.org/10.2298/fuee2203405r © 2022 by university of niš, serbia | creative commons license: cc by-nc-nd original scientific paper comb jamming as a strategy for rcied activation prevention jovan radivojević, mladen mileusnić, aleksandar lebl, verica marinković-nedelicki iritel a.d., belgrade, batajnički put 23, serbia abstract. the main objective of this paper is the analysis of comb jamming as a technique for rcied activation prevention. presentation of three strategies for comb signal generation follows after comprehensive survey of various jamming techniques in the introduction. there are two paper original contributions. the first one is quantitative comparison for three signal generation techniques of their emission power in relation to barrage jamming under the condition of equal ber value. the second contribution is determination of exact ber value as a function of emission power in the case of barrage jamming. until now we have made different analyses and comparisons starting from estimated emission power. the analysis procedure is performed for qpsk modulated rcied activation signal. power saving is evident for all three methods of jamming signal generation. it is proved that additional 2.5db of power saving is achieved by equalization of frequency components level in comb signal. the analysis in this paper shows that comb jamming allows the same effects as barrage jamming, but with lower emission power. key words: remote controlled improvised explosive devices jamming, comb jamming, emission power, qpsk modulation, bit error rate 1. introduction procedures of fight against remote controlled improvised explosive devices (rcied) today are becoming more and more important. this method of activation allows a significant degree of comfort for an attacker to realize his intentions from a safe distance, where his activities are difficult to be detected. besides, there are few other reasons why remote control is very attractive to the attacker for explosive devices activation: more effective and precise bombing, absence of wires gives autonomy to the attacker and the possibility that an attacker is arrested or killed is decreased [1]. different wireless communication techniques are available to the attacker. these techniques are not implemented only in highly specialized, hardly available equipment, but may be found in received february 12, 2022; revised march 25, 2022; accepted april 7, 2022 corresponding author: aleksandar lebl iritel a.d., 11080 belgrade, batajnički put 23, serbia e-mail: lebl@iritel.com 406 j. radivojević, m. mileusnić, a. lebl, v. marinković-nedelicki low-cost, commercial devices, such as long range cordless telephones, cell phones, satellite phones, radio controlled toys, car alarms, keyless automobile door openers, wireless doorbell buzzers, and so on [1]. it is this variety of attacking techniques that set high requirements in the development of the jammer of rcied activation. it is necessary to implement a wide variety of jamming strategies and generate a significant number of jamming signal types, and to change signal parameters within wide limits for each signal type. not only are various signal types necessary, but it is also important to develop new jamming technology, or signal type in a very short time interval, measured in weeks, not in months or years. that is why it is important to have well organized development and production of rcied jamming equipment, as the one presented in [2]. very important element in the organization of such development and production is consolidating data about performed rcied attacks in a database. event logs, implemented at the systems from one of the suppliers, presented in [3], may be implemented for such a purpose. after this introduction, a survey of applied rcied jamming systems is given in the section 2. section 3 of the paper presents the three most important techniques for comb signal generation. section 4 deals with the characteristics of frequency spectrum of these techniques. the exact bit error rate (ber) characteristics of sweep and barrage jamming are compared in the section 5. the procedure to define parameters of comb jamming is described in the section 6. the emission power relation between comb and barrage jamming is investigated in the section 7. paper conclusions are in the section 8. 2. a survey of applied rcied jamming systems frequencies implemented in commercial devices used for rcied activation are, a priori, known and these frequencies should be dominantly jammed to achieve successful jamming. a survey of commercial devices frequencies, usually used for rcied activation, may be found in [4]. these frequencies include those implemented for mobile communication systems (gsm, umts), dect telephones, remote control toys, wireless doorbells and gate drivers, car alarms, and so on. a survey of frequencies shows the part of wireless device spectra which may be adapted for rcied activation. the applied signal power in these devices is variable in the range from several tens of milliwatts to several watts [1]. a very detailed presentation of jamming techniques with mathematical analysis may be found in [5]. the main analyzed or just explained jamming techniques in [5] are noise jamming (separately broadband, partial-band, narrowband depending on the number of jammed channels), tone jamming (single tone and multiple tones), sweep jamming and pulse jamming (in fact comb jamming according to this paper). contribution [6] emphasizes two specific jamming techniques: following (or follower) jamming and smart jamming. following jamming is applied against frequency hopping: here a jammer follows carrier frequency changes on the transmitted signal and then performs jamming on each hopped frequency. the jamming probability when follower jamming is applied is calculated in [7]. it is proved in [7] that channels scanning speed increases linearly as the function of the hopping rate for the lower values of jaming probability, but this dependence is hyperbolical for the higher jamming probability values. in smart jamming the knowledge of transmission protocol is the key issue, because jamming is based on the attack towards the places of protocol vulnerabilities, such as error correction checksum, acknowledgement messages, transmitting overloading (false messages), comb jamming as a strategy for rcied activation prevention 407 and so on. the special threat for successful jamming in the group of smart jamming strategies is the case when timing channels normally intended for regular function of the protected device are maliciously used as covert channels to send activation signal [8]. contributions [9] and [10] present an idea that there is a specific, optimum technique for jamming each kind of modulations. in these contributions jamming of digital amplitude-phase modulated signals is analyzed and it is proved that the same kind of jamming signal modulation as the activation signal modulation is not always the optimum choice. such an analysis is important only in the case that we a priori know the type of implemented modulation in activation message coding, but this is very rarely fulfilled. reactive (responsive) jamming technique is lately more and more implemented [4], [11] [16]. this technique may be treated, in fact, as a kind of smart jamming because jamming is based on successful detection of frequency band implemented for rcied activation signal transmission. in the existing solutions usually is implemented fast fourier transform (fft), as a fast and reliable detection algorithm [4], [11]. in [17] it is proved that rcied activation signal detection on the basis of fft analysis may be faster and in this way more reliable than frequency sweep in active jammer. in [18] this analysis is further expanded to other reactive detector types. a survey of problems, arising in the realization of reactive jammers, is presented in [11]. among them, the greatest attention in [11] is devoted to time synchronization in the case of simultaneous function of multiple jammers. in [12], [13] the characteristics of some other detector types (energy detector, matched filter detector, feature detector and detector based on the calculation of eigenvalues of the covariance matrix) are theoretically compared one to the other. contribution [14] is devoted to activation signals jamming in one specific network (ieee. 802.15.4), where message packet duration is very short (only about 350μs), thus causing necessity for a very short detection time. in the case of universal jamming (not for specific activation signal type), the achieved detection time is less than 1ms in [15], and even about 200μs for the frequency range of 6ghz in [16]. a survey of implemented techniques for remote activation of improvised explosive devices and the frequency band intended for each technique implementation may be found in [19]. besides these techniques, sms message sending is very attractive and in some world regions dominant technique of rcied activation, because of its realization simplicity [20], [21]. rcied activation signal sending by sms messages may be prevented or delayed by various detection algorithms implemented in base stations [21]. modern solutions of rcied activation signal jammers should follow development in communication procedures and techniques. one such direction which aims at reliable and hardly detectable communication is implementation of frequency hopping signals. today hopping speed in realized systems may be significantly higher than it is presented in [7]. responsive jammers realized on the base of rcied activation signal detection in some cases have the possibility to follow frequencies changes when frequency hopped signal is applied [22], [23]. according to the achieved detection rate, the solution [22] may block the signal with 300hops/s while the solution [23] is effective even when the hop rate is 10000hops/s. the systems [22], [23] are available now and may be purchased on the market. our idea is to implement active jamming in a broad frequency range with not too high jamming power and thus to avoid the risk of, perhaps, unsuccessful rcied activation signal detection. one possible solution with these desired characteristics is comb jamming signal implementation according to the principles presented in this paper. 408 j. radivojević, m. mileusnić, a. lebl, v. marinković-nedelicki there are two mutually different accesses to jamming signal generation. the first one is to generate the desired shape of jamming signal at low/lower frequency band and then to shift it by the modulator to the necessary frequency band [24]. it is easier to model the signal at lower frequencies, but modulation is additional complication in the solution practical implementation. the other possibility is to directly generate the signal in the jammed frequency band. our intention is to consider the first possibility since we want to cover the broad frequency range in one moment and the generated signal may be shifted by several modulators adjusted at different frequency bands in the same time. a completely new approach to jamming signal generation is presented in [25], [26]. there is no need to take care about the shape of jamming signal or even to have such a generator. the solutions belong to the group of responsive jammers. when this approach is applied, the detected signal which has to be jammed is first delayed by the implementation of optical lines with adjustable, precise delay and then transmitted as the generated jamming signal. the selected value of delay determines the level of rcied activation signal attenuation. instead of this approach, we apply a specific jamming signal generation again to avoid the possibility that rcied activation signal is not detected. the complexity of the fight against the rcied activation and development perspectives of remote control of these devices were already noticed in [27]. there were made measurements of the bit error rate in the transmission when several jamming procedures types are implemented, thus presenting the possibilities for the fight against the then existing devices, but also against devices, which would appear in the future. the obtained measurement results led to the development of practical devices for fight against rcied activation [28] [30]. in these devices generation of very heterogeneous jamming signal types is applied: continuous wave (cw), amplitude shift keying (ask), phase shift keying (psk), frequency shift keying (fsk), comb signal (barrage jamming), sweep signal (with different sweep strategies, as, for, example, single sweep, multiple sweep, sweep with frequency gap, where there is no sweep signal and where jamming device management may be realized, etc.), white gaussian noise (wgn), and so on. among all these techniques jamming by sweep signal and jamming by wgn are most often applied. the characteristics of sweep jamming are analyzed in detail in [31] [33]. in [34] there is compared necessary power to realize jamming of mpsk (m-ary psk) modulated rcied activation message by sweep signal and by wgn, but without considering simultaneous influence of sweep signal and noise which is normally present in the system environment (environmental noise). sweep signal and wgn are in some cases combined in one unique signal, as demonstrated in [35]. a method for wgn signal generation is analyzed in [36]. the results presented in these last six papers are based on iritel great experience in developing jamming devices of various applications: against rcied activation [37], for jamming mobile telephony systems [38] and for radio surveillance and jamming [39]. comb jamming is a special technique for generating a signal for rcied activation prevention, similar to, but more energy efficient than barrage jamming. iritel is one of the pioneers for such jamming implementation [40], [41]. regarding recent times, the main characteristics of comb jamming are presented in [42]. comb jamming as a strategy for rcied activation prevention 409 3. techniques for comb jamming realization the main purpose of comb signal definition for jamming is to achieve similar implementation characteristics as if barrage jamming is applied, but with reduced emission power. comb signal consists of a number of discrete, usually equidistant components when considering frequency spectrum. in this way continual part of frequency spectrum is replaced by only one frequency component, but with the same jamming effect. there are three main methods for comb signal generation [43]: rectangular pulse train, filtered pulse train and pseudorandom sequence. the signal with the desired frequency characteristics (number of discrete frequency components, components distance in frequency domain) is usually first generated in a low frequency band. after that such a signal modulates a carrier in order to be shifted to the pre-defined frequency band. rwg τ t lpf txmod posc a fig. 1 principle block-scheme of rectangular pulse train jamming signal generation figure 1 presents the principle block-scheme for generating the rectangular pulse train signal. the generation process is initiated in the rectangular waveform generator (rwg), where the pulses of duration τ and period t are formed. the amplitude of pulses is a. the frequency spectrum of the generated pulses is band-limited in the low-pass filter (lpf). the frequency characteristic of this lpf is flat in the pass-band, meaning that only the undesired frequency components are truncated. amplitudes of frequency components in the pass band are not changed and they remain as generated. such modified impulses have the frequency spectrum at low frequency band and this spectrum is shifted to the required higher frequency band in the modulator (mod). here the generated pulse train signal is multiplied by the signal from the programmable oscillator (posc). it is possible to produce variable signal frequency band changing the frequency of posc, i.e. to additionally sweep the generated comb signal in the case that it is necessary to jam wider frequency band (one such application example for jamming mobile communication in gsm systems may be found in [44]). at the end the generated jamming signal is transmitted by a transmit antenna (tx). the generation of filtered pulse train signal is a slight modification of the previous method. its principle block-scheme is equal to the one presented in figure 1. difference is in the function of lpf. besides limiting the pass-band width, this filter also modifies the amplitudes of the generated comb frequency components with the aim to achieve approximately flat frequency characteristic in the pass band. in lpf the higher frequency components are more amplified (or, in other sense, less attenuated) than the lower frequency components. modifications are also noticeable in the pulse train signal shape in the time domain [43]. 410 j. radivojević, m. mileusnić, a. lebl, v. marinković-nedelicki figure 2 presents the principle block-scheme for comb signal generation according to the third method based on pseudorandom sequence implementation. the initial signal is generated in the linear feedback shift register (lfsr). the period of a sequence is t and it consists of n pulses whose duration is τ (i.e. it is t=n·τ). the amplitude of each pulse is +a or –a. the remaining algorithm realization phases are the same as for the previous algorithms: the spectrum of the generated comb signal is filtered in lpf and transferred to higher frequencies after signals modulation (implementation of blocks mod and posc). tx mod posc lfsr lpf τ t a a t τ τ fig. 2 principle block-scheme of pseudorandom sequence based jamming signal generation 4. frequency spectrum characteristics of three methods for comb signal generation frequency spectrum of rectangular pulse train signals is well-studied and presented in many references [43]. this spectrum is discrete with equidistant components and may be expressed by the equation 2 2 2 2 2 sin ( ) ( ) ( ) ( )k k a ktp f f k tt t  =−    =   −         (1) where p(f) presents signal power spectral density, δ(f-k/t) is designation for places where discrete frequency components are situated and the remaining part in the equation presents frequency components power envelope. the meaning of variables a, τ and t is already illustrated in the fig. 1. frequency spectrum of the signal shaped as the rectangular pulse train is presented in the fig. 3. such signal is obtained implementing the comb signal generator from the fig. 1. spectral components envelope is the function in the form (sin(x)/x)2 and the number of frequency components in the main lobe is selected by the ratio k=t/τ. the function of the lpf is to pass certain number of components from the main lobe leaving them with unchanged amplitudes. in the example from the fig. 3 it is k=6 and the lpf passes total 2·i+1 frequency components where i=3 is the number of non-attenuated frequency components on both sides related to the central component. comb jamming as a strategy for rcied activation prevention 411 f p(f) k=0 k=1k=-1k=-2k=-3k=-4k=-5k=-6 k=2 k=3 k=4 k=5 k=6 k=t/τ=6 i=3 fig. 3 frequency spectrum of rectangular pulse train signal frequency spectrum of the signal shaped as the filtered rectangular pulse train is presented in fig. 4. its initial shape is equal to the one presented in fig. 3 with the addition that the lpf characteristic (the curve designated by lpf in figure 4) has to approximate reciprocal function of (sin(x)/x)2 in the filter pass band. in this way 2·i+1 transferred frequency components at the generator output have approximately the same level. f p(f) k=0 k=1k=-1k=-2k=-3k=-4k=-5k=-6 k=2 k=3 k=4 k=5 k=6 k=t/τ=6 i=3 lpf fig. 4 frequency spectrum of filtered rectangular pulse train signal similar to the case of rectangular pulse train, frequency spectrum of the pseudorandom sequence signal may be presented by the equation ( ) ( )k k k p f p f n  =− =  −    (2) where coefficients pk which model the frequency spectrum envelope are 2 1 for 0kp k n = = (3) 2 2 2 sin ( ) 1 for 0 ( ) k k n np k kn n  + =      (4) 412 j. radivojević, m. mileusnić, a. lebl, v. marinković-nedelicki variables a, n and τ are already defined in the fig. 2 and in the explanation dealing with the same figure. f p(f) k=0 k=1k=-1k=-2k=-3k=-4k=-5k=-6 k=2 k=3 k=4 k=5 k=6 n=t/τ=6 i=3 fig. 5 frequency spectrum of the pseudorandom sequence signal fig. 5 presents the frequency spectrum of the generated pseudorandom sequence signal [43], [45]. comparing to the frequency spectrum of rectangular pulse train (fig. 3), difference exists at the component for i=0. this component has very low level (it is nearly eliminated) comparing to other components in the main lobe, because the typical values of n are more than 10. as frequency spectrum of pseudorandom sequence signal is similar to the spectrum of rectangular pulse train, all analysis in the continuation of the paper are performed only for this second type of signal. now, when we have explained the main characteristics of comb signal in time and frequency domain, the logical question is: what are the possibilities for this signal generation and practical implementation. if we want to have a wide main frequency lobe, the rectangular pulse duration τ should be very narrow. in a hardware sense it is difficult to generate such an impulse with a significant amplitude level. on the other hand, if we adopt the longer τ, there are fewer frequency components in the main lobe and there is a need for more additional hardware processing to expand the frequency spectrum. this means that we need to have more modulators and programmable oscillators connected as in fig.1 or fig. 2 to realize complete solution. generally, the process of shifting and shaping the frequency spectrum of comb signal which is generated in lower frequency band is also challenging. these are the reasons why comb jamming is not often practically applied. 5. performances comparison of sweep and barrage jamming the main purpose of comb jamming implementation is to achieve benefits as at barrage jamming, but with lower emission power. our first step in such an analysis is to compare the performances of pure sweep and pure barrage jamming. such an analysis is already approximately performed in [34] for mpsk modulated signals. the deviation from the accurate result is mainly caused by the fact that it is supposed that only one error in the symbol is possible regardless of the jamming signal level. in other words, the situation when both bits in qpsk symbol are faulty is replaced by only one faulty bit. comb jamming as a strategy for rcied activation prevention 413 0,01 0,1 1 -20 -18 -16 -14 -12 -10 -8 -6 -4 -2 0 2 4 6 8 s/n (db), s/i (db) p b s/n for s/i=60db s/i for s/n=60db 0,01 0,1 1 -20 -18 -16 -14 -12 -10 -8 -6 -4 -2 0 2 4 6 8 s/n (db), s/i (db) p b s/n for s/i=60db s/i for s/n=60db 0,01 0,1 1 -20 -18 -16 -14 -12 -10 -8 -6 -4 -2 0 2 4 6 8 s/n (db), s/i (db) p b s/n for s/i=60db s/i for s/n=60db fig. 6 ber (pb) as a function of the ratio s/n in the case of barrage jamming and as a function of s/i in the case of sweep jamming in this paper we implemented more accurate comparison in the case of qpsk signal jamming. the exact number of faulty bits in a symbol is supposed in an estimation process. the estimation is based on the implementation of our originally developed simulation program which is already presented in [35]. the purpose of the simulation program is to determine bit error rate (ber) when mpsk modulated signal is jammed by the simultaneous influence of sweep and barrage jamming. for the implementation in this paper we select one of the two jamming signals to have very low level. in order to simulate barrage jamming, we have defined the sweep signal level by the expression s/i=60db and in order to simulate sweep jamming we have defined noise level by s/n=60db, where s is reserved for the level of qpsk modulated rcied activation signal and i and n are the levels of sinusoidal interference and noise signal, respectively. fig. 6 presents the ber values as a function of the ratio s/n when barrage jamming is implemented and as a function of s/i when sweep jamming is implemented. for the ber values greater than 0.1 (which are of interest in jamming applications) it is necessary to apply higher interference signal level in the case of barrage jamming to achieve the same ber as if sweep jamming is implemented. difference in interference level is about 3db when it is ber=0.2, 4db when it is ber=0.3 and 4.8db when it is ber=0.4. 414 j. radivojević, m. mileusnić, a. lebl, v. marinković-nedelicki 6. comb jammer parameters definition jammed bandwidth is usually the initial condition which has to be defined in each jammer realization. this bandwidth is then transferred to the bandwidth important for comb jammer design. let us suppose that 2·i+1 is the number of discrete frequency components which is expected to effectively cause jamming. this number of frequency components is odd, but the generality of results is not lost because we may always select one component more than it is necessary. the second parameter which has to be satisfied at the beginning is the desired ber value. the first problem in jammer design is to determine the optimum number of frequency components in the main lobe of a comb signal before lpf when the number of generated jamming frequencies is known. optimum number of frequency components is selected so that jamming signal emission power is minimized for the pre-defined ber. when rectangular pulse train or filtered rectangular pulse train is designed, the problem is manifested as the selection of the ratio τ/t. the comb jamming signal is presented as the sum of a number of frequency components. according to the shape of frequency spectrum in figure 3 for the rectangular pulse train, the minimum level has the highest frequency component in the main lobe which is passed through the lpf (i.e. the component of the order i). comb jammer has to be designed so that this component satisfies the desired ber value. as a consequence, all other frequency components after the lpf have the higher level than the component of the order i and thus cause the higher ber value. the fact that comb signal has the minimum power means that its amplitude a is minimum. there are two opposite effects, which have the influence on the value of a. first, if we select the lower value of ratio τ/t, there will be more frequency components in the main lobe and the frequency components after the lpf will tend to be equal. the effect of this modification is lower value of a. but, according to the equation (1), lower value of τ/t means that multiplication factor in this equation in front of the part in the shape (sin(x)/x)2 is decreased and it is necessary to compensate this effect by the higher value of a. that is why there is the ratio τ/t where signal amplitude a is minimum. this is illustrated in figure 7. there are three presented characteristics. each of them is for the same width of filter pass-band, i.e. equal signal period t, but for different pulse width τ. f p(f) k=0 k=1k=-1k=-2k=-3k=-4k=-5k=-6 k=2 k=3 k=4 k=5 k=6 i=3 1 2 3 fig. 7 frequency spectrum of rectangular pulse for i=3 and three different values of τ comb jamming as a strategy for rcied activation prevention 415 the curve 1 in figure 7 corresponds to the case when the number of frequencies in the main lobe is significantly higher than the number of frequencies which have to cause jamming. signal energy in the main lobe is distributed on relatively high number of frequency components which have relatively low level each. the curve 2 is opposite case, when a low number of frequencies are in the pass-band. these frequencies have higher level than in the previous case. the curve 3 is in the middle when considering signal level at f=0, but its level at the frequency f=i is maximal. our problem to determine the optimum ratio τ/t is now solved after finding the first and the second derivative of the expression (1) at the point i, because it is necessary to find when the power in this point is maximal. the first derivative when considering only spectrum envelope in the (1) is expressed as 2 sin(2 ) i k i k xdp a dx k =−    =     (5) while the second derivative is 2 2 2 2 cos(2 ) i k i d p a k x dx =− =       (6) where it is x=τ/t and components between k=-i and k=i are passed through the lpf. according to the real conditions from the figure 3, it must be i<(1/x). in the point i the expressions (5) and (6) become ( ) 2 sin(2 ) i i xdp a dx i    =     (7) and 2 2 2 cos(2 ) i i xd p a idx     =       (8) the equation (7) is equal 0 if it is satisfied the condition 1 2 x i =  (9) meaning that it is the function extreme. for this x the value of the second derivative according to (8) is less than 0, which proves that emission power in the point defined by (9) is really the maximum. it further means that system gain should have minimum value to reach the desired power level and that emission power should be minimal in that case. 7. emission power relation of comb and barrage jamming we have already emphasized that the intention of comb jamming implementation is to produce the same effect as with barrage jamming, but with the reduced jammer emission power. that is why we are now going to compare the necessary jamming power for these two jamming strategies. let us suppose that our wish is to cause jamming in total 2·i+1 channels. the classical solution is to implement noise signal for jamming which covers continually frequency 416 j. radivojević, m. mileusnić, a. lebl, v. marinković-nedelicki band of these channels. the improved possibility is to implement only one jamming frequency in each channel. the characteristics presented in figure 6 correspond to each one of 2·i+1 considered channels, i.e. frequency components. as a consequence, benefits of filtered rectangular pulse train are directly obvious from figure 6. namely, the power of each frequency component in the filtered pulse train signal is equal and for the same extent lower than uniform noise jamming power to cause the same ber. in this way the total effect of jamming in all channels is also equal to the one presented in figure 6. the necessary emission power decreases when comb jamming is implemented is δp1fp=3db when it is ber=0.2, δp2fp=4db when it is ber=0.3 and δp3fp=4.8db when it is ber=0.4. the benefits are decreased when rectangular pulse train or pseudorandom sequence signal is implemented. to determine the improvement in emission power in this case, we start from the calculation of total emission power related to the case of uniform spectrum emission power. our estimation is illustrated by the example when it is i=3, meaning that total 7 frequency components are passed through the lpf. according to the problem which is earlier defined to be solved, components at i=3 need to have equal power. table 1 illustrates procedure to determine the ratio of comb signal power to the barrage signal power when the sinusoidal component level at i=3 is equal to the value of the power at the same frequency in the case of filtered pulse train or also to the level of barrage (noise) signal. the column with the designation prel presents ratio of considered sinusoidal component power with the order k to the unity power. the last two rows in the table present the power ratio of total 7 frequency components after the lpf to the uniform power in the same frequency band. the data in the last column of the table 1 is graphically presented by fig. 8. it illustrates the power level ratio of frequency components of rectangular pulse train signal to barrage signal where rectangular pulse train signal has (at least) the same jamming effect as barrage signal. the calculated power difference of 2.5db has to be subtracted from the power save when filtered rectangular pulse train signal is implemented to obtain the equivalent power save when rectangular pulse train is considered. therefore, in the case of rectangular pulse train implementation, power save is δp1p=0.5db when it is ber=0.2, δp2p=1.5db when it is ber=0.3 and δp3p=2.3db when it is ber=0.4. these values are significantly lower than the values for filtered pulse train, thus approving the benefits of power spectrum equalization. table 1 power ratio of comb signal for rectangular pulse train to barrage jamming k τ/t prel pcomb/pbarrage -3 0.167 0.011258 1 -2 0.167 0.019044 1.692 -1 0.167 0.025422 2.258 0 0.167 0.027889 2.398 1 0.167 0.025422 2.258 2 0.167 0.019044 1.692 3 0.167 0.011258 1 total 1.768 total (db) ≈2.5 comb jamming as a strategy for rcied activation prevention 417 pcomb/pbarrage k=0 k=1k=-1k=-2k=-3 k=2 k=3 1 2 barrage fig. 8 power spectrum ratio graphical presentation for rectangular pulse train to barrage signal 8. conclusions this paper starts with the comprehensive presentation of iritel contributions in the area of rcied activation jamming. after that analysis is directed towards comb jamming. comb jamming is a wide-band jamming strategy. it efficiently replaces more often implemented barrage jamming strategy. the available literature only emphasizes the fact that comb jamming signal power is lower than barrage jamming power, but without any attempt to quantitatively support this statement [5], [46]. the main paper contribution is quantitative estimation of emission power difference between comb and barrage jamming under the criterion of the same achieved ber value in both cases. the analysis in the paper considers all three most often implemented strategies for comb jamming signal generation: rectangular pulse train, filtered pulse train and pseudorandom sequence. it is proved that power equalization for all generated frequency components when filtered pulse train signal is considered additionally achieves 2.5db improvement of power saving possibilities. in this way power saving is more than doubled comparing to the pulse train signal. the second paper contribution is determination of exact ber value when barrage jamming of rcied activation message is applied. in our previous contributions we have used only approximate calculation of this value [34]. the exact value of this variable is obtained by the implementation of our original simulation program. our other direction of jammers development is related to malicious drones’ missions prevention. modern drone communication channels are often realized using some broadband techniques [47]: frequency hopping spread spectrum (fhss) [48] or direct sequence spread spectrum (dsss) [49]. comb jamming is highly suitable for jamming these two signal types due to its ability to cover great bandwidth with not too high emission power. the solutions presented in this paper are the first step for the future development to allow broadband jamming of drone communication signals. 418 j. radivojević, m. mileusnić, a. lebl, v. marinković-nedelicki references [1] g. kumaraswamy rao and k. v. ranga rao, "intelligent jamming solution to defeat the growing menance of remotely controlled improvised devices (rcieds) using electronic counter measures", int. j. electron. commun. comput. eng., vol. 4, no. 5, pp. 1479–1488, 2013. [2] m. e. pesci, "systems engineering in counter radio-controlled improvised explosive device electronic warfare", john hopkinsapl technical digest, vol. 31, no. 1, pp. 58–65, 2012. [3] j. haystead, "defeat ied mission expands to defensive electronic attack (dea)", the j. electron. defense, pp. 28–40, 2015. [4] k. wilgucki, r. urban, g. baranowski, p. grądzki and p. skarźyński, automated protection system against rcied, military communications and information technology. chapter 7: cognitive radio and spectrum management techniques, 2012, pp. 593–601. [5] r. poisel, modern communications jamming principles and techniques. boston/london, second edition, artech house, 2011. [6] k. wilgucki, r. urban, g. baranowski, p. grądzki and p. skarźyński, "selected aspects of effective rcied jamming", in proceedings of the military communications and information systems conference, warsaw, 2012, pp. 1–5. [7] k. burda, "the performance of follower jammer with a wideband scanning receiver", j. electr. eng., vol. 55, no. 1–2, pp. 36–38, 2004. [8] s. d’oro, l. gallucio, g. morabito and s. palazzo, "efficiency analysis of jamming-based countermeasures against malicious timing channel in tactical communications", in proceedings of the ieee international conference on communications icc, budapest, 2013, pp. 4020–4024. [9] s. amuru and r. m. buehrer, "optimal jamming strategies in digital communications / impact of modulation", in proceedings of the ieee global communications conference (globecom), 2014, pp. 1619–1624. [10] s. amuru and r. m. buehrer, "optimal jamming against digital modulation", ieee trans. inf. forensics secur., vol. 10, no. 10, pp. 2212–2224, 2015. [11] j. mietzner, p. nickel, a. meusling, p. loos and g. bauch, "responsive communications jamming against radio-controlled improvised explosive devices", ieee commun. mag., vol. 50, no. 10, pp. 38–46, 2012. [12] m. tanatwy, "responsive communication jamming detector with noise power fluctuation using cognitive radio", int. j. innovative res. comput. commun. eng., vol. 2, no. 10, pp. 5967–5973, 2014. [13] t. trump and i. müürsepp, "detection speed of responsive communication jamming detectors, recent advances in telecommunications and circuits", in proceedings of the 2nd international conference on circuits, systems, communications, computers and applications, dubrovnik, 2013, pp. 149–154. [14] m. wilhelm, i. martinović, j. schmitt and v. lenders, "reactive jamming in wireless networks: how realistic is the threat?", in proceedings of the 4th acm conference on wireless network security (wisec '11), acm, hamburg, 2011, pp. 47–52. [15] g. evans, "a new weapon in the fight against rcieds, army technology", august 2015, https://www.army-technology.com/features/featurea-new-weapon-in-the-fight-against-rcieds-4647155/. [16] selena electronics, rss intelligent reactive stationary jammer and rsv vehicle reactive jammer. in electronics warfare systems: jamming solution, 2015. [17] m. mileusnić, p. petrović, a. lebl and b. pavić, "comparison of rcied activation responsive and active jamming reliability", in proceedings of the 6th international conference icetran 2019. srebrno jezero, 2019, pp. 988–993, awarded as the best paper in the section of telecommunications. [18] m. mileusnić, p. petrović, v. kosjer, a. lebl and b. pavić, "reliability analysis of different rcied activation signal responsive jamming techniques and their comparison to active jamming", fu electr. energ., vol. 33, no. 3, pp. 459–476, 2020. [19] a. gulyás, "the radio controlled improvised explosive device (rcied) threat in afghanistan", aarms, vol. 12, no. 1, pp. 1–11, 2013. [20] oss net, survey of rcieds southeast asia – feb 2003-oct 2005. oss southeast asia division, 2005. [21] f. e. idachaba, "algorithm for source mobile identification and deactivation in sms triggered improvised explosive devices", procedia eng., vol. 78, pp. 96-101, 2014. [22] stratign, "radio jammers", https://www.stratign.com/radio-jammers/. [23] security & counterintelligence group llc, "lightning: rcied jamming system – vehicle installed", https://scgroup-ltd.com/lightning/. [24] j. magiera, "wideband signal generation for jamming radio-controlled improvised explosive devices", in proceedings of the 41st international conference on telecommunications and signal processing (tsp). athens, 2018, pp. 1–4. https://www.army-technology.com/features/featurea-new-weapon-in-the-fight-against-rcieds-4647155/ https://www.stratign.com/radio-jammers/ https://scgroup-ltd.com/lightning/ comb jamming as a strategy for rcied activation prevention 419 [25] m. e. belkin, a. alyoshin, d. fofanov and a. s. sigov, "studying microwave-photonics design principle of a responsive jammer for radio-controlled explosive devices", tech. phys. lett., vol. 46, no. 11, pp. 1132–1135, 2020. [26] m. e. belkin, l. zhukov and n. smirnov, "devising an optimal time-delay circuit configuration for a microwave-photonics-based radio communication jammer", in proceedings of the 29th telecommunications forum (telfor), belgrade, 2021, pp. 440–443. [27] p. petrović and m. šunjevarić, "radio surveillance and jamming systems and techniques", trends in telecommunications, pp. 17.1.-17.22., belgrade, november 1988, (p. petrović, m. šunjevarić, “savremeni sistemi i tehnike za radio-izviđanje i ometanje”, pravci razvoja telekomunikacija, str. 17.117.22, beograd, novembar 1988). [28] iritel high frequency (hf) radio surveillance and jamming system, chapter in the book m. streetly, jane’s radar and electronic warfare systems. ihs global limited, 2011. [29] iritel very/ultra high frequency (v/uhf) radio surveillance and jamming system, chapter in the book m. streetly, jane’s radar and electronic warfare systems. ihs global limited, 2011. [30] m. mileusnić, p. petrović, b. pavić, v. marinković-nedelicki, j. glišović, a. lebl and i. marjanović, "the radio jammer against remote controlled improvised explosive devices", in proceedings of the 25th telecommunications forum (telfor), belgrade, 2017, pp. 151–154. [31] m. mileusnić, b. pavić, v. marinković-nedelicki, p. petrović, d. mitić and a. lebl, "analysis of jamming successfulness against rcied activation", in proceedings of the 5th international conference icetran 2018. palić, 2018, pp. 1206–1211, paper awarded as the best one in the section of telecommunications. [32] m. mileusnić, b. pavić, v. marinković-nedelicki, p. petrović, d. mitić and a. lebl, "analysis of jamming successfulness against rcied activation with the emphasis on sweep jamming", fu electron. energ., vol. 32, no. 2, pp. 211–229, 2019. [33] v. marinković-nedelicki, a. lebl, m. mileusnić, p. petrović and b. pavić, "ber calculation for sweep jamming of mpsk modulated rcied activation message signals", in proceedings of the 18th international symposium "infoteh jahorina 2019". jahorina, 2019, pp. 1–6. [34] m. mileusnić, p. petrović, b. pavić, v. marinković-nedelicki, v. matić and a. lebl, "jamming of mpsk modulated messages for rcied activation", in proceedings of the 8th international scientific conference on defensive technologies oteh, belgrade, 2018. [35] v. marinković-nedelicki, a. lebl, m. mileusnić and p. petrović, "combined jamming in rcied activation prevention", in proceedings of the 19th international symposium “infoteh jahorina 2020”. jahorina, 2020, pp. 1–6. [36] a. lebl, m. mileusnić, b. pavić, v. marinković-nedelicki and p. petrović, "programmable generator of pseudo-white noise for jamming applications", in proceedings of the 27th telecommunications forum (telfor). belgrade, 2019, pp. 1–4. [37] p. petrović, n. remenski, p. jovanović, v. tadić, b. pavić, m. mileusnić and b. mišković, wrj 2004 wideband radio jammer against rcieds. tehničko rešenje – novi proizvod na projektu tehnološkog razvoja tr32051 pod nazivom razvoj i realizacija naredne generacije sistema, uređaja i softvera na bazi softverskog radija za radio i radarske mreže, http://www.iritel.com/images/pdf/wrj2004-e.pdf, 2011. [38] n. remenski, b. pavić, p. petrović, m. mileusnić and v. marinković-nedelicki, integrisana radiooprema za zaštitu prostora od mobilnih veza (treća generacija radio-opreme). tehničko rešenje – novi proizvod s oznakom cj-1p na projektu tehnološkog razvoja tr-11030 razvoj i realizacija nove generacije softvera, hardvera i usluga na bazi softverskog radija za namenske aplikacije, http://www.iritel.com/images/pdf/cj-1p-e.pdf,, 2010 (also published in the book m. streetly, jane’s radar and electronic warfare systems.. ihs global limited, 2011). prva generacija radio-opreme s oznakom cj-1 je realizovana na projektu tehnološkog razvoja tr6149b, 2006. [39] p. petrović, m. mileusnić, b. pavić, v. tadić and v. marinković-nedelicki, razvoj nove generacije sistema za radio-izviđanje i ometanje u vf i vvf/uvf opsegu. tehničko rešenje u okviru projekta 10 m 06, ministarstvo za nauku i tehnologiju srbije, fond za naučni razvoj, 1997-2000. [40] p. petrović, generator of jamming signals gemos. technical solution, 1990. [41] p. petrović, development of new generation of gemos devices and signal classifier based on dsp technology. technical solution, 1999. [42] a. lebl, m. mileusnić and j. radivojević, "combined and comb rcied activation messages jamming – two different strategies with similar names", sci. tech. rev., vol. 70, no. 1, pp. 21–28, 2020. [43] b. a. black, on the generation of waveforms having comb-shaped spectra. nrl memorandum report 619, naval research laboratory, may 1988. [44] r. e. stoddard, multi-band jammer. patent no. us7697885 b2, 2010, pp. 1–7. http://www.iritel.com/images/pdf/wrj2004-e.pdf http://www.iritel.com/images/pdf/cj-1p-e.pdf 420 j. radivojević, m. mileusnić, a. lebl, v. marinković-nedelicki [45] x. song, x. wang, z. dong, x. zhao and x. feng, "pseudo-random sequence correlation identification parameters and anti-noise performance", energies, vol. 2018, no. 11, pp. 1–18, 2018. [46] m. r. frater and m. ryan, electronic warfare for the digitized battlefield. artech house inc., 2001. [47] v. chamola, p. kotesh, a. agarwal, naren, n. gupta and m. guizani, "a comprehensive review of unmanned aerial vehicle attacks and neutralization techniques", ad hoc networks, vol. 111, p. 102324, 2021, [48] h.-b. kil, j.-s. lee and e.-r. jeong, "analysis of frequency hopping signals in commercial drones", int. j. pure appl. math., vol. 118, no. 19, pp. 2015–2024, 2018. [49] b. m. todorović and v. d. orlić, "direct sequence spread spectrum scheme for an unmanned aerial vehicle ppm control signal protection”, ieee commun. lett., vol 13, no. 10, pp. 727–729, 2009. facta universitatis series: electronics and energetics vol. 29, no 4, december 2016, pp. 701 720 doi: 10.2298/fuee1604701a anas n. al-rabadi1,2 received november 28, 2015; received in revised form april 13, 2016 corresponding author: anas n. al-rabadi electrical engineering department, philadelphia university, jordan & computer engineering department, the university of jordan, amman-jordan (email: alrabadi@yahoo.com) facta universitatis series: electronics and energetics vol. 28, no 4, december 2015, pp. 507 525 doi: 10.2298/fuee1504507s horizontal current bipolar transistor (hcbt) – a low-cost, high-performance flexible bicmos technology for rf communication applications tomislav suligoj1, marko koričić1, josip žilak1, hidenori mochizuki2, so-ichi morita2, katsumi shinomura2, hisaya imai2 1university of zagreb, faculty of electrical engineering and computing, department of electronics, microand nano-electronics laboratory, croatia 2asahi kasei microdevices co. 5-4960, nobeoka, miyazaki, 882-0031, japan abstract. in an overview of horizontal current bipolar transistor (hcbt) technology, the state-of-the-art integrated silicon bipolar transistors are described which exhibit ft and fmax of 51 ghz and 61 ghz and ftbvceo product of 173 ghzv that are among the highest-performance implanted-base, silicon bipolar transistors. hbct is integrated with cmos in a considerably lower-cost fabrication sequence as compared to standard vertical-current bipolar transistors with only 2 or 3 additional masks and fewer process steps. due to its specific structure, the charge sharing effect can be employed to increase bvceo without sacrificing ft and fmax. moreover, the electric field can be engineered just by manipulating the lithography masks achieving the high-voltage hcbts with breakdowns up to 36 v integrated in the same process flow with high-speed devices, i.e. at zero additional costs. double-balanced active mixer circuit is designed and fabricated in hcbt technology. the maximum iip3 of 17.7 dbm at mixer current of 9.2 ma and conversion gain of -5 db are achieved. key words: bicmos technology, bipolar transistors, horizontal current bipolar transistor, radio frequency integrated circuits, mixer, high-voltage bipolar transistors. 1. introduction in the highly competitive wireless communication markets, the rf circuits and systems are fabricated in the technologies that are very cost-sensitive. in order to minimize the fabrication costs, the sub-10 ghz applications can be processed by using the high-volume silicon technologies. it has been identified that the optimum solution might received march 9, 2015 corresponding author: tomislav suligoj university of zagreb, faculty of electrical engineering and computing, department of electronics, microand nano-electronics laboratory, croatia (e-mail: tom@zemris.fer.hr) facta universitatis, seri es: electroni cs and energeti cs vol. 22, no. 4, december 2016, 100-118 udk: doi: multi-valued galois shannon davio trees and their complexity anas n. al-rabadi abstract: the idea of shannon-davio (s/d) trees for binary logic is a general concept that found applications in the sum-of-product (sop) minimization and the generation of new diagrams and canonical forms. extended s/d trees are used to generate forms that include a minimum galois field sum-of-products (gfsop) forms. since there exist many applications of galois field of quaternary radix especially that gf(4) is considered as an important extension of gf(2), the extension of the s/d trees to gf(4) is presented here. a general formula to calculate the number of inclusive forms (ifs) per variable order for an arbitrary galois field radix and arbitrary number of variables is derived and introduced. a new fast method to count the number of ifs for an arbitrary galois radix and functions of two variables is also introduced; the ifn,2 triangles. the results introduced in this work can be useful for the creation of an efficient gfsop minimizer for galois logic and in other applications such as in reversible logic synthesis. keywords: complexity, galois field sum-of-product (gfsop), galois forms, inclusive forms, multi-valued logic, quaternary logic, shannon-davio (s/d) trees. 1 introduction spectral transforms play an important role in the synthesis, analysis, testing, classification, formal verification and simulation of logic circuits and systems. dyadic families of discrete transforms; reed-muller and green-sasao hierarchy, walsh, arithmetic, adding and haar transforms and their generalizations to p-adic (multi-valued) transforms, have found a fruitful use in digital system design [1, 2, 6-35]. reed-muller-like spectral transforms [2-6, 12-14, 16-18, 21, 25, 27, 29, 33, 35] have found a variety of useful applications in minimizing exclusive sum-of-products (esop) and galois field sop (gfsop) expressions, creation of new forms, binary decision diagrams, spectral decision diagrams, regular manuscript received july 15, 2015; revised electrical engineering department, philadelphia university, jordan & computer engineering department, the university of jordan, amman-jordan e-mail: alrabadi@yahoo.com this research was performed during sabbatical leave in 2015-2016 granted from the university jordan and spent at philadelphia university 101 multi-valued galois shannon davio trees and their complexity 1electrical engineering department, philadelphia university, 2jordan & computer engineering department, the university of jordan, amman-jordan abstract. the idea of shannon-davio (s/d) trees for binary logic is a general concept that found applications in the sum-of-product (sop) minimization and the generation of new diagrams and canonical forms. extended s/d trees are used to generate forms that include a minimum galois field sum-of-products (gfsop) forms. since there exist many applications of galois field of quaternary radix especially that gf(4) is considered as an important extension of gf(2), the extension of the s/d trees to gf(4) is presented here. a general formula to calculate the number of inclusive forms (ifs) per variable order for an arbitrary galois field radix and arbitrary number of variables is derived and introduced. a new fast method to count the number of ifs for an arbitrary galois radix and functions of two variables is also introduced; the ifn,2 triangles. the results introduced in this work can be useful for the creation of an efficient gfsop minimizer for galois logic and in other applications such as in reversible logic synthesis. key words: complexity, galois field sum-of-product (gfsop), galois forms, inclusive forms, multi-valued logic, quaternary logic, shannon-davio (s/d) trees. 702 a. n. al-rabadi multi-valued galois shannon davio trees and their complexity 703 multi-valued galois shannon davio trees and their complexity structures, besides their well-known uses in digital communications, digital signal processing, digital image processing and fault detection and testing [1-7, 12, 13, 15-19, 21-25, 27, 29, 32, 35]. the method of generating the new families of multi-valued shannon and davio spectral transforms is based on the fundamental multi-valued shannon and davio expansions, respectively. the remainder of this paper is organized as follows: basic definitions of the fundamental binary expansions and their multi-valued extensions are given in section 2. section 3 presents the quaternary galois shannon-davio (s/d) trees. the number of s/d inclusive forms and the new ifn,2 triangles are introduced in section 4. conclusions and future work are presented in section 5. 2 basic shannon and davio decompositions this section presents necessary mathematical background and the fundamental formalisms of the work that will be introduced and further developed in the following sections. normal canonical forms play an important role in the synthesis of logic circuits which includes synthesis, testing and optimization [2, 8, 13, 15, 17, 18, 21, 23, 25, 27, 29, 32, 35-37]. the main algebraic structure which is used in this work for developing the canonical normal forms is the galois field (gf) algebraic structure, which is a fundamental algebraic structure in the theory of algebras [2, 8, 17, 18, 21, 32]. galois field has proven high efficiency in various applications of logic synthesis, such as in the design for test, error correction codes, and even in the proof of the well-known fermat’s last theorem. the importance of gf for logic synthesis results from the fact that every finite field is isomorphic to a galois field. in general, the attractive properties of gf-based circuits, such as high testability of such circuits, are due to the fact that the gf operators exhibit the cyclic group also called latin square property which can be explained, for example, using gf(4) (quaternary) operators as shown in figures 1(a) and 1(b), respectively; note that in any row and column of the addition table in figure 1(a), the elements are all different which is cyclic, and that the elements have a different order in each row and column. another cyclic group can be observed in the multiplication table; if the zero elements are removed from the multiplication table in figure 1(b), then the remaining elements form a cyclic group. in binary, for example, gf(2) addition operator exor has the cyclic group property. + 0 1 2 3 0 0 1 2 3 1 1 0 3 2 2 2 3 0 1 3 3 2 1 0 ∗ 0 1 2 3 0 0 0 0 0 1 0 1 2 3 2 0 2 3 1 3 0 3 1 2 (a) (b) fig. 1: gf(4) addition and multiplication tables. reed-muller based normal forms have been classified using the green-sasao hierarchy. the green-sasao hierarchy of families of canonical forms and corresponding decision di102 702 a. n. al-rabadi multi-valued galois shannon davio trees and their complexity 703 multi-valued galois shannon davio trees and their complexity agrams is based on three generic expansions; shannon, positive davio and negative davio expansions. the corresponding shannon, positive davio and negative davio expansions are given as follows [2, 32]: f (x1,x2,...,xn) = x̄1 · f0(x1,x2,...,xn)⊕ x1 · f1(x1,x2,...,xn), = [ x̄1 x1 ] [ 1 0 0 1 ][ f0 f1 ] , (1) f (x1,x2,...,xn) = 1 · f0(x1,x2,...,xn)⊕ x1 · f2(x1,x2,...,xn), = [ 1 x1 ] [ 1 0 1 1 ][ f0 f1 ] , (2) f (x1,x2,...,xn) = 1 · f1(x1,x2,...,xn)⊕ x̄1 · f2(x1,x2,...,xn) = [ 1 x̄1 ] [ 0 1 1 1 ][ f0 f1 ] , (3) where f0(x1,x2,...,xn) = f (0,x2,...,xn) = f0 is the negative cofactor of variable x1, f1(x1,x2,...,xn) = f (1,x2,...,xn) = f1 is the positive cofactor of variable x1, and f2(x1,x2,...,xn) = f (0,x2,...,xn) ⊕ f (1,x2,...,xn) = f0 ⊕ f1. an arbitrary n-variable function f (x1,x2,...,xn) can be represented using the positive polarity reed-muller (pprm) expansion as follows: f (x1,x2,...,xn) =a0 ⊕ a1x1 ⊕ a2x2 ⊕ ...⊕ anxn ⊕ a12x1x2 ⊕ a13x1x3 ⊕ an−1,nxn−1xn ⊕ ...⊕ a12...nx1x2 ...xn. (4) for each function f , the coefficients ai in equation (4) are determined uniquely, so pprm is a canonical form. if we use either only the positive literal or only the negative literal for each variable in equation (4) we obtain the fixed polarity reed-muller (fprm) form. there are 2n possible combinations of polarities and as many fprms for any given logic function [2, 32]. if we freely choose the polarity of each literal in equation (4), we obtain the generalized reed-muller (grm) form. in grms, contrary to fprms, the same variable can appear in both positive and negative polarities. there are n2(n−1) literals in equation (4) so there are 2n2 (n−1) polarities for an n-variable function and as many grms [32]. each of the polarities determines a unique set of coefficients, and thus each grm is a canonical representation of a function. two other types of expansions result from the flattening of certain binary trees that will produce kronecker (kro) forms and pseudo kronecker (pkro) forms for shannon, positive davio and negative davio expansions. there are 3n and at most 32 n−1 different kros and pkros, respectively [32]. the good selection of the various permutations using the shannon and davio expansions as internal nodes in decision trees (dts) and diagrams (dds) will result in dts and dds, that represent the corresponding logic functions, with smaller sizes in terms of the total number of hierarchical levels used, and the total number of internal nodes needed. the minimization of the size of dd, to represent a logic function, will result in speeding up the manipulations of logic functions using dd as data structure, and the minimization of the use of memory space during the execution of such manipulations. one can observe that 103 704 a. n. al-rabadi multi-valued galois shannon davio trees and their complexity 705 multi-valued galois shannon davio trees and their complexity by going from pprm to grm forms, less constraints are imposed on the canonical forms due to the enlarged set of polarities that one can choose from. the gain of more freedom (less constraints) on the polarity of the canonical expansions will provide an advantage of obtaining exclusive-sum-of-product (esop) expressions with less number of terms and literals, and consequently expressing boolean functions using esop forms will produce on average expressions with less size as if compared to sum-of-product (sop) expressions. in general, a literal can be defined as any function of a single variable [2, 18, 32]. basis functions in the general case of multi-valued expansions are constructed using literals. galois field sop expansions can be performed on variety of literals. for example, one can use among others: k-reduced post literal (k-rpl) to produce k-rpl gfsop, post literal to produce pl gfsop, window literal to produce wl gfsop, generalized (post) literal to produce gl gfsop, or universal literal to produce ul gfsop. figure 2 demonstrates set-theoretic relationships between the various literals, where the shaded reduced post literal is the type of literal that will be used through this paper. one may note that the rpl in the discrete domain is analogous to the delta function in the continuous domain. reduced post literal post literal window literal generalized (post) literal universal literal fig. 2: inclusion relationship of various types of literals. example 1. figure 3 demonstrates several literal types, where one proceeds from the simplest rpl literal in figure 3(a) to the more complex wl literal in figure 3(c). for rpl in figure 3(a), a value k is produced by the literal when the value of the variable is equal to a specific state, and in this particular example a value k = 1 is generated by the 1-rpl when the value of variable x is equal to certain state (here this state is equal to one). figure 3(b) shows pl where the value generated by the literal at a specific state is equal to the maximum value (i.e., radix) of that logic, and wl in figure 3(c) generates a value equal to the radix for a "window" of specific states. since k-rpl gfsop is as simple as pl and it is simpler from implementation point of view than other kind of literals, we will perform all of the gfsop expansions utilizing the corresponding 1-reduced post literal gfsop. consequently, let us define the 1-rpl as [2, 32]: ix = 1 iff x = i else ix = 0. (5) for example { 0x, 1x, 2x} are the zero, first and second polarities of the 1-reduced post literal, respectively. also, let us define the ternary shifts over x variable {x,x′,x′′} as 104 704 a. n. al-rabadi multi-valued galois shannon davio trees and their complexity 705 multi-valued galois shannon davio trees and their complexity 1 0 1 2 3 4 x 2 3 4 1 x 0 1 2 3 4 x 1 2 3 4 l 1 (x) 0 1 2 3 4 x 1 2 3 4 l [1:2] (x) (a) (b) (c) fig. 3: an example of different types of literals over an arbitrary five-radix logic: (a) 1-reduced post literal (1-rpl), (b) post literal (pl), and (c) window literal (wl). the zero, first and second shifts of the variable x respectively (i.e., x = x + 0, x′ = x + 1 and x′′ = x+ 2, respectively), and x can take any value in the set {0,1,2}. we chose to represent the 1-reduced post literals in terms of shifts and powers, among others, because of the ease of the implementation of powers of shifted variables in hardware such as in universal logic modules (ulms) [2]. analogously to the binary and ternary cases, quaternary shannon expansion over gf(4) for a function with single variable is: f = 0x f0 + 1x f1 + 2x f2 + 3x f3, (6) where f0 is the cofactor of f with respect to variable x of value 0, f1 is the cofactor of f with respect to variable x of value 1, f2 is the cofactor of f with respect to variable x of value 2, and f3 is the cofactor of f with respect to variable x of value 3. example 2. let f (x1,x2) = x′′1 x2 +x ′′′ 2 x1. by using figure (1), the quaternary truth vector in the variable order {x1,x2} is f = [0,3,1,2,2,1,3,0,3,0,2,1,1,2,0,3]t . utilizing equation (6), one obtains the following gf(4) shannon expansion for f : f =2 · 0x1 1x2 + 3 · 0x1 2x2 + 0x1 3x2 + 3 · 1x1 0x2 + 1x1 1x2 + 2 · 1x1 3x2 + 2x1 0x2 + 3 · 2x1 1x2 + 2 · 2x1 2x2 + 2 · 3x1 0x2 + 3x1 2x2 + 3 · 3x1 3x2. using the axioms of gf(4) that are manifested in the operators shown in figure 1, the 1-rpl defined in equation (5) is related to the shifts of variables over gf(4) in terms of powers [2-5] as follows: 105 706 a. n. al-rabadi multi-valued galois shannon davio trees and their complexity 707 multi-valued galois shannon davio trees and their complexity 0x = x3 + 1, (7) 0x = x′ + (x′)2 + (x′)3, (8) 0x = 3(x′′)+ 2(x′′)2 + (x′′)3, (9) 0x = 2(x′′′)+ 3(x′′′)2 + (x′′′)3, (10) 1x = x + (x)2 + (x)3, (11) 1x = (x′)3 + 1, (12) 1x = 2(x′′)+ 3(x′′)2 + (x′′)3, (13) 1x = 3(x′′′)+ 2(x′′′)2 + (x′′′)3, (14) 2x = 3(x)+ 2(x)2 + (x)3, (15) 2x = 2(x′)+ 3(x′)2 + (x′)3, (16) 2x = (x′′)3 + 1, (17) 2x = x′′′ + (x′′′)2 + (x′′′)3, (18) 3x = 2(x)+ 3(x)2 + (x)3, (19) 3x = 3(x′)+ 2(x′)2 + (x′)3, (20) 3x = x′′ + (x′′)2 + (x′′)3, (21) 3x = (x′′′)3 + 1, (22) where { 0x, 1x, 2x, 3x} are the zero, first, second and third polarities of the 1-rpl, respectively. also, {x,x′,x′′,x′′′} are the zero, first, second and third shifts (inversions) of the variable x respectively, and variable x can take any value of the set {0,1,2,3}. analogous to the ternary case, we chose to represent the 1-rpl in terms of shifts and powers, among others, because of the ease of the implementation of powers of shifted variables in hardware. after the substitution of equations (7) (22) in equation (6), and after the rearrangement and reduction of terms according to the gf(4) operations in figure 1, one obtains: f = 1 · f0 + x( f1 + 3 f2 + 2 f3)+ (x)2( f1 + 2 f2 + 3 f3)+ (x)3( f0 + f1 + f2 + f3), (23) f = 1 · f1 + (x′)( f0 + 2 f2 + 3 f3)+ (x′)2( f0 + 3 f2 + 2 f3)+ (x′)3( f0 + f1 + f2 + f3), (24) f = 1 · f2 + (x′′)(3 f0 + 2 f1 + f3)+ (x′′)2(2 f0 + 3 f1 + f3)+ (x′′)3( f0 + f1 + f2 + f3), (25) f = 1 · f3 + (x′′′)( f2 + 3 f1 + 2 f0)+ (x′′′)2( f2 + 2 f1 + 3 f0)+ (x′′′)3( f0 + f1 + f2 + f3). (26) equations (6) and (23) (26) are the 1-rpl quaternary shannon (s) and davio (d0,d1,d2,d3} expansions for a single variable, respectively. these equations can be re-written in the following matrix-based convolution-like forms, respectively: 106 706 a. n. al-rabadi multi-valued galois shannon davio trees and their complexity 707 multi-valued galois shannon davio trees and their complexity f = � 0x 1x 2x 3x �     1 0 0 0 0 1 0 0 0 0 1 0 0 0 0 1         f0 f1 f2 f3     , (27) f = � 1 x x2 x3 �     1 0 0 0 0 1 3 2 0 1 2 3 1 1 1 1         f0 f1 f2 f3     , (28) f = � 1 x′ (x′)2 (x′)3 �     0 1 0 0 1 0 2 3 1 0 3 2 1 1 1 1         f0 f1 f2 f3     , (29) f = � 1 x′′ (x′′)2 (x′′)3 �     0 0 1 0 3 2 0 1 2 3 0 1 1 1 1 1         f0 f1 f2 f3     , (30) f = � 1 x′′′ (x′′′)2 (x′′′)3 �     0 0 0 1 2 3 1 0 3 2 1 0 1 1 1 1         f0 f1 f2 f3     . (31) one can observe that equations (27) (31) are expansions for a single variable. yet, these canonical expressions can be generated for arbitrary number of variables n using the kronecker (tensor) product. this can be expressed formally as in the following discrete convolution-like forms for shannon (s), and davio (d0, d1, d2 and d3) expressions, respectively [2, 32]: f = n � i=1 � 0xi 1xi 2xi 3xi � n � i=1 [s][�f], (32) f = n � i=1 � 1 xi x2i x 3 i � n � i=1 [d0][�f], (33) f = n � i=1 � 1 x′i (x ′ i) 2 (x′i) 3 � n � i=1 [d1][�f], (34) f = n � i=1 � 1 x′′i (x ′′ i ) 2 (x′′i ) 3 � n � i=1 [d2][�f], (35) f = n � i=1 � 1 x′′′i (x ′′′ i ) 2 (x′′′i ) 3 � n � i=1 [d3][�f]. (36) the following section utilizes the presented gf(4) spectral-based functional decompositions for the synthesis of decision trees. in this spectral interpretation of decision trees, 107 708 a. n. al-rabadi multi-valued galois shannon davio trees and their complexity 709 multi-valued galois shannon davio trees and their complexity different decision trees can be defined by using different decomposition forms which are specified by the corresponding transform matrices and multi-valued literals. the utilized decision trees can therefore be viewed as graphical representations of functional expressions where different trees produce different functional expressions, and by counting the number of possible different trees, that are derived by assigning different decomposition rules to their nodes, we can count the number of possible functional expressions. 3 quaternary shannon-dvaio (s/d) trees the basic s, d0, d1, d2 and d3 quaternary expansions (i.e., flattened forms) introduced previously in equations (32) (36) can be represented in quaternary dts (qudts) and the corresponding varieties of reduced quaternary dds (rqudds) according to the corresponding reduction rules. for one variable (i.e., one level of the dt), figures 4(a) 4(e) represent the expansion nodes for {s,d0,d1,d2,d3}, respectively, and the notation in figure 4(f) means that x corresponds to the four possible shifts of the variable x as: x ∈ {x,x′,x′′,x′′′}, over gf(4). (37) 1 1 1 1 1 x ( ’)x ( ’’)x ( ’’’)x x x ( ’)x ( ’’)x ( ’’’)x ( )x 2 2 2 2 2 x ( ’)x ( ’’)x ( ’’’)x ( )x 3 3 3 3 3 0 x 1 x 2 x 3 x s d0 d1 d2 d3 d ( )a (d) ( )b (e) ( )c (f ) fig. 4: quaternary decision nodes: (a) shannon in eq. (32), (b) davio0 in eq. (33), (c) davio1 in eq. (34), (d) davio2 in eq. (35), (e) davio3 in eq. (36), and (f) generalized quaternary davio defined in eq. (37). utilizing the two nodes defined for quaternary shannon in figure 4(a) and quaternary generalized davio in figure 4(f), and analogously to the binary and ternary cases, one can obtain the quaternary shannon-davio (s/d) trees for two variables (cf. figure 5), where general family called inclusive forms (ifs) is obtained as flattened expressions generated by these s/d trees. for example, the corresponding s/d trees for ifs of two variables can be generated for variable order {a,b} and for variable order {b,a} as well. the number of these s/d trees per variable order is 2(4+1) = 32, where the number of qifs per s/d tree will be later derived in section 4 in two different ways; the first method 108 708 a. n. al-rabadi multi-valued galois shannon davio trees and their complexity 709 multi-valued galois shannon davio trees and their complexity is by using the general formula for an arbitrary number of variables over gf(4) and the second method is performed by using the general formula for any radix. the number of all possible forms is important because it can be used as an upper-bound parameter in a search heuristic that searches for a minimum gfsop expression using the corresponding s/d trees. example 3 illustrates some of the quaternary s/d trees and some of the quaternary trees they produce. the numbers on top of s/d trees in figures 5(a) and 6(a) are the numbers of total qifs (i.e., total number of quaternary trees) that are generated. example 3. utilizing the notation in equation (37), we obtain, for the s/d trees in figures 5(a) and 6(a), the corresponding s/d trees in figures 5(b) 5(c) and figures 6(b) 6(c), respectively. from the quaternary s/d trees shown in figures 5 and 6, by taking any s/d tree, multiplying the two-level cofactors (which are in the qudt leafs) each by the corresponding path in that qudt, and next summing all the resulting cubes (terms or products) over gf(4), one obtains the flattened if form for the function f as a certain gfsop expression (expansion). for each qudt in figures 5(a) and 6(a), there are as many if forms obtained for the function f as the number of all possible permutations of the polarities of the variables in the second level branches of each qudt. 4 count of the number of s/d inclusive forms over gf(pk) and the new ifn,2 triangles this section provides the count for the numbers of inclusive forms, which are flattened expressions generated by the corresponding s/d trees, where these counts can be used as numerical parameters for upper-bounds in search heuristics that search for minimum gfsop expressions. theorem 1. for gf(3) and n variables, the total number of ternary ifs (tifs) per variable order is: #tifs = (3)n−1 ∑ k1=0 (3)n−2 ∑ k2=0 (3)n−3 ∑ k3=0 ··· (3)0 ∑ kn =0 {[ 3(n−1)! (3(n−1) − k1)!k1! 3(n−2)! (3(n−2) − k2)!k2! 3(n−3)! (3(n−3) − k3)!k3! ··· 3(0)! (3(0) − kn)!kn ! ] [ (32(3) 0 )k1 (32(3) 1 )k2(32(3) 2 )k3 ···(32(3) n−1 )kn ]} . (38) proof. the following is the derivation of equation (38) to calculate #tifs per variable order. the total number of nodes for any gf(3) tree with n levels (n variables) equals: n−1 ∑ k=0 (3)k. (39) for any s-type node there is only one type of nodes as the branches have the possibility of single value each. yet, for d-type node there are n possible types of nodes where n is the 109 710 a. n. al-rabadi multi-valued galois shannon davio trees and their complexity 711 multi-valued galois shannon davio trees and their complexity 1 1 1 1 1 1 0 b 0 b 0 b 0 b 0 b 0 b b b b b b’’ b’ b 1 b 1 b b 1 b 1 b ( )b 2 (b’) 2 (b) 2 ( )b 2 (b’) 2 (b) 2 2 b 2 b 2 b 2 b 2 b 2 b ( )b 3 (b’) 3 (b’’) 3 ( )b 3 (b) 3 (b) 3 3 b 3 b 3 b 3 b 3 b 3 b s s s s s s s s s d d d d d d (a) (b) ( ) n= 4,096 0 a 0 a 0 a 1 a 1 a 1 a 2 a 2 a 2 a 3 a 3 a 3 a c fig. 5: examples of s/d trees: (a) quaternary s/d tree for two variables of order {a,b} with three shannon nodes and two generalized davio nodes, and (b) (c) some of the quaternary trees that it generates. number of variables which is equal to the number of levels. the highest possible number of forms for the d-type node is when the d-type node exists in the first (highest) level, and the lowest possible number of forms for the d-type node is when the d-type node exists in the n-level (lowest level). therefore, for certain number m of s-type nodes the following equation describes the number of the d-type nodes for n variables: #s = m ⇒ #d = [ n−1 ∑ k=o (3)k − m]. (40) it can be shown that for gf(3) (i.e., ternary decision tree (tdt)) and n-levels (n-variables), the general formulas that count the number of d-type nodes, and the number of all possible 110 1 1 710 a. n. al-rabadi multi-valued galois shannon davio trees and their complexity 711 multi-valued galois shannon davio trees and their complexity 1 1 1 1 1 1 0 a 0 a 0 a 1 1 1 a a aa a’ a 1 a 1 a 1 a a a’’ a’’ ( )a 2 (a) 2 (a) 2 ( )a 2 (a’’) 2 (a’’) 2 2 a 2 a 2 a (a) 2 (a’’’) 2 (a) 2 ( )a 3 (a) 3 (a’) 3 ( )a 3 (a’’’) 3 (a’’) 3 3 a 3 a 3 a (a) 3 (a’) 3 (a’) 3 s s s d d s s s s d d d d d d (a) (b) n= 262,144 0 b 0 b 0 b 1 b 1 b 1 b 2 b 2 b 2 b 3 b 3 b 3 b (c) fig. 6: examples of s/d trees: (a) quaternary s/d tree for two variables of order {b,a} with two shannon nodes and three generalized davio nodes, and (b) (c) some of the quaternary trees that it generates. forms for the d-type node in the k level of the n-level tdt are: #dk =(3)(k−1), (41) |dk|per node =(3)2(3) (n−k) , (42) where #dk is the number of d-type nodes in k level, and |dk| is the number of all possible forms for the d-type node in the k level. let us define s/d tree category to be the s/d trees that have in common the same number of s-type nodes and same number of 111 (a)2 (a)3 d 712 a. n. al-rabadi multi-valued galois shannon davio trees and their complexity 713 multi-valued galois shannon davio trees and their complexity d-type nodes within the same variable order. also, define: ψ ≡ number of variable orders, (43) ω ≡ number of s/d tree categories per variable order, (44) φ ≡ number of s/d trees per category, (45) φ ≡ number of tifs per variable order. (46) from equations (39) (42), and using some elementary count rules, we can derive by mathematical induction the following general formulas for n being the number of variables: ψ = n!, (47) ω = n−1 ∑ k=0 (3)k + 1, (48) φ = [∑n−1k=0 (3) k]! [∑n−1k=0 (3) k − k]!k! , where k = 0,1,2,3,..., n−1 ∑ k=0 (3)k, (49) φ = (3)n−1 ∑ k1=0 (3)n−2 ∑ k2=0 (3)n−3 ∑ k3=0 ··· (3)0 ∑ kn =0 {[ 3(n−1)! (3(n−1) − k1)!k1! 3(n−2)! (3(n−2) − k2)!k2! 3(n−3)! (3(n−3) − k3)!k3! ··· 3(0)! (3(0) − kn)!kn ! ][ (32(3) 0 )k1(32(3) 1 )k2(32(3) 2 )k3 ··· (32(3) (n−1) )kn ]} . (50) from equations (47) (50), it can be noticed that the total number of tifs for all variable orders is equal to [n!][#tifs per order]. example 4. for number of variables equal to two (n = 2), φ reduces to: φ = (3)1 ∑ k1=0 1 ∑ k2=0 { 31! (31 − k1)!k1! 30! (30 − k2)!k2! (32(3) 0 )k1(32(3) 1 )k2 } φ =φ|k1=0,k2=0 + φ|k1=1,k2=0 + φ|k1=2,k2=0 + φ|k1=3,k2=0 +φ|k1=0,k2=1 + φ|k1=1,k2=1 + φ|k1=2,k2=1 + φ|k1=3,k2=1 =φ00 + φ10 + φ20 + φ30 + φ01 + φ11 + φ21 + φ31 =1 + 27 + 243 + 729 + 729 + 19683 + 177147 + 531441 = 730,000. utilizing multi-valued map representation, there are n#minterms different functions for n-valued input-output logic. therefore, for ternary logic, there are 39 = 19,683 different ternary functions of two variables, and 730,000 ternary inclusive forms generated by the s/d trees. thus, on the average every function of two variables can be realized in approximately 37 ways. 112 712 a. n. al-rabadi multi-valued galois shannon davio trees and their complexity 713 multi-valued galois shannon davio trees and their complexity d-type nodes within the same variable order. also, define: ψ ≡ number of variable orders, (43) ω ≡ number of s/d tree categories per variable order, (44) φ ≡ number of s/d trees per category, (45) φ ≡ number of tifs per variable order. (46) from equations (39) (42), and using some elementary count rules, we can derive by mathematical induction the following general formulas for n being the number of variables: ψ = n!, (47) ω = n−1 ∑ k=0 (3)k + 1, (48) φ = [∑n−1k=0 (3) k]! [∑n−1k=0 (3) k − k]!k! , where k = 0,1,2,3,..., n−1 ∑ k=0 (3)k, (49) φ = (3)n−1 ∑ k1=0 (3)n−2 ∑ k2=0 (3)n−3 ∑ k3=0 ··· (3)0 ∑ kn =0 {[ 3(n−1)! (3(n−1) − k1)!k1! 3(n−2)! (3(n−2) − k2)!k2! 3(n−3)! (3(n−3) − k3)!k3! ··· 3(0)! (3(0) − kn)!kn ! ][ (32(3) 0 )k1(32(3) 1 )k2(32(3) 2 )k3 ··· (32(3) (n−1) )kn ]} . (50) from equations (47) (50), it can be noticed that the total number of tifs for all variable orders is equal to [n!][#tifs per order]. example 4. for number of variables equal to two (n = 2), φ reduces to: φ = (3)1 ∑ k1=0 1 ∑ k2=0 { 31! (31 − k1)!k1! 30! (30 − k2)!k2! (32(3) 0 )k1(32(3) 1 )k2 } φ =φ|k1=0,k2=0 + φ|k1=1,k2=0 + φ|k1=2,k2=0 + φ|k1=3,k2=0 +φ|k1=0,k2=1 + φ|k1=1,k2=1 + φ|k1=2,k2=1 + φ|k1=3,k2=1 =φ00 + φ10 + φ20 + φ30 + φ01 + φ11 + φ21 + φ31 =1 + 27 + 243 + 729 + 729 + 19683 + 177147 + 531441 = 730,000. utilizing multi-valued map representation, there are n#minterms different functions for n-valued input-output logic. therefore, for ternary logic, there are 39 = 19,683 different ternary functions of two variables, and 730,000 ternary inclusive forms generated by the s/d trees. thus, on the average every function of two variables can be realized in approximately 37 ways. 112 multi-valued galois shannon davio trees and their complexity theorem 2. for gf(4) and n variables, the total number of quaternary ifs (qifs) per variable order is: #qifs =φ = (4)n−1 ∑ k1=0 (4)n−2 ∑ k2=0 ··· (4)0 ∑ kn =0 { 4(n−1)! (4(n−1) − k1)!k1! 4(n−2)! (4(n−2) − k2)!k2! ··· 4(0)! (4(0) − kn)!kn ! (43(4) 0 )k1 (43(4) 1 )k2(43(4) 2 )k3 ···(43(4) (n−1) )kn } . (51) proof. a general proof that includes gf(4) as special case will be provided later in this section. the extension of the concept of s/d trees to higher radices of galois fields (i.e., higher than four) is a systematic and direct process that follows the same method developed for the ternary case and the quaternary case. the following example demonstrates the counts of qifs using theorem 2. example 5. for number of variables equal to two (n = 2), equation (51) reduces to: φ = (4)1 ∑ k1=0 (4)0 ∑ k2=0 { 4(1)! (4(1) − k1)!k1! 4(0)! (4(0) − k2)!k2! (43(4) 0 )k1 (43(4) 1 )k2 } =φ|k1=0,k2=0 + φ|k1=1,k2=0 + φ|k1=2,k2=0 + φ|k1=3,k2=0 + φ|k1=4,k2=0 + φ|k1=0,k2=1 + φ|k1=1,k2=1 + φ|k1=2,k2=1 + φ|k1=3,k2=1 + φ|k1=4,k2=1 =φ00 + φ10 + φ20 + φ30 + φ40 + φ01 + φ11 + φ21 + φ31 + φ41 =1 + 256 + 24,576 + 1,048,576 + 16,777,216 + 16,777,216 + 4,294,967,296 + 412,316,860,416 + 1.75921860444 × 1013 + 2.81477976711 × 1014 =2.99483809211 × 1014. utilizing multi-valued map representation, we can easily prove that there are 416 = 4,294,967,296 quaternary functions of two variables, and 2.99483809211 × 1014 quaternary inclusive forms generated by the s/d trees. thus, on the average, every function of two variables can be synthesized (realized) in approximately 69,729 ways. this high number of realizations means that most functions of two variables are realized with less than five expansions, and all functions with at most five expansions. 4.1 general formula to compute the number of ifs for an arbitrary variable number and arbitrary galois radix gf( pk) although the s/d trees and inclusive forms that were developed are for gf(4), the same concept can be directly and systematically extended to the case of n radix of galois fields and n variables. theorem 3 provides the total number of ifs per variable order for n variables (i.e., n decision tree levels) and n radix of any arbitrary algebraic field, including gf( pk) where p is a prime number and k is a natural number ≥ 1. the generality of theorem 3 comes from the fact that algebraic structures specify the type of operations 113 714 a. n. al-rabadi multi-valued galois shannon davio trees and their complexity 715 multi-valued galois shannon davio trees and their complexity (e.g., addition and multiplication operations) in the functional expansions but do not specify the counts which are an intrinsic property of the tree structure and are independent of the algebraic operations performed. thus, theorem 3 is valid, among others, for galois fields of arbitrary radix. theorem 3. the total number of inclusive forms for n variables and n-radix galois field logic is equal to: #nif s = φn,n = (n)n−1 ∑ k1=0 (n)n−2 ∑ k2=0 ··· (n)0 ∑ kn =0 { n(n−1)! (n(n−1) − k1)!k1! n(n−2)! (n(n−2) − k2)!k2! ··· n(0)! (n(0) − kn)!kn ! (n(n−1)(n) 0 )k1 (n(n−1)(n) 1 )k2 (n(n−1)(n) 2 )k3 ···(n(n−1)(n) n−1 )kn } . (52) proof. the following is the derivation of the general equation (52) to calculate the number of ifs per variable order. the total number of nodes for any gf(n) tree with n levels (i.e., n variables) equals to: n−1 ∑ k=0 (n)k. (53) for any s-type (i.e., shannon type) node there is only one type of nodes as the branches of the shannon node have the possibility of single value each. yet, for d-type (i.e., davio type) node there are n possible types of nodes where n is the number of variables which is equal to the number of levels. the highest possible number of forms for the d-type node exists when the davio node exists in the first (highest) level, and the lowest possible number of forms for the d-type node is when the davio node exists in the n-level (lowest level). therefore, for certain number m of s-type nodes the following formula describes the number of the d-type nodes for n variables: #s = m ⇒ #d = n−1 ∑ k=0 (n)k − m. (54) it can be shown that for gf(n) (n-ary decision tree with n-levels, i.e., n variables), the general formulas that count the number of d-type nodes, and the number of all possible forms for the d-type node in the k level (where k is less than or equal the total number of levels n) are, respectively: #dk =(n)k−1, (55) |dk| =(n)(n−1)(n) (n−k) , (56) where #dk is the number of d-type nodes in the k level and |dk| is the number of all possible forms (per node) for the d-type node in the k level. let us define the s/d tree category to be the s/d trees that have in common the same number of s-type nodes and the same number of d-type nodes within the same variable order. let us define the following 114 714 a. n. al-rabadi multi-valued galois shannon davio trees and their complexity 715 multi-valued galois shannon davio trees and their complexity entities for n radix galois field and n variables (i.e., n decision tree levels): ψn,n ≡ number of variable orders, (57) ωn,n ≡ number of s/d tree categories per variable order, (58) φn,n ≡ number of s/d trees per category, (59) φn,n ≡ number of ifs per variable order. (60) from the previous equations, and using elementary count rules, one can derive by mathematical induction the following general formulas for n being the number of variables and n being the field radix: ψn,n = n!, (61) ωn,n = n−1 ∑ k=0 (n)k + 1, (62) φn,n = [∑n−1k=0 (n) k]! [∑n−1k=0 (n) k − k]!k! , where k = 0,1,2,3,...,n − 1, (63) φn,n = (n)n−1 ∑ k1=0 (n)n−2 ∑ k2=0 ··· (n)0 ∑ kn =0 { n(n−1)! (n(n−1) − k1)!k1! n(n−2)! (n(n−2) − k2)!k2! ··· n(0)! (n(0) − kn)!kn ! (n(n−1)(n) 0 )k1(n(n−1)(n) 1 )k2 ···(n(n−1)(n) (n−1) )kn } . (64) one can note that the formula in equation (52) used to obtain the total number of inclusive forms for n variables and n radix of galois field is a very general formula that includes the ternary case in equation (38) and the quaternary case in equation (51) as special cases. numerical counting results that are obtained from equation (52) can be used in search heuristics as numerical bounds that could be incorporated into efficient search of s/d trees in order to obtain minimal gfsop forms for specific multi-valued logic functions. since such search for minimal forms is already a difficult problem in two-valued logic for example using binary s/d trees especially when the number of variables is large, the search for minimal gfsop forms in multi-valued galois logic will be very difficult. thus, further numerical evaluations have to be conducted in order to estimate the usefulness of the utilizations of the numerical bounds obtained from equation (52) in such extended multi-valued search heuristics. example 6. the number of qifs over gf(4) for two variables (i.e., n = 2 and n = 4) is: 115 716 a. n. al-rabadi multi-valued galois shannon davio trees and their complexity 717 multi-valued galois shannon davio trees and their complexity φ4,2 = (4)2−1 ∑ k1=0 (4)2−2 ∑ k2=0 { 4(2−1)! (4(2−1) − k1)!k1! 4(2−2)! (4(2−2) − k2)!k2! (4(4−1)(4) 0 )k1 (4(4−1)(4) 1 )k2 } , =φ00|4,2 + φ10|4,2 + φ20|4,2 + φ30|4,2 + φ40|4,2 + φ01|4,2 + φ11|4,2 + φ21|4,2 + φ31|4,2 + φ41|4,2 =1 + 256 + 24,576 + 1,048,576 + 16,777,216 + 16,777,216 + 4,294,967,296 + 412,316,860,416 + 1.75921860444 × 1013 + 2.81477976711 × 1014 =2.99483809211 × 1014. corollary 1. from equation (52), the count of ifs for n variables and second radix is: n−1 ∏ k=0 (1 + 22 n−k−1 )2 k = (2)n−1 ∑ k1=0 (2)n−2 ∑ k2=0 ··· (2)0 ∑ kn =0 { 2(n−1)! (2(n−1) − k1)!k1! 2(n−2)! (2(n−2) − k2)!k2! ··· 2(0)! (2(0) − kn)!kn ! (2(2−1)(2) 0 )k1 (2(2−1)(2) 1 )k2 ···(2(2−1)(2) (n−1) )kn } . (65) as previously mentioned, this enumeration can be useful as a terminating point of minimization algorithms for multi-valued functions. yet, as shown, the number of combinations is so large that restriction to some particular cases of functional expressions can be more feasible. the following section introduces a fast method to calculate the number of ifs for an arbitrary galois field logic for functions with two variables. 4.2 the ifn,2 triangles: fast count calculations of ifs for gf( pk) and two-variable functions the count of the number of ifs can be important in many applications, especially in providing upper numerical boundaries for efficient search of a minimum gfsop. calculating the numbers of inclusive forms can be very time consuming due to the time required to perform the mathematical operationsin the general equation (52). this is why a fast method to generate the number of ifs is needed. because functions with two variables find an important application such as in universal logic modules (ulms) for pairs of control variables that generalize shannon and davio expansion modules [2], and since two-variable functions are attractive in logic synthesis since many functional decomposition methods exist that produce two control inputs for primitive cells in a standard library of standard cells such as in a multiplexer with two address lines, theorem 4 provides a fast computational method to calculate the number of ifs over an arbitrary radix of galois field gf(pk) for two-variable functions (i.e., n = 2). theorem 4. the following ifn,2 triangles provide a fast computational method to calculate the number of ifs over an arbitrary n radix of galois field gf( pk) for two-variable functions (n = 2). 116 716 a. n. al-rabadi multi-valued galois shannon davio trees and their complexity 717 multi-valued galois shannon davio trees and their complexity φ4,2 = (4)2−1 ∑ k1=0 (4)2−2 ∑ k2=0 { 4(2−1)! (4(2−1) − k1)!k1! 4(2−2)! (4(2−2) − k2)!k2! (4(4−1)(4) 0 )k1 (4(4−1)(4) 1 )k2 } , =φ00|4,2 + φ10|4,2 + φ20|4,2 + φ30|4,2 + φ40|4,2 + φ01|4,2 + φ11|4,2 + φ21|4,2 + φ31|4,2 + φ41|4,2 =1 + 256 + 24,576 + 1,048,576 + 16,777,216 + 16,777,216 + 4,294,967,296 + 412,316,860,416 + 1.75921860444 × 1013 + 2.81477976711 × 1014 =2.99483809211 × 1014. corollary 1. from equation (52), the count of ifs for n variables and second radix is: n−1 ∏ k=0 (1 + 22 n−k−1 )2 k = (2)n−1 ∑ k1=0 (2)n−2 ∑ k2=0 ··· (2)0 ∑ kn =0 { 2(n−1)! (2(n−1) − k1)!k1! 2(n−2)! (2(n−2) − k2)!k2! ··· 2(0)! (2(0) − kn)!kn ! (2(2−1)(2) 0 )k1 (2(2−1)(2) 1 )k2 ···(2(2−1)(2) (n−1) )kn } . (65) as previously mentioned, this enumeration can be useful as a terminating point of minimization algorithms for multi-valued functions. yet, as shown, the number of combinations is so large that restriction to some particular cases of functional expressions can be more feasible. the following section introduces a fast method to calculate the number of ifs for an arbitrary galois field logic for functions with two variables. 4.2 the ifn,2 triangles: fast count calculations of ifs for gf( pk) and two-variable functions the count of the number of ifs can be important in many applications, especially in providing upper numerical boundaries for efficient search of a minimum gfsop. calculating the numbers of inclusive forms can be very time consuming due to the time required to perform the mathematical operationsin the general equation (52). this is why a fast method to generate the number of ifs is needed. because functions with two variables find an important application such as in universal logic modules (ulms) for pairs of control variables that generalize shannon and davio expansion modules [2], and since two-variable functions are attractive in logic synthesis since many functional decomposition methods exist that produce two control inputs for primitive cells in a standard library of standard cells such as in a multiplexer with two address lines, theorem 4 provides a fast computational method to calculate the number of ifs over an arbitrary radix of galois field gf(pk) for two-variable functions (i.e., n = 2). theorem 4. the following ifn,2 triangles provide a fast computational method to calculate the number of ifs over an arbitrary n radix of galois field gf( pk) for two-variable functions (n = 2). 116 multi-valued galois shannon davio trees and their complexity 1 2 1 1 2 1 1 3 3 1 1 3 3 1 1 4 6 4 1 1 4 6 4 1 1 5 10 10 5 1 1 5 10 10 5 1 1 6 15 20 15 6 1 1 6 15 20 15 6 1 1 7 21 35 35 21 7 1 1 7 21 35 35 21 7 1 2 0 2 1 2 2 2 2 2 3 2 4 3 0 3 2 3 4 3 6 3 6 3 8 3 10 3 12 4 0 4 3 4 6 4 9 4 12 4 12 4 15 4 18 4 21 4 24 5 0 5 4 5 8 5 12 5 16 5 20 5 20 5 24 5 28 5 32 5 36 5 40 n 0(n-1) n 1(n-1) n 2(n-1) n 3(n-1) ... n ... (n-1)(n-1) n n(n-1) n n(n-1) n (n+1)(n-1) n (n+2)(n-1) n 2n(n-1) ( )a ( )b fig. 7: the ifn,2 triangles: (a) triangle of coefficients, and (b) triangle of values for fast calculation of the number of inclusive forms for arbitrary radix galois field and functions of two-input variables. proof. . the proof follows directly from mathematical induction of the number of ifs over gf( pk) for twovariable functions. this is deduced from the general equation (52); if the ifn,2 triangles are valid for n = q then they will be also valid for n = q + 1, for n = pk where p is a prime number and k ≥ 1. these triangles are important because the count complexity using equation (52) for high dimensions is very high, and thus the ability of a computer to compute the counts for number of variables greater than five in a reasonable amount of time becomes difficult. consequently, the ifn,2 triangles provide an alternative numerical and geometrical pattern of computing. it can be observed that the ifn,2 triangle of coefficients possesses a close similarity to the well-known pascal triangle. this occurs as follows: if one omits the first two rows of the pascal triangle and duplicates each row into another horizontally adjacent row, the ifn,2 triangle of coefficients will be obtained. this observation helps in creating algorithms that generates the ifn,2 triangle of coefficients since many efficient algorithms exist to generate the pascal triangle example 7. utilizing ifn,2 triangles from figure 7, one calculates the following number of inclusive forms for gf(2), gf(3) and gf(4) for two variables, where the results are 117 (a) (b) . 718 a. n. al-rabadi multi-valued galois shannon davio trees and their complexity 719 multi-valued galois shannon davio trees and their complexity identical to those obtained previously: φ2,2 = 1 · 20 + 2 · 21 + 1 · 22 + 1 · 22 + 2 · 23 + 1 · 24 = 1 + 4 + 4 + 4 + 16 + 16 = 45. φ3,2 = 1 · 30 + 3 · 32 + 3 · 34 + 1 · 36 + 1 · 36 + 3 · 38 + 3 · 310 + 1 · 312 = 730,000. φ4,2 = 1 · 40 + 4 · 43 + 6 · 46 + 4 · 49 + 1 · 412 + 1 · 412 + 4 · 415 + 6 · 418 + 4 · 421 + 1 · 424 = 2.99483809211 × 1014. the ifn,2 triangles, for n is the number of variables, possess the following interesting properties: 1. number of positions (elements) in each row of the triangles in figure 7 are even starting from six. 2. sum of elements in each row in figure 7(a) equals to the number of s/d trees per variable order. 3. triangle in figure 7(a) possesses even symmetry around an imaginary vertical axis in the middle. 4. the minimum number of columns required to generate the whole triangle in figure 7(a) is equal to three due to even symmetry: one wing, one column neighbor to the middle column and one middle column. 5. the triangle in figure 7(a) can be generated by the process of "shift diagonally and add diagonally" (sdaad): shift the left wing diagonally from west to southeast direction and add two numbers diagonally from east to southwest direction, and shift the right wing diagonally from east to southwest direction and add two numbers diagonally from west to southeast direction. 6. the difference in powers in the triangle in figure 7(b) per row element is (n − 1). 7. the first number in each row of the triangle in figure 7(b) is n0 and the last number per row is n2n(n−1). 8. the middle two numbers in each row of the triangle in figure 7(b) are always equal to nn(n−1). 5 conclusions and future work trees for generalized shannon-davio (s/d) expansions over quaternary galois radix is presented, and the corresponding count for the number of inclusive forms (ifs) per variable order for arbitrary galois radix and arbitrary number of variables is introduced. also, the ifn,2 triangles as a new fast computational method to count the number of ifs for an arbitrary galois radix and functions of two variables is introduced. since galois field of quaternary radix has some interesting properties including its implementation utilizing the well-established two-valued logic synthesis methods, the extension of the s/d trees to gf(4) is presented. in addition, the form of s/d trees is a general concept that can be used in applications for the generation of new diagrams and canonical forms, and in the sumof-product (sop) minimization where s/d trees can be utilized for generating forms that include minimum galois field sum-of-products (gfsop) circuits for binary and m-ary radices. 118 718 a. n. al-rabadi multi-valued galois shannon davio trees and their complexity 719 multi-valued galois shannon davio trees and their complexity future work will investigate using other complex types of literals such as the presented post literal (pl) and window literal (wl) to expand upon and consequently construct the corresponding new s/d trees. the utilization of the results from this research to create an efficient gfsop minimizer for synthesis applications within the spaces of classical and reversible logic will also be conducted. references [1] s. b. akers, "binary decision diagrams," ieee trans. comp., vol. c-27, no. 6, pp. 509-516, june 1978. [2] a. n. al-rabadi, reversible logic synthesis: from fundamentals to quantum computing, springer-verlag, 2004. [3] a. n. al-rabadi, "reversible fast permutation transforms for quantum circuit synthesis," proc. ieee int. symposium on multiple-valued logic (ismvl), toronto, 2004, pp. 81-86. [4] a. n. al-rabadi, "quantum circuit synthesis using classes of gf(3) reversible fast spectral transforms," proc. ieee int. symposium on multiple-valued logic (ismvl), toronto, 2004, pp. 87-93. [ 5] a. n. al-rabadi, "quantum logic circuit design of many-valued galois reversible expansions and fast transforms," j. circuits, systems, and computers, world scientific, singapore, vol. 16, no. 5, pp. 641 671, 2007. [ 6] a. n. al-rabadi, "representations, operations, and applications of switching circuits in the reversible and quantum spaces," facta universitatis (fu) electronics and energetics, vol. 20, no. 3, pp. 507 539, 2007. [7] r. e. bryant, "graph-based algorithms for boolean functions manipulation," ieee trans. on comp., vol. c-35, no.8, pp. 667-691, 1986. [8] m. cohn, switching function canonical form over integer fields, ph.d. dissertation, harvard university, 1960. [9] r. drechsler, a. sarabi, m. theobald, b. becker, and m. a. perkowski, "efficient representation and manipulation of switching functions based on ordered kronecker functional decision diagrams," proc. dac, 1994, pp. 415-419. [10] m. escobar and f. somenzi, "synthesis of and/exor expressions via satisfiability," proc. reed-muller, 1995, pp. 80-87. [11] b. j. falkowski and s. rahardja, "classification and properties of fast linearly independent logic transformations," ieee trans. on circuits and systems-ii, vol. 44, no. 8, pp. 646-655, august 1997. [12] b. falkowski and l.-s. lim, "gray scale image compression based on multiple-valued input binary functions, walsh and reed-muller spectra," proc. ismvl, 2000, pp. 279-284. [13] h. fujiwara, logic testing and design for testability, mit press, 1985. [14] d. h. green, "families of reed-muller canonical forms," int. j. of electronics, no. 2, pp. 259-280, 1991. [15] s. hassoun, t. sasao, and r. brayton (editors), logic synthesis and verification, kluwer acad. publishers, 2001. [16] m. helliwell and m. a. perkowski, "a fast algorithm to minimize multi-output mixed-polarity generalized reed-muller forms," proc. design automation conference, 1988, pp. 427-432. [17] s. l. hurst, d. m. miller, and j. c. muzio, spectral techniques in digital logic, academic 119 multi-valued galois shannon davio trees and their complexity future work will investigate using other complex types of literals such as the presented post literal (pl) and window literal (wl) to expand upon and consequently construct the corresponding new s/d trees. the utilization of the results from this research to create an efficient gfsop minimizer for synthesis applications within the spaces of classical and reversible logic will also be conducted. references [1] s. b. akers, "binary decision diagrams," ieee trans. comp., vol. c-27, no. 6, pp. 509-516, june 1978. [2] a. n. al-rabadi, reversible logic synthesis: from fundamentals to quantum computing, springer-verlag, 2004. [3] a. n. al-rabadi, "reversible fast permutation transforms for quantum circuit synthesis," proc. ieee int. symposium on multiple-valued logic (ismvl), toronto, 2004, pp. 81-86. [4] a. n. al-rabadi, "quantum circuit synthesis using classes of gf(3) reversible fast spectral transforms," proc. ieee int. symposium on multiple-valued logic (ismvl), toronto, 2004, pp. 87-93. [ 5] a. n. al-rabadi, "quantum logic circuit design of many-valued galois reversible expansions and fast transforms," j. circuits, systems, and computers, world scientific, singapore, vol. 16, no. 5, pp. 641 671, 2007. [ 6] a. n. al-rabadi, "representations, operations, and applications of switching circuits in the reversible and quantum spaces," facta universitatis (fu) electronics and energetics, vol. 20, no. 3, pp. 507 539, 2007. [7] r. e. bryant, "graph-based algorithms for boolean functions manipulation," ieee trans. on comp., vol. c-35, no.8, pp. 667-691, 1986. [8] m. cohn, switching function canonical form over integer fields, ph.d. dissertation, harvard university, 1960. [9] r. drechsler, a. sarabi, m. theobald, b. becker, and m. a. perkowski, "efficient representation and manipulation of switching functions based on ordered kronecker functional decision diagrams," proc. dac, 1994, pp. 415-419. [10] m. escobar and f. somenzi, "synthesis of and/exor expressions via satisfiability," proc. reed-muller, 1995, pp. 80-87. [11] b. j. falkowski and s. rahardja, "classification and properties of fast linearly independent logic transformations," ieee trans. on circuits and systems-ii, vol. 44, no. 8, pp. 646-655, august 1997. [12] b. falkowski and l.-s. lim, "gray scale image compression based on multiple-valued input binary functions, walsh and reed-muller spectra," proc. ismvl, 2000, pp. 279-284. [13] h. fujiwara, logic testing and design for testability, mit press, 1985. [14] d. h. green, "families of reed-muller canonical forms," int. j. of electronics, no. 2, pp. 259-280, 1991. [15] s. hassoun, t. sasao, and r. brayton (editors), logic synthesis and verification, kluwer acad. publishers, 2001. [16] m. helliwell and m. a. perkowski, "a fast algorithm to minimize multi-output mixed-polarity generalized reed-muller forms," proc. design automation conference, 1988, pp. 427-432. [17] s. l. hurst, d. m. miller, and j. c. muzio, spectral techniques in digital logic, academic 119 acknowledgement: this research was performed during sabbatical leave in 2015-2016 granted from the university of jordan and spent at philadelphia university. 720 a. n. al-rabadi multi-valued galois shannon davio trees and their complexity pb multi-valued galois shannon davio trees and their complexity press inc., 1985. [18] m. g. karpovski, finite orthogonal series in the design of digital devices, wiley, 1976. [19] c. y. lee, "representation of switching circuits by binary decision diagrams," bell syst. tech. j., vol. 38, pp. 985-999, 1959. [20] c. moraga, "ternary spectral logic," proc. ismvl, pp. 7-12, 1977. [21] j. c. muzio and t. wesselkamper, multiple-valued switching theory, adam-hilger, 1985. [22] d. k. pradhan, "universal test sets for multiple fault detection in and-exor arrays," ieee trans. comp., vol. 27, pp. 181-187, 1978. [23] d. k. pradhan, fault-tolerant computing: theory and techniques, vol. i, prentice-hall, 1987. [24] s. m. reddy, "easily testable realizations of logic functions," ieee trans. comp., c-21, pp. 1183-1188, 1972. [25] t. sasao (editor), logic synthesis and optimization, kluwer academic publishers, 1993. [26] t. sasao, "exmin2: a simplified algorithm for exclusive-or-sum-of-products expressions for muliptle-valued input two-valued output functions," ieee trans. computer aided design, vol. 12, no. 5, pp. 621-632, 1993. [27] t. sasao and m. fujita (editors), representations of discrete functions, kluwer academic publishers, 1996. [28] t. sasao, "easily testable realizations for generalized reed-muller expressions," ieee trans. comp., vol. 46, pp. 709-716, 1997. [29] t. sasao, switching theory for logic synthesis, kluwer academic publishers, 1999. [30] n. song and m. perkowski, "minimization of exclusive sum of products expressions for multioutput multiple-valued input incompletely specified functions," ieee trans. computer aided design, vol. 15, no. 4, pp. 385-395, 1996. [31] r. s. stanković, "functional decision diagrams for multiple-valued functions," proc. ismvl, 1995, pp. 284-289. [32] r. s. stankovic, spectral transform decision diagrams in simple questions and simple answers, nauka, 1998. [33] r. s. stanković, c. moraga, and j. t. astola, "reed-muller expressions in the previous decade," proc. reed-muller, starkville, 2001, pp. 7-26. [34] b. steinbach and a. mishchenko, "a new approach to exact esop minimization," proc. reedmuller, starkville, 2001, pp. 66-81. [35] s. n. yanushkevich, logic differential calculus in multi-valued logic design, technical univ. szczecin, 1998. [36] i. zhegalkin, "on the techniques of calculating sentences in symbolic logic," math. sb., vol. 34, pp. 9-28, 1927. [37] i. zhegalkin, "arithmetic representations for symbolic logic," math. sb., vol. 35, pp. 311-377, 1928. 120 multi-valued galois shannon davio trees and their complexity future work will investigate using other complex types of literals such as the presented post literal (pl) and window literal (wl) to expand upon and consequently construct the corresponding new s/d trees. the utilization of the results from this research to create an efficient gfsop minimizer for synthesis applications within the spaces of classical and reversible logic will also be conducted. references [1] s. b. akers, "binary decision diagrams," ieee trans. comp., vol. c-27, no. 6, pp. 509-516, june 1978. [2] a. n. al-rabadi, reversible logic synthesis: from fundamentals to quantum computing, springer-verlag, 2004. [3] a. n. al-rabadi, "reversible fast permutation transforms for quantum circuit synthesis," proc. ieee int. symposium on multiple-valued logic (ismvl), toronto, 2004, pp. 81-86. [4] a. n. al-rabadi, "quantum circuit synthesis using classes of gf(3) reversible fast spectral transforms," proc. ieee int. symposium on multiple-valued logic (ismvl), toronto, 2004, pp. 87-93. [ 5] a. n. al-rabadi, "quantum logic circuit design of many-valued galois reversible expansions and fast transforms," j. circuits, systems, and computers, world scientific, singapore, vol. 16, no. 5, pp. 641 671, 2007. [ 6] a. n. al-rabadi, "representations, operations, and applications of switching circuits in the reversible and quantum spaces," facta universitatis (fu) electronics and energetics, vol. 20, no. 3, pp. 507 539, 2007. [7] r. e. bryant, "graph-based algorithms for boolean functions manipulation," ieee trans. on comp., vol. c-35, no.8, pp. 667-691, 1986. [8] m. cohn, switching function canonical form over integer fields, ph.d. dissertation, harvard university, 1960. [9] r. drechsler, a. sarabi, m. theobald, b. becker, and m. a. perkowski, "efficient representation and manipulation of switching functions based on ordered kronecker functional decision diagrams," proc. dac, 1994, pp. 415-419. [10] m. escobar and f. somenzi, "synthesis of and/exor expressions via satisfiability," proc. reed-muller, 1995, pp. 80-87. [11] b. j. falkowski and s. rahardja, "classification and properties of fast linearly independent logic transformations," ieee trans. on circuits and systems-ii, vol. 44, no. 8, pp. 646-655, august 1997. [12] b. falkowski and l.-s. lim, "gray scale image compression based on multiple-valued input binary functions, walsh and reed-muller spectra," proc. ismvl, 2000, pp. 279-284. [13] h. fujiwara, logic testing and design for testability, mit press, 1985. [14] d. h. green, "families of reed-muller canonical forms," int. j. of electronics, no. 2, pp. 259-280, 1991. [15] s. hassoun, t. sasao, and r. brayton (editors), logic synthesis and verification, kluwer acad. publishers, 2001. [16] m. helliwell and m. a. perkowski, "a fast algorithm to minimize multi-output mixed-polarity generalized reed-muller forms," proc. design automation conference, 1988, pp. 427-432. [17] s. l. hurst, d. m. miller, and j. c. muzio, spectral techniques in digital logic, academic 119 10666 facta universitatis series: electronics and energetics vol. 35, no 2, june 2022, pp. 155-186 https://doi.org/10.2298/fuee2202155n © 2022 by university of niš, serbia | creative commons license: cc by-nc-nd review paper fifty years of microprocessor evolution: from single cpu to multicore and manycore systems goran nikolić, bojan dimitrijević, tatjana nikolić, mile stojčev university of niš, faculty of electronic engineering, niš, serbia abstract. nowadays microprocessors are among the most complex electronic systems that man has ever designed. one small silicon chip can contain the complete processor, large memory and logic needed to connect it to the input-output devices. the performance of today's processors implemented on a single chip surpasses the performance of a room-sized supercomputer from just 50 years ago, which cost over $ 10 million [1]. even the embedded processors found in everyday devices such as mobile phones are far more powerful than computer developers once imagined. the main components of a modern microprocessor are a number of general-purpose cores, a graphics processing unit, a shared cache, memory and input-output interface and a network on a chip to interconnect all these components [2]. the speed of the microprocessor is determined by its clock frequency and cannot exceed a certain limit. namely, as the frequency increases, the power dissipation increases too, and consequently the amount of heating becomes critical. so, silicon manufacturers decided to design new processor architecture, called multicore processors [3]. with aim to increase performance and efficiency these multiple cores execute multiple instructions simultaneously. in this way, the amount of parallel computing or parallelism is increased [4]. in spite of mentioned advantages, numerous challenges must be addressed carefully when more cores and parallelism are used. this paper presents a review of microprocessor microarchitectures, discussing their generations over the past 50 years. then, it describes the currently used implementations of the microarchitecture of modern microprocessors, pointing out the specifics of parallel computing in heterogeneous microprocessor systems. to use efficiently the possibility of multi-core technology, software applications must be multithreaded. the program execution must be distributed among the multi-core processors so they can operate simultaneously. to use multi-threading, it is imperative for programmer to understand the basic principles of parallel computing and parallel hardware. finally, the paper provides details how to implement hardware parallelism in multicore systems. key words: microprocessor, pipelining, superscalar, multicore, multithreading received april 13, 2022 corresponding author: goran nikolić university of niš, faculty of electronic engineering, 18106 niš, aleksandra medvedeva 14, serbia e-mail: goran.nikolic@elfak.ni.ac.rs 156 g. nikolić, b. dimitrijević, t. nikolić, m. stojčev 1. introduction a microprocessor (processor implemented in a single chip) is one of the most inventive technological innovations in electronics since the discovery of the transistor in 1948. this amazing device has involved many innovations in the field of digital electronics, and became a part of everyday life of people. the microprocessor is the central processing unit (cpu) and it is an essential component of the computer [5]. nowadays, it is a silicon chip that is composed from millions up to billions of transistors and other electronic components. the cpu can execute several hundred millions/billions of instructions per second. a microprocessor is preprogrammed to execute software in conjunction with memory and special-purpose chips. it accepts digital data as input and processes it according to the instructions stored in the memory [6]. the microprocessor performs numerous functions including data storage, interaction with input-output devices, time-critical execution and other. applications of microprocessors range from very complex process controllers to simple devices and even toys. therefore, it is necessary for every electronics engineer to have a solid knowledge of microprocessors. this article discusses the types and 50 years’ evolution period of microprocessor. the evolution of microprocessors throughout history has been turbulent. the first microprocessor called intel 4004 was designed by intel in 1971. it was composed of about 2,300 transistors, was clocked at 740 khz and delivered 92,000 instructions per second while dissipating around 0.5 watts. after that, almost every year a new microprocessor, with significant performance improvements in respect to previous ones, was launched. the growth in performance was exponential, of the order of 50% per year, resulting in a cumulative growth of over three orders of magnitude over a two-decade period [7]. these improvements have been driven by advances in the semiconductor manufacturing process and innovations in processor architecture [8]. multicore processing has posed new challenges for both hardware designers and application developers. parallel applications place new demands on the processing system. although a multicore architecture designed for a specific target problem gives excellent results, it should be borne in mind that the main goal in computer system design should be to provide the ability to efficiently handle different types of problems. however, a single architecture "one size fits all", which is able to effectively solve all challenges, has not been found so far, and many are convinced that it will never be [9]. this article presents a review of the microarchitecture of contemporary microprocessors. the discussion starts with 50 years of microprocessor history and its generations. then, it describes the currently used microarchitecture implementations of modern microprocessors. at the end it points to specifics of parallel computing in heterogeneous microprocessor systems. this article is intended for an advanced course on computer architecture, suitable for graduate students or senior undergrads in computer and electrical engineering. it can be also useful for practitioners in the industry in the area of microprocessor design. 2. definition of microprocessor central processing unit, also known as a processor or microprocessor, is a controlling unit of a micro-computer inside a small chip. cpu is often referred to as the brain and heart of all computer (digital) systems and is responsible for doing all the work. it performs every single action a computer does and executes programs. in essence, the cpu is capable to perform arithmetic logical unit (alu) operations and communicates fifty years of microprocessor evolution: from single cpu to multicore and manycore systems 157 with the other input/output devices and auxiliary storage units connected with it. in modern computers, the cpu is contained on an integrated circuit chip in which several functions are combined [10]. in general, all cpus, single-chip microprocessors or multichip implementations run programs by performing the following steps: 1. read an instruction and decode it 2. find any associated data that is needed to process the instruction 3. process the instruction 4. write the results out the instruction cycle is repeated continuously until the power is turned off. a microprocessor is built using the following three basic circuit blocks [11]: 1. registers, 2. alu, and 3. control unit (cu). registers can exist in two forms, either as an array of static memory elements such as flipflops, or as a portion of a random access memory (ram) which may be of the dynamic or static type. alu usually provides, at the minimum, facilities for addition, subtraction, or, and, complementation, and shift operations. the cu of the cpu regulates and integrates computer operations. it selects and retrieves instructions from main memory in the appropriate order and interprets them to activate other functional building blocks of the system at the appropriate time with aim to perform its proper operations. 3. generation and microprocessor history on december 23rd, 1947, the transistor was invented in bell laboratory, whereas an integrated circuit was invented in 1958 in texas instruments. in 1971 intel or integrated electronics has invented the first microprocessor. the evolution of cpu can be divided into five generations such as first, second, third, fourth, and fifth generation [12], and the characteristics of these generations will be discussed in the sequel. 1st generation: the first-generation microprocessors were introduced in the year 1971-1972 when intel launched the first microprocessor 4004 running at a clock speed of 740 khz. other microprocessors that belong to this generation are rockwell international pps-4, intel-8008, and national semiconductors imp-16. instruction processing of these cpus was serial. namely, instruction phases, fetch, decode and execution, were performed sequentially. when the current instruction was finished, then the cpu updates the instruction pointer and fetches the consecutive one in the program sequence, and so on for each instruction in turn. 2nd generation: this was the period from 1973 to 1978 in which very efficient 8-bit microprocessors were implemented like motorola 6800 and 6801, intel-8085, and zilog’sz80, which were among the most popular ones. the second-generation of the microprocessor is characterized by overlapped fetch, decode, and execute phases. when the first instruction is processed in the execution unit, then the second instruction is decoded and the third instruction is fetched. compared to the first-generation, the use of new semiconductor technologies for chip manufacture was a novelty in the second generation. gains in innovation were a significant increase in instruction execution speed and chip densities. 3rd generation: the third-generation microprocessors were introduced in the year 1978, as denoted by intel’s 8086 and the zilog z8000. from 1979 to 1980, intel 8086/80186/80286 158 g. nikolić, b. dimitrijević, t. nikolić, m. stojčev and motorola 68000 and 68010 were developed. processors of this generation were 16-bit, four times faster than the previous generation, and with a performance like mini computers [13], [14], [15]. the development of a proprietary microprocessor architecture based on own instruction set computer (isc) was a novelty of this generation. 4th generation: development of 32-bit microprocessors, during the period from 1981 up to 1995, characterizes the fourth-generation. typical products were intel-80386 and motorola’s 68020/68030. microprocessors of this generation are characterized by higher chip density, even up to a million transistors. high-end microprocessors at the time, such as motorola's 88100 and intel's 80960ca, could issue and retrieve more than one instruction per clock cycle [16], [17]. 5th generation: from 1995 until now, this generation has been characterized by 64bit processors that have high performance and run at high speeds. typical representatives are pentium, celeron, dual and quad-core processors that use superscalar processing, and their chip design exceeds 10 million transistors. the 64-bit processors became mainstream in the 2000s. microprocessor speeds were limited by power dissipation. in order to avoid the implementation of expensive cooling systems, manufacturers were forced to use parallel computing in the form of the multi-core and many-core processor. thus, the microprocessor has evolved through all these generations, and the fifthgeneration microprocessors represent an advancement in specifications. some of the processors from the fifth generation of processors with their specifications will be briefly discussed in the text that follows. 4. classification of processor processor can be classified along several orthogonal dimensions. here we will point briefly to some of the most commonly used. the first classification is based on microarchitecture specifics, second one to the market segment, the third on type of processing, e.tc. in this article, we will focus on the first classification scheme. for more details about this problematic the readers can consult reference [10]. 4.1. classification of microarchitecture specifics in general, we distinguish the following classifications: 4.1.1. pipelined vs non-pipelined processors a non-pipelined processor executes only a single instruction at a given time. the start of the next instruction is delayed until the current ends, not based on hazards but unconditionally. the cpu scheduler chooses the instruction from the pool of waiting instructions, when it is free. pipelining is a technique where multiple instructions are overlapped during execution. the pipelined processor is divided into several processing stages (segments). the stages are mutually connected in a form of a pipe structure. constituents of each stage are an input register and a combinational circuit. the role of the register is to hold data and of combinational circuit to process it. the combinational circuit outputs processed data to the input register of the next segment. fifty years of microprocessor evolution: from single cpu to multicore and manycore systems 159 the pipeline technique is divided into two categories: a) arithmetic pipelines are mainly used for floating point operations, multiplication of fixed-point numbers, etc.; b) in instruction pipeline instructions are executed by overlapping fetch, decode and execute phases. pipeline technique increases instruction level parallelism (ilp) and is used by all processors nowadays [18]. table 1 difference between pipelining and non-pipelining systems pipelining system non-pipelining system multiple instructions are overlapped during execution phases fetching, decoding, execution and writing memory are merged into a single unit (step) several instructions are executed at the same time only one instruction is executed at the same time the cpu scheduler design determines efficiency the efficiency is not dependent on the cpu scheduler execution time is less (in a fewer cycle) execution takes more time (a greater number of cycles) in addition to the fact that pipelining increases the overall system performance, there are several factors that cause conflicts and degrade performance. among the most important factors are the following: 1. timing variations the processing time of instructions is not the same, because different instructions may require different operands (constants, registers, memory). accordingly, pipeline stages do not always consume the same amount of time. 2. data hazards the problem arises when several instructions are partially executed in the pipeline system and in doing so, two or more of them refer to the same data. in that case, it must be ensured that the next instruction stall until the current instruction has finished processing that data, because otherwise an incorrect result will occur. 3. branching the next instruction is fetched during the execution of the current one. however, if the current instruction is conditional branching, then the next instruction will not be known until this current one completes data processing and determines the branching outcome. 4. interrupts interrupts have an impact on the execution of instructions by inserting unwanted instructions into the current instruction stream. 5. data dependency this problem occurs when the result of the previous instruction is not yet available, and it is already needed as data for the current instruction. main advantages of pipelining are higher clock frequency and increased the system throughput. however, there are disadvantages of this technique, primarily the greater complexity of the design and the increased latency of the instruction. 4.1.2. in-order vs out-of-order processors a processor that executes instructions sequentially usually uses resources inefficiently, resulting in poor performance. two approaches can be used to improve processor performance. the first one deals with simultaneous executing different sub-steps of consecutive instructions or even executing instructions completely simultaneously. the second one refers to out-of-order instruction execution which can be achieved by executing the instruction in a different order from the original one [1], [19]. instructions order is determined by the compiler, but it is not necessary to execute them in that order. they may 160 g. nikolić, b. dimitrijević, t. nikolić, m. stojčev be: a) issued in order and completed in order; b) issued in order, but completed out of order; c) issued out of order, but completed in order; and d) issued out of order and completed out of order [1]. firstand second-generation microprocessors process instructions in order. in-order processor performs the following steps: 1. retrieves instructions from the program memory. 2. if input operands are available in the register file, it sends command to the execution unit in order to execute instruction. 3. if during the current clock cycle input operands are not available, the processor will wait for them. this case occurs when the processor retrieves data from slow memory. this implies that instructions are statically scheduled. 4. the instruction is then executed by the appropriate execution unit. 5. after that, the result is entered back into the destination register. out-of-order execution is an approach used in third-, fourth-, and fifth-generation microprocessors. this approach significantly reduces latency when executing instructions. the specificity is that the processor will execute instructions in the order of data or operand availability, but not in the original order of instructions generated by the compiler. in this way, the processor will avoid waiting states, because during the execution of the current instruction, it will obtain operands for the next instruction. for example, i1 and i2 are two instructions where i1 is the first and i2 is the second. in out-of-order execution, the processor may execute an i2 instruction before the i1 instruction is completed. this feature will improve cpu performance as it allows execution with less latency. the steps required for out-of-order processor are as follows: 1. retrieves instructions from the program memory. 2. instructions are sent to an instruction queue (also called instruction buffer). 3. until the input operand is available the instruction waits in the queue. the instructions will leave the queue when the operand is available. this implies that instructions are dynamically scheduled. 4. the instruction is sent to appropriate execution unit for execution. 5. then the results are queued. 6. if all the previous instructions have their results written back to register file, then the current result is entered back to the destination register. the main goal of out-of-order instruction execution is to increase the amount of ilp. but let note that the hardware complexity of out-of-order processors is significantly higher compared to in-order ones. 5. scalar vs superscalar processors a scalar processor is one where instructions are executed in a pipeline, as is presented in fig. 1a), but only a single instruction can be fetched or decoded in a single cycle. a super scalar processor on the other hand can have multiple parallel instruction pipelines [20], [21]. a 2-way super scalar processor (see fig. 1b)) can fetch two instructions per cycle and supports two parallel pipelines. the terms "scalar" or "superscalar" are not to be confused with "single-core/multi-core". scalars are single-core processors, while superscalars may either be singleor multi-cores. the key point is that scalars cannot perform more than one operation (i.e., carry out more than one instruction) per clock cycle, but fifty years of microprocessor evolution: from single cpu to multicore and manycore systems 161 superscalars can perform up to two instructions in some cases. this means that if you have a cpu with three cores on it – one being an old scalar processor – and you run an application that utilizes all three cores, the old third core will be no more than half as fast as if it were completely superscalar. the main thing to remember is that certain instruction sets are suited better to certain optimizations. superscalars can execute basic operations such as add and load on separate registers simultaneously, whereas a scalar processor would have to complete one operation before moving on to the next. for example, a scalar processor may be able to run multiple threads, but they will all share the same core and therefore only run as fast as the slowest thread. superscalars can provide much higher performance because each thread gets its own core/execution unit. a) b) fig. 1 scalar processor (a), superscalar processor (b) the terms "scalar" or "superscalar" are not to be confused with "single-core/multi-core". scalars are single-core processors, while superscalars may either be singleor multi-cores [22]. the key point is that scalars cannot perform more than one operation (i.e., carry out more than one instruction) per clock cycle, but superscalars can perform up to two instructions in some cases. this means that if you have a cpu with three cores on it – one being an old scalar processor – and you run an application that utilizes all three cores, the old third core will be no more than half as fast as if it were completely superscalar. the main thing to remember is that certain instruction sets are suited better to certain optimizations. superscalars can execute basic operations such as add and load on separate registers simultaneously, whereas a scalar processor would have to complete one operation before moving on to the next. for example, a scalar processor may be able to run multiple threads, but they will all share the same core and therefore only run as fast as the slowest thread. superscalars can provide much higher performance because each thread gets its own core/execution unit. the main challenge in superscalar processing is how many instructions can be issued per cycle. if a processor can issue k instructions per cycle, then it is called a k-degree superscalar processor. in order for a superscalar processor to take full advantage of parallelism, then k instructions must be executable in parallel. so, the key idea of a superscalar processor is that there is more instruction level parallelism (ilp) [18]. 162 g. nikolić, b. dimitrijević, t. nikolić, m. stojčev the implementation of superscalar processing requires special hardware (see fig. 2 for more details). the data path is increased with the degree of superscalar processing. for instance, if 2-degree superscalar processor is used and the instruction size is 32 bit, then 64bit data is fetched from the instruction memory and 2 instruction registers are required. fig. 2 comparison between a scalar and a superscalar processor. notice: the superscalar processor implements one pipeline dedicated for memory access and one pipeline for arithmetic operations. the main feature of superscalar processors is to issue more than one instruction in each cycle (usually up to 8 instructions). let note that instructions can change the order to make better use of the processor architecture. in order to reduce data dependency in superscalar processing, more complex parallel hardware is necessary. hardware parallelism ensures the availability of more resources and it is one of the ways to use parallelism. an alternative way is to use ilp which can be achieved by transforming the source code using an optimization compiler. typical commercial superscalar processors are ibm rs/6000, dec 21064, mips r4000, power pc, pentium, etc. very-long-instruction-word (vliw) processors are a variant of superscalar processors because they can process multiple instructions in all pipeline stages [23]. the vliw processor has the following features: (a) it is an in-order processor; (b) the binary code defines which instructions will be executed in parallel. the size of the vliw instruction word can be in hundreds of bits. the compiler forms the layout of the vliw instruction by compacting the instruction words of the source program. the processor must have the sufficient number of hardware resources to execute all the specified operations in vliw word simultaneously. for instance, as shown in fig. 3, one vliw instruction word is compacted to have l/s operation, fp addition, fp multiply, branch, and integer alu. fifty years of microprocessor evolution: from single cpu to multicore and manycore systems 163 fig. 3 a) vliw instruction word; b) vliw processor all functional units (shown in figure 3 b)) are implemented according to the vliw instruction word (given in figure 3 a). large registry file is shared by all functional units in the processor. the parallelism in instructions and data flow is specified at compile time. trace scheduling is used for handling branch instructions. it is based on the prediction of branch decisions at compile time, while prediction is based on some heuristic methods. in table 2 a comparison between vliw and superscalar processors from aspect of ilp implementation is given. as conclusion, when we compare vliw and superscalar processors, we can say that vliw differs from superscalar machine in the following: a) instruction decoding process is simpler; b) ilp is higher but code density is lower; and c) object-code compatibility with a larger family of nonparallel machines is lower. table 2 instruction-level parallelism: vliw vs superscalar superscalar vliw instruction scheduling mechanism is implemented with complex hardware more functional units are needed instruction code word is larger complex compiler is needed out-of-order execution ▪ there is a logic that checks the dependencies between parallel instructions and checks the hazards when working functional units if a compiler that performs efficient code optimization is not implemented, then more effort is needed to create executable code longer execution time and higher power consumption are potential consequences hardware is simpler due to the use of predicted execution to avoid branching more efficiently execution of pipelinedependent code 164 g. nikolić, b. dimitrijević, t. nikolić, m. stojčev simple hardware structure and instruction set are the crucial advantages of vliw architecture. the vliw processor is suitable for scientific applications where the program behavior is more predictable. super-pipelining is an alternative performance method to superscalar. in this approach, pipeline stages can be segmented into n distinct non-overlapping parts each of which can execute in 1/n of a clock cycle, i.e., super-pipelining is based on dividing the stages of a pipeline into sub-stages and thus increasing the number of instructions which are active in the pipeline at a given moment. by dividing each stage into two, the cycle period τ is reduced to the half, τ/2 => at maximum capacity, a result is produced every τ/2 s (see fig. 4). for a given architecture and the corresponding instruction set there is an optimal number of pipeline stages; increasing the number of stages over this limit reduces the overall performance [24]. by analyzing fig. 4 we can observe the following: 1. base pipeline: i) issues one instruction per clock cycle; ii) can perform one pipeline stage per clock cycle; iii) several instructions are executing concurrently; iv) only one instruction is in its execution stage at any one time; and vi) total time to execute 6 instructions is 10 cycles. 2. super-pipelined implementation: j) capable of performing two pipeline stages per clock cycle; jj) each stage can be split into two non-overlapping parts: jjj) each executing in half a clock cycle; jiv) total time to execute 6 instructions is 7.5 cycles; jv) theoretical speedup is equal to 1 − 7.5 / 10 ≈ 25%. 3. superscalar implementation: k) capable of executing two instances of each stage in parallel; kk) total time to execute 6 instructions is 7 cycles; and kkk) theoretical speedup: 1 – 7/10 ≈ 30%. fig. 4 comparison of superscalar and super-pipeline approaches fifty years of microprocessor evolution: from single cpu to multicore and manycore systems 165 from the fig. 4 we can notice that both the super-pipeline and the superscalar implementations: a) have the same number of instructions executing at the same time; b) however, super-pipelined processor falls behind the superscalar processor; and c) parallelism empowers greater performance. so, a better solution to further improve speed is the superscalar architecture. 6. vector processor a vector is an ordered set of the same type of scalar data items that can be of type a floating-point number, an integer, or a logical value. vector processing is the arithmetic, or logical computation, applied on vectors whereas in scalar processing only one or pair of data is processed. therefore, vector processing is faster compared to scalar processing. when the scalar code is converted to vector form then it is called vectorization. a vector processor is a special accelerator building block, which is designed to handle the vector computations [25], [26]. there are the following types of vector instructions: a) vector-vector instructions: vector operands are fetched from the vector register and after processing generated results are stored in another vector register. these instructions are marked with the following function mappings: 1 : 1 2 2 : 1 2 3 p v v p v v v    for example, p1 type denotes vector square root, and p2 addition (or multiplication) of two vectors. b) vector-scalar instructions: scalar and vector operands are fetched and stored in vector register. these instructions are denoted with the following function mappings: 3 : 1 2p s v v  ; where s is the scalar item for example, p3 type denotes vector-scalar subtraction or divisions. c) vector-reduction instructions: this type of instructions is used when operations on vector are being reduced to scalar items as the result. these instructions are presented with the following function mappings: 4 : 1 1p v s 5 : 1 2 2p v v s  for example, p4 type corresponds to finding the maximum, minimum and summation of all the elements of vector, while p5 is used for the dot product of two vectors. d) vector-memory instructions: this type of instructions is used when vector operations with memory m are performed. these instructions are marked with the following function mappings: 6 : 1 1p m v 7 : 1 2p v m for example, p6 type corresponds to vector load and p7 to vector store operation. typical examples of vector operations are the following: 1. 2 1v v ; complement all elements 2. 1s v ; min, max, sum 3. 3 2 1v v v  ; vector addition, multiplication, division 4. 2 1v v s  ; multiply or add a scalar to a vector 5. 2 1s v v  ; calculate an element of a matrix 166 g. nikolić, b. dimitrijević, t. nikolić, m. stojčev vector processing with pipelining: due to the repetition of the same computation on different operands, vector processing is very suitable for pipelining. a vector processor performs better if length of vector is larger, but it causes the problem in storage and manipulating of vectors. efficiency of vector processing over scalar processing: as we have already mentioned, a sequential computer processes vector item by item. therefore, with aim to process a vector of length n through the sequential computer then the vector must be divided into n scalar steps and executed one by one. for example, consider the following example which is used for addition of two vectors of length 1000: + a b c the sequential computer implements this operation by 1000 add instructions in the following way: [1] [1] [1] [2] [2] [2] . . . [1000] [1000] [1000] c a b c a b c a b = + = + = + a vector processor does not divide the vectors in 1000 add statements to perform identical operation, because it has the set of vector instructions that allow the operations to be specified in single vector instruction as: (1:1000) (1:1000) (1:1000)+ a b c comparative execution of addition instruction by scalar and vector processor is presented in fig. 5. fig. 5 scalar vs vector operations execution thus, the main advantage of using vector in respect to scalar processing is reflected in the elimination of overhead caused by the loop control. fifty years of microprocessor evolution: from single cpu to multicore and manycore systems 167 properties of vector instructions: a) single instruction implies lot of operations: hence reduce the number of instruction’s fetch and decode. b) each operation is independent of each other: i) simple design; ii) multiple operations can be run in parallel. c) data hazards has to be checked for each vector operation and not each operation. d) reduces control hazards by reducing branches. e) knows memory access pattern. nowadays, large number of microprocessors contain a set of instructions that manipulate with relatively small vectors (e.g., up to 8 single-precision fp elements in the intel avx extensions [27]). these instructions are often referred to as simd (single instruction, multiple data) instructions. table 3 shows the comparative properties (advantages vs disadvantages) of vector processors. table 3 comparative properties of vector processors advantages of vector processors disadvantages of vector processors ▪ instruction bandwidth is lower ▪ fetch and decode phases are reduced ▪ main memory addressing is easier ▪ load/store units use known patterns for memory access ▪ memory wastage is eliminated – no cache misses, latency only occurs during vector loading ▪ control hazards logic is simple – loop-related control hazards are eliminated ▪ scalable platform – larger number of hardware resources increases performance ▪ code size is reduced – n operations are described by single instruction ▪ works (only) if parallelism is regular (data/simd parallelism). ▪ very inefficient if parallelism is irregular. ▪ memory (bandwidth) can easily become a bottleneck especially if: a) compute/memory operation balance is not maintained; b) data is not mapped appropriately to memory banks vector processing applications include problems that can be efficiently formulated in terms of vectors such as: a) long-range weather forecasting; b) petroleum explorations; c) seismic data analysis; d) medical diagnosis; e) aerodynamics and space flight simulations; f) artificial intelligence and expert systems; g) mapping the human genome; and h) image processing. 7. multicore processors nowadays, large uniprocessors no longer scale in performance, because conventional superscalar techniques for instruction issue allow only a limited amount of parallelism to be extracted from the instruction flow. in addition, it is not possible to further increase the clock speed, because the power dissipation will become prohibitive. for more than thirty years (time period between 1972-2003 year, often called as time intensive microarchitecture processor design), a variety of modifications have been conducted to perform one of two goals: 1) increasing the number of instructions that can be issued per cycle; and 2) increasing the clock frequency faster than moore’s law and 168 g. nikolić, b. dimitrijević, t. nikolić, m. stojčev denard’s rule would normally allow [28]. pipelining and super-pipelining of individual instruction execution into a sequence of stages has allowed designers to increase clock rates. superscalar processors were designed to execute multiple instructions from an instruction stream on each cycle. these function by dynamically examining sets of instructions from the stream to find one’s capable candidates for parallel execution on each cycle. these can be often executed in out-of-order manner with respect to the original sequence. this concept is referred as instruction-level parallelism (ilp). typical instruction streams have only a limited amount of usable parallelism among instructions [1], [29], so superscalar processors that can issue more than about four instructions per cycle achieve very little additional benefit on most applications. today, advances in processor core development have slowed dramatically because of a simple physical limit: power dissipation. in modern pipelined and superscalar processors, typical high-end power exceeds 100 w. in order to bypass the mentioned design constraints, processor manufacturers are now switching to a new microprocessor design paradigm: multicore (also called chip multiprocessor, or cmp for short) and many-core. a multi-core processor is a single computing component with two or more independent actual processing units (called "cores" made up of computation units and caches [30]), which are functional units that read and execute program instructions. multiple cores can run multiple instructions (ordinary cpu instructions) at the same time, increasing overall speed for programs suitable to parallel computing. coupling multiple cores on a single chip should achieve the performance of a single faster processor. the individual cores on a multi-core processor are not necessary to run as fast as the highest performing singlecore processors, but in general they improve overall performance by executing more tasks in parallel [31]. the increase in performance can be seen by considering the way singlecore and multi-core processors execute programs. single-core processors that run multiple programs will assign different time slices to all programs, and they will run sequentially. if one of the processes lasts longer, then all the other processes start to lag behind. however, with multi-core processors, if there are multiple tasks that can run in parallel at the same time, then each of them will be executed by a separate core in parallel. this improves performance. depending on the application requirements, multi-core processors can be implemented in different ways. it can be a group of heterogeneous cores or a group of homogeneous cores or a combination of both. in a homogeneous core architecture, all cores in the processor are identical [32] and in order to improve overall performance they break down a computationally intensive application into less intensive applications and run them in parallel [4]. significant advantages of a homogeneous multi-core processor are reduced design complexity, reusability, and reduced verification effort [33]. heterogeneous cores, on the other hand, consist of dedicated application specific processor cores that would run various applications [34]. cores in multi-core systems, as well as single-processor systems, can implement architectures such as vliw, superscalar, vector, or multithreading. multicore processors are used in many application domains, such as general purpose, embedded, multimedia, network, digital signal processing (dsp) and graphics (gpu). they can be harnessed as complex cores that address computationally intensive applications, or a remedial core that deals with less computationally intensive applications [24]. software algorithms and their implementations greatly influence the performance improvement obtained by using multi-core processors. in particular, possible gains are limited by the fraction of the software that can run in parallel simultaneously on multiple fifty years of microprocessor evolution: from single cpu to multicore and manycore systems 169 cores. at best, parallel problems can achieve acceleration factors close to the number of cores, or even more if the problem is sufficiently split to fit in the local core cache(s). in this way the number of accesses to much slower main system memory are reduced. however, most applications are not so fast without the effort of programmers to reshape the whole problem. currently, software parallelization is a significant ongoing research topic [4]. a comparison between single and multiple-core processor is given in table 4. table 4 comparation of single-core processor and multi-core processor parameter single-core processor multi-core processor number of cores on a die single multiple instruction execution single instruction is executed at a time multiple instructions are executed by using multiple cores gain speed up every program speed up the programs intended for multi-core processor performance depend on the clock frequency depend on the clock frequency, number of cores and program examples 80386, 80486, amd 29000, amd k6, pentium i, ii, iii etc. core-2-duo, athlon 64 x2, i3, i5, i7 etc. 7.1. multicore topologies in the sequel we will point out to four types of multicore topologies: symmetric (or homogeneous), asymmetric, dynamic, and composed (alternatively referred as "fused" or "heterogeneous") [20], [35]. the symmetric multicore topology is composed of multiple copies of the same core that functioning at the same frequency and voltage. in this topology, the resources such as the power and the area budget, are evenly distributed on all cores. in figure 6a) symmetric multicore processor is presented where each block is a basic core equivalent (bce) and contains l1 and l2 caches as constituents. l3 cache and on-chip network are not presented. the asymmetric multicore topology is composed of one large monolithic core and a number of identical small cores. this topology uses a large high-performance core that performs the serial part of the code and uses a number of small cores as well as the large core to take advantage of the parallel part of the code. in figures 6b) and 6c) asymmetric multicore processors are presented with: b) one complex core and 12 bces; c) two complex cores and 8 bces. the dynamic multicore topology is a modification of the asymmetric topology. parallel parts of the code are executed by small cores while the large core is off, and the serial part of the code is executed only on the large core, while small cores are inoperative. in figures 6d) and 6e) dynamic multicore processors are presented with: d) 16 bces or one large core; e) four cores and frequency scaling using power budget of 8 bces (currently one core is at full core thermal design point (tdp), two cores are at 0.5 core tdp, and one core is switched off). 170 g. nikolić, b. dimitrijević, t. nikolić, m. stojčev fig. 6 a) symmetric multicore processor; b) and c) asymmetric multicore processors; d) and e) dynamic multicore processors; f) heterogeneous multicore the heterogeneous (composed) multicore topology is composed of a set of small cores that are logically combined to assemble a large high-performance core for serial code execution. in serial or parallel cases, exclusively large or small cores are used. in figure 6f) heterogeneous multicore is presented with large core, four bces, two accelerators or co-processors of type a, b, d, e each. some of the limits of multicore processors are the following [36]: 1. present days cmps are designed to exploit both instruction-level parallelism (ilp) and thread-level parallelism (tlp). in such solutions, the number of processors and the complexity of each processor are fixed at design time. 2. performance improvement mainly achieved by increasing the number of cores cannot always lead to effective design solution due to: a) dark silicon problem (all the cores cannot be powered at the same time); and b) declining yield in tlp. nowadays, we have multicore processors all over the place, single thread programs are no longer an option. in essence, we moved from single core to multicore not because the software community was ready for concurrency but because the hardware community could not afford to neglect the power issue. fifty years of microprocessor evolution: from single cpu to multicore and manycore systems 171 today, multi-core technology has become commonly used in most personal electronic devices that contain multiple cores. therefore, in order to take advantage of multiple cores on such machines, creating parallel programs is crucial to achieving high performance and enabling large-scale data processing. in addition to multicore technology (mainly realized as shared memory systems) [37], parallel computing can be in the form of distributed systems. unlike multicore shared memory systems, distributed systems can solve problems that do not fit in the memory of a single machine. in contrast to multicores with shared memory, communication and data replication in distributed systems causes high additional overheads. compared to distributed memory systems, multicores with shared memory are more efficient for programs that can fit in memory. efficiency is reflected in reduced hardware, cost, and power consumption [3]. today’s multicore cpus use most of their transistors on processing logic and cache memory. during operation most of the power is consumed by non-computational units. alternative strategy are heterogeneous architectures, i.e. multi-core architectures in combination with accelerator cores. accelerators are specialized hardware cores designed with fewer transistors, operating at lower frequencies than traditional cpus, and enabling increased system performance. 8. multithreaded processors as hardware complexity of modern processor and capabilities have increased, so demands related to higher performance increased too. this requirement has led to an increase in cpu resource efficiency to the same extent. the main idea is that the time while the processor is waiting to perform certain tasks, i.e. it is in idle state, is used to perform another activities. to achieve this goal, software designers involved new approach in possibilities of the operating system that support running pieces of programs, called threads. threads are small tasks that can run independently [38]. during execution, each thread gets its own time period. as a consequence, the processor time is efficiently utilized. fig. 7 shows multithreading execution on single processor and two-way superscalar processor. fig. 7 multithreading in a cpu: (a) single processor running a single thread. (b) single processor running several threads. (c) two-way superscalar processor running a single thread. (d) two-way superscalar processor running multiple (two) threads. 172 g. nikolić, b. dimitrijević, t. nikolić, m. stojčev 8.1. difference between multitasking, multiprocessing and multithreading several threads make up one process (task), and share access to processor resources. this new concept of operating systems, known as multi treading, has ensured the run of one thread while the other is in a state of waiting for an event. contemporary commercially available pc machines and servers, mainly based on intel or amd processors, that run microsoft windows, support multithreading [28]. each program requires resources that are occupied by the process (task). the process is assigned a virtual address space, executable code, system object manipulation, a security context, a unique process identifier, environment variables, a priority type, working set sizes, and at least one thread of execution. a thread is a single entity within a process that can be planned for execution. all threads that are part of a process share its previously mentioned resources. in addition, each thread maintains code for manipulation with exceptions, a planning priority, a local thread memory, a thread identifier, and structures that the system will use in order to preserve the context of the thread. the thread context consists of a set of machine registers, a kernel stack, an environment block, and a user thread stack. each thread is characterized by: 1) thread id; 2) register state, including pc and sp; 3) stack; 4) signal mask; 5) priority; and 6) thread-private memory. threads share instructions and data of the process to which they belong. all threads in the process can see changes in the shared data of any thread. threads in the same process can interact with each other without involving the operating environment. multitasking is a mode of operation where the cpu performs multiple tasks at the same time (see fig. 8). it is characterized by cpu switching between multiple tasks so that users can work together with each program. unlike multithreading, in multitasking, processes share separate memory and resources. in multitasking, cpu switching between tasks is relatively fast. fig. 8 multitasking operating system for single processor multithreading is an operating mode in which during process execution many threads are active. in this manner, higher computer power is achieved. in multithreading (see fig. 9), cpu executes many threads that are part of a process at a time. processes share the same memory and resources. property of multithreading is that two or more threads can run concurrently. therefore, multithreading is also referred as concurrency [39]. fifty years of microprocessor evolution: from single cpu to multicore and manycore systems 173 fig. 9 multithreading system for single processor the difference between multitasking and multithreading [40] is presented in table 5. table 5 difference between multitasking and multithreading no. multitasking multithreading 1. cpu performs many user tasks within a process many threads are created 2. cpu switching among the tasks cpu switching among the threads 3. processes share separate memory processes share same memory 4. multiprocessing can be involved multiprocessing cannot be involved 5. cpu executes many tasks at a time cpu executes many threads at a time 6. each process has separate resources each process shares same resources 7. multitasking is slower multithreading is faster 8. termination of process is longer termination of thread is shorter a computer system composed of two or more processors is called a multiprocessing system (see fig. 10). in this way, the computing speed of the system is increased. in such systems, each processor has its own registers and main memory. the division of processes and resources among processors is done dynamically. the main characteristics of multiprocessing are the following: i) the organization of memory determines the type of multiprocessing; ii) system reliability is improved, and iii) decomposing programs into parallel executable tasks leads to performance increase. advantages of multiprocessing are the following: a) more activity can be performed in a shorter time; b) code is simple; c) system is composed of multiple cpu and cores; d) synchronization is simplified; e) child processes are interruptible/killable; and f) costefficient because processors share resources. disadvantages of multiprocessing are: a) inter-process communication involves time overhead; and b) larger memory is needed. the main characteristics of multithreading are the following: j) each thread is executed parallel with other; and jj) program performance is increased since threads share the same memory area. 174 g. nikolić, b. dimitrijević, t. nikolić, m. stojčev fig. 10 multiprocessing system advantages of multithreading are the following: a) the address space is shared for all threads; b) lower amount of memory is needed; c) cost-efficient and fast communication between threads; d) fast context switching; e) suitable for input/output-oriented applications; and f) switching time between two threads is short. disadvantages of multithreading are the following: a) not interruptible/killable; b) manual synchronization is often necessary; and c) program code is harder to understand and testing and debugging is harder due to race conditions. both multiprocessing and multithreading as operating modes increase a computing power [41]. a multiprocessing system is composed of multiple processors where a multithreading comprises multiple threads. table 6 multiprocessing vs multithreading multiprocessing multithreading multiple cpus increase computing power multiple threads of a single process increase computing power multiple processes are executed concurrently multiple threads of a single process are executed concurrently 9. multithreading: execution model one decade later in respect to software architects, hardware architects designed a multithreaded processor which can run more than one thread on some of its cores at the same time. a multithreaded architecture is one in which a single processor has the ability to follow multiple streams of execution without the aid of software context switches. in order for a conventional processor to stop executing one thread and start executing instructions from another thread, it requires special software. the role of this software is to transfer the state of the running thread to memory (usually to stack memory) and then load the state of the selected other thread into the processor. this process usually requires hundreds (or thousands) of cycles, especially if an operating system was introduced. a multithreaded architecture, on the other hand, can access the state of multiple threads in, or near, the processor core. this allows the multithreaded architecture to quickly switch between threads, and potentially more efficiently and effectively use processor resources [42] (see for illustration fig. 11). fifty years of microprocessor evolution: from single cpu to multicore and manycore systems 175 fig. 11 multithreaded pipeline example multicore and multithreading can be used simultaneously because they are two orthogonal concepts. for instance, the intel core i7 processor has multiple cores, and each core is two-way multithreaded [43]. in the case where multiple threads are executed simultaneously, then those threads use mostly different hardware resources in the multicore, while they share most of the hardware resources in the multithreaded processor. in order to achieve this, a multithreaded architecture must be able to store the state of multiple threads in hardware this storage is referred to as hardware contexts, where the number of supported hardware contexts defines the level of multithreading (the number of threads that can share the processor without software intervention). the state of a thread is primarily composed of the program counter (pc), the contents of generalpurpose registers, and special purpose and program status registers. it does not include memory (because that remains in place), or dynamic state that can be rebuilt or retained between thread invocations (branch predictor, cache, or tlb contents). 9.1. instructions issue multithreaded processors are divided into two groups depending on how many threads can issue instructions in a given cycle. when instructions can be issued only from a single thread in a given cycle, explicit multithreading is used. in that case, the following two main techniques can be applied [44] (see fig. 12): i) coarse-grain multithreading (cgmt) or blocked multithreading (bmt); and ii) fine-grain multithreading (fgmt) or interleaved multithreading (imt). when instructions can be issued from multiple threads in a given cycle, simultaneous multithreading (smt) is used. fig. 12 explicit multithreading 176 g. nikolić, b. dimitrijević, t. nikolić, m. stojčev coarse-grain multithreading, also called blocked multithreading or switch-on-event multithreading, has multiple hardware contexts associated with each processor core [45]. the instructions of a thread are executed successively, but when an event occurs that may cause latency, then it produces a context switch. instructions of one thread continue to be executed until there is a long delay such as a branch or no cache data is found (see fig. 13). when such a delay is achieved, it is switched to another thread, and this thread is also executed until a long delay occurs. this process is constantly repeated. the strategy of this technique makes it possible to hide long delays, but omits shorter delays where the cost of switching is higher than the cost of tolerating delays. a hardware context is the program counter, register file, and other data required to enable a software thread to execute on a core. a coarse-grain multithreaded processor operates similarly to a software time-shared system, but with hardware support for fast context switch, allowing it to switch within a small number of cycles (e.g., less than 10) rather than thousands or tens of thousands. fig. 13 coarse-grain multithreading fine-grain multithreading, also called interleaved multithreading, also has multiple hardware contexts associated with each core, but can switch between them with no additional delay. an instruction of another thread is fetched and entered into the execution pipeline at each cycle, and therefore the processor can execute an instruction or instructions from different thread in each cycle. unlike coarse-grain multithreading, then, a fine-grain multithreaded processor has instructions from different threads active in the processor at once, within different pipeline stages [46]. but within a single pipeline stage (or given our particular definitions, within the issue stage) there is only one thread represented. in this approach, the cpu executes one instruction of each thread in succession (one after the other) before going back (in a circular way) to execute the next instruction of the first thread. during execution, the cpu skips the instruction of any thread that is waiting for an event to occur and has a long delay (stalled). in this manner, the processor is busy because the pipeline system is almost always full. such a processor has significantly complex hardware structure because for each thread it needs a separate copy of register file and program counter. since the next instruction of a thread is fed into the pipeline after the withdrawal of the previous instruction of this thread, control and data dependencies between instructions do not occur in fgmt. the pipeline system is simple and potentially very fast because there is no need for complex hardware hazard detection. in addition, the context switching time between threads is zero cycles. memory latency is compensated by not scheduling a thread until memory access is completed. in this model, the number of cpu pipeline stages determines the number of threads that can be executed. the processing power available to one thread is limited by the instruction interleaving from other threads. fifty years of microprocessor evolution: from single cpu to multicore and manycore systems 177 in simultaneous multithreading (smt) instructions are simultaneously initiated from multiple threads to the execution units of a superscalar cpu. in this way, the initiation of several superscalar instructions is linked with hardware resources for multiple-context approach. the cpu can issue multiple instructions from multiple threads each cycle. in this way, both unused cycles in the case of latencies and unused issue slots within one cycle can be filled by instructions of alternative threads. from one hand, tlp exists as a consequence of multithreading, parallel programs or multiple independent programs in a multiprogramming workload. from the other hand, ilp is based on execution of individual threads. smt processor achieves better throughput and speedup in respect to single threaded superscalar processor for multithreaded workloads because it efficiently uses coarseand fine-grain parallelism, at cost of more complex hardware architecture. smt, cgmt and fgmt are approaches which are often used in risc or vliw processors [39], [47]. intel pentium 4 implements smt from 2002, starting from the 3.06 ghz model. intel calls smt technique as hyper-threading. other processors that use smt are alpha axp 21464, ibm power5, and intel nehalem i7 [48]. one simple smt architecture is presented in fig. 14. fig. 14 smt processor architecture much like pipelining, superscalar architecture (presented in fig. 14) also extends very naturally the possibility to support multiple threads of instructions. a multi-threaded superscalar processor executes instructions from multiple threads. each thread executes its logical instruction stream and uses separate registers, etc., but shares most of the available physical resources. the additional hardware required to support multiple thread execution is minor, but performance is significantly improved. short remarks related to explicit multithreading: coarse-grain multithreaded processors directly execute one thread at a time, but can switch contexts relatively quickly, in a matter of a few cycles. this allows them to switch to the execution of a new thread to hide long latencies (such as memory accesses), but they are less effective at hiding short latencies. finegrain multithreaded processors can context switch every cycle with no delay. this allows them to hide even short latencies by interleaving instructions from different threads while one thread is stalled. however, this processor cannot hide single-cycle latencies. a simultaneous multithreaded processor can issue instructions from multiple threads in the same cycle, 178 g. nikolić, b. dimitrijević, t. nikolić, m. stojčev allowing it to fill out the full issue width of the processor, even when one thread does not have sufficient ilp to use the entire issue bandwidth. illustration purpose only, two different approaches that are possible with single-issue (scalar) processors and multiple-issue processors are given in fig. 15 and fig. 16, respectively [37]. fig. 17 presents two cases of issuing multiple threads in a cycle [38]. fig. 15 different approaches possible with single-issue (scalar) processors: a) single-threaded scalar, b) interleaved multithreading scalar, c) blocked multithreading scalar fig. 16 different approaches possible with multiple-issue processors: (a) single-threaded four-wide superscalar, (b) interleaved multithreading four-wide superscalar, (c) blocked multithreading four-wide superscalar notice: vertical waste corresponds to darker marked box, while horizontal waste corresponds to lighter marked box fifty years of microprocessor evolution: from single cpu to multicore and manycore systems 179 fig. 17 issuing from multiple threads in a cycle: a) simultaneous multithreading, b) chip multiprocessor 10. parallel vs serial computing what is serial computing: computer software is conventionally created for serial execution, where the algorithm divides the problem into smaller parts, i.e. instructions. these instructions are then serially executed on the cpu of the computer one by one [18]. after completing the current instruction, the next one begins. so, in short, serial (sequential) computing is following (see fig. 18): ▪ a problem is broken into a discrete series of instructions ▪ instructions are executed sequentially one after another ▪ executed on a single processor ▪ only one instruction may execute at any moment in time. fig. 18 serial computing generic example what is parallel computing: contrary to the serial approach, parallelism can be defined as an approach of dividing big problems into smaller ones. after that, smaller problems are simultaneously solved by multiple processors. the terms parallelism and concurrency are often confused. parallelism means that two or more program sequences 180 g. nikolić, b. dimitrijević, t. nikolić, m. stojčev are executed independently of each other by a number of processors, while in concurrent execution there are dependencies between program sequences so that the execution of one program sequence must wait for the execution of another to continue. every parallel processing is not needed to be considered as concurrent. for example, bit-level parallelism is not concurrent. as can be seen from fig. 19, to solve a computational problem parallel computing involves the simultaneous usage of multiple computing resources [49]: ▪ a problem is decomposed into several parts that can be solved concurrently ▪ each part is further decomposed down to a series of instructions ▪ instructions from each part execute simultaneously on different processors ▪ a general control/coordination mechanism is implemented. concurrency vs parallelism: we can see how concurrency and parallelism work with the below example. as shown in fig. 20, there are two cores and two tasks. in a concurrent approach, each core is executing both tasks by switching among them over time. in contrast, the parallel approach doesn’t switch among tasks, but instead executes them in parallel over time [50]. this simple example for concurrent processing can be any user-interactive program, like a text editor. in such a program, there can be some io operations that waste cpu cycles. when we save a file or print it, the user can concurrently type. the main thread launches many threads for typing, saving, and similar activities concurrently. they may run in the same time period; however, they aren’t actually running in parallel. types of parallelism: in essence, the parallelism can be implemented at two levels, hardware and software, respectively [51]. fig. 19 parallel computing generic example parallelism at hardware level is built into machines architecture and hardware multiplicity, so it is also known as machine parallelism. this type of parallelism is a function of cost and performance trade off. it also displays resource utilization patterns of simultaneously executable operations and indicates the peak performance of processor resources. it is characterized by number of instruction issues per machine cycle [1]. fifty years of microprocessor evolution: from single cpu to multicore and manycore systems 181 fig. 20 concurrent vs parallel execution in summary, we distinguish the following types of hardware parallelism: 1. parallelism in a uniprocessor level implemented as: i) pipelining, super-pipelining; ii) superscalar, vliw etc. 2. parallelism implemented with simd instructions, vector processors, gpus 3. parallelism at multiprocessor level: j) symmetric shared-memory multiprocessors; jj) distributed-memory multiprocessors; jjj) chip-multiprocessors a.k.a. multi-cores; ivj) multicomputer a.k.a. clusters. parallelism at software level is exploited by the concurrent execution of machine language instructions in a program. this type of parallelism is a function of algorithm, programming style and compiler optimization. it also displays patterns of simultaneously executable operations using the program flow graph [52]. we distinguish the following two types of software parallelism: 1. control parallelism – allows two or more operations to be performed simultaneously; 2. data parallelism – at most same operation is performed over many data elements by many processors simultaneously. data level parallelism (dlp) arises from executing essentially the same code on a large number of objects [53], while control level parallelism (clp) arises from executing different threads of control concurrently [54]. parallelism in software level can be implemented at instruction, task, data or transaction level parallelism [55]. 10.1. implementations of the most common type of parallelism bit-level parallelism: this type of parallelism (implemented at hardware level) uses doubling the processor word size. it provides faster execution of arithmetic operations for large numbers. for instance, an 8-bit cpu executes 16-bit addition for two cycles, whereas a 16-bit processor needs just one cycle for the same activity. this level of parallelism is also used in 64-bit processors. instruction-level parallelism (ilp): this type of parallelism (implemented at hardware level) exploits the potential overlap between instructions in a program. in most cases, ilp is implemented on each processor’s hardware as: i) instruction pipelining; ii) superscalar processing; iii) out-of-order execution; and iv) speculative execution/branch prediction. most processors use a combination of the aforementioned ilp techniques to achieve higher performance. very long instruction word (vliw) processors use specialized compilers to achieve static ilp parallelism at the software level. compilers prepare parallel instruction streams for vliw processors so that they take full advantage of a number of executive units organized in multiple pipelines. 182 g. nikolić, b. dimitrijević, t. nikolić, m. stojčev task/thread-level parallelism (tlp): this type of high-level parallel computing (implemented at software level) is based on partitioning the application in distinct task or threads, that can be then executed simultaneously. threads are executed on different computer units and can work on independent data or share data. until recently, programming was done sequentially, with a single thread representing the entire application. today, it is necessary to use the new paradigm of multi-threaded programming in order to take full advantage of the available multicore processors. in that sense, modern operating systems (oss) provide scheduling of different processes on different cores. however, in the case of complex applications such as bioinformatics, the os cannot efficiently distribute the computational load of each process to available cores. in order to improve performance, these applications need to be redeveloped to achieve thread-level parallelism. data parallelism: this form of high-level parallelism partitions data into various available computing units. data is assigned to cores that independently execute the same task code on each fragment of data. therefore, this type of parallelism requires advanced code development skills and can only be applied to specific problems. computer graphics is an important area of application of high-level data parallelism. the design of graphic processor units (gpus) enables efficient execution of every graphics processing task. first, each frame is divided into regions, and then, based on the command, hundreds of processor units perform the task independently on each data region. many incarnations of dlp architectures over decades are the following [49]: a) old vector processors (cray processors: cray-1, cray-2, …, cray x1); b) simd extensions (intel sse and avx units, alpha tarantula (didn’t see light of day)); c) old massively parallel computers (connection machines, maspar machines); and d) modern gpus (nvidia, amd, qualcomm, ...). in general, dlp focus of throughput rather than latency. in fig. 21 a classification scheme of parallel computer architectures based on type of instruction processing is given. fig. 21 classification of parallel architectures based on type of instruction processing the difference among the three major categories ilp, tlp and dlp that are nowadays mainly used in computer systems to exploit parallelism is sketched in fig. 22, and that is [56]: ▪ instruction-level parallelism (ilp) multiple instructions from one instruction stream are executed simultaneously; ▪ thread-level parallelism (tlp) multiple instruction streams are executed simultaneously; ▪ vector data parallelism (vdp) the same operation is performed simultaneously on arrays of elements. fifty years of microprocessor evolution: from single cpu to multicore and manycore systems 183 there are many reasons to use parallel computing [4]. first of all, the whole real world has a dynamic nature, i.e. many things happen at a certain time, but in different places at the same time, so this data is very huge to manage and requires more dynamic simulation and modeling. it is parallel computing that ensures the concurrency and organization of complex, large datasets and their management while saving money and time. it also provides efficient use of hardware resources and real-time system implementation. fig. 22 differences in execution among ilp, tlp and vdp parallel computing finds application in many areas of science and engineering, then in databases and data mining, in real-time system simulations, as well as in advanced graphics, augmented reality and virtual reality. in addition to a number of advantages, there are some limitations of parallel computing. the main problem is the difficulty in achieving communication and synchronization between multiple subtasks and processes. also, algorithms or programs must be provided with low coupling and high cohesion as well as the possibility that they can be handled in a parallel mechanism. developers must be experts and technically skilled in order to be able to effectively code a program based on parallelism. 11. conclusion nowadays, the microprocessor represents one of the most complex applications of the transistor, with well over 10 billion transistors of the most powerful microprocessor. in fact, throughout its 50 years of evolution period, the microprocessor has always used the technology of the day. the intention to permanently increase performance has led to rapid technological improvements that have made it possible to build more complex microprocessors. advances in semiconductor fabrication processes, computer architecture and organization, as well as cmos ic vlsi design methodologies, were all needed to create today’s microprocessor. the development of microprocessors since 1971 has been aimed at (a) improving architecture, (b) improving instruction set, (c) increasing speeds, (d) simplifying power requirements [57], [58] and (e) embedding more and more memory space and i/o facilities in the same chip (using single chip computers). this paper discusses first, fifty years of microprocessor history and its generations. then it describes the benefits of switching from non-pipelined processor to single core 184 g. nikolić, b. dimitrijević, t. nikolić, m. stojčev pipelined processor, and switching from single core pipelined and superscalar processor to multicore pipelined and superscalar processor. finally, it presents the design of a multicore processor. the transition from single-core to multi-core is inevitable because past techniques for accelerating processor architectures that do not modify the basic von neumann computer model, such as pipelining and superscalar, encounter strong limits. the question is, why have multi-core machines become so widespread in the last decade? according to moore's law, the density of transistors doubles approximately every 18 months [59], and according to dennard's scaling, the power density of transistors is constant [60]. this has historically corresponded to increase in clock speed of single core machines of approximately 30% per year since the mid-1970s [61]. however, since the mid-2000s, dennard scaling has no longer been maintained due to physical hardware limitations, and therefore, there has been a need for new mechanisms. to improve performance, hardware vendors have focused on developing processors with multiple cores [62]. as a result, the microprocessor industry is moving towards multicore architectures. however, the full potential of these architectures will not be exploited until the software industry fully accepts parallel programming. multiprocessor programming is much more complex than programming single processor machines and requires an understanding of new algorithms, computational principles, and software tools. only a small number of developers currently master these skills. there are many techniques that can be used to facilitate the transition to multicore processors, but to take full advantage of the potential offered by such systems, some form of parallel programming will always be needed [4]. multicore technology has become ubiquitous today with most personal computers and even mobile phones [63], so writing parallel programs is crucial to achieving scalable performance and enabling large-scale data processing. in addition, to take full advantage of multicore technology, software applications must be multithreaded. the total work to be performed must be able to be distributed among the execution units of a multicore processor in such a way that they can execute at the same time. in order to consider multithreading in more detail, it is first necessary to understand parallel hardware and parallel computing [44]. finally, this paper provides some details on how to implement hardware parallelism in multicore systems. acknowledgement: this work was supported by the serbian ministry of education and science, project no tr-32009 – "low power reconfigurable fault-tolerant platforms". references [1] d. patterson and j. hennessy, computer architecture: a quantitative approach, 6th ed., morgan kaufmann, 2017. [2] j.-l. baer, microprocessor architecture: from simple pipelines to chip multiprocessors, cambridge university press, 2009. [3] y. solihin, fundamentals of parallel multicore architecture, chapman & hall/crc, 2015. [4] r. kuhn and d. padua, parallel processing, 1980 to 2020", morgan & claypool, 2021. [5] m. stojčev, microprocessor architectures i part, in serbian, elektronski fakultet niš, 2004 [6] b. parhami, computer architecture: from microprocessors to supercomputers, oxford university press, 2005 [7] semiconductor industry association, international technology roadmap for semiconductors (itrs), 2013 edition, 2013 [8] k. olukotun, l. hammond, and j. laudon, chip multiprocessor architecture: techniques to improve throughput and latency, morgan & claypool, 2007 fifty years of microprocessor evolution: from single cpu to multicore and manycore systems 185 [9] r. v. mehta, k. r. bhatt and v. v. dwivedi, "multicore processor challenges – design aspects", j. emerg. technol. innov. res. (jetir), vol. 8, no. 5, pp. c171-c174, may 2021. [10] a. gonzalez, f. latorre and g. magklis, processor microarchitecture: an implementation perspective, morgan & claypool, 2011. [11] m. stojčev and p. krtolica, computer systems: principle of digital systems, in serbian, elektronski fakultet niš i prirodno-matematički fakultet niš, 2005. [12] "microprocessor chronology", av.at. https://en.wikipedia.org/wiki/microprocessor_chronology, last access 28.03.2022. [13] m. stojčev, contemporary 16-bit microprocessors, vol. i, in serbian, naučna knjiga, beograd, 1988. [14] m. stojčev, contemporary 16-bit microprocessors, vol. ii, in serbian, naučna knjiga, beograd, 1988. [15] m. stojčev, contemporary 16-bit microprocessors, vol. iii, in serbian, naučna knjiga, beograd, 1988. [16] m. stojčev, risc, cisc and dsp processors, in serbian, elektronski fakultet niš, 1997. [17] m. stojčev, branislav petrović, architectures and programming microcomputer systems based on processor family 80x86, in serbian, elektronski fakultet niš, 1999. [18] d. patterson and j. hennessy, computer organization and design: the hardware/software interface, 5th ed., morgan kaufmann, 2014. [19] y. etsion, "computer architecture out-of-order execution", av.at. https://iis-people.ee.ethz.ch/~gmichi/ asocd/addinfo/out-of-order_execution.pdf, last access 28.03.2022. [20] "superscalar processors", av. at. https://www.cambridge.org/core/terms, last access. 28.03.2022. [21] m. stojčev and t. nikolić, pipeline processing and scalar risc processor, in serbian, elektronski fakultet niš, 2012. [22] m. stojčev and t. nikolić, superscalar and vliw processors, in serbian, elektronski fakultet niš, 2012 . [23] philips semiconductors, introduction to vliw computer architecture, av. at. https://www.isi.edu/ ~youngcho/cse560m/vliw.pdf. last access 28.03.2022 [24] n. p. jouppi and d. w. wall, "available instruction level parallelism for superscalar and superpipelined machines", wrl research report 89/7, av. at. https://www.hpl.hp.com/techreports/ compaq-dec/wrl-89-7.pdf, last access 28.03.2022 [25] c. e. kozyrakis and d.a. patterson, "scalable vector processors for embedded system", ieee micro, vol. 23, no. 6, pp. 36– 45, nov.-dec. 2003. [26] e. aldakheel, g. chandrasekaran and a. kshemkalyani, "vector procesors", av. at. https://www.cs.uic.edu/~ajayk/c566/vectorprocessors.pdf, last access 29.03.2022 [27] c. lomont, "introduction to intel® advanced vector extensions", av. at. https://hpc.llnl.gov/sites/ default/files/intelavxintro.pdf, last access 29.03.2022. [28] m. stojčev, e. milovanović and t. nikolić, multiprocessor systems on chip, in serbian, elektronski fakultet niš, 2012. [29] j. l. lo and s. j. eggers, "improving balanced scheduling with compiler optimizations that increase instruction-level parallelism", av. at. https://homes.cs.washington.edu/~eggers/research/bsopt.pdf, last access 29.03.2022. [30] s. akhter and j. roberts, multi-core programming, intel press, 2006. [31] g. koch, "intel’s road to multi-core chip architecture", av. at. http://www.intel.com/cd/ids/ developer/asmo-na/eng/220997.htm [32] g. koch, "transitioning to multi-core architecture", av.at. www.intel.com/cd/ids/developer/asmona/eng/recent /221170.htm, last access 29.03.2022. [33] m. brorsson, "multi-core and many-core processor architectures", chapter 2 in programming manycore chips, ed. a. vajda, springer, 2011. [34] m. zahran, heterogeneous computing: hardware and software perspectives, acm books #26, 2019. [35] m. mitić, m. stojčev and z. stamenković, "an overview of soc buses", in embedded systems handbook, digital systems and aplications, ed. v. oklobdzija, chapter 7, 7.17.16, crc press, boca raton, 2008. [36] j. rehman, "advantages and disadvantages of multi-core processors", av. at https://www.itrelease.com/ 2020/07/advantages-and-disadvantages-of-multi-core-processors/, last access 29.03.2022 [37] j. shun, shared-memory parallelism can be simple, fast, and scalable, morgan & claypool pub., 2017 [38] t. ungerer, b. rogic and j. silc, "multithreaded processors", comput j., vol. 45, no. 3, pp. 320–348, 2002. [39] a. silberschatz, g. gagne and p. b. galvin, "multithreaded programming", chapter 4 in operating system concepts, 8th ed., john wiley, 2009. [40] differencebetween.com, "difference between multithreading and multitasking", av.at. https://www.differencebetween.com/difference-between-multithreading-and-vs-multitasking/, last access 29.03.2022. https://en.wikipedia.org/wiki/microprocessor_chronology https://iis-people.ee.ethz.ch/~gmichi/%0basocd/addinfo/out-of-order_execution.pdf https://iis-people.ee.ethz.ch/~gmichi/%0basocd/addinfo/out-of-order_execution.pdf https://www.cambridge.org/core/terms https://www.hpl.hp.com/techreports/%0bcompaq-dec/wrl-89-7.pdf https://www.hpl.hp.com/techreports/%0bcompaq-dec/wrl-89-7.pdf https://www.cs.uic.edu/~ajayk/c566/vectorprocessors.pdf https://homes.cs.washington.edu/~eggers/research/bsopt.pdf http://www.intel.com/cd/ids/developer/asmo-na/eng/recent%20/221170.htm http://www.intel.com/cd/ids/developer/asmo-na/eng/recent%20/221170.htm https://www.itrelease.com/%0b2020/07/advantages-and-disadvantages-of-multi-core-processors/ https://www.itrelease.com/%0b2020/07/advantages-and-disadvantages-of-multi-core-processors/ https://www.differencebetween.com/difference-between-multithreading-and-vs-multitasking/ 186 g. nikolić, b. dimitrijević, t. nikolić, m. stojčev [41] techdifferences, "difference between multitasking and multithreading in os", av. at. https://techdifferences.com/difference-between-multitasking-and-multithreading-in-os.html, last access 29.03.2022 [42] tutorialspoint, "multi-threading models", av.at https://www.tutorialspoint.com/multi-threading-models, last access 22.03.2022 [43] wikipedia, "list of intel core i7 processors", av. at https://en.wikipedia.org/wiki/list_of_intel_ core_i7_processors, last access 29.03.2022 [44] m. nemirovsky and d. m. tullsen, multithreading architecture, morgan & claypool, 2013. [45] o. mutlu, "computer architecture: multithreading", av. at. https://rmd.ac.in/dept/ece/supporting_ online_%20materials/5/cao/unit5.pdf, last access 22.03.2022 [46] n. manjikian, "implementation of hardware multithreading in a pipelined processor", in proceedings of the ieee north-east workshop on circuits and systems, 2006, pp. 145–148. [47] p. manadhata, and v. sekar, "simultaneous multithreading”, av. at https://www.cs.cmu.edu/afs/cs/ academic/class/15740-f03/www/lectures/smt.pdf, last access 29.03.2022 [48] intel, "products formerly nehalem ep", av. at [49] https://ark.intel.com/content/www/us/en/ark/products/codename/54499/products-formerly-nehalem-ep.html, last access 29.03.2022 [50] k. hwang and z. xu, scalable parallel computing: technology, architecture, programming, mcgrawhill, 1998. [51] d. malkhi, concurrency: the works of leslie lamport, acm books #29, 2019. [52] cs4/msc parallel architectures, "lect. 2: types of parallelism", av. at https://www.inf.ed.ac.uk/ teaching/courses/pa/notes/lecture02-types.pdf, last access 29.03.2022 [53] chapter 3: "understanding parallelism", av. at https://courses.cs.washington.edu/courses/cse590o/06au/lnlch-3-4.pdf, last access 29.03.2022 [54] j. owens, "data level parallelism", av. at https://www.ece.ucdavis.edu/~jowens/171/lectures/dlp3.pdf. last access 29.03.2022 [55] a. a. freitas, s. h. lavington, "data parallelism, control parallelism, and related issues", in mining very large databases with parallel processing, springer, 2000. [56] e. i. milovanović, t. r. nikolić, m. k. stojčev and i. ž. milovanović, "multi-functional systolic array with reconfigurable micro-power processing elements", microelectron. reliab., vol. 49, no. 7, pp. 813–820, july 2009. [57] c. severance and k. dowd, high performance computing, connexions, rice university, houston, texas, 2012. [58] g. nikolić, m. stojčev, z. stamenković, g. panić and b. petrović, "wireless sensor node with lowpower sensing", facta univ. ser.: elec. energ., vol. 27, no 3, pp. 435–453, sept. 2014. [59] t. nikolić, m. stojčev, g. nikolić and g. jovanović, "energy harvesting techniques in wireless sensor networks", facta univ. ser.: aut. cont. rob., vol. 17, no. 2, pp. 117-142, dec. 2018. [60] g. e. moore. "cramming more components onto integrated circuits", electronics, vol. 38, no. 8, pp. 114–117, april 1965. [61] r. h. dennard, f. h. gaensslen and k. mai, "design of ion-implanted mosfet’s with very small physical dimensions", ieee j. solid-state circuits, vol. 9, no. 5, pp. 256–268, oct. 1974. [62] s. naffziger, j. warnock and h. knapp. "when processors hit the power wall", in proceedings of the ieee international solid-state circuits conference (isscc), 2005, pp. 16–17. [63] s. borkar and a. a. chien, "the future of microprocessors", commun. acm, vol. 54, no. 5, pp.67–77, may 2011. [64] m. d. hill and m. r. marty, "amdahl’s law in the multicore era", ieee comput. mag., vol. 41, no.7, pp. 33–38, july 2008. https://techdifferences.com/difference-between-multitasking-and-multithreading-in-os.html https://www.tutorialspoint.com/multi-threading-models https://en.wikipedia.org/wiki/list_of_intel_%0bcore_i7_processors https://en.wikipedia.org/wiki/list_of_intel_%0bcore_i7_processors https://rmd.ac.in/dept/ece/supporting_%0bonline_%20materials/5/cao/unit5.pdf https://rmd.ac.in/dept/ece/supporting_%0bonline_%20materials/5/cao/unit5.pdf https://ieeexplore.ieee.org/xpl/conhome/4016922/proceeding https://www.cs.cmu.edu/afs/cs/%0bacademic/class/15740-f03/www/lectures/smt.pdf https://www.cs.cmu.edu/afs/cs/%0bacademic/class/15740-f03/www/lectures/smt.pdf https://ark.intel.com/content/www/us/en/ark/products/codename/54499/products-formerly-nehalem-ep.html https://www.inf.ed.ac.uk/%0bteaching/courses/pa/notes/lecture02-types.pdf https://www.inf.ed.ac.uk/%0bteaching/courses/pa/notes/lecture02-types.pdf https://www.ece.ucdavis.edu/~jowens/171/lectures/dlp3.pdf http://casopisi.junis.ni.ac.rs/index.php/fuelectenerg/article/view/140 http://casopisi.junis.ni.ac.rs/index.php/fuelectenerg/article/view/140 instruction facta universitatis series: electronics and energetics vol. 30, n o 2, june 2017, pp. 187 197 doi: 10.2298/fuee1702187n sparse localization of breast tumors using quasi-te polarized antennas  marija nikolić stevanović 1 , jelena dinkić 1 , antonije đorđević 1,2 , jasmin musić 3 , lorenzo crocco 4 1 university of belgrade – school of electrical engineering, belgrade, serbia 2 serbian academy of sciences and arts, belgrade, serbia 3 wipl-d d.o.o., belgrade, serbia 4 institute for the electromagnetic sensing of the environment – national research council, irea-cnr, naples, italy abstract. we develop a three-dimensional (3d) sparse algorithm for localization of breast tumors, using an antenna array and signal processing. assuming that the priorknowledge of the breast tissue distribution is available, we develop a model in which the trans-polarization is fully taken into account. by considering various array configurations, we also investigate the robustness of the algorithm to the inaccuracies in the assumed electromagnetic parameters of the breast. key words: breast imaging, compressive sensing, inverse scattering, microwave imaging 1. introduction in the recent years, there has been a growing interest in microwave medical imaging [1], [2]. compared to the conventional technologies, the main advantages of microwave imaging systems are their portability, low-cost, and non-ionizing radiation. the majority of clinical applications have focused on breast imaging, e.g., [2][5], but lately the efforts have been extended to other modalities such as bone [6] and brain imaging [7][9]. numerous techniques have been proposed for this purpose. some examples are the time-domain beamforming [5], the conjugate gradient approach [10][12], gauss-newton optimization [4], [13], etc. lately, compressive sensing techniques [14], [15] have been used for solving a number of microwave imaging problems [16][21]. compressive sensing (sparse) imaging is known to yield clean and focused images with suppressed artifacts. sparse imaging is particularly suitable for situations in which targets occupy only a small part of the observed domain. typically, this is the case in differential microwave imaging, where received may 15, 2016; received in revised form september 15, 2016 corresponding author: marija nikolić stevanović school of electrical engineering, bulevar kralja aleksandra 73, 11120 belgrade, serbia (e-mail: mnikolic@etf.rs) 188 m. nikolić stevanović, j. dinkić, a. đorđević, j. musić, l. crocco the goal is to locate small changes between consecutive measurements, rather than retrieving the permittivity of the whole investigated domain. examples of differential microwave imaging apparatuses are the wearable breast-cancer detection system [22], [23] and the stroke-finder system [24]. here, we consider the application of the compressive sensing for the three-dimensional (3d) breast-cancer localization. we assume that dipole-like antennas are placed parallel to circles encompassing the breast surface 1 , which is analogous to the transverse electric (te) polarization in the two dimensional (2d) geometry. this is in contrast to the usual approach in which the antennas are parallel to each other, as in the case of transverse magnetic polarization (tm). however, one must consider a full 3d model in which all field components are taken into account, unlike in the quasi-tm measurement configuration. assuming that variations of the tissue parameters (due to the possible tumor presence) between two measurements are small, it is possible to linearize the scattering equations. however, it is still necessary to compute 3d (dyadic) green's functions, as well as the approximate field inside the breast. for this purpose, we assume to have a prior knowledge of the healthy breast tissue parameters. we also investigate the robustness of the algorithm against the errors in the tissue permittivity. by combining the obtained results in a particular way, we suppress false targets caused by the parameter ambiguity. (a) (b) fig. 1 (a) measurement model and (b) sketch of 3d grid used in sparse processing the organization of the paper is as follows. in the section ii, we describe the electromagnetic model. in section iii, we develop the sparse algorithm. in section iv, we detail the inhomogeneous breast phantom that was used in simulations. finally, in section v, we provide some numerical results. 1 in clinical examinations, the patient typically lies in prone position, with breasts pointing downwards, inside the imaging system. hence, the field radiated by the array is horizontally polarized. sparse localization of breast tumors using quasi-te polarized antennas 189 2. measurement model we consider the measurement scenario depicted in fig. 1(a). an unknown target or lesion (illustrated as an elliptic inclusion) is located inside the non-magnetic inhomogeneous breast tissue. to determine the location of the target, we use an antenna array placed around the breast in the vicinity of the skin. according to the coordinate system given in fig. 1, the antennas are parallel to the yz plane, i.e., parallel to the chest wall. for simplicity, we show only two antennas: a transmitter, located at ri, and a receiver, located at rj. we define the scattered field as s b ( ) ( ) ( ) ε r ε r ε r , (1) where e(r) and eb(r) are the electric field vectors, measured at the field point r, when the target is inside the breast and when there is no target (healthy breast), respectively. using the volume equivalence principle [24], the scattered field may be expressed as b bs eq b( ) ( , ') ( ') d ( , ') j ( ( ') ( ')) ( ') d v v v v       ε r g r r j r g r r r r e r , (2) where r' is the source position vector, b ( , ')g r r is the dyadic background green's function, jeq(r') is the equivalent current density vector, e(r') is the total field inside the breast, b is the permittivity of the healthy breast,  is the permittivity of the target (  b), and v is the breast volume. if the target is electrically small, (2) becomes bs b( ) j ( ( ')) ( , ) ( ) v   ε r r g r t e t , (3) where t is the target position vector and v is its volume. supposing that the target is a weak scatterer, we have bs b b( ) j ( ) ( , ) ( ) v   ε r g r t e t , (4) where b ( ) ( )e t e t is the background electric field. we express the background field in terms of 3d green's function as a bb ( ) ( , ) ( )d l i ε t g t l l l , (5) where l is the source vector, i is the current of the transmitting antenna, and la is the antenna length. assuming that the antennas are electrically short dipoles (without top loadings), the current distribution is approximately triangular, i.e., i(l) = i0(1  l/h), |l|  h, where i0 is the current at the port of the dipole, h is the length of the dipole arm, and l is the local coordinate. using this approximation, (5) becomes a b bb 0 ( ) ( , ) ( ) d ( , ) i i i l i i   ε t g t r l l g t r h , hi h , (6) where ri is the location of the transmitter and hi is the vector in the direction of the current, parallel to the dipole axis. hence, the scattered field at the receiver is b bs b 0( , ) j ( ) ( , ) ( , )j i j i i i v    ε r r g r t g t r h , (7) where rj is the location of the jth receiver. due to the reciprocity, i.e., t b b( , ) ( , )g r t g t r , 190 m. nikolić stevanović, j. dinkić, a. đorđević, j. musić, l. crocco t b bs b 0 ( , ) j ( )( ( , )) ( , ) j i j i i v    ε r r g t r g t r h . (8) in our case, both the transmitting and receiving antennas are parallel to the yz plane. hence, the scattered field at the location of the jth antenna, when the ith antenna is transmitting, is s ( , ) ( , ) ( , ) ( , ) ( , ) ( , ) 0 ( , ) ( , ) ( , ) ( , ) ( , ) ( , ) ( , ) cos ( , ) ( , ) ( , ) ( , ) ( , ) ( , ) s xx j yx j zx j xx i xy i xz i j i xy j yy j zy j yx i yy i yz i i xz j yz j zz j zx i zy i zz i g g g g g g k g g g g g g g g g g g g                      t r t r t r t r t r t r ε r r t r t r t r t r t r t r t r t r t r t r t r t r in i          ,(9) where b 0 j ( )k i h v    , (cos sin )i i y i zh   h i i , and i is the angle defined in fig. 1(a). in the expanded form, (9) is , 11 12 , 21 22 ( , ) cos ( , ) sin s y j i i s z j i i e g g k e g g                r r r r , (10) 11 ( ( , ) ( , ) ( , ) ( , ) ( , ) ( , )) xy j xy i yy j yy i zy j zy i g g g g g g g  t r t r t r t r t r t r , (11) 12 ( ( , ) ( , ) ( , ) ( , ) ( , ) ( , )) xy j xz i yy j yz i zy j zz i g g g g g g g  t r t r t r t r t r t r , (12) 21 ( ( , ) ( , ) ( , ) ( , ) ( , ) ( , )) xz j xy i yz j yy i zz j zy i g g g g g g g  t r t r t r t r t r t r , (13) 22 ( ( , ) ( , ) ( , ) ( , ) ( , ) ( , )) xz j xz i yz j yz i zz j zz i g g g g g g g  t r t r t r t r t r t r . (14) finally, the induced voltage at the jth antenna, due to the scattering from the target, is s ( , ) j i j v   r r e h , (cos sin ) j j y j z h   h i i , (15) where j is the angle defined in fig. 1(a). 3. sparse model we search for the target on a uniform 3d grid inside the breast, as shown in fig. 1(b). assuming that there is a target at each node, we derive an approximate linear model cge ii  , (16)   t 1 1 ( , ) ( , ) i i m i m v v  e r r r r , (17) 11 1 1 1 1 1 1 ( , ; ) ( , ; ) ( , ; ) ( , ; ) i n n i i m m i mn m n i m n g g g g            r t r r t r g r t r r t r , (18)  t 11   nn cc c , (19) where ei is the vector of the received signals when the ith antenna is transmitting, gi is the corresponding system matrix, and c is the unknown vector whose elements are proportional to the permittivity difference, as defined in (7), for each grid node. an element of the system matrix is sparse localization of breast tumors using quasi-te polarized antennas 191 11 12 21 22 ( , ; ) ( ( , , ) cos ( , , ) sin ) cos ( ( , , ) cos ( , , ) sin ) sin , 1 , 1 jk j k i j i k i j i k i j j i k i j i k i j g g g g g k n j m                r t r r r t r r t r r t r r t , (20) where k t is the position of the kth grid node, n is the size of the grid, and m is the total number of receiving antennas (only ith antenna is transmitting). we combine the measurements related to different transmissions into one set of equations as gce  , (21) where the stacked measurement vector is            m e e e  1 , (22) and the aggregated system matrix is            m g g g  1 . (23) under the assumption that the target occupies only a few grid nodes, we apply the 1 l regularization to emphasize the sparsity of the solution vector c , 2 2 1 ˆ min{|| || || || }   c c e gc c . (24) here, ĉ is the estimated coefficient vector and  is the regularization parameter. to solve (26), we use the cvx package [26], [27]. we compute the regularization parameter, which balances between the data fidelity and the solution sparsity using the l-curve method [28]. we also investigate a different sparse scheme in which the system matrix and the measurement vector are associated with a subset of m transmissions. namely, we jointly process the data corresponding to a few transmitting antennas. as before, we assume that one transmitter is active at a time. in this case, the measurement vector and the system matrix are obtained from (24) and (25) by keeping ei and gi, i = 1,...,m, related to the desired transmitters. the final image is obtained by superimposing partial images (i.e., estimated coefficient vectors) associated with different groups of transmitters. 4. brest phantom in our investigations, we used an inhomogeneous breast model (breast id: 012204) provided by the uwcem numerical breast phantom repository [29], [30]. this repository contains a number of anatomically-realistic breast phantoms derived from the magnetic resonance imaging (mri). to make the model suitable for the electromagnetic analysis, we decreased its resolution by averaging the electromagnetic parameters of the groups of 101010  voxels of the original distribution. the resolution of the resulting model was 5 mm and the operating frequency was f = 1ghz. in addition [31], we divided the 192 m. nikolić stevanović, j. dinkić, a. đorđević, j. musić, l. crocco obtained continuous range of permittivity (r) and conductivity () into 8 domains with the constant parameters defined in table 1. the relative complex permittivity was defined as  = r  j/(0), where the imaginary part of the complex permittivity, /(0), takes into account all dielectric losses (polarization and conductive), as in [32]. besides the true values of the permittivities, table 1 also shows these values altered for 10%. approximately, the domain #1 corresponds to the fatty region, the domains #2–4 belong to the transitional tissue, the domains #5–7 correspond to the fibro-glandular tissue, and the domain #8 is skin. we included the tumor by changing the parameters of one voxel. its parameters are given in the 9th column of table 1. fig. 2 shows the boundaries of the domains, in the order of appearance given in table 1. table 1 permittivities of homogeneous domains in breast phantom domain 1 2 3 4 5 6 7 8 9 r 5.5 15 24 32 42 51 60 39 56 r (10%) 6.05 16.5 21.60 28.8 46.2 56.1 66.0 39 56  [s/m] 0.06 0.21 0.36 0.49 0.68 0.93 1.28 0.9 1 domain 1 domain 2 domain 3 domain 4 domain 5 domain 6 domain 7 domain 8 tumor fig. 2 different tissues (domains) defined in table 1 5. numerical results in our numerical simulations, we used an array of m = 60 horizontal (quasi-te polarized) dipoles placed around the breast surface. as illustrated in fig. 3(a), the dipoles were uniformly distributed along three circular contours. the radii of the contours were 7.8 cm, 8 cm, and sparse localization of breast tumors using quasi-te polarized antennas 193 8.3 cm. the corresponding distances of the centers of the contours from the nipple region, along the x-axes, were 5.8 cm, 8.3 cm, and 11.3 cm, respectively. the operating frequency was f = 1 ghz. the length of the dipoles was 2h = 2 cm. to compute the response of the array and the 3d green's functions, we used the software wipl-d pro [33]. we supposed that the induced signal in the jth antenna, when the ith antenna is transmitting, is proportional to the mutual impedance between those two antennas, i.e., v(ri,rj)  zi j. by adding white gaussian noise, we corrupted the measurement vector. in fig. 3(b), red lines denote the search space consisting of nx = 6 horizontal cuts. in each cut, the number of the grid nodes was ny  nz, ny = 32, nz = 32. the corresponding steps along the coordinate axes were x  9 mm and y  z  4.5 mm. the blue lines in fig. 3(b) indicate the antenna positions. (a) (b) fig. 3 (a) antenna array and (b) search space represented by red lines (numbers indicate different cuts) 5.1. ideal case first, we considered the case in which we have a perfect knowledge of the breast tissue. we assumed that all m = 60 antennas were receiving and every fourth antenna was transmitting (one at a time). fig. 4 shows the localization result in the cut #3, which was the closest to the target. we simultaneously processed all 6 cuts, as defined in fig. 3(b). the adopted signal-to-noise ratio was snr = 10 db. the true position of the scatterer was denoted by a red square marker. the elements of the solution vector in all other planes were zero. the regularization coefficient corresponded to the knee of the l-curve. cut #3 fig. 4 ideal case: target image computed for snr = 10 db 194 m. nikolić stevanović, j. dinkić, a. đorđević, j. musić, l. crocco 5.2. incomplete knowledge of dielectric properties we investigated the robustness of the algorithm against the ambiguity in the breast tissue parameters. in table 1, we show the permittivity values altered with respect to their adopted values for about 10 %. the experimental setup was the same as in the ideal case. the adopted snr was 10 db. instead of considering all available data simultaneously, we jointly processed the measured signals associated with groups of adjacent transmitters. again, we assumed that one transmitter was active at a time. in fig. 5(a), we give an example of such a group consisting of 4  2 transmitters. the receiving array comprised all antennas. hence, the corresponding system matrix, as defined by (23), consisted of mnhnv  n elements, where nh refers to the number of the transmitters in the horizontal direction and nv refers to the number of the transmitters in the vertical direction. we shifted the position of the transmitting array for about knh / 2 elements in the horizontal direction and lnv / 2 elements in the vertical direction, where k,l = 0, 1... . fig. 5(b) illustrates the position of the shifted array for k = l = 1. we obtained the final image by superimposing the partial results obtained using different positions of the transmitting array. (a) (b) fig. 5 example of the transmitting array in (a) its first position and (b) shifted position fig. 6 and fig. 7 show the imaging result for nh = 4 and nv = 2; and for nh = 5 and nv = 2, respectively. in both cases, the location of the tumor was correct in the yz plane (horizontal). in the direction of x-axis, there was an error of about 1 cm. the results in other cuts are several orders of magnitudes smaller and they are caused by tissue ambiguity (i.e., false targets). numerical investigations showed that the positions of these artifacts varied for different values of nh. in contrast, the location of the tumor did not change. this dissimilar behavior may be explained by a point-target nature of the tumor as opposed to the distributed nature of the tissue errors. sparse localization of breast tumors using quasi-te polarized antennas 195 cut #1 cut #2 cut #3 cut #4 cut #5 cut #6 fig. 6 sparse imaging results obtained for nh = 4 and nv = 2 cut #1 cut #2 cut #3 cut #4 cut #5 cut #6 fig. 7 sparse imaging results obtained for nh = 5 and nv = 2 6. conclusion we have proposed a 3d sparsity-based algorithm for differential microwave imaging of tumors inside a known inhomogeneous breast tissue. in contrast to the usual approach available in the literature, we have developed a model in which the trans-polarization due to the inhomogeneous breast tissue was fully taken into account. to check the robustness of the algorithm, we have considered the cases in which the breast tissue was only partially known. to reveal the true position of the tumor and suppress false targets, we applied the sparse processing scheme on different subarrays. 196 m. nikolić stevanović, j. dinkić, a. đorđević, j. musić, l. crocco acknowledgement: this work was supported by the serbian ministry of science and education under the grant tr32005 and by the cost action td1301, mimed. references [1] s. semenov, "microwave tomography: review of the progress towards clinical applications", phil. trans. r. soc. a, vol. 367, pp. 3021–3042, 2009. [2] a. m. hassan, m. el-shenawee, "review of electromagnetic techniques for breast cancer detection", ieee rev. biomed. eng., vol. 4, pp. 103–118, 2011. [3] p. m. meaney, m. w. fanning, t. raynolds, c. j. fox, q. q. fang, c. a. kogel, s. p. poplack, and k. d. paulsen, "initial clinical experience with microwave breast imaging in women with normal mammography", acad. radiol., vol. 14, no. 2, pp. 207–218, february 2007. [4] s. p. poplack, k. d. paulsen, a. hartov, p. m. meaney, b. w. pogue, t. tosteson, m. grove, s. soho, and w.wells, "electromagnetic breast imaging-pilot results in women with abnormal mammography", radiology, vol. 243, pp. 350–359, 2007. [5] m. klemm, j. leendertz, a. w. preece, m. shere, i. j. craddock, and r. benjamin, "clinical experience of breast cancer imaging using ultrawideband microwave radar system at bristol", in proceedings of the ieee ap-s int. symp., toronto, on, canada, 2010, vol. 501.10. [6] p. m. meaney, d. goodwin, a. h. golnabi, t. zhou, m. pallone, s. d. geimer, g. bruke, and k. d. paulsen, "clinical microwave tomographic imaging of the calcaneus: a first-in-human case study of two subjects", ieee trans. biomed. eng., vol. 59, no. 12, pp. 3304–3313, december 2012. [7] i. s. karanasiou, n. k. uzunoglu, and c. c. papageorgiou, "towards functional noninvasive imaging of excitable tissues inside the human body using focused microwave radiometry", ieee trans. microw. theory techn., vol. 52, no. 8, pp. 1898–1908, august 2004. [8] a. fhager and m. persson, "a microwave measurement system for stroke detection", in proceedings of the antennas and propagation conference (lapc), loughborough, uk, 2011. pp. 14–15. [9] r. scapaticci, l. di donato, i. catapano, and l. crocco, "a feasibility study on microwave imaging for brain stroke monitoring", progress in electromagnetics research b, vol. 40, pp. 305–324, 2012. [10] c. gilmore, a. abubakar, w. hu, t.m. habashy, and p. m. van den berg, "microwave biomedical data inversion using the finite-difference contrast source inversion method", ieee trans. antennas propag., vol. 57, no. 5, pp. 1528–1538, may 2009. [11] t. u. r , . a lan rek, a. apar, . a int rk, i. ak man, a nonlinear mi ro ave rea t cancer imaging approach through realistic body–breast modeling", ieee trans. antennas propag., vol. 62, no. 5, pp. 2596–2605, may 2014. [12] r. scapaticci, i. catapano, and l. crocco, "wavelet-based adaptive multiresolution inversion for quantitative microwave imaging of breast tissues", ieee trans. antennas propag., vol. 60, no. 8, pp. 3717–3726, august 2012. [13] j. d. shea, p. kosmas, s. c. hagness, and b. d. van veen, "three-dimensional microwave imaging of realistic numerical breast phantoms via a multiple-frequency inverse scattering technique", med. phys., vol. 37, no. 8, pp. 4210–4226, august 2010. [14] f. gao, b. van veen, and s. c. hagness, "contrast enhanced microwave imaging of breast tumors using sparsity regularization", in proceedings of the ieee antennas propag. soc. int. symp. (aps-ursi), chicago, il, 2012, pp. 8–14. [15] d. winters, b. van veen, and s. c. hagness, "a sparsity regularization approach to the electromagnetic inverse scattering problem", ieee trans. antennas propag., vol. 58, no. 1, pp. 145–154, january 2012. [16] m. nikolic stevanovic, l. crocco, a. djordjevic, and a. nehorai, "higher order sparse microwave imaging of pec scatterers", ieee trans. antennas propag., vol. 64, no. 3, march 2016. [17] m. azghani, p. kosmas, f. marvasti, "microwave medical imaging based on sparsity and an iterative method with adaptive thresholding", ieee trans. med. imag., vol. 34, no. 2, pp. 357–365, february 2015. [18] m. bevacqua, r. scapaticci, "a compressive sensing approach for 3d breast cancer microwave imaging with magnetic nanoparticles as contrast agent", ieee trans. med. imag., vol. 35, no. 2, pp. 665–673, february 2016. sparse localization of breast tumors using quasi-te polarized antennas 197 [19] m. nikolic, j. dinkic, n. milosevic, and b. kolundzija, "sparse localization of tumors inside an inhomogeneous breast", in proceedings of the international conference on electromagnetics in advanced applications (iceaa), torino, it, 2015, pp. 1056–1059. [20] d. m. malioutov, m. cetin, and a. s. willsky, "sparse signal reconstruction perspective for source localization with sensor arrays", ieee trans. signal process., vol. 53, no. 8, pp. 3010–3022, 2005. [21] l. c. potter, e. ertin, j.t. parker, m. cetin, "sparsity and compressed sensing in radar imaging", in proceedings of the ieee, vol. 98, no. 6, pp. 1006–1020, june 2010. [22] e. porter, . wall , . z o , m. popović, an j. d. schwartz, "a flexible broadband antenna and transmission line network for a wearable microwave breast cancer detection system", prog. electromagn. res. lett., vol. 49, pp. 111–118, 2014. [23] e. porter, m. coate an m. popović, an earl lini al t of time-domain microwave radar for breast health monitoring", ieee trans. biomed. eng., vol. 63, no. 3, pp. 530–539, march 2016. [24] r. scapaticci, o. m. bucci, i. catapano, and l. crocco, "differential microwave imaging for brain stroke followup", int. j. antennas propag., vol. 2014, article id 312528, 11 pages, 2014. [25] w.c. chew, waves and fields in inhomogenous media, wiley-ieee press, february 1999. [26] m. grant and s. boyd, cvx: matlab software for disciplined convex programming, http://stanford.edu/ ~boyd/cvx, june 2009. [27] m. grant and s. boyd, graph implementations for non smooth convex programs, recent advances in learning and control (a tribute to m. vidyasagar), v. blondel, s. boyd, and h. kimura, editors, lecture notes in control and information sciences, springer, 2008, pp. 95-110. [28] p. c. an en an d. p. o’lear , "the use of the l-curve in the regularization of discrete ill-posed problems", siam j. sci. comput., vol. 14, no. 6, pp. 1487–1503, 1993. [29] e. zastrow, s. k. davis, m. lazebnik, f. kelcz, b. d. van veen, s. c. hagness, database of 3d gridbased numerical breast phantoms for use in computational electromagnetics simulations. [30] m. lazebnik, l. mccartney, d. popovic, c. b. watkins, m. j. lindstrom, j. harter, s. sewall, a. magliocco, j. h. booske, m. okoniewski, and s. c. hagness, "a large-scale study of the ultrawideband microwave dielectric properties of normal breast tissue obtained from reduction surgeries", phys. med. biol., vol. 52, pp. 2637–2656, april 2007. [31] n. milosevic, m. nikolic, b. kolundzija, j. music, "numerical heterogeneous breast phantoms with different resolutions", in proceedings of the eucap, lisbon, pt, 2015. [32] a. djor jević, d. olćan, m. stojilović, m. pavlović, . kol n žija, d. tošić, "causal models of electrically large and lossy dielectric bodies", facta universitatis, series: electronics and energetics, vol. 27, no. 2, pp. 221–234, june 2014. [33] http://www.wipl-d.com/ http://ieeexplore.ieee.org/xpl/recentissue.jsp?punumber=5 http://www.wipl-d.com/ instruction facta universitatis series: electronics and energetics vol. 29, n o 3, september 2016, pp. 339 355 doi: 10.2298/fuee1603339e swarm intelligence based reliable and energy balance routing algorithm for wireless sensor network fatma h. elfouly 1 , rabie a. ramadan 2 , mohamed i. mahmoud 3 , moawad i. dessouky 4 1 department of electronics and electrical communications higher institute of engineering, el-shorouk academy, el-shorouk city, egypt 2 computer engineering department, cairo university, egypt 3 department of control engineering and industrial electronics, faculty of electronic engineering, menoufia university menouf, egypt 4 department of electronics and electrical communications, faculty of electronic engineering, menoufia university menouf, egypt abstract. energy is an extremely crucial resource for wireless sensor networks (wsns). many routing techniques have been proposed for finding the minimum energy routing paths with a view to extend the network lifetime. however, this might lead to unbalanced distribution of energy among sensor nodes resulting in, energy hole problem. therefore, designing energy-balanced routing technique is a challenge area of research in wsn. moreover, dynamic and harsh environments pose great challenges in the reliability of wsn. to achieve reliable wireless communication within wsn, it is essential to have reliable routing protocol. furthermore, due to the limited memory resources of sensor nodes, full utilization of such resources with less buffer overflow remains as a one of main consideration when designing a routing protocol for wsn. consequently, this paper proposes a routing scheme that uses swarm intelligence to achieve both minimum energy consumption and balanced energy consumption among sensor nodes for wsn lifetime extension. in addition, data reliability is considered in our model where, the sensed data can reach the sink node in a more reliable way. finally, buffer space is considered to reduce the packet loss and energy consumption due to the retransmission of the same packets. through simulation, the performance of proposed algorithm is compared with the previous work such as ebrp, aco, tadr, seb, and clr-routing. key words: wsns; swarm intelligence; ant colony system (acs); energy balancing; reliability received november 12, 2015 corresponding author: rabie a. ramadan computer engineering department, cairo university, egypt (e-mail: rabie@rabieramadan.org) *an earlier version of this paper was presented at the international conference on recent advances in computer systems racs-2015, hail university, saudi arabia, 2015 [1]. 340 f. elfouly, r. ramadan, m. mahmoud, m. dessouky 1. introduction a wireless sensor network (wsn) is a wireless network consisting of large number of small size, inexpensive, and battery operated sensor nodes. such nodes are essential for monitoring physical or environmental conditions such as temperature and humidity, perform simple computation, and communicate via wireless multi-hop transmission technique to report the collected data to sink node [2]. however, the nodes in wsn have severe resource limitations such as energy, bandwidth, and storage resources. energy is an extremely crucial resource because it not only determines the sensor nodes lifetime, but the network lifetime as well [3]. in wsns, communication has been recognized as the ajor source of energy consumption and costs significantly more than computation [3][4]. consequently, most of the existing routing techniques in wsn attempt to find the shortest path to the sink to minimize energy consumption. as a result, highly unbalanced energy consumption which causes energy holes around the sink and significant network lifetime reduction. therefore, designing energy-balanced routing technique plays a crucial role in wsns [5][6]. the reliable data transmission is one of the most essential issues in wsns [7][8][9]. the loss of important information due to unexpected node failure or dynamic nature of wireless communication link [10] prevents the sensor network from achieving its primary purpose which is data transfer. hence, routing techniques should give priority to reliable transmission. at the same time, it is critical to reduce packet loss in wsns which will improve the network throughput and energy-efficiency. due to memory constraints on sensor nodes, buffering a large number of packets is impossible. thus, such a buffer overflow problem may result in information loss and more energy consumption due to the retransmission of the same packets. thus, such retransmission limits the network's lifetime and efficiency. consequently, it is a highly needed to consider buffer space when designing routing protocols in wsns [11]. in the last two decades, optimization techniques inspired by swarm intelligence have gained much popularity [12]. they mimic the swarms' behaviour of social insects like ants and bees, the behaviour of other animal societies such as birds flocks, or fish schools as well [12]. swarm intelligent systems are robust, scalable, adaptable, and can efficiently solve complex problems through simple behaviour [13] such as the shortest path finding. ant colony system (acs) is considered one of the most important swarm intelligence techniques that can provide approximate solutions to optimization problems in a reasonable amount of computation time [12]. acs [14] has been inspired from the food searching behaviour of real ants which can be utilized to find the shortest path in wsns. unlike other routing approaches [15], the ant colony optimization meta-heuristic proposed in the literature for wsns is based only on local information of sensor nodes [16]. the problems of balancing energy consumption among sensor nodes and reliable communication have received significant attention in recent years [17][18][19][20][21][22]. however, our contributions in this paper focus on: 1) reducing energy consumption for wsn lifetime extension, 2) balancing of energy consumption among sensor nodes to maintain and balance of residual energy on sensor nodes as well, 3) enhancing data reliability where the sensed data can reach the sink node in a more reliable way, 4) taking into consideration buffer space on sensor nodes to reduce dropped packets, which in turn conserves energy, and 5) introducing a swarm intelligence as a heuristic algorithm based energy reduction and reliability as well as load balancing and minimizing the probability of buffer overflow. swarm intelligence based reliable and energy balance routing algorithm for wireless sensor network 341 the rest of this paper is organized as follows: section 2 introduces a brief summary of the related work. section 3 introduces the problem description. then, section 4 describes the swarm based approach. section 5 provides the simulation results. finally, section 6 concludes the paper. 2. related work this section focuses only on the most related work to the proposal of this paper. it starts by explaining the work presented in [5][23][24][25][26] which are the more related work to our proposed approach followed by the differences from our proposal. [5] proposed an energy-balanced routing protocol (ebrp). ebrp algorithm borrows the concept of potential in physics to construct a mixed virtual potential field in terms of depth, energy density, and residual energy. the depth field is used to route packets toward the sink node. the energy density field is essential to balance energy consumption where, the packets are driven through the dense energy area. finally, the residual energy field protects nodes with relatively low residual energy from dying. [23] proposed an improved ant colony optimization routing (aco) for wsn. in this algorithm, an enhanced ant colony is used to optimize the node power consumption and prolongs network lifetime. the aco improved approach in enhanced an approach based on aco in which the probability of selecting next hop neighbour has been determined by using two heuristic functions. the first one is related to the quantity of the pheromone which inversely proportional to hop count, and the second depends on residual energy of neighbour nodes. however, the improvement in [23] is done by adding more accuracy to make a choice especially when probabilities are equal where, in such case the node chooses randomly the next hop. as a result, this might make wrong choice and data loss in uncovered area, or packets travel a long path to the sink. therefore, many nodes lose power due to bad choice, delivery delay, and may leads to network lifetime reduction. the aco improved approach adding new heuristic information to distinguish the best neighbour and avoiding the use of wrong nodes. the new heuristic information is related to the energy of the neighbour node which having sink in its collection field. such neighbour node will have more chance to be chosen, because the packets will attain the sink node definitely. however, only energy and pheromone are considered in the probabilistic rule when the sink is not in the neighbour node field. meanwhile, the analysis of aco improved algorithm [23], and ebrp [5] show that some issues are not considered which are reflected as drawbacks. firstly, the network reliability, as discussed above, this might increase the packet loss and packet retransmissions which affects the network efficiency. the second is the queue buffer size in which it has directly impact on network throughput and lifetime. finally, node load where, the nodes with heavy load and low residual energy should be prevented from being selected as a next hop to achieve energy balance of the whole network and relieve the energy hole problem. consequently, taking residual energy only into consideration as in [5][23] is not sufficient to achieve balanced energy usage in the network. 342 f. elfouly, r. ramadan, m. mahmoud, m. dessouky [24] proposed a traffic aware dynamic routing (tadr) algorithm to route the packets around the congestion areas and scatter excessive packets along multiple paths consisting of idle or unloaded nodes. in this algorithm, a hybrid potential field is constructed in terms of depth and the normalized queue length. the depth field creates a backbone to forward packets toward the sink. the queue length field is used to prevent the packet from going to the possible congestion area. however, tadr algorithm doesn't consider two critical issues which considered as a drawback. the first is energy balancing, as described above; this might lead to unbalanced energy consumption in the network which causes energy holes around the sink and significant network lifetime reduction. the second issue is the network reliability which is one of the key issues in wsns due to the high dynamics, limited resources, and unstable channel conditions. thus, this might deteriorate the network performance as mentioned above. [25] proposed a simple cross-layer balancing routing (clb-routing) that enhances the wsns lifetime by balancing the energy consumption in the forwarding task. clbrouting protocol is a bottom up approach, where the network layer uses information given by the mac layer in the choice of the next hop. the proposed algorithm in [25] operates in two phases. the first is initialization, where the sink node broadcasts a route request message containing a cost variable initialized to zero. each node receiving this message, updates the cost field according to its residual energy and the energy required for communication between that node and the sender of the route request and, then broadcasts it. the second phase is data transmission, where the mac layer informs the network layer about all the overheard communications of the neighbouring nodes. with this information, a node can know how many times each forwarding node has routed data. according to this information, and to effectively balance sensor nodes energy consumption, a node chooses its next hop among the less-used ones. this choice is not random; it is according to a probability, which counts residual energy, energy of communication, and the number of times that each forwarding node has routed data. however, clb-routing had important issues to take into account, but it lacked some others like network reliability and buffer size. this eventually affect the network throughput and lifetime as described above. [26] proposed a swarm intelligence based energy balance routing scheme (seb). it utilizes swarm intelligence to maintain and balance residual energy on sensor nodes for wsn lifetime extension. seb algorithm balances residual energy on sensor nodes evenly according to their weights as much as possible. the node weight is related to the number of its neighbour nodes that may select it to relay their messages. the probability of selecting the next hop neighbour node is calculated according to residual energy, distance to the sink, weight of nodes, and the environment pheromone which is related to path quality. nevertheless, the previous study of seb shows that it has some drawbacks since some issues are not considered. the first is the packet buffer capacity of sensor nodes. as described above, this might increase the packet loss and packet retransmission which inevitably affects the network efficiency. secondly, the dynamic behaviour of the wireless link quality over time and space where, the path quality is determined as a function of hop count. this can easily lead to the use of low-quality links, and result in unreliable routs [27]. finally, calculating the weight of nodes in such algorithm was based on the assumption that the environment events distributed uniformly. this might be inefficient when the environment events distributed non-uniformly. swarm intelligence based reliable and energy balance routing algorithm for wireless sensor network 343 the proposed swarm algorithm in this paper considers the end-to-end reliability of a multi-hop route based on the packet reception rate (prr) which is one of the most commonly used reliability metrics [28]. in this model, the work analyzes the reliability of the whole path from the next hop node to the sink, and then chooses the relay node with the best prr which improve the end-to-end reliability of a multi-hop route. moreover, the proposed algorithm can balance energy consumption among sensor nodes evenly as much as possible through new effective function between nodes' residual energy and weight. as well as, a new weight definition is proposed in this algorithm to achieve balanced energy consumption for both uniform and non-uniform event distribution in the environment. in addition, it can effectively alleviate buffer overflow by integrating the normalized buffer space into routing choice. consequently, the local information in the proposed swarm solution refers to each neighbour's residual energy, weight, normalized buffer space, transmission distance, and pheromone. as well as, a new pheromone update operator is designed to integrate energy, path length, and path quality into routing choice. 3. problem description consider a static multi-hop wsn deployed in the sensing field. in this model, we aim to achieve reliable routing algorithm taking into consideration nodes' energy consumption, energy balancing among sensor nodes, and nodes' buffer space. the wireless sensor network can be modelled as a random geometric graph, g(v,l), where v denotes the set of sensor nodes which distributed randomly in the square monitoring field and l represents a set of all communication links (i, j) where, i, j  v. link (i, j) exists if and only if nodes i,j are within radio range of each other. the events in the environment will be detected by some sensor nodes which are called source nodes. assuming that the mac layer provides the link quality estimation service, e.g., the prr information on each link [29], where each node is aware of the prr values to its one-hop neighbours. the information regarding the presence of the detected events at each source node should be reported to the sink node. since wsns are usually based on a multi-hop transmission, the source nodes send their data to the sink through intermediate sensor nodes which acts as a relay nodes. the chosen path from each source node to the sink should be the best path which satisfies some constraints including 1) low communication cost, 2) its reliability greater than or equal target value, 3) at the same time, sensor nodes on that path should have the maximum value resulting from a new proposed equation between the residual energy and weight compared with their neighbours to balance energy consumption among sensor nodes, and 4) as well, sensor nodes should have a buffer space greater than or equal message size to reduce packet loss and energy consumption due to retransmission of the same packets as a result of buffer overflow. to simplify the description of the problem and its formulation, the notations used to model the problem are given in table 1. 344 f. elfouly, r. ramadan, m. mahmoud, m. dessouky table 1 our model notations given parameters notation description s the set of all sensor nodes that in sensing or sensing-relaying state. r the set of all sensor nodes that in relaying state accept sink node. prr the set of packet reception ratio prr(i,j) associated with link (i, j). wq constant value less than or equal 1. rej the residual energy of each sensor node j, rsnebnebj ii  , se(i,j) the energy required to do single hop transmission from i to j, .),( lji  mesi the number of messages at node i, rsi  wj the weight of a neighbor j, rsnebnebj ii  , ewrj the residual energy to weight ratio for each neighbor node j, rsnebnebj ii  , encj(t) the ratio between residual energy to initial energy for each neighbor node j at time t, }{sin, krsnebnebj ii   }{sin, krsnebnebj ii   pz the packet size. bsj(t) buffer space in node j at time t, }{sin, krsnebnebj ii   bmj(t) the normalized buffer space of node j at time t, }{sin, krsnebnebj ii   nrej the ratio between rej and se(i,j) for each neighbor node j, rsnebnebj ii  , nebi the set of neighbors of node i, }{sin, krsnebi i   4. swarm based approach this section describes the details of the proposed swarm technique for energy balance and reliable routing in wsns. the section states the different parts of the proposed scheme including the routing scheme, local heuristic information computation, pheromone computation, and neighbour node selection. the proposed swarm solution is composed of two phases. in the first phase, it starts with a set of forward ants placed in the source nodes and move through neighbour relay nodes until reach sink node. in this algorithm, for calculating the packet transfer probability to the next hop neighbour, residual energy, weight, normalized buffer space, transmission distance, and pheromone are considered. at each node i, a forward ant k selects the next hop node j, inebj  randomly with a probability ( , ) k r p i j which determined as follows:    inebl ililililil ijijijijijk r ttttt ttttt jip     )]([)]([)]([)]([)]([ )]([)]([)]([)]([)]([ ),( (1) where ηij(t) is the pheromone value on the link (i,j) at the time t, ηij(t), ψij(t), εij(t), and δij(t) are the heuristic information of link (i,j) for node j; α, β, γ, λ, and ϕ are the weight factors that control the pheromone value and the heuristic information parameters respectively. when forward ant k reaches sink node, it is transformed into a backward ant and the second phase starts. the backward ant starts from the sink node and moves towards its source node along the same path in opposite direction, depositing an increment of pheromone on that. swarm intelligence based reliable and energy balance routing algorithm for wireless sensor network 345 4.1. problem formulation due to the use of multi-hop routing technique, the information about the detected events at each source node should be transmitted as messages to the sink node through intermediate nodes or relay nodes. in order to achieve energy balanced routing, the node with heavy weight and low residual energy should be prevented from being selected as a next hop. so, the proposed algorithm considers a model in which the sensor node residual energy and weight are used when choosing the relay node through a new proposed function. now, let’s start with the computation of the weight of a neighbour j at time t by equation (2).          otherwise hchifmes twe jnebi iji j 0 c (t) )( (2) because the events detected in the monitored environment distribute non-uniformly, node weight can be defined as the total number of messages at its neighbour nodes which may choose it to relay their messages. equation (2) means that packets are not allowed to be transmitted backward to the neighbours with higher hop count. this strategy ensures that the packets are forwarded closer toward the sink and prevents forming a loop. in addition, the new function that combines residual energy and weight for each node j at time t is defined by equation (3) as follows: (( ( ) ( )) 1) (exp( ( ))) ( ) ( ) 1 ( ) (exp( ( ))) ( ) ( ) ( ) ( ) 0 ( ) 0 j j j j j j j j j j j j nre t we t enc t if nre t we t ewr t enc t if nre t we t we t nre t if we t                 (3) due to the use of multi-hop routing technique, the information about the sensed events at each source node should be transmitted as messages to the sink node through intermediate nodes or relay nodes. therefore, the relay node needs to hold in a buffer the incoming data packets during the processing time required for the previous ones. the sensor nodes have limited memory, it is impossible to buffer a large number of packets. consequently, the buffer of the relay node may start overflowing, resulting in loss of important packets and more energy consumption due to the retransmission of the same packets [30]. for efficient use of available buffer, we consider a model in which the probability of buffer overflow is minimized as much as possible by integrating the normalized buffer space into routing choice. the normalized buffer space is defined as the ratio between the buffer space and packet size. it is used to express the number of packets that can be received by every sensor node without it starting buffer overflowing at a certain time. the normalized buffer space of node j at time t can be defined as follows: ( ) ( ) ( ) 0 j j j bs t if bs t pz bm t pz otherwise       (4) 346 f. elfouly, r. ramadan, m. mahmoud, m. dessouky 4.2. calculation of local heuristic information in order to maintain higher and balance residual energy on sensor nodes, the proposed relation between residual energy and weight is used as a heuristic information when selecting the next hop neighbour node which denoted by ij(t). ( ) ( ) ( ) i j ij l l neb ewr t t ewr t     (5) according to this rule, the node with the greater value of ij will have a higher residual energy compared to its weight and a much better opportunity to be chosen as a next hop. since energy conservation is an essential issue in wsn, selecting the nodes with minimum hop count is required to minimize energy consumption and conserve much more energy as possible. therefore, the hop count from neighbour node j to the sink node is used as heuristic information which is denoted by ij(t). ( ) 1 ( ) ( ) 1 i i j ij i j l neb hc hc t hc hc        (6) a neighbour node that has a greater value of ij(t) is closer to the sink than the others and will be more likely to be chosen as next hop. in order to avoid or reduce packet loss due to buffer overflow which in turn improve the overall network performance, it is critical to send packets to the sensor node with more buffer space or less traffic load. therefore, bmj(t) can be used as heuristic information which denoted by ij(t) ( ) ( ) 1 ( ) i j ij l l neb bm t t bm t      (7) this rule enables decision making according to the buffer apace on the neighbour nodes, meaning that if a node has a greater value of ij(t) then it has a much better opportunity to be chosen as next hop. due to the dynamic behaviour of the wireless link quality over time and space, it is essential to use the current packet reception ratio of link (i,j), prrij as heuristic information to improve the network throughput. it is denoted by ij(t) ( ) i ij ij lj l neb prr t prr     (8) where, the greater value of ij(t) indicates that the link (i,j) more reliable than others. thus the neighbour node j will have more chance to be chosen as next hop. swarm intelligence based reliable and energy balance routing algorithm for wireless sensor network 347 4.3. pheromone calculation in this algorithm, pheromone concentration is affected by the combination between energy, path length, and path quality in a new effective form. this may improve network reliability, reduce energy consumption, and achieve more balanced transmission among the nodes. let’s begin with the calculation of the path quality, qp, which related to the prr by equation (9). p p q prr (9) where, prrp, represents the packet reception ratio of the path p. due to the use of multi-hop routing, the prrp can be computed by the prr of each hop on the path p as follow: ( , ) p p ij i j n prr prr    (10) where, np is the set of edges on the path p (hop count). in this model, all nodes have the same fixed transmission range. so, the number of hops in the path p is considered as the path length, lp as follow: p p l n (11) by estimating the length of each possible path for the same source node, the best path length lpbest is recorded at the sink. then, the relative length of path p can be determined as follows: rlp = lpbest / lp = npbest / np (12) the increasing density of pheromone on the path p is defined as follows: ij = (rlp  prr w1) w2  (e p min) w3 / n 2 p (13) where e p min is the minimum residual energy of nodes visited by ant k and the parameters w1, w2, and w3 determine the relative influence of the energy, path length, and path quality. the sink node constructs the value of pheromone update operator, ij and sent it back as a backward ant to its source node along the reverse path. whenever a node i receives a backward ant k coming from neighbouring node j, it updates its pheromone concentration according to the following rule: ijijij tt   )1()1()( (14) where,   (0,1) is the evaporation constant that determines the evaporation rate of the pheromone [26]. 348 f. elfouly, r. ramadan, m. mahmoud, m. dessouky 5. performance evaluation in this section, different experiments are conducted to evaluate the performance and validate the effectiveness of our proposal. the section starts by describing the performance metrics followed by simulation environment and finally simulation results. 5.1. performance metric for a comprehensive performance evaluation, several quantitative metrics considered are defined below. 1. network lifetime [5]. it is defined as the time duration from the begging of the network operation until the first node exhausts its battery. 2. energy imbalance factor (eif) [5]. it is defined to quantify the routing protocol energy balance characteristic which defined formally as the standard variance of the residual energy of all nodes.    n i avgi rere n eif 1 2 )( 1 where n is the total number of sensor nodes, rei is the residual energy on node i, and reavg is the average residual energy of all nodes. 3. throughput ratio (tr) [25]. this metric is defined as: nodessourcebysentpacketsofnumber kthebyreceivedpacketsofnumber tr sin  4. average end-to-end delay (seconds) [30]: it is defined as the average time a packet takes to travel from source node to the sink node. this includes propagation, transmission, queuing, and processing delay. the processing delay can be ignored as a result of fast processing speed [31]. 5.2. simulation environment in this paper, the simulation environment consists of 80 sensor nodes deployed randomly in a field of 1000 m x 1000 m. the sink node, and sensor nodes are stationary after being deployed in the field. furthermore, the sink node is located at (1000, 500) m. all the later experiments are done for both homogeneous and heterogeneous node energy distributions on a custom matlab simulator. data traffic is generated according to a passion process with mean parameter ζ. in addition, we choose a harsh wireless channel model, which includes shadowing and deep fading effects, as well as the noise [32]. in this simulation, the case of chipcon cc2420 radio transceiver is taken into consideration [1]. the simulation parameters are listed in table 2. in the later experiments, we use the combination (α = 2, β = 2, γ = 1, λ=1, and ϕ=12), the evaluation result shows this combination is the best for all experiments. swarm intelligence based reliable and energy balance routing algorithm for wireless sensor network 349 table 2 simulation environment parameters parameters values network size 1000×1000 number of nodes 80 number of sink nodes 1 node placement random uniform packet size 64 byte frequency 2400 mhz transmission power -5dbm maximum transmission range 223 m channel model log-normal shadow path loss exponent 6 shadow fading variance 6 noise power -145dbm reference distance 3 m 5.3. simulation results to verify the feasibility and effectiveness of our proposal, its performance is compared in terms of network lifetime, energy imbalance factor, and throughput ratio, with the proposed protocols in [5][23][24][25][26] for homogenous and heterogeneous networks. we implemented all of the algorithms in [5][23][24][25][26]. 5.3.1. network lifetime evaluation for homogenous and heterogeneous networks in this experiment, the performance of the proposed swarm approach is evaluated in terms of network lifetime for both homogenous and heterogeneous networks compared to ebrp [5], aco proposed in [23], tadr [24], clr-routing [25], and seb [26] under different traffic rate σ. the initial energy on each sensor node is 125mj for homogenous network while it is between 100 and 125mj randomly for heterogeneous network. fig. 1 and fig. 2 show the variation of network lifetime with respect to different traffic rate σ for homogeneous and heterogeneous networks respectively. from the figures it can be found that as the value of σ increases, the network lifetime decreases. since the network traffic increases with the increment of σ, the relay load of nodes increases linearly which is the main reason behind decrease of lifetime. however, the figures show clearly that our swarm algorithm enhances significantly the network lifetime comparing with the others for both homogeneous and heterogeneous network. this means that our swarm algorithm balances the network energy consumption more effectively than the others. 350 f. elfouly, r. ramadan, m. mahmoud, m. dessouky fig. 1 network lifetime vs. traffic rate σ for homogeneous network fig. 2 network lifetime vs. traffic rate σ for heterogeneous network 5.3.2. network reliability evaluation for homogenous and heterogeneous network in this experiment, the performance of the proposed swarm approach is evaluated in terms of tr for both homogenous and heterogeneous networks compared to ebrp [5], aco proposed in [23], tadr [24], clr-routing [25], and seb [26] for homogeneous and heterogeneous network under different traffic rate σ. the initial energy on each sensor node is 125mj for homogenous network while it is between 100 and 125mj randomly for heterogeneous network. the tr against different traffic rate σ for both homogeneous and heterogeneous networks is depicted in fig. 3 and fig. 4 respectively. clearly, our swarm algorithm achieves the highest tr compared to the others. this is because it forwards the data packets toward the sink in a more reliable way and alleviates the possible buffer overflow. swarm intelligence based reliable and energy balance routing algorithm for wireless sensor network 351 fig. 3 network throughput vs. traffic rate σ for homogeneous network fig. 4 network throughput vs. traffic rate σ for heterogeneous network 5.3.3. energy balancing evaluation for homogenous and heterogeneous networks in this experiment, the performance of the proposed swarm approach is evaluated in terms of energy balance for both homogenous and heterogeneous networks compared to ebrp [5], aco proposed in [23], tadr [24], clr-routing [25], and seb [26]. the initial energy on each sensor node is 125mj for homogenous network while it is between 100 and 125mj randomly for heterogeneous network. in this set of experiments, it is assumed that the traffic rate σ equal 5. the eif was calculated during running time to find the network's balance efficiency. fig. 5 and fig. 6 present the variation of eif over simulation time for homogeneous and heterogeneous networks respectively. as shown in the figures, eif increases with more running time. the augmentation of the eif is due to the high use of the sink node neighbours comparing to the others, which reduce the average residual energy. however, according to the results in fig. 5 and fig. 6, it is obvious that the eif of our swarm algorithm is the minimum among those of all the 352 f. elfouly, r. ramadan, m. mahmoud, m. dessouky others. it means that in our swarm algorithm, the energy of the entire nodes in the network is close to the average energy in contrast to the others. that's to say, our swarm algorithm can balance residual energy among sensor nodes efficiently. fig. 5 the eif vs. simulation time for homogeneous network fig. 6 the eif vs. simulation time for heterogeneous network 5.3.4. average end-to-end delay evaluation for homogenous and heterogeneous networks in this experiment, the performance of the proposed swarm approach is evaluated in terms of end-to-end delay for both homogenous and heterogeneous networks compared to ebrp [5], aco proposed in [23], tadr [24], clr-routing [25], and seb [26] under different traffic rate σ. the initial energy on each sensor node is 125mj for homogenous network while it is between 100 and 125mj randomly for heterogeneous network. fig. 7 and fig. 8 show the average end-to-end delay under different traffic rate σ for homogeneous and heterogeneous networks respectively. from the results, it is observed that the end-end swarm intelligence based reliable and energy balance routing algorithm for wireless sensor network 353 delay increases, as the traffic rate increases. a higher traffic rate causes more queuing delay, which raises the end-to-end delay. however, it is clear that our swarm approach giving the lowest end-to-end delay compared with the others. this is because, our swarm approach forwards the data packets toward the sink in a more reliable way and alleviates the possible buffer overflow, which decreases the packet loss and retransmissions and hence the end-to-end delay. fig. 7 average end-to-end delay vs. traffic rate σ for homogeneous network fig. 8 average end-to-end delay vs. traffic rate σ for heterogeneous network 6. conclusions in this work we presented an efficient routing algorithm that uses swarm intelligence for wsns. the proposed approach not only reduces the energy consumption but also balanced it among sensor nodes to extend wsn lifetime. at the same time, the sensed data delivered to the sink with the highest possible reliability and minimum buffer overflow. the performance of proposed method compared with the previous works which are related to 354 f. elfouly, r. ramadan, m. mahmoud, m. dessouky our topic such as ebrp, aco, tadr, seb, and clr-routing are evaluated and analyzed through simulation. simulation results showed that our approach is robust; achieve longer lifetime, and giving lower end-to-end delay compared to the previous works for both homogenous and heterogeneous networks. references [1] f. elfouly, r. ramadan, m. mahmoud, m. dessouky, “swarm intelligence based reliable and energy balance routing algorithm for wireless sensor network”, in proceedings of the international conference on recent advances in computer systems racs-2015, hail university, saudi arabia, november 2015 [2] “micaz wireless module.” [online]. available http://www.cmt-gmbh.de/micaz.pdf. [3] h. m. ammari, “challenges and opportunities of connected k covered wireless sensor networks-from sensor deployment to data gathering s,” springer, 2009 [4] g.j. pottie and w.j. kaiser, “ wireless integrated network sensors,” communications of acm, vol. 43, no. 5, pp. 51-58, 2000. [5] f. ren, j. zhang, t. he, c. lin, and s. k. das, “ebrp: energy-balanced routing protocol for data gathering in wireless sensor networks,” ieee trans. on parallel and distributed systems, vol. 22, no. 12, december 2011. [6] x. liu, “a transmission scheme for wireless sensor networks using ant colony optimization with unconventional characteristics,” ieee communications letters, vol. 18, no. 7, pp. 1214-1217, 2014. [7] g. campobello, a. leonardi, and s. palazzo, “improving energy saving and reliability in wireless sensor networks using a simple crt-based packet-forwarding solution,” ieee/acm transactions on networking, vol. 20, no. 1, pp. 191–205, 2012. [8] a. zonouz, l. xing, v. vokkarane, and y. sun, “reliability-oriented single-path routing protocols in wireless sensor networks,” ieee sensors journal, vol 14, no. 11, pp 4059-4068, june 2014. [9] j. niu, l. cheng, y. gu, l. shu, and s. das, “r3e: reliable reactive routing enhancement for wireless sensor networks,” ieee transactions on industrial informatics, vol. 10, no. 1, pp. 784–794, 2014. [10] a. m. kamal, c. j. bleakley, and s. dobson, “failure detection in wireless sensor networks: a sequence-based dynamic approach,” acm transaction on sensor networks (tosn), vol. 10, 2014. [11] f. viani, p. rocca, m. benedetti, g. oliveri, and a. massa, “electromagnetic passive localization and tracking of moving targets in a wsn-infrastructured environment,” inverse problems, vol. 26, no. 074003, pp. 1-15, 2010. [12] ch. blum, d. merkle, “swarm intelligence introduction and applications,” natural computing series, springer, berline, 2008. [13] r. r. mccune and g. r. madey, “control of artifial swarms with dddas,” in proceedings of the 14th international conference on computational science (iccs), elsevier, vol. 29, pp. 1171-1181, 2014. [14] a. r. sardar, m. singh, r. r. sahoo, k. majumder, j. k. sing, and s. k. sarkar, “an efficient ant colony based routing algorithm for better quality of services in manet,” ict and critical infrastructure: in proceedings of the 48th annual convention of computer society of india-vol i, advances in intelligent systems and computing, springer lncs, vol. 248, pp. 233-240, 2014. [15] p. rocca, m. benedetti, m. donelli, d. franceschini, and a. massa, “evolutionary optimization as applied to inverse problems,”, inverse problems 25th year special issue of inverse problems, invited topical review, vol. 25, pp. 1-41, dec. 2009. [16] m gunes, u sorges, i bouazzi, “ara-the ant-colony based routing algorithm for manets,” international workshop on ad hoc networking, pp. 79-85, 2002. [17] d. zhang, g. li, and k. zheng, “an energy-balanced routing method based on forward-aware factor for wireless sensor network”, ieee trans. on industrial informatics, vol. pp, no. 99, 2013, pp.1. [18] w. jianguo, w. zhongsheng, s. fei, and s. guohua, “research on routing algorithm for wireless sensor network based on energy balance”, in proceedings of the industrial control and electronics engineering (icicee '12), 2012, pp. 295-298. [19] a. m. s. almshreqi, b. f. a. rasid, a. ismail, and p. varahram, “an improved routing mechanism using bio-inspired for energy balancing in wireless sensor networks”, in proceedings of the information network (icoin '12), 2012, pp. 150-153. http://www.cs.ucf.edu/~turgut/courses/classreviewpapers/p51-pottie.pdf http://www.intechopen.com/books/references/contemporary-issues-in-wireless-communications/evolutionary-algorithms-for-wireless-communications-a-review-of-the-state-of-the-art#b48 http://www.intechopen.com/books/references/contemporary-issues-in-wireless-communications/evolutionary-algorithms-for-wireless-communications-a-review-of-the-state-of-the-art#b48 http://ieeexplore.ieee.org/xpl/articledetails.jsp?tp=&arnumber=6322374&ranges%3d2012_2013_p_publication_year%26querytext%3denergy+balance+routing+in+wireless+sensor+networks http://ieeexplore.ieee.org/xpl/articledetails.jsp?tp=&arnumber=6322374&ranges%3d2012_2013_p_publication_year%26querytext%3denergy+balance+routing+in+wireless+sensor+networks http://ieeexplore.ieee.org/xpl/articledetails.jsp?tp=&arnumber=6164367&ranges%3d2012_2013_p_publication_year%26querytext%3denergy+balance+routing+in+wireless+sensor+networks http://ieeexplore.ieee.org/xpl/articledetails.jsp?tp=&arnumber=6164367&ranges%3d2012_2013_p_publication_year%26querytext%3denergy+balance+routing+in+wireless+sensor+networks swarm intelligence based reliable and energy balance routing algorithm for wireless sensor network 355 [20] k. yu, m. gidlund, j. akerberg, and m. bjorkman, “reliable rss-based routing protocol for industrial wireless sensor networks”," in proceedings of the 38th annual conference of the ieee industrial electronics society (iecon), canada, october, 2012. [21] j. niu, l. cheng, y. gu, l. shu, s.k. das, “r3e: reliable reactive routing enhancement for wireless sensor networks”, ieee trans. on industrial informatics, vol.pp, no.99, 2013, pp.1. [22] d. sahin, s. bulbul, v.c. gungor, t. kocak, “reliable routing in wireless sensor networks for smart grid environments”, in proceedings of the 20th ieee conf. on signal processing and communications applications (siu), 2012, pp. 1-4. [23] a. el ghazi, b. ahiod, and a. ouaarab, “improved ant colony optimization routing protocol for wireless sensor networks,” in p. g. noubir and m. raynal (eds.): netys 2014, pp. 246-256, springer, heidelberg, 2014. [24] f. ren, s. k. das, and c. lin, “traffic-aware dynamic routing to alleviate congestion in wireless sensor networks,” ieee transactions on parallel and distributed systems, vol. 22, no. 9, september 2011. [25] s. yaessad, l. bouallouche, and d. aissani, “a cross-layer routing protocol for balancing energy consumption in wireless sensor networks“ wireless pers. commun., springer, 2014. [26] d. qian, h. chen, w. wu, and l. cheng, “swarm intelligence based energy balance routing for wireless sensor networks”, in proceedings of the 2nd international symposium on intelligent information technology application, vol. 2, pp.811-815, 2008. [27] x. baoshu, and w. hui, “a reliability transmission routing metric algorithm for wireless sensor network”, in proceedings of the ieee international conference e-health networking, digital ecosystems and technologies (edt), vol.1, pp.454 – 457, 2010. [28] s. b. kootkar, “reliable sensor networks”, m.s. thesis, dept. comp. eng., tu delft univ., delft, netherlands, 2008. [29] l. cheng, j. nia, j. cao, s. k. das, and y. gu, “qos aware geographic opportunistic routing in wireless sensor networks”, ieee trans. on parallel and distributed systems, 2014. [30] g. s. sharvani, n. k. cauvery, t. m. rangaswamy, “different types of swarm intelligence algorithm for routing,” in proceedings of the ieee international conference on recent technologies in communication and computing (artcom), kottyam, kerala, india, pp.604 – 609, 2009. [31] v. k. verma, s. singh, and n. p. pathak, “analysis of scalability for aodv routing protocol in wireless sensor networks,” optik—international journal for light and electron optics, vol. 125, no. 2, pp. 748– 750, 2014. [32] d. jian, “cloud model and ant colony optimization based qos routing algorithm for wireless sensor networks,” y. wu (ed.): international conference on wtcs 2009, aisc 116, pp. 179–187, springer, heidelberg, 2012. http://link.springer.com/chapter/10.1007/978-3-319-09581-3_17 http://link.springer.com/chapter/10.1007/978-3-319-09581-3_17 http://www.informatik.uni-trier.de/~ley/pers/hd/s/sharvani:g=_s=.html http://www.informatik.uni-trier.de/~ley/pers/hd/c/cauvery:n=_k=.html instruction facta universitatis series: electronics and energetics vol. 27, n o 1, march 2014, pp. 25 39 doi: 10.2298/fuee1401025p determination of actual reduction factor of hv and mv cable lines passing through urban and suburban areas  ljubivoje m. popović school of electrical engineering, beograd, serbia abstract: the paper presents the method for determination of ground fault current distribution in the cases when feeding cable lines are passing through urban and/or suburban areas, or when many relevant data are uncertain, or completely unknown. the problem appears as a consequence of the fact that many of surrounding urban installations are situated under the surface of the ground and cannot be visually determined or verified. on the basis of on site measurements, the developed method enables compensation of all deficiencies of the relevant data about metal installations involved with the fluctuating magnet field appearing along and around of a power line during an unbalanced fault. the presented analytical procedure is based on the fact that certain measurable quantities cumulatively involve the inductive effects of all, known and unknown surrounding metal installations. the performed quantitative analysis points on at the significance of taking into account the existence of surrounding metal installations. key words: substation, grounding system, ground fault current, inductive influence 1. introduction the fault current during an earth fault in a power network has at least two alternative paths for returning to the source which feeds the fault. because of that, each ground fault current has at least two fractions. one of them is injected into surrounding earth through the grounding system of a supplied substation, while the other is returning to the source of origin through the neutral conductor(s) of the feeding line [1]. the first one produces all potentials and potential differences (touch and step voltages) relevant for the safety conditions on the grounding system of a supplied substation, while other causes the thermal stress on the neutral conductor(s) of the feeding line. thus, the correct estimation of the ground fault distribution is of prime importance for correct designing of the grounding system of a supplied substation and correct selection of the feeding line neutral conductor(s).  received december 24, 2013 corresponding author: ljubivoje m. popović school of electrical engineering, beograd, serbia (e-mail: ljubivoje@beotel.net) 26 lj. popović with the aim of defining the influence of available return paths on the ground fault current distribution, a special parameter of feeding line is introduced in professional literature, including technical standards [2]. this parameter is called the reduction factor of the feeding line and is defined as the ratio of the part of the ground fault current returning through earth and the total ground fault current. by this, it is assumed that the grounding impedances at both line ends are negligible (e.g. [2]). under such assumption the fault current(s) in the line neutral conductor(s) is a consequence solely of inductive coupling between this/these conductor(s) and the phase conductor through which the total ground fault current passes. with the aim of solving the problem of determination of ground fault current distribution, an extensive and continuous research work has been done in the last at least five decades. the firstly developed methods for solving this problem relate to the case when the feeding line is constructed as an overhead line, e.g. [3]-[5]. somewhat later, the papers considering this problem in special cases, i.e. when a feeding line appears as a longitudinal combination of one cable and one overhead section, have been published, e.g. [6], [7]. also, several methods have been developed for solving the problem in cases where, because if high local soil resistivity and/or a high short-circuit level, special measurements (e.g. bare copper conductor laid in the same trench as the cable returning line, counterpoises, etc.) in the aim of reducing the part of ground fault current returning exclusively through the earth are considered indispensable [8]-[10]. then, the papers [11], [12] present the method developed for determination of the ground fault current distribution when a feeding cable line is constructed of three single-core cables. the method enables taking into account the participation of all three metal sheaths on ground fault current distribution for the fault at any place along the line. the common characteristic of the mentioned methods is that the value of the reductions factor depends only on design/constructive characteristics of a feeding line and on characteristics of the surrounding earth as a conductive medium. thus, none of the mentioned methods enables obtaining the solution of this problem when it appears in urban and suburban conditions, when many other metal installations participate in the ground fault current returning to the power system. in such surroundings each powercable line represents a very complex electrical circuit with many conductively and inductively coupled elements having uncertain or completely unknown parameters. the problem of determination of the influence of metal installations surrounding a feeding cable line on ground fault current distribution through the grounding system of a supplied substation is considered and solved for the first time in [14]. the solution is achieved by substituting all surrounding metal installations by one, from the standpoint of the ground fault current distribution, equivalent conductor of cylindrical form placed around and along each considered cable line. somewhat later, the achieved solution extended to include the overhead distribution lines [15]. the investigation results presented in [14], [15] show that part of the ground fault current flowing through the earth, in typically urban environments, is three to five times smaller than it has been considered earlier. the developed method enables a new insight into the whole grounding problem of urban hv/mv substations and dramatically changes our perception about the magnitude of this problem. it can be seen in realistic frameworks and solved in each concrete case without any redundant expenditure. determination of actual reduction factor of hv and mv cable lines passing through urban and... 27 the developed method simultaneously gives possibility of solving another problem of the current engineering practice that has also not been solved in the past. this is the problem of determination of the feeding line series impedance without ignoring the fact that the surrounding metal installations are attendant. namely, on the basis of the imagined physical appearance of the introduced equivalent conductor, it is not difficult to see that this conductor acts as an additional neutral conductor of each distribution line passing through urban and suburban areas and, in accordance with this, improves its transfer characteristics [16]. this paper presents the theoretical foundations of the mentioned method in the more transparent and complete manner and introduces certain improvement in the development procedure of the calculative part of the method. this improvement enables a more direct and easier determination of the actual ground fault current part dissipated through the grounding system of the supplied substation into the surrounding earth and returning to the power system only through the earth. 2. problem description the increasing sizes of modern distribution networks, as well as the higher operating and short-circuit currents of these networks, have been matched by over-spreading networks of earth return circuits (different pipelines, different cable and overhead line neutrals, etc.) close to hv and mv distribution lines. space dispositions of all of these installations, determined mainly by dispositions of city streets, and small mutual distances result in an inductive and, in the vicinity of substations, conductive coupling of different network types. the usage of common routes (mainly street pavements) for positioning various supply networks (electricity, water, gas, oil, telecommunications, etc.) unavoidably leads to the appearance of their mutual interaction that should be determined in many concrete cases. the whole problem has at least three different aspects important for the current engineering practice. they are:  influence of the surrounding metal installations on the fault current dissipated into the surrounding earth through the grounding system of a supplied substation [14], [15],  influence of the surrounding metal installations on the transfer characteristics of power lines passing through urban and suburban areas [16], and  inductive influence of an hv power line on each of the surrounding metal installations considered separately. one of the main parameters for estimation of these mutual interactions, soil resistivity of the surrounding area, can not be determined exactly. although there are several methods to measure the soil resistivity [1], no one of them can be practically applied in urban conditions. the reason stems from the fact that the surface of urban areas are already covered/occupied by buildings, streets, pavements and many other permanently constructed objects; while under the ground surface many known and unknown metal installations already exist. many urban metal installations of different basic functions are usually situated in a relatively small space, like: sheaths of different types of cable lines, neutral conductors of the low voltage network, steel water pipes, building foundations, etc. some of them are 28 lj. popović not in direct contact with the earth, while the others are in an effective and continuous contact with the earth. they are interconnected and their spatial dispositions are different in each concrete case and vary along any of the distribution lines. also, most of them are laid under the street pavements and many relevant data about them cannot be notified and visually determined or verified. grounding system of a distribution hv/mv substation consists of the substation grounding electrode and many outgoing mv cable lines acting as external grounding electrodes, and/or conductive connections with the grounding systems of the supplied mv/lv substations [17]. the spontaneously formed grounding system involves a large urban area around an hv/mv substation. such grounding system includes, through the terra-neutral (tn) grounding system in the low-voltage (lv) network and consumer installations, many, known and unknown, metallic installations typical for an urban area. as a consequence, the outgoing cable lines simultaneously become the conductive connections with the metal installations laid along the same street(s) as the feeding line. thus, it is not difficult to notice that in the case of unbalanced operating conditions two, essentially different, currents flow out of the power system. one of them is dissipated into the surrounding earth through the grounding system of the supplied substation, while the other is induced in the metal installations surrounding the feeding line. as an illustration, these two, mutually separated fault currents, when a ground fault occurs in the substation supplied by a three-phase line, are presented in fig. 1. fig. 1 current fractions passing through the elements of the grounding systems the used notation has the following meaning: a – supply substation, f – ground fault place (supplied substation), if – ground fault current, ii – ground fault current component induced in metal installations surrounding the feeding line, and ie – ground fault current component injected into the earth. both of the presented ground fault current fractions leave power system through the elements of the grounding system of the supplied substation, f. the current, ie, is injected into the earth, while the other, ii, circulates only through the surrounding metal installations that are foreseen for some others purposes. because of that all potentially dangerous and harmful influences of power cable lines and supplied substations on their environment emanate from these two ground fault current components. accordingly, determination of actual reduction factor of hv and mv cable lines passing through urban and... 29 determination of these two currents is of prime importance for estimating the safety conditions within, and in the vicinity, of a supplied substation and inductive influence of a feeding line on the neighboring parallel circuits (pipeline, telecommunication line, etc). since the process of splitting to these two fractions occurs along many external grounding electrodes and under the surface of the ground, none of these components can be separated and determined by direct or indirect measurements [14], [15]. each of the metal return paths, together with earth as the common return path, forms an electrical circuit, while all together these paths form a large number of conductively and inductively coupled circuits. since these circuits can be represented by the corresponding system of equations and since the analytical expressions necessary for the self and mutual impedances of metal conductors are known [2], it can be said that the considered problem has been in principle solvable long ago. however, it is not possible because of many practical difficulties and limitations in collecting for calculations necessary data. thus, the problem can be defined as follows: how to find the method enabling the compensation of the lack of huge number of relevant data? 3. experimental investigation the experimental investigations of the influence of metal installations, typical of urban areas, on value of the feeding line reduction factor are performed on a cable line that supplies, in series, two substations of the 110 kv distribution network in belgrade, serbia [14]. length of the line to the closer of the two supplied substations, measured from the supply substation, is 2320 m, while the feeding line length to the more distant substation is 6590 m. the line is realized by xhlp cables having mutually identical design parameters, laid in a triangular formation over the entire line length. the crossbounding technique necessary for reduction of circulating currents was not applied. in the areas through which the line is passing the specific soil resistivity is estimated on the basis of the main geological characteristics of the involved area, the only possible way for doing this in urban areas. the roughly estimated equivalent soil resistivity of the entire area is within the range from 30 to 50 ωm. the line section between the supply and the transit (nearer) substation passes through the area with a lower degree of urbanization compared with the rest of the line. the phase conductors are made of aluminum of a cross-section of 1000 mm 2 , while the metallic sheaths are made of copper strings of a total cross-section of 95 mm 2 and a medium diameter of 91 mm. the main elements of the grounding system in both supplied substations are the station building foundation and the 44 mv outgoing cable lines performed by cables with uncovered metal sheaths. the grounding impedances of the supply substation and of both of the supplied substations, determined by standard measurements, were found to be between 0.02 and 0.03 ω. since these impedances are very small compared to the other parameters influencing measured values of the reduction factor, they are completely disregarded. the described line is used to obtain the experimental results for two different feeding cable lines, one 2320 m and the other 6590 m long. this was achieved in the following manner. the longer line is obtained as a continuous cable line along its entire length by disconnecting its metal sheaths from the grounding electrode of the transit substation. the necessary measurements are performed by using the test circuit schematically represented in fig. 2. 30 lj. popović fig. 2 measurement circuit the used notation has the following meaning: a and b – substations connected through the tested cable line, ua – auxiliary voltage source, it – test current, is – current induced in the cable sheath, ic – total current induced in the surrounding metal installations, ie – current injected into the earth through the grounding system of substation b, za (zb ) – impedance of the grounding system of substation a(b), and g – remote ground. the influence of the surrounding metal installations has been observed by measuring value of the self impedance of one of the line phase conductors and by using the known analytical expressions for this impedance that is, according to e.g. [2], given by ω/km,;ln 428 '' 00          ph r phph r jrz       (1) where r'ph – phase conductor resistance per unit length, /km, rph – outer radius of the phase conductor, m,  – angular frequency = 2f, 0 – magnetic permeability of vacuum, 4∙10 –7 vs/am, and r – relative magnetic permeability. (prime (') denotes values per unit length). the equivalent earth penetration depth  is determined by the following expression ,m;658 f    (2) where  – equivalent soil resistivity along the cable line in ωm, and f – test circuit frequency. determination of actual reduction factor of hv and mv cable lines passing through urban and... 31 here, it should be mentioned that these expressions are based on carson's theory of the current return path through the earth. they have been derived under the assumptions that the power line is laid in a homogeneous soil of a resistivity equal to the equivalent resistivity of the normally heterogeneous (multilayer, with each layer having different resistivity) soil and that there are no other metal installations in the vicinity of the line. however, these assumptions do not correspond to the described actual situation. since the considered cable lines pass through areas covered by a spontaneously formed network of different underground metal installations, the measured quantities are affected by the conductive and inductive couplings of all, known and unknown, surrounding metal installations. thus, measurements can give only an apparent value of the self impedance of the line phase conductor. by simulating single-phase ground fault in the supplied substation and by disconnecting all three cable sheaths from the grounding electrode at one of the line ends (fig. 2), the following values for the apparent phase conductor self-impedance were obtained  zpha = (0.1819 + j1.0243) ω, for the line 2320 m, and  zpha = (0.5167 + j2.6260) ω, for the line 6590 m. the values of the line reduction factor obtained only on the bases of the measured values of the currents appearing in the cable sheaths were the following  r = 0.0473 – j 0.1565, for the shorter line, and  r = 0.0637 – j 0.1724, for the longer line. when the apparent values of impedance z'pha are determined, the only unknown parameters in the given expressions, (1) and (2),  and δ can be obtained by using these equations. however, these parameters in the considered cases involve, besides local earth characteristics, the constructive characteristics and mutual space disposition of all others available return paths. according to this, they should be defined in somewhat different way. a new definition of the parameter is that it represents the apparent equivalent soil resistivity, because it involves the conductive and inductive influences of all return paths, while the parameter δ is the equivalent penetration depth all return paths because it involves the conductive and inductive influences of all return paths. thus, the new notation which expresses this new meaning is  and δa. the corresponding values of the newly defined parameters are:  δa = 20.9 m, or ρa = 0.052 ωm, for the shorter line and  δa = 10.88 m, or ρa = 0.0137 ωm, for the longer line. the obvious difference between the two values can be explained by the fact that the shorter line passes through an area of a lower degree of urbanization, e. i. along lower number of surrounding metal installations. since the design data concerning the tested lines, as well as the newly defined parameters  or δa are known, the analytical expression, for the reduction factor in the case of single-core cables laid in a triangular formation, can be tested. this expression has, according to [13], the following mathematical form 32 lj. popović 3 2 00 ln 2 3 8 3' ' dr jr r r s a s s         (3) where r's  metal sheath resistance per unit length, /km, d  distance between two adjacent cables (or, diameter of the single-core cable) in the case of triangular formation, and rs  medium radius of the cable sheath. by applying the above expression to the considered cable lines one obtains:  r = 0.0471 – j0.1537, for the shorter and  r = 0.0589 – j0.1693, for the longer line. it can be seen that the results obtained by the applied expression and the results obtained by the measurements are in good agreement, practically identical. in this way the experimental proof has been obtained for the accuracy of the given analytical expression, but under the unreal assumption that the surrounding metal installations do not exist. based on the experimental results, the following facts can be still noted. the apparent equivalent soil resistivity obtained in this way is drastically lower compared to the realistic soil resistivity (between 30 and 50 ωm). the corresponding values of the reduction factor are by 52,1% and 69.6 % higher than the value obtained with the approximately estimated equivalent specific soil resistivity, r( =30 ωm)= 0.0204 – j0.1037. this can be explained by the fact that the presence of the nearby underground metal installations reduces not only of the fault current flowing through the earth, but also the currents flowing through the cable line sheaths. for obtaining a more complete insight into the influence of metal installations surrounding a feeding line, currents through the cable sheaths were also experimentally investigated. at first, a ground fault current distribution was observed when only the sheath of the cable with the ground fault current was connected to both grounding electrodes at the line ends. then another situation was considered, i.e. when one more cable sheath was connected to both of the grounding electrodes at the line ends. finally, normal operating conditions have been observed, i.e. when all three metal sheaths were grounded at the line ends. the measurement results show that the successive increase of the number of grounded metal sheaths (fig. 2) reduces not only the current flowing through the earth, but also the relative participation of each of the already connected sheaths in reducing the fault current through the earth. these relative reductions in the case of the sheath of the cable with simulated ground fault are from 60.66% to 46.31% and from 46.31% to 37.49%, respectively. these results are helpful for understanding the influence of the surrounding metal installations. since the reduction factor is defined as the ratio ie/if, for obtaining the actual value of the reduction factor the presence of the surrounding metal installations has to be taken into account. determination of actual reduction factor of hv and mv cable lines passing through urban and... 33 4. method development on the basis of the former analysis, it is clear that each power line passing through urban and/or suburban areas during a ground fault represents a very complex electrical circuit. this circuit consists of a large number of mutually conductively and inductively coupled electrical circuits with common return path through the earth. the number of these circuits, if the line phase conductors are excluded, is equal to the number of the line neutral conductors enlarged by the number of surrounding metal installations. if again the cable line presented in fig. 2 is considered, but now, with all metal sheaths grounded at both line ends, and if it is assumed that the total number of surrounding metal installations, including the neutral conductor, is equal to an arbitrary number n, then this line can be represented by the equivalent circuit shown in fig. 3. an arbitrary current in fig. 3, in, induces in an also arbitrary (m th ) current circuit, voltage, umn, determined by nmnmn izu  (4) where zmn – mutual impedance between two arbitrary (n th and m th ) surrounding metal installations (circuits), fig. 3. it is well known that distribution substations are located in areas occupied by many underground metal installations, acting as perfect grounding electrodes [17]. thus, grounding impedances za and zb can be neglected (za ≈ 0 and zb ≈ 0). because of that, the fault currents appearing in the cable line metal sheaths are a consequence solely of the inductive influence of the ground fault current appearing in the faulted phase conductor [15]. this is in accordance with the formerly mentioned reduction factor definition. also, for further considerations it is necessary to mention that the current directions shown in fig. 3 are taken arbitrarily. on the basis of the equivalent circuit presented in fig. 3 it is possible to write the system of (n + 1) equations and, for the known the values of ua and all self and mutual impedances, determine in the given equivalent circuit each of the presented currents. unfortunately, because of the previously mentioned practical difficulties and limitations the parameters of the surrounding metal installations necessary for determination of these impedances should be treated as unknown quantities. thus, for solving the problem of determining the current circulating through the earth, ie, and reduction factor of the considered line a completely new approach is necessary. the problem is solved in [14] by measuring currents it and i1 (fig. 2) and by substituting all surrounding metal installations by one equivalent conductor that is imagined as a cylinder placed around and along the entire feeding line. here, for the sake of simplifying the necessary calculation procedure, it will be assumed that this conductor also involves 34 lj. popović fig. 3 complete line equivalent circuit the used notation has the following meaning ua – auxiliary voltage source, un0, un1, un2, …, unn – voltages induced in an arbitrary (n-th) circuit by the current in each of the surrounding circuits (metal installations), it – test current through the phase conductor of the considered line, i1,– current through the sheath of the cable with the current it, i2, i3 – currents through the metal sheaths of the other two single-core cables, i4, i5, i6 , … ,in, …, in currents induced in the individual surrounding metal installations z1, z2, z3 – selfimpedances of cable metal sheaths, and z4, z5, z6, … , zn – self-impedances of the individual surrounding metal installations. two sheaths of the remaining single-core cables through which the ground fault current do not circulate. under such assumption the considered cable line is transformed into the physical model whose cross-section can be represented as shown in fig. 4. determination of actual reduction factor of hv and mv cable lines passing through urban and... 35 fig. 4 cross section of the introduced line model for the equivalent conductor imagined in such manner, the corresponding equivalent circuit of the entire cable line has the appearance as shown in fig. 5. fig. 5 equivalent circuit of the introduced line model the used notation has the following meaning u0c , u1c – voltages that current ic induces in the phase conductor and its metal sheath, uc0 , uc1 – voltages that currents it and i1 induce in the equivalent conductor. the relevant parameters of the assumed equivalent conductor will be determined under condition that currents it and i1 in fig. 3 remain unchanged after introducing the equivalent conductor, figs. 4 and 5. by using the equivalent circuit presented in fig. 5 this condition can be expressed by the following system of equations 0.2 0.1 110 11101   ccctc cct iziziz iziziz (5) 36 lj. popović where zc – self impedance of the equivalent conductor, and z1c – mutual impedance between the equivalent conductor and the cable sheath. impedances z1 and z01 are determined, according to e.g. [14], by the following expressions: s s r jrz       ln 28 00 1  ; ω/km (6) s r jz       ln 28 00 01  ; ω/km (7) for the adopted physical appearance of the equivalent conductor, the analytical expressions for impedances zc, z0c, and z1c are c cc r jrz       ln 28 00  ; ω/km (8) c cc r jzz       ln 28 00 10  ; ω/km (9) where rc – medium radius of the cylinder representing the equivalent conductor, and r'c – equivalent conductor resistance per unit length, /km. since currents it and i1 represent the known quantities, obtained by measurements (fig. 2), the condition given by (5) can be modified to the folowing simpler form tcc ccc i i zzz zzzz 1 2 11 0110    (10) in the given expression the only unknown quantities, according to (6), (7), (8), and (9), are r'c and rc. since (10) gives the relationship between complex quantities, it can be presented in the form of the following system of two equations   tccc izzzz 0110 re  =   1 2 11 re izzz cc  (11)   tccc izzzz 0110 im  =   1 2 11 im izzz cc  after determining the relevant parameters of the equivalent conductor (r'c and rc) relations (8) and (9) can be used for obtaining: z1c, z0c and z1c, as well as: ic and ie. then, in accordance with the equivalent circuit in fig. 5 and the reduction factor definition, the feeding line reduction factor is determined by     2 11 0110111 cc cccc t e zzz zzzzzz i i r    , (12) or, according to (8), and (9), in somewhat more compact form determination of actual reduction factor of hv and mv cable lines passing through urban and... 37   2 11 01101 ' 1 cc ccc zzz zzzzr r    (13) condition defined by (5) means at the same time:    n n nnccc izizu 2 000 , and (14)    n n nnccc izizu 2 111 (15) on the basis of (14) and (15) it is clear that the introduced equivalent conductor substitutes many known and unknown relevant parameters. data about actual reduction factor of a distribution line, or about actual ground fault current distribution in a supplied substation, are usually necessary before the line has been constructed. in these cases, for performing of the previously defined measurements one can utilize a provisory cable line posed on the surface of the soil along the foreseen path of the future line. this can be done by using any single-core cable suitable for this purpose in different, practically possible conditions and by using calculation procedure described here. since the metal installations in urban areas are mainly under the soil surface, the considered inductive influence will be slightly smaller [16]. according to the developed analytical procedure, the presented method enables determination of the reduction factor of any type of cable lines and takes into account all relevant factors and parameters, including even those whose contribution is negligibly small. it means that this method enables a correct problem solution for each, including extremely complex, practical situation. some inaccuracy can appear only as a consequence of the inductive influence of the nearby power distribution lines. this influence can be efficiently avoided by using the test current of somewhat higher frequency that can easily be discriminated from the omnipresent power frequency (e.g. [15]). the introduced error is small and gives the final results that are slightly on the safe side. although there are several methods of measurement of soil resistivity (e.g. [1]), no one of them is applicable in urban conditions (e.g. [15]). the reason stems from the fact that the surfaces of urban areas are already covered/occupied by buildings, streets, pavements, and many other permanently constructed objects; while under the ground surface many known and unknown metal installations already exist. thus, one is forced to adopt an approximate value of equivalent soil resistivity, based on the main geological characteristics of the relevant area, and use it in the calculation part of the developed method. here, the favorable circumstance is that the selfand mutual impedances are, according to the given equations, only slightly dependent on the equivalent soil resistivity and a more accurate data about this parameter does not bear any practical significance. it is sufficient to know that, within the framework of actually possible values, the lowest one should be preferred because it gives final results that are slightly on the safe side. 38 lj. popović 5. quantitative analysis by using the previously described method we obtain that the reduction factor taking into account the influence of the surrounding metal installations in the cases of the considered lines is: – r = – 0.0267 – j0.0245, or | r | ≈ 0.036, for the shorter, and – r = – 0.0225 – j 0.0170, or | r |≈ 0.029, for the longer line. it can be seen that its effective values are by 65.9% and 72.6% lower than the value obtained without taking into account the surrounding metal installations ( | r | ≈ 0.105). if it is assumed that the cables in the considered cases are laid in a flat formation and at a distance of d = 0.5 m, one obtains r = – 0.0275 – j0.0297, or | r | = 0.0405 for the shorter and r = – 0.0232 – j 0.0202, or | r | = 0.0308, for longer line. the differences are still greater in comparison with the previously given results of measurements; the obtained values are lower in this case: 78.0% and 84.2%, respectively. obviously, disregarding the influence of metal installations surrounding a feeding line, as well as determining the reduction factor only by measuring currents through the cable sheaths give results that are excessively conservative. bearing in mind that this also means the reduction by the same ratio of all potentials appearing on the grounding systems of the supplied substations, one can conclude that the results of this analysis throw a completely new light on the grounding problem of the supplied substations. also, having in mind the similarity of the urban conditions all over the world, this conclusion can be treated as generally valid for the safety conditions of the distribution substations supplied by cable lines. certainly, greater economical effects can be expected in cases where, because of a high soil resistivity and/or a high short-circuit current level, special measurements (e.g. bare copper conductor laid in the same trench as the cable feeding line, counterpoises, etc.) were considered necessary. besides, one can expect elimination of the strict requirement for the application of expensive mv cables acting as grounding conductors (cables with an uncovered sheath), as was the case with the mv network of beograd. the only difficulty arises from the fact that the actual ground fault current distribution depends on the metal installations laid in the area through which the feeding line passes. it practically means that for obtaining actual ground fault current distribution, each concrete distribution line should be considered separately. 6. conclusions the presented method enables taking into account the favorable influence of urban metal installations surrounding hv and mv distribution cable lines on the ground fault current distribution in the supplied substation(s). since the cable lines are almost without any exception applied in urban areas and since the effect of the surrounding metal installations are considerable, the presented method can by used as a foundation for the revision of the current version of the international technical standard [2]. determination of actual reduction factor of hv and mv cable lines passing through urban and... 39 references [1] ieee guide for safety in ac substation grounding, ieee std. 80, 2000. [2] short-circuits currents in three-phase a.c. system-part 3:currents during two separate simultaneous line-to-earth short circuit and partial short-circuit flowing trough earth, iec int. std. 60909-3, ed. ii, 2003. [3] j. endreny, "analysis of transmission tower potentials during ground faults", ieee trans. power app. syst., vol. pas-86, no. 10, pp. 1274-1283, october 1967. [4] f. dawalibi and g. niles, "measurements and computations of fault current distribution on overhead transmission lines", ieee trans. power app. syst. ,vol. pas-103, no. 3, pp. 553-560, march 1984. [5] lj. m. popović: "practical method for evaluating ground fault current distribution in station, towers and ground wire", ieee transactions on power delivery, vol. 13, no. 1, january 1998, pp. 123 129. [6] s. t. sobral, j. o. barbosa and v. s. costa, "grpunding potential rise characteristics of urban step-down substations fed by power cables a practical example", ieee transactions on power delivery, vol. 3, no. 2, pp. 1564 -1572, apr. 1988. [7] s. mangione, "a simple method for evaluating ground-fault current transfer at the transient station of a combined overhead cable line", ieee transactions on power delivery, vol. 23, no. 3, pp. 14131418, july 2008. [8] j. fortin, h.g. sarmieto, d. mukhedkar, "field measurements of draund fault distribution at lg-2", quebec, ieee transactions on power delivery, pwrg-1, vol. 3, pp. 48-60, 1986. [9] lj. m. popović: "efficient reduction of fault current through the grounding grid of substation supplied by cable line", ieee transactions on power delivery, vol. 15, no. 2, april 2000, pp. 556561. [10] a. campuccia and g. zizzo, "a study of the use bare buried comductors in an extended interconnected earthing systems inside a mv network", 18 th international conference on electricity distribution, cired, turin, 6-9 june 2005 [11] lj. m. popović, "practical method for evaluating ground fault current distribution in station supplied by an unhomogeneous line", ieee transactions on power delivery, vol. 12, no. 2, april 1997, pp. 722-727. [12] lj. m. popović, "determination of the reduction factor for feeding cable lines consisting of three single-core cables", ieee transactions on power delivery, vol.18, no.3, july 2003, pp.736-744. [13] lj. m. popović, improved analytical expressions for the determination of the reduction factor of the feeding line consisting of tree single-core cables", european transactions on electrical power, 2008, 18, pp. 190 – 203. [14] lj. m. popović, "influence of the metal installations surrounding the feeding cable line on the ground fault current distribution in supplied substations", ieee trans. on power delivery, vol. 23, no. 4, october 2008, pp. 2583-2590. [15] lj. m. popović, "testing and evaluating grounding systems for substations located in urban areas", iet generation, transmission & distribution, vol. 5, no. 2, february 2011, pp. 231-238 [16] lj. m. popović, "transfer characteristics of electric-power lines passing through urban and suburban areas", international journal of electrical power & energy systems, ijepes, vol. 56, mart 2014, pp. 151-158. [17] lj. popovic: "comparative analysis of grounding systems formed by mv cable lines with either uninsulated or insulated metal sheath(s)", electric power system research, vol. 81. no. 2, february 2011, pp. 393 – 39. plane thermoelastic waves in infinite half-space caused facta universitatis series: electronics and energetics vol. 31, n o 1, march 2018, pp. 63 74 https://doi.org/10.2298/fuee1801063f prediction of annual energy production from pv string under mismatch condition due to long-term degradation * miodrag forcan university of east sarajevo, faculty of electrical engineering, east sarajevo, bosnia and herzegovina university of belgrade, faculty of electrical engineering, belgrade, serbia abstract. reduction of long-term degradation effects represents a long-time challenge in photovoltaic (pv) manufacturing industry. modelling of long-term degradation types and their impact on maximum power of pv systems have been analysed in this article. brief guidelines for pv cell-based modelling of pv systems have been illustrated. special study case, pv string consisting of 12 pv modules, has been modelled in order to determine degradation and mismatch power losses. modified methodology for prediction of annual energy production from pv string, based on horizontal irradiation and ambient temperature experimental measurements at the location of belgrade, has been developed. coefficient named “degradation factor” has been introduced to include and validate degradation power losses. economic considerations have indicated evident money income reduction, as a consequence of lower annual energy production related to long-term degradation. key words: pv string, energy production, long-term degradation, degradation factor, mismatch losses 1. introduction precise determination of annual energy production from pv systems is very difficult to achieve, mostly due to variable operating conditions (irradiation and ambient temperature) [1]. electricity production is closely related to conversion efficiency, which represents one of the most important parameters when discussing pv systems [2]. the new materials are being constantly developed with purpose of increasing conversion efficiency and mitigating degradation effects. according to research, presented in [3], organic materials with pv properties have proved to be one of the most promising solutions. meanwhile, conventional silicon materials remain the most widely used in field applications. received january 31, 2017; received in revised form september 18, 2017 corresponding author: miodrag forcan university of east sarajevo, faculty of electrical engineering, vuka karadzica 30, 71126 lukavica, 71123 east sarajevo, republic of srpska, bosnia and herzegovina (e-mail: miodrag.forcan@live.com, miodrag.forcan@etf.unssa.rs.ba) * an earlier version of this paper was presented at the 2 nd virtual international conference on science, technology and management in energy (energetics 2016), 22-23 september, 2016, in niš, serbia [1]. 64 m. forcan conventional methods for prediction of energy production from pv systems usually use hourly-averaged horizontal irradiation and ambient temperature measurements for specific locations [4-6]. one of their main shortcomings is neglecting of long-term degradations related to encapsulating material, e.g. delamination, discoloration and corrosion. according to various research results [7-12], it has been found that long-term degradation effects can often reduce pv system’s power up to 15-20% during lifetime exploitation period. this article is organized as follows. the second chapter covers basic facts related to most common types of long-term degradation. in the third chapter, modelling guidelines for pv systems and degradation types are presented. pv module degradation effects, under variable irradiation and temperature condition, have been analysed with results presented in the fourth chapter. the fifth chapter presents study case dedicated to pv string power reduction due to long-term degradation. modified methodology for prediction of annual energy production from pv string and financial income, based on introduction of degradation factor, has been investigated in sixth chapter. valuable conclusions are pointed out in final chapter. 2. long-term degradation of pv systems degradation represents a gradual deterioration of pv system components caused by real operating conditions in the field. affected pv modules can continue to generate electricity, although produced energy can be significantly reduced. according to manufacturers, it is common practise to identify pv module as degraded when its maximum power reduces below 80% of the initial value. long-term degradation of pv systems is related to encapsulation material deterioration and its effects could be observed on the surfaces of pv cells during exploitation period. ethylene vinyl acetate (eva) is recognized, over the decades, as one of the best encapsulation materials for pv cells. as a consequence, nearly 80% of pv modules, produced around the world, are encapsulated by eva [7]. typical long-term degradation types of pv cells, related to eva, are: delamination, discoloration and corrosion. characteristic field examples of pv cells affected by long-term degradation types are shown in fig.1 [8]. (a) (b) (c) fig. 1 typical long-term degradations of pv cells [8]: (a) delamination; (b) discoloration; (c) corrosion prediction of annual energy production from pv string under mismatch condition due to long-term degradation 65 2.1. delamination glass, eva and pv material are tightly affixed (laminated) in normal pv cells. if some of the mentioned layers is damaged it could lead to delamination development. delamination represents separation between the different layers within the pv cells and it is usually followed by the penetration of moisture and corrosion. the most common pv cell’s surface area affected is located around busbars, as can be seen in fig.1.a. this type of degradation is observed in more than 50% of installed pv modules according to research [9]. 2.2. discoloration ultra-violet radiation, followed by high degree of humidity and environment temperature, is recognized as the main cause of discoloration. discoloration is the most common type of long-term degradation represented by electro-chemical process in which pv material changes colour, usually from light yellow to brown (fig. 1.b). according to research papers [10] and [11], discoloration can reduce pv cell’s short-circuit current up to 15%. 2.3. corrosion the main reason for corrosion occurrence in pv cells is moisture penetration. corrosion damages metal parts and contacts of pv cells (fig.1.c), which leads to pv cell’s series resistance increase. based on the results of accelerated corrosion tests, it has been found that probability of corrosion occurrence is related to oxygen presence in silicon layers of pv cell [12]. 3. pv system modelling in order to precisely determine degradation and mismatch power losses in pv systems, it is essential to use pv cell-based modelling [1]. for proper calculation of degradation effects on pv systems it is necessary to model functionality between generated power and specific ambient conditions. it is a common practice to model i-v curve with irradiation and temperature as controllable primary input variables. one-diode matlab-based model of pv cell has been created by using recommendations from [13]. corresponding pv module and cell models are used throughout previous research and series of related publications [14-16]. future pv modelling research will include the cooling effect of wind on pv cell temperature [17]. regarding mismatch effect due to long-term degradation it can be assumed that wind conditions are uniform at the relatively small surface of pv string. low irradiation effects have been included in modelling process by threatening of pv cell’s series resistance, parallel resistance and diode ideality factor, as functions of irradiation and operating temperature, with corresponding analytical expressions recommended in literature [18-20]. pv module modelling has been realized by using matlab/simulink software [21]. the chosen pv module zdny -250p60 250wp [22] consists of 60 polycrystalline suntellite 156m pv cells with electrical data for standard test conditions (stc) presented in table 1. pv string model consists of 12 pv modules with maximum installed power of 2.995 kw. similar types of pv systems are often used on the roofs of households in urban environments. pv system modelling procedure is presented in fig.2. pv modules within pv string are enumerated with numbers 1-12. 66 m. forcan table 1 stc electrical data of suntellite 156m pv cell and pv module suntellite pv pv cell pv module efficiency [%] 17.00-17.19 17.00 pmpp [w] 4.16 249.61 vmpp [v] 0.531 31.84 impp [a] 7.834 7.84 voc [v] 0.63 37.78 isc [a] 8.35 8.35 ff [%] 79.08 79.12 fig. 2 pv system modelling procedure: (a) one-diode pv cell model; (b) 60-cell pv module model; (c) 12-module pv string model 3.1. long-term degradation modelling in order to analyse long-term degradation effects on reduction of pv string power, it is mandatory to establish relation between degradation mechanisms and pv cell’s parameters. as delamination, discoloration and corrosion are impossible to predict precisely, their modelling is limited on approximate relations resulting from field observations and statistical analysis of experimentally obtained data. according to experimental research results [8], delamination reduces pv module’s short-circuit current isc, while its effects on open-circuit voltage voc can be neglected. based on experimentally obtained data for characteristic pv module, several modelling cases are defined: 1. case 0 del 0 no delamination isc = isc-(stc). 2. case 1 del 1 limited area around pv cells’ busbars affected isc = 0.95 × isc-(stc) (5% decrease). 3. case 2 del 2 limited area around small cracks in pv module’s surface isc = 0.92 × isc-(stc) (8% decrease). based on statistical analyses and experimental field data obtained in temperate climate zone [23], it has been found that discoloration also can be modelled as reduction of isc, the following cases are defined: 1. case 0 dis 0 no discoloration isc = isc-(stc). 2. case 1 dis 1 bright colours present on less than 50% of pv module’s surface isc = 0.9473 × isc-(stc) (5.27% decrease). 3. case 2 dis 2 bright colours present on more than 50% of pv module’s surface isc = 0.9137 × isc-(stc) (8.63% decrease). 4. case 3 dis 3 dark colours present on less than 50% of pv module’s surface isc = 0.9088 × isc-(stc) (9.12% decrease). prediction of annual energy production from pv string under mismatch condition due to long-term degradation 67 regarding corrosion modelling of pv modules, according to same experimental results, as in the case of discoloration [23], pv module’s series resistance rs has been identified as key parameter. corrosion manifests as increase of rs. the following modelling cases are defined: 1. case 0 cor 0 no corrosion. 2. case 1cor 1 bright colour corrosion on metal parts of pv module rs = 1.65 × rs-(stc) (65% increase). 3. case 2 cor 2 bright colour corrosion on metal parts and terminals of pv module rs = 2.2 × rs-(stc) (120% increase). 4. case 3 cor 3 dark colour corrosion on metal parts and terminals of pv module rs = 4.3 × rs-(stc) (330% increase). 4. pv module degradation effects under variable irradiation and temperature condition in real-time field conditions pv systems are operating under hourly-based irradiation and temperature variations. it is of mandatory importance to determine long-term degradation effects under variable irradiation and temperature conditions. by using earlier defined longterm degradation modelling cases, pv module maximum power is observed for ambient temperature and irradiation ranges: -5°c 35°c; 200 w/m 2 1000 w/m 2 , respectively. ambient temperature values have been varied with constant irradiation condition 800 w/m 2 . similarly, irradiation values have been varied with constant ambient temperature condition 8.75°c (pv cells’ operating temperature 25°c). corresponding results are presented in fig.3. by analysing graphs from fig.3, the several observations can be made:  delamination and discoloration preserve approximate linear correlation between pv module’s maximum power (pm) and both ambient temperature (t) and irradiation (i), while corrosion inserts slightly nonlinear components.  in the case of t variations, pm curve slopes remain approximately constant in delamination and discoloration analysis. as a consequence, differences between pm for all modelling cases (del0, del1, del2 and dis0, dis1, dis2, dis3) remain approximately constant.  in the case of i variations, pm curve slopes slightly change in delamination and discoloration analysis, which leads to important conclusion: for higher irradiation values, differences between pm for all considered modelling cases (del0, del1, del2 and dis0, dis1, dis2, dis3) are also higher.  regarding the corrosion effects, it can be seen from fig.3.c that modelling case cor3 significantly differ from other cases in terms of pm curve slope for variable t condition. for variable i condition, pm value differences between different modelling cases of corrosion are lower than the corresponding cases of delamination and discoloration.  degradation losses related to delamination and discoloration maintain approximately equal values in whole analysed t and i ranges, while corrosion losses nonlinearly increase with t and i values increasing (fig.3.d). it can be concluded that delamination and discoloration losses are approximately unaffected by variation of t and i. 68 m. forcan fig. 3 pv module maximum power and degradation losses under variable temperature and irradiation condition: (a) delamination; (b) discoloration; (c) corrosion; (d) power losses due to long-term degradation prediction of annual energy production from pv string under mismatch condition due to long-term degradation 69 5. pv string power reduction due to long-term degradation study case determination of pv string power losses due to degradation is a complex task, because of the mismatch condition occurrence. the term “mismatch condition” refers to differences in current-voltage (i-v) curves of individual pv modules in pv string due to different degradation rates. in the field conditions, during long exploitation periods, it is very common that pv modules degrade differently. in order to investigate pv string’s degradation and degradation mismatch power losses, the special study case, consisting of adopted pv module’s degradation modelling cases, is defined in table 2. table 2 pv string under long-term degradation study case period of pv string exploitation 10 years 15 years 20 years 25 years type of long-term degradation del. disc. corr. del. disc. corr. del. disc. corr. del. disc. corr. pv1 del 0 dis 1 cor 1 del 1 dis 2 cor 2 del 2 dis 2 cor 2 del 2 dis 2 cor 3 pv2 del 0 dis 0 cor 1 del 1 dis 0 cor 2 del 1 dis 1 cor 2 del 2 dis 1 cor 3 pv3 del 0 dis 1 cor 0 del 0 dis 3 cor 1 del 1 dis 3 cor 2 del 2 dis 3 cor 2 pv4 del 0 dis 0 cor 0 del 0 dis 1 cor 1 del 1 dis 2 cor 2 del 1 dis 2 cor 2 pv5 del 0 dis 0 cor 0 del 0 dis 1 cor 1 del 0 dis 2 cor 2 del 1 dis 2 cor 2 pv6 del 0 dis 0 cor 0 del 0 dis 0 cor 1 del 0 dis 1 cor 2 del 1 dis 2 cor 2 pv7 del 0 dis 0 cor 0 del 0 dis 1 cor 0 del 0 dis 3 cor 1 del 1 dis 3 cor 2 pv8 del 0 dis 0 cor 0 del 0 dis 1 cor 0 del 0 dis 3 cor 1 del 1 dis 3 cor 2 pv9 del 0 dis 0 cor 0 del 0 dis 0 cor 0 del 0 dis 1 cor 1 del 0 dis 3 cor 2 pv10 del 0 dis 0 cor 0 del 0 dis 0 cor 0 del 0 dis 1 cor 1 del 0 dis 1 cor 2 pv11 del 0 dis 0 cor 0 del 0 dis 0 cor 0 del 0 dis 1 cor 0 del 0 dis 1 cor 1 pv12 del 0 dis 0 cor 0 del 0 dis 0 cor 0 del 0 dis 0 cor 0 del 0 dis 1 cor 1 according to data in table 2 it can be observed that several key time points are defined during 25 years long exploitation period of pv string. long-term degradation modelling cases are assumed to take place after 10, 15, 20 and 25 years of exploitation period. the highest combined degradation rate is set for pv modules with starting indexes (1, 2, 3 …). it is assumed that degradation rate is negligible in the first 10 years of exploitation. in order to determine pv string degradation losses and mismatch losses separately, it is necessary to identify total maximum power of individual pv modules (12 pv modules operate separately), beside the maximum power of the entire pv string (12 pv modules operate in series connection). for defined study case (table 2), under constant ambient temperature t = 20°c and irradiation i = 600 w/m 2 conditions, maximum power points of individual pv modules pmpp-im and pv string pmpp-string have been determined and presented in table 3. 70 m. forcan table 3 maximum power points of individual pv modules and pv string for study case defined in tab.5.1 under constant ambient temperature and irradiation values (t = 20°c and i = 600 w/m 2 ) maximum power point pv modules / string (pmpp-im / pmpp-string) period of pv string exploitation 10 years 15 years 20 years 25 years ppv1-mpp [w] 126.773 115.478 111.637 109.500 ppv2-mpp [w] 133.563 126.533 119.793 113.653 ppv3-mpp [w] 127.631 121.935 114.847 111.005 ppv4-mpp [w] 134.518 126.918 115.478 115.478 ppv5-mpp [w] 134.518 126.918 121.895 115.478 ppv6-mpp [w] 134.518 133.702 126.189 115.478 ppv7-mpp [w] 134.518 127.773 121.935 114.847 ppv8-mpp [w] 134.518 127.773 121.935 114.847 ppv9-mpp [w] 134.518 134.518 126.914 121.268 ppv10-mpp [w] 134.518 134.518 126.914 126.189 ppv11-mpp [w] 134.518 134.518 127.773 126.194 ppv12-mpp [w] 134.518 134.518 134.518 126.194 pmpp-im = σppvi-mpp (i=1…12) 1598.6 w 1561.1 w 1475.2 w 1418.5 w pmpp-string 1591.7 w 1511 w 1440.4 w 1392.7 w power losses due to long-term degradation pmpp-new string* pmpp-string 22.52 w 1.4 % 103.2 w 6.4 % 173.82 w 10.8 % 221.52 w 13.7 % mismatch losses due to long-term degradation pmpp-im pmpp-string 6.9 w 0.43 % 50.1 w 3.1 % 34.8 w 2.16 % 25.8 w 1.6 % *new pv string maximum power 1614.22 w according to results presented in table 3 it can be concluded that power losses due to long-term degradation are increasing from 1.4% to 13.7% over the 25 years exploitation period. on the other hand, mismatch losses have the highest value after just 15 years of exploitation (3.1%), because the degradation rates of individual pv modules differ the most in that time period. it is important to notice that mismatch losses are very difficult to predict and they certainly depend on particular study cases. their values could reach up to 50% of power losses due to long-term degradation itself. 6. pv string annual energy production statistical prediction of energy production from pv string is based on horizontal irradiation and ambient temperature measurements. acquisition system provided measurements of horizontal irradiation and ambient temperature for every 10 minutes between july 15 th , 2013 and july 15 th , 2014, at location of belgrade, serbia, with wgs coordinates: 44.8 0 ; 20.47 0 ; 120 m. the obtained irradiation and ambient temperature values have been averaged for every three hours and in the next step monthly-averaged. based on the procedure given in [24], horizontal irradiation can be divided into direct and diffuse component. in addition, reflected component can be determined by using corresponding reflection coefficient. prediction of annual energy production from pv string under mismatch condition due to long-term degradation 71 in order to determine irradiation components on pv string surface, position angles need to be defined. corresponding assumed tilt and azimuth angles are σ=30 0 and s=0 0 , respectively. in the process of determining ambient reflection coefficient, it is assumed that household with pv string on its roof is located on a grassy surface. the adopted reflection coefficient value is ρ=0.15. total irradiation on the surface of pv string has been calculated by usage of following relation: , pv dir dif ref i i i i   (1) where: idir, idif and iref are direct, diffused and reflected irradiation components, respectively. by using calculated irradiation and measured ambient temperature data, it is possible to determine operating temperature of pv string, according to following relation: 20( ) , 0.8 pv amb pv noct t t i     (2) where: tpv is operating temperature of pv string; tamb is ambient temperature; noct is nominal operating temperature of pv cell (47 °c for considered pv cells); ipv is irradiation value on the surface of pv string. based on the calculated and averaged ipv and tpv values, pv string dc power values are obtained (pdc). conventional relation for calculation of pv systems’ dc power in the field conditions, expanded with insertion of degradation factor, is defined as follows: (field) ( , ) (1 ) (1 ) (1 ), dc dc pv pv n z d p p i t          (3) where: μn is efficiency reduction factor due to resulting unequal i-v curves in the manufacturing process of pv modules; μz is efficiency reduction factor resulting from soiling of pv modules in the field; μd (df) is newly defined efficiency reduction factor as a consequence of long-term degradation (degradation factor). assumed values of efficiency reduction factors in the analysed study case are μn = 0.03 (3%) and μz = 0.04 (4%). degradation factor has been calculated on the hour basis according to relation (4) and initially averaged for every three hours, and in the next step also monthly-averaged. _ _ _string _ . new string long term d d new string p p df p      (4) monthly averaged pv string maximum power and degradation factor have been presented in fig.4 for analysed exploitation period of one year and considered study case. daily time intervals with irradiation values below 30 w/m 2 have been neglected in the analysis. 72 m. forcan fig. 4 monthly averaged pv string maximum power and degradation factor during the considered year of exploitation in different hourly-based time intervals: (a) 8:30h 11:30h; (b) 11:30h 14:30h; (c) 14:30h 17:30h according to results from fig.4, the following observations can be obtained:  degradation factor has higher values in the summer time (up to 14%), with the exception of january in time interval 11:30h 14:30h.  the highest values of degradation factor are present in the period with maximum irradiation (11:30h 14:30h).  degradation factor monthly-based differences are most expressed after 25 years of pv string exploitation. prediction of annual energy production exhibits pv string ac power calculation by using the following relation: (field) (field) , ac dc i p p  (5) where μi is inverter efficiency. after determination of pac-(field) values, annual energy production can be easily calculated. purchase price of electricity produced from small capacity pv systems, installed on the households in serbia, can be calculated by using relation (6).  0.01 (20.941 – 9.383 ) / ,p eur kwh  (6) where p is installed power of pv system in mw units. prediction of annual energy production from pv string under mismatch condition due to long-term degradation 73 with assumed inverter efficiency of μi = 97%, pv string annual energy production has been predicted, together with annual money income and losses due to long-term degradation. corresponding results are presented in table 4. table 4 pv string annual energy production, money income and losses due to long-term degradation pv string annual energy production and income period of pv string exploitation new string 10 years 15 years 20 years 25 years annual energy production [kwh] 3422 3374 3204 3058 2962 annual money income [eur] 715.6 705.6 670.1 639.5 619.4 loss of money due to degradation [eur] 0 10 45.5 76.1 96.2 with assumption that loss of money due to long-term degradation is approximately equal in consecutive time periods of 5 years (e.g. loss of money in time period 7.5 12.5 years is equal to 5 × loss of money in the 10th year of exploitation) it is possible to roughly estimate total loss of money in time period 0 27.5 years (very close to lifetime of pv string): 5 × (10 + 45.5 + 76.1 + 96.2) = 1139 eur. it can be concluded that predicted amount of money loss due to long-term degradation is enough to buy several new pv modules during considered exploitation period. even rough estimation of degradation factor could be of significant interest for economic predictions, especially for larger installed pv capacities (> 20 kw) where money income could be reduced for more than 10 000 eur in lifetime exploitation period due to long-term degradation. 7. conclusions modelling of pv system degradation in terms of statistical prediction of annual energy production proved to be a very complex task, mainly because of many uncertainties related to long exploitation period and field conditions. several useful guidelines and study case results have been presented in this article. the most common long-term degradation types have been modelled by using approximate relations, adopted on the basis of experimental observations. it has been shown that power losses of individual pv modules due to delamination and discoloration remain approximately constant under wide range of irradiation and ambient temperature values, while power losses due to corrosion proved to be temperature-dependent. mismatch power losses, caused by different degradation rates of individual pv modules in pv string, have been identified as potentially significant part of total degradation losses. methodology for prediction of annual energy production from pv string, based on horizontal irradiation and ambient temperature field measurements, has been modified in order to include long-term degradation effects. degradation factor has been introduced as useful tool for validating power losses due to long-term degradation. analysis of pv string consisting of 12 pv modules, located in belgrade, study case, showed that money losses during lifetime exploitation period, caused by long-term degradation could overcome price of several new pv modules. acknowledgement: the author would like to thank to professors jovan mikulović and željko đurišić for their advices and support during research period. special acknowledgement belongs to my best friend slobodan elez, who contributed with useful results related to his master thesis. 74 m. forcan references [1] m. forcan, “prediction of energy production from string pv system under mismatch condition”, in proceedings of the 2 nd virtual international conference on science, technology and management in energy energetics, 2016, pp. 3-9. [2] m. jošt and m. topič, “efficiency limits in photovoltaics – case of single junction solar cells”, facta universitatis, series: electronics and energetics, vol. 27, no 4, pp. 631 638, december 2014. [3] y. georgiev, g. angelov, t. takov, i. zhivkov and m. hristov, “the photovoltaic behavior of vacuum deposited diphenyl-diketo-pyrrolopyrrole polymer”, facta universitatis, series: electronics and energetics, vol. 27, no 4, pp. 639 648, december 2014. [4] o. perpinan, e. lorenzo and m.a. castro, “on the calculation of energy produced by pv grid-connected system”, progress in photovoltaics research and applications, vol. 15, issue: 3, pp. 265-274, 2007. [5] m. brabec, e. pelikán, p. krč, k. eben and p. musilek, “statistical modeling of energy production by photovoltaic farms”, in proceedings of the ieee elect. power energy conf. (epec), aug. 2010, pp. 1-6. [6] o. perpinan, “statistical analysis of performance and simulation of two axis tracking pv system”, solar energy, vol. 83, issue 11, pp. 2074-2085, nov. 2009. [7] s. jiang, k. wang, h. zhang, y. ding and q. yu “encapsulation of pv modules using ethylene vinyl acetate copolymer as the encapsulant”, macromol. react. eng., 9, pp. 522–529, 2015. [8] t. shioda, “delamination failures in long-term field-aged pv modules from point of view of encapsulant”, lecture presented at 2013 nrel pv module reliability workshop, denver. [9] d. c. jordan, j. h. wohlgemuth, and s. r. kurtz, “technology and climate trends in pv module degradation”, in proceedings of the 27th european photovoltaic solar energy conference and exhibition, 2012, pp. 3118-3124. [10] m. kempe, “modelling of rates of moisture ingress into photovoltaic modules”, solar energy materials & solar cells, vol. 90, issue: 16, pp. 2720–2738, 2006. [11] m. kempe, “ultraviolet test and evaluation methods for encapsulants of photovoltaic modules”, solar energy materials & solar cells, vol. 94, issue: 2, pp. 246–253, 2010. [12] a. ndiaye, a. charki, a. kobi, c.m.f. kébé, p.a. ndiaye and v. sambou, “degradations of silicon photovoltaic modules: a literature review”, solar energy, vol. 96, pp. 140–151, 2013. [13] d. sera, r. teodorescu and p. rodriguez, “pv panel model based on datasheet values”, in proceedings of the ieee international symposium on industrial electronics, vigo, spain, 2007, pp. 2392–2396. [14] m. forcan, ţ. đurišić, and j. mikulović, “an algorithm for elimination of partial shading effect based on a theory of reference pv string,” solar energy, vol. 132, pp. 51–63, 2016. [15] m. forcan, j. tuševljak, s. lubura and m. šoja, “analyzing and modeling the power optimizer for boosting efficiency of pv panel,” ix symposium industrial electronics indel, banja luka, november 2012, pp. 193-198. [16] m. forcan and ţ. đurišić, “the analysis of pv string efficiency under mismatch conditions,” in 4th international symposium on environment friendly energies and applications efea, 2016, pp. 1-6. [17] c. schwingshackl, m. petitta, j.e. wagner, g. belluardo, d. moser, m. castelli, m. zebisch and a. tetzlaff, “wind effect on pv module temperature: analysis of different techniques for an accurate estimation”, energy procedia, vol. 40, pp. 77–86, 2013. [18] s. bensalem and m. chegaar, “thermal behavior of parasitic resistances of polycrystalline silicon solar cells”, revue des energies renouvelables, vol. 15, pp. 171-176, 2013. [19] m.l. priyanka and s.n. singh, “a new method of determination of series and shunt resistances of silicon solar cells”, solar energy materials & solar cells, vol. 91, pp. 137–142, jan. 2007. [20] d. macdonald and a. cuevas, “reduced fill factors in multicrystalline silicon solar cells due to injectionlevel dependent bulk recombination lifetimes”, progress in photovoltaics: research and applications, vol. 8, pp. 363–375, 2000. [21] matlab/simulink. mathworks, inc. natick. massachusetts. united states. [22] pv module data sheet, available online at http://www.suntellite.cn/en/product/suntellite-modulepolycrystalline-20.html [23] r. dubey, s. chattopadhyay, v. kuthanazhi, j. j. john, b. m. arora, a. kottantharayil, k. l. narasimhan, c. s. solanki, v. kuber, j. vasi, a. kumar and o. s. sastry “all india survey of photovoltaic module degradation 2013”, national centre for photovoltaic research and education, mumbai, india, 2014, available online at http://www.ncpre.iitb.ac.in/pages/publications_reports.html [24] g. m. masters, renewable and efficient electric power systems. hoboken, nj: john wiley & sons, 2004, chapters 7-8. http://www.suntellite.cn/en/product/suntellite-module-polycrystalline-20.html http://www.suntellite.cn/en/product/suntellite-module-polycrystalline-20.html http://www.ncpre.iitb.ac.in/pages/publications_reports.html 10413 facta universitatis series: electronics and energetics vol. 35, no 4, december 2022, pp. 469-482 https://doi.org/10.2298/fuee2204469g © 2022 by university of niš, serbia | creative commons license: cc by-nc-nd original scientific paper design of a four stages vco using a novel delay circuit for operation in distributed band frequencies mriganka gogoi1,2, pranab kishore dutta1 1assam don bosco university, department of ece, india 2north eastern regional institute of science and technology, department of ece, india abstract the manuscript proposes a novel architecture of a delay cell that is implemented in 4-stage vco which has the ability to operate in two distributed frequency bands. the operating frequency is chosen based on the principle of carrier mobility and the transistor resistance. the vco uses dual delay input techniques to improve the frequency of operation. the design is implemented in cadence 90nm gpdk cmos technology and simulated results show that it is capable of operating in dual frequency bands of 55 mhz to 606 mhz and 857 mhz to 1049 mhz. at normal temperature (270) power consumption of the circuit is found to be 151μw at 606 mhz and 157μw at 1049 mhz respectively and consumes an area of 171.42µm2. the design shows good tradeoff between the parameters-operating frequency, phase noise and power consumption. key words: ring oscillator, voltage controlled oscillator (vco), tuning range 1. introduction phase lock loop (pll), one of the key elements of contemporary wireless digital signal processing and instrumentation systems, is crucial for improving the performance of this electronic component. the parameters associated with vco like operating frequency range, power dissipation and phase noise have important contribution towards the improvement of the pll. there are two widely used vcos topologies and they are lc and ring vcos. the former has a high resolution and frequency, but the operating frequency range is limited and the chip surface is big. the latter has many advantages like wide tuning range, easy integration, low chip area, multiphase clock and low power consumption; however, it has a low resolution and poor phase noise performance [1]. ring vcos are divided into two sorts based on their delay stages. 1) vco with a single-ended ring (sero) and 2) vco with a differential ring received january 10, 2022; revised june 22, 2022 and november 15, 2022; accepted december 3, 2022 corresponding author: pranab kishore dutta associate professor, department of ece, nerist, itanagar, arunachal pradesh, 791109,india e-mail: pkdutta07@gmail.com 470 m. gogoi, p. k. dutta (dro). seros consume less area compared to dros but have more noise and hence less efficient [3-7]. dros are more resilient to common mode noise and have a lower swing. delay cell is the basic element of differential configuration oscillators. many such delay cells were proposed by different researchers at different times. maneatis et al proposed a delay cell that was used to design a ring oscillator which could oscillate with an operating frequency of 141 mhz. the delay cell was based on a source coupled pair [8]. a wide operating frequency three stage vco was proposed by yan et al that could operate in frequency range of 1.3 to 1.8 ghz, however the power consumption was comparatively high [9]. park et al designed a 4 stage ring oscillator with low phase noise and operates in 900 mhz. the phase noise was found to be -101 dbc/hz at 100 khz [10]. tu et al. proposed a novel delay circuit and used it to design two stages voltage controlled oscillator whose operating range was from 2.5 ghz to 5.2 ghz for a supply voltage of 1.8v. however due to lesser number of stages the phase noise achieved was -90.1 dbc/hz at offset frequency of 1 mhz [11]. sheu et al proposed a new differential delay cell which was implemented in three stages vco, the tuning range was found to be 479 mhz to 4.09 ghz with phase noise -93.3 dbc/hz at offset frequency of 1 mhz [12]. parvizi et al. proposed design of ring oscillators using two topologies which are differential and single-ended. to reduce stage delay and boost tuning range, the vco used a feed-forward technique and load in terms of inductive impedance [13]. a pll was designed by shruti suman et al by proposing an improved performance vco. the operating frequency varied from 2.26 ghz to 3.44 ghz with the help of a controlled voltage changed from 1v to 3v but they did not focus on phase noise[14]. a delay cell for using in ring oscillator with dual loop was proposed by gao et al, where they also used controlled voltage to tune the frequency range. the design was efficient enough to achieve wideband tuning range while maintaning low phase noise [15]. to determine the optimal dimensions of the vco, gargouri et al proposed a systematic and efficient optimization method and found an optimal trade-off between various specifications [16]. salem et al proposed a fault tolerant delay cell to be used for designing ring oscillator that uses redundant transistor methods to improve relaibility, power dissipation and phase noise [17]. kumar et al presented a vco using nor gate and varactor tuning method with inversion mode [18]. the changes in the varactor width is considered for variation in the operating frequency. however there is still scope for improve in the phase noise. a low noise injection locked vco was proposed by lee et al in which a separate injection signal is employed and the oscillator output locks to the frequency of the injection signal [19]. the circuit showed tuning range having wide frequency and low phase noise with low power consumption. ramazani et al presented some delay cells using basic inverters and current starved inverters to be used for designing vco to achieve better frequency stability [20]. the proposed circuit is oriented towards designing of ring vco with the ability to operate in disributed tuning range while maintaininng decent tradeoff between phase noise and power consumption in differential configuration. thus the novelty of the work lies in allowing the same vco to work in the high as well as low frequency ranges without altering the physical design of the circuit. the subsequent sections of the paper are organized as follows: section 2 deals with proposed vco, section 3 deals with delay circuit analysis, section 4 deals with implementation and section 5 deals with conclusion. design of a four stages vco using a novel delay circuit for operation in distributed band frequencies 471 2. proposed vco even and odd numbers of stages can be used in differential ring oscillators, but an odd number of stages cannot produce both in phase and quadrature phase outputs. frequency of oscillation depends on factors like driving capability, load and number of stages. in addition, when the number of phases increase, the quantity of energy used, the amount of space needed, and the cost increased. additionally, there will be greater phase noise with fewer stages. therefore, maintaining adequate tradeoff between the various performance characteristics requires an optimal design. two stages dro will have tight constrains particularly in oscillations to occur, while three stages limit the output of in phase and quadrature phase, hence we choose to design a four stages vco [21-23]. the designed vco will be applicable in communication systems where multiphase signals are needed like phase array transceivers, fractional frequency synthesizers and clock data recovery circuits. moreover, communication systems demand the need of wide range oscillators to cover a variety of standards across multiple frequency bands [24]. proposed delay cell of the vco is designed using two control frequencies hence this technique is also known as dual frequency control technique [25-26]. it comprises of input vin1+, vin2+, vin1and vin2-, output voltages vout+ , and vout, control voltages vcntr1 and vcntr2. considering tdelay as the delay time of the cell then total delay time of four stages vco will be 4tdelay and hence the operating frequency will be 1/(4tdelay). the proposed delay cell and four-stage vco are depicted in fig.1 and fig.2. the time constants τc and τd estimated during charging and discharging provide a generic equation for finding the oscillation frequency. the following formula is used to compute the time constant: τ = rc (1) where, r is the resistance offered by the charging and discharging path, c is the lumped capacitance that is the combined parasitic capacitances. using τc and τd, time intervals t1 and t2 which are the charging and discharging time intervals of the delay cell are calculated. they are used for determining the oscillating frequency which is given by: 𝑓𝑎 = 1 𝑇1 + 𝑇2 (2) resistance(r) in the time constant formula is the resistance of pmos and nmos transistors respectively, and is given by: 𝑟𝑝 = 1 𝜇𝑝𝐶𝑜𝑥 𝑊 𝐿 (|𝑉𝑔𝑠 | − |𝑉𝑡𝑝|) (3) 𝑟𝑛 = 1 𝜇𝑛𝐶𝑜𝑥 𝑊 𝐿 (|𝑉𝑔𝑠 | − |𝑉𝑡𝑛|) (4) where 𝜇𝑝 and 𝜇𝑛 are the mobility of pmos and nmos transistors, cox is the oxide capacitances, both transistors channel width and length are w and l, respectively. vgs is the applied gate to source voltage, vtp and vtn are the pmos and nmos threshold voltages. the 472 m. gogoi, p. k. dutta transistors are assumed to be working in triode region and small drain to source voltage vds is neglected. thus, from the formulae it can be clearly interpreted that higher the mobility lower will be the resistance or in other words resistance is inversely proportional to mobility. moreover, resistance is directly proportional to time constant hence oscillating frequency is inversely dependent on resistance and directly dependent on mobility. so, transistor with higher mobility can play crucial role in improving the oscillating frequency. since the mobility of electrons is larger than that of holes, this has an effect on the current flow and time constant, or delay time, which has an additional impact on oscillation frequency. lowering the resistance will result in a shorter delay time and a greater oscillation frequency. since the nmos time constant is lower than the pmos time constant, the oscillation frequency will be higher. this idea inspired us to suggest the delay circuit depicted in figure 1 and utilise it to create the vco depicted in figure 2. fig. 1 delay cell fig. 2 four stages vco design of a four stages vco using a novel delay circuit for operation in distributed band frequencies 473 3. delay circuit analysis the proposed delay cell is for dual loop ring based voltage controlled oscillator [27-29]. the primary loop’s inputs are m1 and m2, while the secondary loop’s inputs are m3 and m4. the dual-loop technique amplifies the oscillation. m13 controls the ring vco’s frequency. the latch’s feedback strength is made up of m5, m6, m7, m8, m9, m10, m11 and m12. thus, it is simple to control the delay time of the latch by controlling vcntr2 while vcntr1 helps in maintaining the vco to operate both in the low frequency and high frequency bands. considering the left part of the delay circuit which deals with vout-, the charging and discharging time can be calculated as shown below: initial condition be vout-=vl and vout+=vo, vl and vo being the minimum and maximum output voltages. the charging time is controlled by vin2and the transistor m3, based on the initial condition m5 will be off, is given by 𝜏1 = 𝑟3𝐶𝑙𝑜𝑎𝑑 (5) 𝑟3is equivalent resistance of mosfet m3 and cload is the parasitic capacitive load associated with vout-. 𝑟3 = 1 𝜇𝑝𝐶𝑜𝑥 𝑊 𝐿 (|𝑉𝑔𝑠 | − |𝑉𝑡𝑝|) (6) 𝐶𝐿𝑜𝑎𝑑 = 𝐶𝑑𝑏1 + 𝐶𝑔𝑑1 + 𝐶𝑑𝑏3 + 𝐶𝑔𝑑3 + 𝐶𝑑𝑠3 + 𝐶𝑑𝑏5 + 𝐶𝑔𝑑5 + 𝐶𝑑𝑏7 + 𝐶𝑔𝑑7 + 𝐶𝑑𝑏9 + 𝐶𝑔𝑑9 + 𝐶𝑑𝑏11 + 𝐶𝑔𝑑11 + 𝐶𝑖𝑛_𝑥 (7) where, cin_x is the next stage input capacitance, cdb is drain to body, cgd is gate to drain and cgs is gate to source capacitances of the transistor. across the load capacitance voltage will be 𝑉𝐶𝐿𝑜𝑎𝑑 = 𝑉0 − (𝑉0 − 𝑉𝑙 )exp (− 𝑇1 𝜏1 ) (8) suppose in the time interval 𝑇1capacitor charges upto αv0 then α𝑉0 = 𝑉0 − (𝑉0 − 𝑉𝑙 )exp (− 𝑇1 𝜏1 ) (9) α is a constant with a value ranging from 0 to 1. 𝑇1 = 𝜏1ln { 𝑉0 − 𝑉𝑙 𝑉0(1 − 𝛼) } (10) in the next state when vout-=v0 and vout+=vl. the discharging phenomenon comprises of both charging and discharging time constants and the effective discharging time τ2 is 𝜏2 = {(𝑟1||(𝑟7 + 𝑟13)) − (𝑟3||𝑟5)}𝐶′𝐿𝑜𝑎𝑑 (11) r1, r5, r7 and r13 represents the equivalent resistances of mosfet m1, m5, m7 and m13. 474 m. gogoi, p. k. dutta 𝐶′𝐿𝑜𝑎𝑑 = 𝐶𝑑𝑏1 + 𝐶𝑔𝑑1 + 𝐶𝑑𝑠1 + 𝐶𝑑𝑏3 + 𝐶𝑔𝑑3 + 𝐶𝑑𝑏5 + 𝐶𝑔𝑑5 + 𝐶𝑑𝑏7 + 𝐶𝑔𝑑7 + 𝐶𝑔𝑑9 + 𝐶𝑑𝑏9 + 𝐶𝑑𝑏11 + 𝐶𝑔𝑑11 + 𝐶𝑖𝑛𝑥 (12) c’load is node capacitance during time t2 now voltage across capacitor c’load can be given by 𝑉𝐶′𝐿 = 𝑉𝑙 − (𝑉𝑙 − 𝛼𝑉0)exp (− 𝑇2 𝜏2 ) ( 123) suppose the capacitor c’load discharges to, βvl in the time interval t2 such that β>1 then β𝑉𝑙 = 𝑉𝑙 − (𝑉𝑙 − 𝛼𝑉0)exp (− 𝑇2 𝜏2 ) (14) 𝑇2 = 𝜏2ln { 𝑉𝑙 − 𝛼𝑉0 𝑉𝑙 (1 − β) } (15) 𝑇 = 𝑇1 + 𝑇2 = 𝜏1ln { 𝑉0 − 𝑉𝑙 𝑉0(1 − 𝛼) } + 𝜏2ln { 𝑉𝑙 − 𝛼𝑉0 𝑉𝑙 (1 − β) } (16) and finally, fosc=1/4t, for four stage ring vco. fig. 3 delay circuit for low voltage of vcntr1 case 1: when vcntr1 is low (0 to 0.3v), the transistors m9 and m10 become more dominant as both are pmos transistors and they operate in low gate voltage than the m11 and m12 transistors. hence the circuit is found to operate similar to the circuit shown in fig. 3. the normal delay loop’s input pair is m1 and m2, while the skewed delay loop’s input pair is m7 and m8 in the circuit depicted in fig. 3. transistor m1 shuts off when the voltage connected to gate terminal of m1, vin1+, is less than the threshold value. the source current design of a four stages vco using a novel delay circuit for operation in distributed band frequencies 475 of the secondary input transistor m3 is already flowing towards the capacitor associated with output node, vout-, because the input voltage at vin2reaches earlier than at vin1+. this results in reduction of the output node’s rise time. in the delay cell, m5 and m6 combine to form a latch. m9 and m10 are cross-coupled transistors that control the load transistors' maximum gate voltages, as well as the latch strength and frequency of operation. now varying the control voltage vcntr2 will vary the frequency of oscillation. the path delay increases due to the action of pmos transistors m9 and m10, which causes the vco to operate in the low frequency band. fig. 4 delay cell due to high vcntr2 case 2: when vcntr1 is high (0.7v to 1v) the transistors m11 and m12 are more dominant as both are nmos transistors and they operate in higher gate voltage than m9 and m10. the delay circuit is found to work as shown in fig. 4. in this case m11 and m12 are cross-coupled transistors that govern the maximum voltages associated with the gate terminal of the transistors in the load and hence the latch strength and so the frequency of operation. phase noise: several noise elements influence the phase noise in a ring oscillator. the most prevalent types of noise are white noise and flicker noise. in contrast to inverter-based delay cells, differential delay cells operate in class a and consume a steady state current [30]. the main source of flicker noise is the fet that powers the common gate line for all the currents in the delay cells [31]. equation (17) and (18) shows the ssb (single side band) phase noise because of white noise and flicker noise respectively in differential oscillators. l(f) = 2kt i. ln2 [ɤ ( 3 4 veffd + 1 vefft ) + 1 vop ] ( f0 f ) 2 (17) 476 m. gogoi, p. k. dutta l(f) = a kf wlc′oxf ( 1 vefft 2 ) 2 f0 2 f 3 (18) where, ɤ is noise factor of fet, veffd and vefft are the effective gate voltages of the differential delay cell at balance and unbalanced conditions, vop is actual output voltage, i is tail current, f0 is oscillation frequency, w and l stand for fet’s width and length, a is the ratio of width of fet to that of tail fet and c’ox is the oxide capacitance of nfet (tail transistor). in our design a is considered to be 1 as both w and l are of same length. figure of merit which is used for characterizing vco performance can be obtained from the equation (19) [32]. 𝐹𝑂𝑀 = l(f) − 20log ( f0 f ) + 10log ( 𝑃𝑑𝑐 1𝑚𝑊 ) (19) pdc is the dc power consumption. the dimension of both, nmos and pmos, shown in table 1 are maintained same as the goal is to get the functional circuit in order to confirm the topological idea. table 1 device dimension device aspect ratio nmos pmos m1,m2,m7,m8, m11,m12,m13 m3,m4,m5,m6, m9,m10 120/100 120/100 table 2 variation of the parameters at different temperatures when vcntr1=0.2v and vcntr2 is varied from 0 to 1v (pre layout) temp tuning range power consumption phase noise 1m phase noise 10m 00 42 mhz-672 mhz (93.75%) 160 μw -93.10 dbc/hz -112.36 dbc/hz 100 47 mhz-647 mhz (92.73%) 155 μw 92.94 dbc/hz -112.05 dbc/hz 270 55mhz606 mhz (90.9%) 151 μw -92.07 dbc/hz -111.25 dbc/hz 700 63 mhz-487 mhz (87%) 144 μw -91.86 dbc/hz -111.09 dbc/hz table 3 variation of the parameters at different temperatures when vcntr1=0.77 v and vcntr2 is varied from 0 to 1v (pre layout) temp tuning range power consumption phase noise 1m phase noise 10m 00 1040 mhz-1230` mhz (15.44%) 169 μw -93.42 dbc/hz -112.83dbc/hz 100 969 mhz-1161 mhz (16.5%) 161 μw -93.09 dbc/hz -113.23 dbc/hz 270 857mhz1049 mhz (18.3%) 157 μw -93.50 dbc/hz -113.93dbc/hz 700 585 mhz-771 mhz (24.12%) 152 μw -92.88 dbc/hz -112.34 dbc/hz design of a four stages vco using a novel delay circuit for operation in distributed band frequencies 477 fig. 5 tuning range of vco at different temperatures for vcntr1=0.2v and vcntr2 varies from 0v to 1v (pre layout simulation) fig. 6 tuning range of vco at normal temperature for vcntr1=0.2v and vcntr2 varies from 0v to 1v (pre and post layout simulation at 270) table 4 corner analysis at vcntr1=0.2 v and vcntr2=0.1v process coners pre layout @1mhz post layout @1mhz pre layout @10 mhz post layout @10 mhz output noise (db) phase noise (dbc/hz) output noise (db) phase noise(db c/hz) output noise(db) phase noise (dbc/hz) output noise (db) phase noise (dbc/hz) nn -93.10 -92.07 -93.23 -92.61 -112.26 -111.25 -113.00 -112.16 ff -94.68 -93.32 -95.38 -93.87 -113.45 -112.33 -114.12 -113.11 fs -96.12 -95.54 -96.62 -95.93 -116.10 -114.56 -116.96 -115.22 sf -95.22 -94.56 -95.88 -94.89 -115.31 -113.34 -116.45 -114.31 ss -94.20 -93.40 -94.66 -93.84 -112.87 -111.86 -113.10 -112.53 478 m. gogoi, p. k. dutta 4. implementation the proposed four stages vco design is implemented using cadence cmos 90nm technology. device dimension used in the circuit is mentioned in table 1. analysis of tuning ranges were carried out by varying vcntr2 from 0v to 1v for different values of vcntr1. mainly vcntr1 was divided into two ranges, the lower one 0 to 0.5v and upper 0.5v to 1.0v. optimum values for maximizing tuning range in both cases were found to be 0.2v (lower) and 0.77v (upper). oscillating frequency ranges from 55 mhz to 606 mhz (91% approx.) for lower band with the control voltage vcntr1=0.2v as shown in fig 5 and table 2. whenever path delay is high oscillating frequency is found to be low and vice versa. fig 7 and table 3 shows variation of tuning range due to change in vcntr2 keeping vcntr1=0.77v. vcntr2 varies from 1v to 0v and the tuning range is found to be 857 mhz to 1049 mhz (18.30%) at normal temperature. thus, it can be considered as operation in higher band frequency. so, the benefit of the circuit is that the same circuit can be operated in two different bands of frequency and thereby increasing the tuning range of the circuit. effect of temperature: due to the changes in transconductance gain (gm), threshold voltage (vth), electron and hole mobility (n and p) and parasitic capacitors, transistors have the biggest impact on the frequency drift and they are obtained as follows [33]: 𝑔𝑚 = µ𝑛cox w l (vgs − vth) (20) 𝑉𝑡ℎ = 𝑉𝑡ℎ0 − 𝛼(𝑇 − 27 0) (21) µ(𝑇) = µ(𝑇 = 270) ( t 270 ) − 3 2 (22) the operation of the circuit is tested by varying the temperatures; it is found that the operating frequency is reduced with increase in frequency. analysis of the circuit is carried out by varying the temperatures from 00c to 700c in both pre layout and post layout at vdd=1v, vcntr1 equals to 0.77 v and 0.2 v respectively and vcntr2 is varied from 0 to 1v. comparative analysis between the pre and post layout simulation with respect to tuning range are shown in fig. 6 and fig. 8, it is found that the changes in frequency tuning range due to the control voltage vcntr2 are close to each other in both the cases. corner analysis of fig. 7 tuning range of vco at different temperatures for vcntr1=0.77 v and vcntr2 vary from 0v to 1v. design of a four stages vco using a novel delay circuit for operation in distributed band frequencies 479 the circuit in terms of output and phase noise are depicted in tables 4 and 5 for all the five processes namely nn, ff, fs, sf and ss and the results found are satisfactory. the delay circuit layout design is 5.87µm x 6.86µm, while the four-stage vco layout design is 24.98µm x 6.86µm, spanning an area of 171.42m2. they are depicted in fig. 9 and fig. 10. fig. 8 tuning range of vco at normal temperature for vcntr1=0.77v and vcntr2 varies from 0v to 1v (pre and post layout simulation at 270) table 5 corner analysis at vcntr1=0.77v and vcntr2=0.1v process coners pre layout @1mhz post layout @1mhz pre layout @10 mhz post layout @10 mhz output noise (db) phase noise (dbc/hz) output noise (db) phase noise (dbc/hz) output noise (db) phase noise (dbc/hz) output noise (db) phase noise (dbc/hz) nn -96.23 -93.50 -97.11 -94.78 -115.31 -113.93 -116.35 -114.42 ff -95.68 -94.33 -96.28 -94.96 -116.20 -114.42 -116.85 -115.36 fs -99.72 -97.23 -100.22 -98.17 -118.51 -117.81 -119.24 -119.42 sf -98.22 -96.48 -98.89 -97.33 -117.54 -116.71 -118.21 -117.36 ss -97.05 -93.60 -97.76 -94.28 -115.86 -114.11 -116.10 -115.06 fig. 9 layout of the proposed delay cell 480 m. gogoi, p. k. dutta fig. 10 layout of the 4 stage vco the layout design shown in fig. 9 and fig. 10 can further be optimized to reduce the area significantly and get results closer to the obtained in the schematic level. one of the main advantages of the proposed circuit is that the same circuit can be used for working in low frequency range as well as high frequency range by varying the control voltages. however, the range of operations in terms of tuning range is comparatively low. additionally, there is lot of transistors in the delay circuit which further raises the count in the oscillator even more. in the realization column of table 6 it is highlighted that the comparative parameters which are oscillation frequency, consumption of power and phase noise values are either measured or simulated. in our case post-layout values are considered. table 6 comparison parameters references technology (nm) supply voltage (v) number of stages (n) oscillation frequency range (ghz) power consumption (mw) phase noise (dbc/hz) fom dbc/hz realizatio n level 11 180 1.8 2 2.5-5.2 (74%) 17 -90.1 @ 1mhz ---measured 16 180 1 2 0.473-7.54 (93.72%) 7.41 -107.1 @ 10mhz -150.44 simulated 31 180 1.8 4 0.455 to 0.505 0.00139 to 0.00145 1.98 (lower band) and 9.7 (upper band) ------simulated 12 180 1 4 0.479-4.09 (88.28%) 10 -93.3 @ 1mhz -154.4 measured 18 90 1 to 3 3 1.379-1.970 (30%) 0.650-2.584 (74.84%) 0.556-2.584 (78.48%) 0.129 to 5685 -89.779 @ 1mhz -154.51 simulated 28 65 1.2 30 0.556 0.72 -101.7 @ 1mhz -158 simulated 15 65 1.8 3 0.470-0.964 (51.24%) 4.1 -116 @1 mhz -169 measured 29 90 1.2 4 9.21 2.092 -137.9 @ 1mhz ---post layout proposed work 90 1 4 0.048 to 0.57 (91.57%) and 0.82 to 1.01 (18.8 %) {distributed band} 0.151 (lower band) and 0.157 (upper band) -152.40 and -160.64 post layout design of a four stages vco using a novel delay circuit for operation in distributed band frequencies 481 5. conclusion a four-stage vco is designed using a novel differential delay circuit. the vco is found to be operated in two distributed band of frequencies namely lower and upper which is one of its main advantages. pre layout simulation result shows operating frequency bands are 55 mhz to 606 mhz (lower) at normal temperature when one of the control voltages vcntr1 is maintained at 0.77v while the other one vcntr2 varied from 0v to 1v. phase noise at 1mhz and 10 mhz offset are found to be 92.07 dbc/hz and -111.25 dbc/hz at lower frequency band. the vco operates in 857 mhz to 1049 mhz (upper) when vcntr1 is 0.2v and vcntr2 varied from 0v to 1v. in this band the phase noise at 1mhz and 10 mhz offset are -93.50 dbc/hz and -113.93 dbc/hz. pre layout power consumption of the vco at 270 is found to be 151µw and 157µw for the operating frequencies of 606 mhz and 1049 mhz respectively. references [1] t. miyazaki, m. hashimoto and h. onodera, "a performance comparison of plls for clock generation using ring oscillator vco and lc oscillator in a digital cmos process", in proceedings of asia and south pacific design automation conference (aspdac), 2004, pp. 545-546. [2] h. ghonoodi, h. miar-naimi and m. gholami, "analysis of frequency and amplitude in cmos differential ring oscillators", integration, vol.52, pp.253-259, january 2016. [3] m. gogoi, and p. k. dutta, "review and analysis of charge-pump phase-locked loop", in proceedings of 1st international conference on electronics systems and intelligent computing (esic), 2020, pp. 565-574. [4] j. johnson, m. ponnambalam and p. v. chandramani, "comparison of tenability and phase noise associated with injection locked three staged single and differential ended vcos in 90nm cmos", in proceedings of 4th international conference on signal processing, communication and networking, 2017, pp. 1-4. [5] w. t. lee, j. shimand and j. jeong, "design of a three-stage ring-type voltage controlled oscillator with a wide tuning range by controlling the current level in an embedded delay cell", microelectronics j., vol. 44, pp. 1328-1335, dec. 2013. [6] s. salem, m. tajabadiand and m. saneei, "the design and analysis of dual control voltages delay cell for low power and wide tuning range ring oscillators in 65nm cmos technology for cdr applications", aeu international journal electronics communication, vol. 82, pp. 406-412, dec. 2017. [7] v. muddi, k. d. shinde and b. k. shivaprasad, "design and implementation of 1ghz current starved voltage control oscillator (vco) for pll using 90nm cmos technology", in proceedings of international conference on control, instrumentation, communication and computational technologies (iccicct), 2015, pp. 335-339. [8] j. g. maneatis and m. a. horowitz, "precise delay generation using coupled oscillators", ieee j. solid-state ciruits., vol. 28, pp. 1273-1282, dec. 1993. [9] s. t. yan and h. c. luong, "a 3-v 1.3-to-1.8-ghz cmos voltage-controlled oscillator with 0.3-ps jitter", ieee trans. circuits syst. ii: analog and digital signal proc., vol.45, pp. 876-880, july 1998. [10] c. h. park and b. kim, "a low-noise, 900-mhz vco in 0.6-/spl mu /m cmos", ieee j. solid-state circuits, vol. 34, pp. 586-591, june 1998. [11] w. h. tu, j. y. yeh, h. c. tsai and c. k. wang, "a 1.8 v 2.5-5.2 ghz cmos dual-input two-stage ring vco", in proceedings of asia pacific conference on advanced system integrated circuits, 2004, pp. 134-137. [12] m. l. sheu, y. s. tiao and l. j. taso, "a 1-v 4-ghz wide tuning range voltage-controlled ring oscillator in 0.18 μm cmos", microelectronics j., vol. 42, pp. 897-902, april 2011. [13] m. parvizi, a. khodabakhshand and a. nabavi, "low-power high-tuning range cmos ring oscillator vcos", in proceedings of the ieee international conference on semiconductor electronics, 2008, pp. 40-44. [14] s. suman, k. g. sharma and p. k. ghosh, "design of pll using improved performance ring vco", in proceedings of the international conference on electrical, electronics and optimization techniques (iceeot), 2016, pp. 3479-3483. 482 m. gogoi, p. k. dutta [15] h. gao, r. xia, x. wang, t. zhou and m. zhou, "wideband ring oscillator with switched resistor array for low tuning sensitivity", analog integr. circuits and signal process., vol. 89, pp. 493-498, sept. 2016. [16] n. gargouri, d. b. issa, z. sakka, a. kachouri and m. samet, "design and optimization of differential ring oscillator for ir-uwb applications in 0.18μm cmos technology", j. circuits syst. comput., vol. 26, pp. 1750080-1-1750080-15, dec. 2016. [17] s. salem, h. zandevakili, a. mahani and m. saneei, "fault-tolerant delay cell for ring oscillator application in 65 nm cmos technology", iet circuits devices syst., vol. 12, pp. 233–241, nov. 2017. [18] m. kumar and d. dwivedi, "a low power cmos-based vco design with i-mos varactor tuning control", j. circuits syst. comput., vol. 27, pp. 1850160-1-1850160-14, jan. 2018. [19] s. y. lee, s. amakawa, n. ishihara and k. masu, "2.4-10 ghz low-noise injection-locked ring voltage controlled oscillator in 90nm complementary metal oxide semiconductor", jpn. j. appl. phys., vol. 50, pp. 04de03-1-04de03-5, april 2011. [20] a. ramazani, s. biabani and g. hadidi, "cmos ring oscillator with combined delay stages", aeu – int. j. electron. commun., vol. 68, pp. 515-519, june 2014. [21] m. karimi-ghartemani, h. karimiand and m. r. iravani, "a magnitude phase-locked loop system based on estimation of frequency and in-phase/quadrature-phase amplitudes", ieee trans. ind. electron., vol. 51, pp. 511-517, april 2004. [22] a. sharma, saurabh and s. biswas, "a low power cmos voltage controlled oscillator in 65nm technology", in proceedings of international conference on computer communication and informatics, 2014, pp. 1-5. [23] s. kamran and n. ghaderi, "a novel high speed cmos pseudo-differential ring vco with wide tuning control voltage range", in proceedings of the iranian conference on electrical engineering (icee), 2017, pp. 201-204. [24] z. chen and t. lee, "the study of a dual-mode ring oscillator", ieee trans. circuits syst. ii: express briefs, vol. 58, no. 4, pp. 210-214, april 2011. [25] g. k. sharma, a. k. johar, t. b. kumar and d. boolchandani, "design and analysis of wide tuning range differential ring oscillator (wtr-dro)", analog integr. circuits signal process., vol. 103, pp. 17-29, april 2020. [26] j. m. kim, s. kim, i. y. lee, s. k. han and s. g. lee, "a low noise four-stage voltage controlled ring oscillator in deep-submicrometer cmos technology", ieee trans. circuits syst. ii: express briefs, vol. 60, no. 2, pp. 71-75, feb. 2013. [27] i. kovacs and m. neag, "new dual-loop topology for ring vcos based on latched delay cells", in proceedings of the ieee international symposium on circuits and systems (iscas), 2018, pp. 1-5. [28] t. yoshio, t. kihara and t. yoshimura, "a 0.55 v back-gate controlled ring vco for adcs in 65 nm sotb cmos", in proceedings of the ieee asia pacific microwave conference (apmc), 2017, pp. 946-948. [29] s. k. saw, s. k. yadav, m. maiti, a. j. mondal and a. majumder, "a design approach of higher oscillation vco made of cs amplifier with varying active load", microsyst. technol., vol. 26, pp. 1-10, feb. 2020. [30] a. a. abidi, "phase noise and jitter in cmos ring oscillators", ieee j. solid-state circuits, vol. 41, no. 8, pp. 1803-1816, aug. 2006. [31] s. pahlava and m. b. ghaznavi-ghoushchi, "1.45 ghz differential dual band ring based digitallycontrolled oscillator with a reconfigurable delay element in 0.18 μm cmos process", analog integr. circuits signal process., vol. 89, no. 2, pp. 461-467, nov. 2016. [32] m. katebi, a. nasri and s. toofan, "a wide tuning range and low phase noise vco using new capacitor bank structure", majlesi j. electr. eng., vol. 12, pp. 95-103, 2018. [33] m. katebi, a. nasri, s. toofan and h. zolfkhani, "a temperature compensation voltage controlled oscillator using a complementary to absolute temperature voltage reference", int. j. eng., vol. 32, no. 5, pp. 710-719, 2019. 10528 facta universitatis series: electronics and energetics vol. 35, no 3, september 2022, pp. 437-454 https://doi.org/10.2298/fuee2203437a © 2022 by university of niš, serbia | creative commons license: cc by-nc-nd original scientific paper improving performance of transmission networks using facts through continuation power flow method jamal alnasseir electric power department, faculty of mechanical and electrical engineering, damascus university, syria abstract. over the past 50 years, modern electrical systems have become more complex, as they overrun the geographical boundaries of neighboring countries. the problem is that the power system faces many challenges, because it is exposed to difficult operating conditions. the phenomenon of voltage instability is the most frequent phenomenon, and this can lead to the collapse of the power system. to avoid power outages in the system (especially in blackout situations), the power system must be analyzed in order to maintain voltage stability in the expected difficult operating conditions. the main objective is to determine the maximum load capacity of the system and the causes of voltage instability. the voltage instability problem is related to the nature of nonlinear loads, so different load characteristics must be taken into consideration when analyzing voltage stability. this study aims to discover the maximum load capacity required by using the continuous power flow method (cpf) in the studied network. then, the performance of this network using a flexible alternating current transmission system (facts) will be utilized. facts systems present a promising solution in improving the voltage stability by improving the power transmission capacity and controllability of the parameters of the existing power networks. this study will be conducted on a reference network platform under normal working conditions, then installation of one of the facts systems will show its effect on improving voltage stability. the continuous power flow method will be used to find pv curves, which in turn will help to determine the conditions of maximum loading while maintaining stability, and identify the bus bar with the smallest voltage, on which the flexible ac systems will be installed. the software environment matlab/psat will be used for modeling and simulation. key words: voltage stability, continuation power flow (cpf), maximum load conditions, flexible alternating current transmission system (facts), thyristor-controlled series capacitor (tcsc) received february 22, 2022; revised april 7, 2022; accepted may 5, 2022 corresponding author: jamal alnasseir electric power department, faculty of mechanical and electrical engineering, damascus university, syria e-mail: jamalnasseir@yahoo.de 438 j. alnasseir 1. introduction power systems face a lot of challenges, because the energy demand is increasing drastically nowadays. due to that, the generated power will be increased. there are various ways of increasing power generation. power must reach the end consumers through the existing transmission lines, and/or new transmission line should be built. in any case, the loading capacity of the existing transmission lines will increase. if these transmission lines are overloaded, the problem of voltage stability will appear [1]. the voltage profile of the system transmission lines will be affected, and as a result, the power losses will increase. the use of flexible alternating current transmission systems (facts) systems in specific locations of the system will solve most of the previous problems in important power transmission lines [1]. facts devices will improve: loading capacity of transmission lines voltage levels and reduce power losses under normal operating conditions and during the occurrence of faults. facts devices depend on sophisticated power electronic elements, which help control the power flow in transmission lines. these systems can be connected in series or in parallel to important transmission lines. through it, this reactive and active power could be controlled [1]. continuous power flow method is considered one of the best methods used for load flow analysis [1, 2, 3]. studies have confirmed that this method is effective for studying voltage stability in the transmission system. the cpf is characterized by reduced execution time and computation burden, in addition to accuracy and ease of implementation. this method has been applied to the ieee 11-bus system. in this paper, the following systems svc, tcsc and upfc will be connected on a specific bus-bar in the studied transmission system in order to improve the voltage stability in it. the continuation power flow (cpf) method will be used to study the impact of previous systems on the transmission system other studies did not use cpf method to improve the stability if transmission networks in the presence of facts systems. the cpf method uses the step method in prediction and correction, therefore, the jacobian matrix is not considered a mono matrix. the principle of this method is to locate the weakest transmission line, where one of the flexible alternating current systems will be connected. then, an analytical study will be implemented to compare the performance of the network before and after adding the aforementioned system [1, 3, 4, 5]. 2. voltage stability voltage stability is defined as the ability of the power system to maintain the voltage of all buses within acceptable values during normal conditions or after the occurrence of a disturbance. the system is subjected to voltage instability when overloading occurs. the parameters of the system will change and the voltage will drop rapidly. in this case, the automatic control units would be unable to control the system changes and suppress them accordingly. it may take several seconds or even (10-20) minutes to suppress the changes in the voltage. if the disturbance keeps occurring, the voltage then becomes unstable, which would lead to a collapse in the voltage of generators and transmission lines. in other words, the main reason behind the instability of the power system is that the system is unable to meet the demand for reactive power [3, 4, 5, 12, 13]. improving performance of transmission networks using facts through continuation power flow method 439 3. p-v curves the p-v curves express the changes in voltage when the reactive power of the load has changed. these curves are the result of the implementation of load flows at different levels of uniformly distributed loads combined with the constant power factor. when the number of system branches increases, the time required to find these curves will also increase because the time required to calculate the load flow will definitely increase. the p-v curves provide the index of voltage stability of a network as well as the voltage collapse point. voltage stability analysis provides transmission limits through the study of p-v curves. moreover, these curves give the results of the entire system and determine the disturbances which have an impact on the system blackout or emergencies [3, 4, 5, 6, 7, 12, 13]. 4. continuation power flow (cpf) method the cpf method consists of the following steps: 1 − run load flow emergencies [3, 4, 5, 6, 7, 12, 13]: the principle of continuous load flow is to trace the solutions of a nonlinear system through the steps of prediction and correction, considering the nonlinear equations. 0),,( = vf (1) where λ is the load factor, and the value of λ is between 0≤λ≤ λcritical. the following equations give the conventional load flow for a bus i. 0 0 =−− =−− tiligi tiligi qqq ppp (2) where pgi , qgi are the active and reactive generated power, respectively. pli , qli are the active and reactive power of loads. pti, qti are the net power injected into the bus i. therefore, the net power equations are given as follow: ijjiijj n j iti ijjiijj n j iti yvvp yvvq   −−= −−=   = = cos ( )s in( 1 1 (3) 2 − expected step: the step size depends on the direction of the tangent at the previous solution point. equation (4) gives the tangent [3, 4, 5, 5, 7]. 0),,( = vdf (4) by applying the partial differentiation of equation (4), equation (5) is given as follow. 0)()()( =   +   +       f v v ff (5) therefore, the matrix is given by equation (6), where the right side of equation(6) represents tangent vector t. 440 j. alnasseir 0)])()([( =                       v f v ff (6) voltage stability limit stable region unstable region active power loading b u s v o lt a g e critical point (maximum load point) fig. 1 p-v characteristic [2, 7] 0 1 0 ][ ))()(( =       =                 t z f v ff k  (7) the row vector is equal zero and the kth element is equal 1.           =   d dv d t][ (8) the tangent vector t is defined in equation (9).                        = − 1 0))()(( ][ 1 kz f v ff t  (9) by solving the equations, we find:           +           =                    d dv d vv * * * (10) 3 − correction step: the correction step comes after choosing the size of the prediction step for the tangent vector [3, 4, 5, 6, 7]. sg nnn nn rxor rx d dv d −− ++             1 21 2 12    (11) improving performance of transmission networks using facts through continuation power flow method 441 where n1, n2 the number of buses in series and n is the total number of buses in the system. ng is the pv generated buses and ns is the number of infinite buses [3, 4, 5, 6, 7]. the extended equation is an equation from group of equations to determine the status of the variables. k x = (12) therefore, the equation resulting from a group of equations is given by (13) [3, 4, 5, 6, 7]. ]0[ )( =      − kx xf (13) figure 2 shows the scheme of the calculation method in the cpf algorithm [10]. start reading network parameters initializes variables (v, p, q newton-raphson power flow determine continuation parameters calculate tangent factor prediction soluation checking reactive power generation limits are violated checking reactive power generation limits are violated perform correction end pv pq or slack q conversion fig. 2 flow chart of calculation method used in the cpf algorithm [10] 442 j. alnasseir 5. type of flexible alternating current transmission systems (facts) [1] facts controllers are normally connected in series or parallel to the transmission lines. these controllers enhance the power transfer capability of the existing transmission lines. they also improve the voltage stability of the transmission system. when subjected to external disturbances, these controllers help the power system to regain its normal state. effective reactive power management is done using these controllers in transmission system [1, 11]. the series compensation results in the improvement of the maximum power transmission capacity of the line. the net effect is a lower load angle for a given power transmission level and, therefore, a higher-stability margin. the reactive-power absorption of a line depends on the transmission current, so when series capacitors are employed, automatically the resulting reactive-power compensation is adjusted proportionately. also, because the series compensation effectively reduces the overall line reactance, it is expected that the net linevoltage drop would become less susceptible to the loading conditions [1, 11]. application of series capacitors in a long line constitutes placing a lumped impedance at a point. therefore, the following factors need careful evaluation: ▪ the voltage magnitude across the capacitor banks (insulation). ▪ the fault currents at the terminals of a capacitor bank. ▪ the placement of shunt reactors in relation to the series capacitors (resonant over-voltages). ▪ the number of capacitor banks and their location on a long line (voltage profile). while, shunt devices may be connected permanently or through a switch. shunt reactors compensate for the line capacitance, and because they control over-voltages at no loads and light loads, they are often connected permanently to the line, not to the bus [1, 11]. shunt capacitors are used to increase the power-transfer capacity and to compensate for the reactive-voltage drop in the line. the application of shunt capacitors requires careful system design. the circuit breakers connecting shunt capacitors should withstand highcharging in-rush currents and also, upon disconnection, should withstand more than 2-pu voltages, because the capacitors are then left charged for a significant period until they are discharged through a large time-constant discharge circuit. also, the addition of shunt capacitors creates higher-frequency–resonant circuits and can therefore lead to harmonic over-voltages on some system buses [1, 11]. so, facts systems can be classified into three main groups: ▪ series control systems (like tcsc), ▪ shunt control systems (like svc), ▪ shunt-series composite control systems (like upfc) [1, 11]. 5.1. thyristor-controlled series capacitor (tcsc) thyristor-controlled series capacitor (tcsc) is one of the facts types, consisting of a capacitor connected as in parallel with the reactance which is controlled by a thyristor, as shown in figure 3. additionally, figure 3 shows the installation of the arrester discharger made of metal oxide to avoid the occurrence of over voltage across the unit. the series connection of several tcsc units is used to meet the total required compensation, as observed in figure 4 [8, 9, 11, 12]. improving performance of transmission networks using facts through continuation power flow method 443 thyristor vc l (t)li thyristor varistor c fig. 3 power circuit of tcsc compensator thyristor vc l (t)li thyristor varistor c thyristor vc l (t)li thyristor varistor c thyristor vc l (t)li thyristor varistor c fig. 4 series connected tcsc compensators 5.2. static var compensator the svc is an advanced technology that is widely used for transmission applications for several purposes. the primary purpose is usually rapid control of voltage at weak points in the network. worldwide, there is a steady increase in the number of installations. the ieee-definition of an svc is as follows: "static var compensator (svc): a shuntconnected static var generator or absorber whose output is adjusted to exchange capacitive or inductive current so as to maintain or control specific parameters of the electrical power system (typically bus voltage) [8, 9, 11, 12, 13]. svc is an umbrella term for several devices. the svc devices discussed in the following sections are the tcr (thyristor controlled reactor), fc (fixed capacitor) and tsc (thyristor switched capacitor). the components of an svc may include: transformers between the high voltage network bus and medium voltage bus where the power electronic equipment is connected, a fixed (usually air-core) reactor of inductance l and a bidirectional thyristor. the thyristors are fired symmetrically in an angle î± in a controlled range of 90° to nearly 180°, with respect to the capacitor voltage. the tsc is often used in order to decrease standby losses. figure. 5 shows a common structure of svc [8, 9, 11, 12, 3]. (t)ci xc xl (t)li (t)li fig. 5 common structure of svc 444 j. alnasseir 5.3. unified power flow controllers the unified power flow controller (upfc) is one of facts, which is combined of series and shunt facts. it consists of two voltage source converters (vscs), the two vscs are connected to common dc capacitor bank. the first unit of upfc is a static compensator (statcom), which is connected vsc via parallel transformer, then to the dc bus. the second unit is a static synchronous series compensator (sssc). it is also connected to the vsc via series transformer, the to the dc bus. the upfc provides the control capabilities in power flow and instantaneously satisfy the power flow regulation requirements (see figure 6). the major control techniques are as follows [8, 9, 11, 12, 18]: ▪ reactive shunt compensation or bus voltage regulation; ▪ reactive series compensation or line impedance compensation [8, 9, 12, 18]. c shunt transformer series transformer transmission line dc link shunt converter series converter ssscstatcom fig. 6 upcf block diagram between them, the upfc achieves shunt voltage regulation by injecting an in-phase or anti-phase voltage that varies within the maximum and minimum injection limits. these limits are controlled by the ratings of the shunt converter (see figure 7) [8, 9, 12, 18]. area 1 area 2 exchangep li xv vs s vr r svvinj s inj+vsv sv ljx r, qrp fig. 7 upfc voltage injection improving performance of transmission networks using facts through continuation power flow method 445 6. practical study by using matlab-psat software in this research, matlab-psat (power system analysis toolbox) is used to model the power system networks. psat works within the matlab environment and is considered one of the developed software designed to perform static and dynamic analysis of the electrical power systems. it can be used to do the following calculations: ▪ load flow, ▪ continuous power flow, ▪ optimal power flow, ▪ continuous power flow, ▪ static and transient stability analysis of electrical networks, ▪ voltage stability analysis of electrical networks during the static and transient conditions. figure 8 shows the graphical user interface (gui) of matlab-psat which is used to build the electrical power network. the user can add data of the network, build the single line diagrams using the psat-simulink library whereby the data is saved. after that, the data is uploaded by the information file into the gui, and the necessary studies pertaining to the networks are then conducted. fig. 8 user interface of matlab-psat 7. results and discussion 7.1. application of the proposed method to a standard ieee 11-bus system the ieee 11-bus system is a standard system often used by power system specialists for conducting research. figure 9 shows the diagram of the studied network, which consists of 11 buses, and four cylindrical rotor synchronous generators. the voltage level of the generator is 20 kv with a capacity of 900 mva, while the parameters of all generators remain similar. the system has eight transmission lines, two loads and two capacitors. the frequency is 60 hz, while the transmission voltage level is 230 kv and the based power is 100 mva. 446 j. alnasseir fig. 9 diagram of ieee 11-bus system 7.2. voltage stability analysis of ieee 11-bus system utilizing (cpf) method load flow analysis is implemented to calculate the voltage of buses and to determine the weakest buses in the network during normal operating conditions. figure 10 shows the 11-bus system modelled by matlab-psat. fig. 10 ieee 11-bus system modeled by matlab-psat the continuation of power flow is analyzed by using matlab-psat, as explained in figure 11. the results of 11-bus system are presented in table 1. figure 12 presents the voltage of all buses of the studied network calculated by the cpf method. table 1 cpf results under normal operating conditions v [p. u.] bus. nr. v [p. u.] bus. nr. 0.95987 6 1.029 1 0.93523 7 1.0089 2 0.90731 8 1.029 3 0.94874 9 1.0086 4 0.96691 10 0.997 5 0.99876 11 it has been observed in table 1 and figure 12 that the critical voltage values in the network are the voltage of bus 6, bus7, bus 8, bus 9, and bus10, therefore, these buses are the weakest buses in the network, most subjected to network changes, and the probability of voltage collapse is high compared with the other buses. the software calculates the maximum loading factor of the network. as seen in figure 8 (from command improving performance of transmission networks using facts through continuation power flow method 447 window in matlab), after applying the continuation power flow for the studied network, the maximum loading factor is λ = 1.1481[p. u]. in other words, the network is considered stable before this point. however, the network is found to be unstable, after this point. that is why it is called the maximum loading point. fig. 11 voltage of all buses calculated by cpf before adding facts fig. 12 value of maximum loading factor of the 11-bus system without the compensator figure 13 illustrates the p-v curves calculated by cpf method, which provide the maximum loading factors of bus 5 to bus11 as a function of the voltage of the network. moreover, figure 13 shows the buses which have the critical voltage in the studied network. the critical voltage curves are the lowest among these curves, and bus 8 is the weakest which has the lowest curve. fig. 13 p-v curves of the 11-bus system before adding facts 448 j. alnasseir 7.3. voltage stability analysis of ieee 11-bus system after adding the thyristorcontrolled series capacitor (tcsc) a compensator tcsc is connected in series with the transmission line (9-10), as shown in figure 14. fig. 14 the studied network after adding the tcsc in series with transmission line (9-10) the compensator tcsc is connected in series with the transmission line (9-10), therefore, due to that, bus 9 and bus10 were found to be among the weakest buses in the network, as mentioned in table 1, based on the cpf results. additionally, this line is considered as one of the network lines, which carries the largest load and has big reactive losses (table 2). as observed in table 2, the active power injected into the line (9-10) is 1406.4035 mw, which has the biggest value. furthermore, the active and reactive losses are 20.5187 mw and 203.5155 mvar respectively, while the power losses of this line is bigger than the other transmission lines. the value of the maximum loading factor reflects on the abovementioned results, where the value is λ = 1.1754 [p. u.] (as indicated in fig. 15), after adding the compensator tcsc in the line (9-10). table 2 load flow results of the ieee 11-bus before adding facts fig. 15 the value of maximum loading factor of the 11-bus system with tcsc connected in series with the transmission line (9-10) improving performance of transmission networks using facts through continuation power flow method 449 fig. 16 presents the values of the maximum loading factor of the studied network with the implementation of tcsc. it was found that the best location to add the compensator tcsc is the line (9-10). in this case, the value of the maximum loading factor λ is the best. table 3 illustrates the voltage profile of the network from the continuation power flow method after adding the compensator tcsc in series with line (9-10). fig. 16 value of maximum loading factor with the tcsc connected in different locations table 3 load flow results of the studied system after adding tcsc in series with the transmission line (9-10) v [p. u.] bus. nr. v [p. u.] bus. nr. 0.96131 6 1.028813 1 0.938492 7 1.00841 2 0.919863 8 1.028533 3 0.965957 9 1.007755 4 0.967424 10 0.996724 5 0.999152 11 fig. 17 provides the p-v curves after adding tcsc in series with line (9-10), where the graph shows improvement on the critical voltage levels compared with the results before adding facts. these results are similar to what was presented by the reference [16], that using tcsc compensator in the transmission network, to improve the power transfer capacity and loading factor. fig. 17 p-v curves of the 11-bus system after adding tcsc in series with line (9-10) 1.155 1.16 1.165 1.17 1.175 1.18 maximum loading factor( lamda) (9-10) (8-9) (7-8) (6-7) 450 j. alnasseir 7.4. voltage stability analysis of ieee 11-bus system after adding the static variable compensator (svc) a compensator svc is connected in parallel to busbar-8, as shown in fig. 18. fig. 18 the studied network, after adding the svc in parallel at busbar-8 the compensator svc is connected in parallel at busbar-8, because it is the weak point between the network buses, as mentioned in table 1. after applying the continuous load flow to the studied network, table 4 presents the busbar voltages based on the cpf. it is clear that the voltage level of all displayed busbars improved when compared with the first case. also, it was found that the maximum load capacity of the studied network is that the network load capacity after linking the svc has improved, it is: λ =1.1501[p. u.] = as shown in fig.19. table 4 load flow results of the studied system after adding svc v [p. u.] bus. nr. v [p. u.] bus. nr. 0.96945 6 1.0288 1 0.95258 7 1.0084 2 0.96411 8 1.0291 3 0.97936 9 1.0302 4 0.9915 10 1.0002 5 1.009 11 fig. 19 value of maximum loading factor after connecting svc fig. 20 provides the p-v curves after adding on the svc, where the graph shows improvements on the critical voltage level compared with the results before adding the facts system. so, this result is consistent with the reference [17], since the svc compensator is capable to improve and maintain the system’s voltage profile within an acceptable limit, and will also reduce power loss, and in effect improve power transfer capability of the system if applied. improving performance of transmission networks using facts through continuation power flow method 451 fig. 20 p-v curves of the 11-bus system after adding svc 7.5. voltage stability analysis of ieee 11-bus system after the addition of the upfc compensator the upfc compensator was connected between the busbars (7-8), as they are the weakest busbars in the studied network, the rated power of upfc was100 [mvar]. figure 21 shows the modeling of the studied network in the presence of upfc. fig. 21 the studied network after the addition of the upfc between busbars (7-8) after applying the continuous load flow to the studied network, table 5 presents the busbar voltages based on the cpf. it is clear that the voltage level of all busbars was improved when compared with the first case. table 5 load flow results of the studied system after adding svc v [p. u.] bus. nr. v [p. u.] bus. nr. 0.976393 6 1.028729 1 0.966668 7 1.008499 2 1.018 8 1.028578 3 0.970404 9 1.008196 4 0.978061 10 1.002375 5 1.002686 11 452 j. alnasseir from this table (5), the values of the studied network voltage levels, resulting from the continuous load flow method in the presence of upfc, have been found to be better compared to the previous cases as in tables (2), (3) and (4). the compensator upfc points to the best performance in improving the maximum load capacity of the studied network compared with the parallel compensator and the serial compensator, as it controls the volage and reactive power between the two busbars (7, 8). the maximum load capacity becomes λ =1.1918 [p. u.], as shown in figure (22). figure (23) presents the (pv) curves of the ieee-11 in the presence of ufpc. thus, the voltage levels become better when compared to the other cases, i.e., the voltage stability margin for the studied network is better, this means that the maximum loading point (or the so-called voltage breakdown point) is better than the other cases, therefore, the system will stay for a long time, when compared to previous compensation cases without the systemcollapse. these results are in agreement with what was provided by the reference [18], that there is an improvement in the real and reactive powers through the transmission line when upfc is introduced, and combined facts system (upfc) has the advantages like reduced maintenance and ability to control real and reactive powers. fig. 22 value of maximum loading factor after connecting upfc fig. 23 p-v curves of the 11-bus system after adding the upfc system 8. conclusion in this research the continuation power flow method has been used to analyze the possibility of increasing the loading capability of the electrical power systems with the implementation of the facts systems (tcsc, svc, upfc). this study has applied the ieee 11-bus system. the research findings include: improving performance of transmission networks using facts through continuation power flow method 453 1. the cpf has been found to be an effective method in determining the best location to connect the compensators, on the weakest node for the compensators. 2. the study concluded that the implementation of the reactive compensators (series, shunt, or composite) increases the loading capacity of electrical networks, where the loading factor λ of ieee 11-bus increases: from 1.1481 [p. u.] to 1.1754 [p. u.] in the presence of tcsc, form 1.1481 [p. u.] to 1.150 [p. u.] in the presence of svc and 1.1481 [p. u.] to 1.1918 [p. u.] in the presence of upfc. 3. the compensator tcsc is a source of improvement for the performance of the network, this improvement is related to the nature of the network, its loads and line losses. 4. the ufpc compensator shows the best performance among the compensators used in increasing the load capacity of electrical networks, regardless of the nature of the electrical network. 5. the use of different type of compensators (series or shunt), such as svs &upfc is recommended so as to be able to compare their performance. references [1] n. karuppiah1, s. muthubalaji, s. ravivarman, md. asif and a. mandal, "enhancing the performance of transmission lines by facts devices using gsa and bfoa algorithms", int. j. eng. techn., vol. 7, no. 4.6, pp. 203–208, 2018. [2] a. jalali and m. aldeen, "novel continuation power-flow algorithm", in proceedings of the ieee international conference on power system technology (powercon), 2016, p. 16487968. [3] m. gudavalli, h. emulapalli and k. cherukupall, "voltage stability analysis using continuation power flow under contigency", j. theor. appl. inf. technol., vol. 99, no. 10, pp. 2373–2383, may 2021. [4] s. b. bhaladhare, "improving voltage stability by using facts devices", iaset: j. electr. electron. eng. (iaset: jeee), vol. 1, no. 1, pp. 1–10, 2016. [5] s. d. patel, h. h. raval and a. g. patel, "voltage stability analysis of power system using continuation power flow method", int. j. technol. res. eng., vol. 1, no. 9, pp. 763–767, may 2014. [6] n. fnaiech, a. jendoubi and f. bacha, "voltage stability analysis in power system using continuation method and psat software", in proceedings of the 6th international renewable energy congress (irec), tunis, 2015, pp. 1–6. [7] s. greene, i. dobson and f. l. alvarado, "sensitivity of the loading margin to voltage collapse with respect to arbitrary parameters", ieee trans. power syst., vol. 12, no. 1, pp. 262–272, feb. 1997. [8] leonardo l. grigsby, power system stability and control, crc press, 3rd edition, 2012. [9] v. chauhan, b. singh and j. bala, "enhancement of static voltage stability using tcsc and svc", int. j. sci. eng. res., vol. 8, no. 4, pp. 127–130, april 2017. [10] j. m. teixeira da silva marques da cruz, "extension of continuation power flow to incorporate dispersed generation", doctor thesis, lisbon technical university, 2016. [11] r. mohan and r. k. varma, thyristor-based facts controllers for electrical transmission systems, john wiley & sons, 2002. [12] i. g. adebayo, i. a. adejumobi and o. s. olajire, "power flow analysis and voltage stability enhancement using thyristor controlled series capacitor (tcsc) facts controller", int. j. eng. adv. technol. (ijeat), vol. 2, no. 3, pp. 100–104, feb. 2013. [13] t. van cutsem, "a method to compute reactive power margins with respect to voltage collapse", ieee trans. power syst., vol. 6, no. 1, pp. 145–156, feb. 1991. [14] c. sharma and m. g. ganness, "determination of the applicability of using modal analysis for the prediction of voltage stability", in proceedings of the ieee/pes transmission and distribution conference and exposition, 2008, pp. 1–7. [15] y. zhang, s. rajagopalan and j. conto, "practical voltage stability analysis", in proceedings of the ieee pes power and energy society general meeting, 2010, pp. 1–7. https://ieeexplore.ieee.org/xpl/conhome/7735973/proceeding https://ieeexplore.ieee.org/xpl/conhome/7735973/proceeding 454 j. alnasseir [16] o. i. adebisi, i. a. adejumobi, p. e. ogunbowale and o. o. ade-ikuesan, "performance improvement of power system networks using flexible alternating current transmission systems devices: the nigerian 330 kv electricity grid as a case study", lautech j. eng. techno., vol. 12, no. 2, pp. 46–55, 2018. [17] s. kumar, "implementation of tcsc on a transmission line model to analyze the variation in power transfer capability", int. j. res. (ijr), vol. 1, no. 8, pp. 1091–1098, sept. 2014. [18] j. p. sai kumar reddy and p. janga, "power flow improvement in transmission line using upfc", int. j. electron. commun. technol. (iject), vol. 7, no. 4, pp. 9-12, 2016. instruction facta universitatis series:electronics and energetics vol. 27, n o 1, march 2014, pp. 1 11 doi: 10.2298/fuee1401001c microstructural impact on electromigration: a tcad study  hajdin ceric 1,2 , roberto lacerda de orio 2 , wolfhard h. zisser 1,2 , siegfried selberherr 2 1 christian doppler laboratory for reliability issues in microelectronics at the institute for microelectronics, tu wien, austria 2 institute for microelectronics, tu wien, gußhausstraße 27–29, a-1040 wien, austria abstract. current electromigration models used for simulation and analysis of interconnect reliability lack the appropriate description of metal microstructure and consequently have a very limited predictive capability. therefore, the main objective of our work was obtaining more sophisticated electromigration models. the problem is addressed through a combination of different levels of atomistic modeling and already available continuum level macroscopic models. a novel method for an ab initio calculation of the effective valence for electromigration is presented and its application on the analysis of em behavior is demonstrated. additionally, a simple analytical model for the early electromigration lifetime is obtained. we have shown that its application gives a reasonable estimate for the early electromigration failures including the effect of microstructure. keywords: electromigration, interconnect, reliability, physical modeling, simulation 1. introduction electromigration (em) experiments indicate that the copper interconnect lifetime decreases with every new interconnect generation. in particular, fast diffusivity paths cause a significant variation in the interconnect performance and em degradation [1]. in order to produce more reliable interconnects, the fast diffusivity paths must be addressed when introducing new designs and materials. the em lifetime depends on a variation of material properties at the microscopic and atomistic levels. microscopic properties are grain boundaries and grains with their crystal orientation [2]. atomistic properties are configurations of atoms at the grain boundaries, at the interfaces to the surrounding layers, and at the cross-section between grain boundaries and interfaces. modern technology computer-aided design (tcad) tools, in order to meet the challenges of contemporary interconnects, must cover two major  received december 16, 2013 corresponding author: hajdin ceric christian doppler laboratory for reliability issues in microelectronics at the institute for microelectronics, tu wien, austria (e-mail: ceric@iue.tuwien.ac.at) 2 h. ceric, r. lacerda de orio, w. zisser, s. selberherr areas: physically based continuum-level modeling and first-principle/atomistic-level modeling. we present a computationally efficient ab initio method for calculation of the effective valence for em and the atomistic em force. the results of these ab initio calculations are applied for parameterization of a continuum-level model [7] and for simulation of the impact of the copper microstructure on the em behavior. additionally, an application of the kinetic monte carlo method in combination with the ab initio method for em analysis is demonstrated. results of ab initio and atomistic calculations are also used for the derivation of a compact model for early em failures in copper dual-damascene m1/via structures. the model is based on the combination of a complete void nucleation model together with a simple mechanism of slit void growth under the via. it is demonstrated that the early em lifetime is well described by a simple analytical expression, from where its statistical distribution can be obtained. moreover, it is shown that the simulation results provide a reasonable estimate for the em lifetimes. 2. theoretical background 2.1. electronic density based calculation of effective valence generally, the effective valence is a tensor field ( ̅), which defines a linear relationship between the em force ( ⃗) and an external electric field ( ⃗⃗). ⃗⃗( ⃗⃗⃗) ̅ ⃗⃗⃗ ⃗⃗⃗ (1) for the calculation of the effective valence several methods have been proposed, all of them being based on the computation of electron scattering states [3]. density functional theory (dft), in connection with the augmented plane wave (apw) method [4] or the korringa-kohn-rostoker (kkr) method [5], has been established as the most powerful method for the determination of scattering states, however, it requires a demanding computational scheme. the cumbersome representation of scattering wave functions with many parameters is a heavy burden on stability and accuracy of subsequent numerical steps. in this work we introduce a more robust and efficient method to calculate the effective valence, which relies only on the electron density ⃗⃗ ⃗ . the basic idea is given in the following equations for the tensor components: ( ⃗⃗⃗) ∭ ⃗⃗ ( ⃗⃗⃗) ( ⃗⃗)[ ⃗⃗( ⃗⃗) ̂ ] ∭ ⃗ ( ⃗⃗ ⃗)[ ⃗⃗⃗ ( ⃗⃗⃗ ⃗) ̂ ] is the interaction potential between an electron and the migrating atom, ( ⃗⃗) is the relaxationtime due to scattering by phonons, ⃗( ⃗⃗) is the electron group velocity, and is the volume of a unit cell. the first integration is over the k-space and the second over the volume of the crystal. for the calculation of the electron density the dft tool vasp [6] is used. an example of a vasp calculation is given in fig. 1. microstructural impact on electromigration: a tcad study 3 the electron density alone provides a qualitative explanation for the fact that the effective valence is higher in the bulk than in the grain boundaries. similar analyses can be performed for atomic structures of different copper/insulator interfaces. higher electron densities lead to higher effective valences, as can be seen from (2) [7]. for an accurate electron density calculation it is necessary to know the exact positions of the atoms in the structure. fig. 1 portion of the bulk copper crystal. the electron density is represented in two orthogonal planes. it varies from higher values (circle regions around atoms) closer to the atomic nucleus to lower in the inter-atomic space 2.2. kinetic monte carlo simulation of electromigration to utilize results of quantum mechanical calculations for kinetic monte carlo simulations an average driving force along the diffusion jump path must be calculated. in general, the microscopic force-field depends on the position of the defect along the diffusion jump-path. the average of the microscopic force over the j-th diffusion jump path between locations ⃗ and ⃗ [3] is ⃗⃗ ⃗⃗ ∫ ⃗⃗ ⃗ ⃗⃗ ⃗⃗ ⃗ (3) the change in diffusion barrier height is equal to the net work by the microscopic force as the defect is moved from the initial to final sites over the entire jump path. the rates of defect jumps were calculated using the harmonic approximation to transition state theory (tst) [9]. in this approximation the transition rate is given by (4) is the migration energy (barrier) defined as the difference in energy between the transition state and the initial state, and is an attempt frequency [10]. for each defect site α the residence time is calculated as [11] 4 h. ceric, r. lacerda de orio, w. zisser, s. selberherr ∑ (5) is the number of possible jump sites from the site α. a single point defect is created at an arbitrary site, the clock is set to zero, and the defect is released to walk through the system. at each step, the jump direction is decided by a random number according to the local jump probabilities (6) the jump is implemented by updating the coordinates of the defect. by repeating the described random walk procedure for millions of defects, their concentration dependence on the effective valence tensor and the external field is calculated. 2.3. compact model for lifetime estimation in order to calculate the mechanical stress in a three-dimensional copper dual damascene interconnect structure, a complex physically based model including the em equation, the electro-thermal equation, and the mechanical equations has to be solved [7]. korhonen et al. [14] proposed a simple one-dimensional model, where the solution for the stress at the cathode of a semi-infinite line is given by √ √ (7) da is the effective atomic diffusivity and b is the effective modulus, which depends on the metal and the surrounding materials. void formation occurs as soon as the mechanical stress reaches a critical magnitude at a site of weak adhesion, typically at the copper/capping layer interface [15], [16]. thus, the void nucleation time is determined by the condition σ(tn)= σc, which applied to (7) yields ( ) (8) where is the critical stress.the solution given by (8) is a good approximation to the more complete solutionobtained by solving a full physical model [7], [13] numerically, as will be shown later. it should be pointed out that (8) is valid as long as the stress remains significantly smaller than the stress magnitude at the steady state condition, which holds true for the void formation phase. fig. 2 early failure mode: slit void growth under the via microstructural impact on electromigration: a tcad study 5 2.4. void growth for a copper dual-damascene m1/via structure with downstream electron flow, em failure analyses [11] indicate that the early failures are caused by slit voids located under the via, as shown in fig. 2. since the void is very thin and does not grow through the line height, void growth can be described by a one-dimensional process, so that the void length is given by (9) where is the drift velocity of the right edge of the void. the atomic flux into the right edge of the void is governed by the diffusivity of the copper/barrier layer interface , while the outgoing flux is governed by the surface diffusivity . since , using the nernst-einstein equation one can write [17] (10) the em failure occurs, when the void spans the via size, , so that the void growth time contribution to the em lifetime is given by (11) 3. results and discussion the ab initio method described above is applied for the calculation of the effective valence inside grain boundaries and the calculated value is used to parameterize our continuum-level model [7]. prior to carrying out the ab initio calculation it is necessary to construct grain boundaries with exact positions of atoms. for this purpose an in-house molecular dynamic (md) simulator with a many-atom interatomic potential based on effective-medium theory [8] is used. the total energy of the system is expressed as ∑ ∑ ∑ ( ) (12) fig. 3 formation of grain boundaries (circled regions) for a n-atom system, where v(rij) describes a pair potential and f(ni) describes the energy due to the electron density. an example of the construction of grain boundaries by means of md simulation is presented in fig. 3. 6 h. ceric, r. lacerda de orio, w. zisser, s. selberherr ab initio calculations of the effective valence in copper grain boundaries have provided a value 75% lower than in the bulk for 4.3 ev fermi energy (cf. fig. 4), which is in good agreement with the results of sorbello [3]. along with the determination of the effective valence, ab initio calculations predict a lowering of the energy barrier for atomistic transport. knowing the influence of the em force on the diffusional barrier we utilize kinetic monte carlo [9] simulations for em, which provide a closer look into the distribution of atoms in the presence of em for a specific atomistic configuration. the dependence of the atomic concentration on the angle between the em force and the jump direction is displayed in fig. 5. the em intensity clearly reduces from θ = 0 ◦ , where the em force acts in the fast diffusivity path direction, to a minimum for θ = 90 ◦ , where the em force is orthogonal to this direction. ab inito calculations serve as basis to give a proper consideration of fast diffusivity paths and microstructure in the comprehensive physically based model [7].the solution of such a model is indeed rather complex and a detailed description of the numerical approach can be found in [13]. fig. 4 average distribution of the effective valence near a grain boundary. the external electric field is oriented parallel to the grain boundary fig. 5 concentration difference at four different angles () between the em force and the atom migration paths microstructural impact on electromigration: a tcad study 7 fig. 6 shows the mechanical stress close to the via at the cathode end of a simulated line. a high stress develops adjacent to the via, where there is a line of intersection between the copper, the capping layer, and the barrier layer. for a copper dual-damascene m1/via structure with downstream electron flow, this is the typical site for void formation and growth leading to early em failures. since em failure has a statistical character, in order to obtain a distribution of void nucleation times several lines with different microstructures were simulated. in particular, the mechanical stress under the via was monitored for a total of twenty lines, from where the resulting stress build-up for five different structures is shown in fig. 7. we have observed that the time evolution of the stress curves can be divided into two main parts. in the first one the stress increases linearly with time, while in the second part it increases with the square root of time, as shown in fig. 8 for a typical stress curve. it should be pointed out that kirchheim [18] derived a linear stress increase from a onedimensional version of a full physical model [7] under the condition that the stress is sufficiently low. in turn, korhonen et al. [14] obtained a square root stress increase, as given by (7), from the solution of a simplified model for em stress buildup. thus, the stress build-up obtained from our numerical simulations with a rather complete model and for fully three-dimensional structures can be conveniently described by simple analytical solutions. since void nucleation is expected to occur at high stress magnitudes, the second part of the stress curve shown in fig. 8 is fitted by the square root model given in (7), where a is used as fitting parameter. by fitting the stress curves of all simulated structures, the distribution of the parameter a is determined, as shown in fig. 9. the parameter is well described by lognormal statistics, where the mean and the standard deviation are mpa/s 1/2 and , respectively. once a is known, the void formation time is obtained from (8). since the distribution of a is also determined, we are able to obtain the statistical distribution of the void formation times, shown in fig. 10. due to the lognormal statistics of a, also follows a lognormal distribution, where the mean and standard deviation are h and . it should be pointed out that filippi et al. [12] estimated a nucleation time of approximately 5h, which lies within the range predicted by the simulations. fig. 6 hydrostatic stress distribution (in mpa). high stress develops at the copper/capping/barrier layer intersection adjacent to the via 8 h. ceric, r. lacerda de orio, w. zisser, s. selberherr the void growth time is determined by (11), which is a function of the surface diffusivity. choi et al. [17] obtained activation energy for surface diffusivity of ev on clean copper surfaces. it is expected that their measurement delivers a more precise copper surface diffusivity than the typical ones obtained on oxidized surfaces [17] and, therefore, we have used their estimate in our simulations. furthermore, we have assumed that the activation energy follows a normal distribution [19]. as a consequence, both the surface diffusivity and the void growth time are lognormally distributed. the mean and the standard deviation of the void growth time distribution are h and , respectively. the void formation and the void growth times are of about the same order of magnitude, as shown in fig. 10, which highlights the importance of considering both contributions for the early em lifetime estimation under accelerated test conditions. fig. 7 stress build-up at the copper/capping/barrier layer intersection for lines with different microstructures fig. 8 fitting of a numerical solution using a linear and a square root model microstructural impact on electromigration: a tcad study 9 as the void nucleation and the void growth times are known, the early em lifetime is given by the combination of (8) and (11), ( ) (13) the distributions of the em lifetimes are shown in fig. 10, together with the experimental results obtained from filippi et al. [12]. the lognormal mean and standard deviation of the simulated lifetimes are ̅ h and , we can see that the simulation results provide a reasonable description for the early em lifetimes. a major advantage of (13) is that it is a simple analytical formula which is more rigorously related to the physical mechanisms active during the early em failure development than black's equation. a critical issue arises, however, with regard to the estimation of the parameter a. this parameter is affected by several factors, like diffusion coefficients, effective valence, mechanical moduli, microstructure, and more, so that it cannot be defined in a closed form in full physical modeling [7], [13]. nevertheless, we have observed that it can be related to korhonen's solution. in this way, it can be directly described by an analytical expression and connected to physical parameters according to (7). fig. 9 distribution of the square root model fitting parameter. the line represents a lognormal fit the relative difference between the simulated and experimental lifetimes for the same failure percentile varies between 15% and 20%, as shown in fig. 11. the difference is smaller for shorter lifetimes, since the proposed slit void growth model is more accurate for very early failures, where the void volumes are smaller. such an error magnitude is reasonable, given the required assumptions for the parameters and considering the simplicity of the model. 10 h. ceric, r. lacerda de orio, w. zisser, s. selberherr fig. 10 early em lifetime distribution fig. 11 error between the simulation and the experimental results 4. conclusion our work demonstrates a novel approach for the calculation of the em force on an atomistic level and its application to continuum-level modeling. the consideration of the accurate effective valence in grain boundaries allows a realistic simulation of em behavior. the presented combination of atomistic force calculations with a kinetic monte carlo simulation enables sophisticated analyses of vacancy dynamics. a compact model for estimation of the early em lifetimes in m1/via structures of copper dual-damascene interconnects was developed. the model was derived through the combination of a complete model for void nucleation together with a simple slit void growth mechanism under the via. given the simplifications and assumptions made for the simulations, a reasonable approximation to experimental early em failures has been obtained. microstructural impact on electromigration: a tcad study 11 acknowledgment: this work was partly supported by the austrian science fund fwf, project p23296-n13. references [1] z.-s. choi, r. mönig, and c. v. thompson, "dependence of the electromigration flux on the crystallographic orientations of different grains in polycrystalline copper interconnects," appl. phys. lett., vol. 90, p. 241913, 2007. [2] e. zschech and p. r. besser, "microstructure characterization of metal interconnects and barrier layers: status and future," proc. interconnect technol. conf., pp. 233-330, 2000. [3] r. s. sorbello, "microscopic driving forces for electromigration," in materials reliability issues in microelectronics, edited by j. r. lloyd, f. g. yost, and p. s. ho, vol. 225 pp. 3-10, 1996. [4] r. p. gupta, "theory of electromigration in noble and transition metals," phys. rev. b, vol. 25, pp. 5118-5196, 1982. [5] d. n. bly and p. j. rous, "theoretical study of the electromigration wind force for adatom migration at metal surfaces," phys. rev. b, vol. 53, pp. 13909, 2006. [6] g. kresse and j. furthmüller, "efficient iterative schemes for ab initio total-energy calculations using a plane-wave basis set," phys. rev. b, vol. 54, pp. 11169, 1996. [7] h. ceric, r. l. de orio, j. cervenka, and s. selberherr, "a comprehensive tcad approach for assessing electromigration reliability of modern interconnects," ieee trans. dev. mat.rel., vol. 9, pp. 9,2009. [8] k. w. jacobsen, j. k. norskov, and m. j. puska, "interatomic interactions in the effective-medium theory," phys. rev. b, vol. 35, pp. 7423, 1987. [9] r. sorensen, y. mishin, and a. f. voter, "diffusion mechanisms in cu grain boundaries," phys. rev. b, vol. 62, pp. 3658, 2000. [10] m. gall, c. capasso, d. jawarani, r. hernandez, h. kawasaki, and p. s. ho, "statistical analysis of early failures in electromigration," j. appl. phys., vol. 90, no. 2, pp. 732-740, 2001. [11] a. s. oates and m. h. lin, "electromigration failure distribution of cu/low-k dual-damascene vias: impact of the critical current density and a new reliability extrapolation methodology," ieee trans. device mater. rel., vol. 9, no. 2, pp. 244-254, 2009. [12] r. g. filippi, p.-c.wang, a. brendler, p. s. mclaughlin, j. poulin, b. redder, and j. r. lloyd, "the effect of a threshold failure time and bimodal behavior on the electromigration lifetime of copper [13] interconnects," proc.intl. reliability physics symp., pp. 444-451, 2009. [14] r. l. de orio, dissertation, technische universität wien, (2010). [online]. available: http://www.iue. tuwien.ac.at/phd/orio/ [15] m. a. korhonen, p. borgesen, k. n. tu, and c.-y. li, j. "stress evolution due to electromigration in confined metal lines," appl. phys., vol. 73, no. 8, pp. 3790-3799, 1993. [16] r. j. gleixner, b. m. clemens, and w. d. nix, "void nucleati on in passivated interconnect lines: effects of site geometries, interfaces, and interface flaws," j. mater. res., vol. 12, pp. 2081-2090, 1997. [17] m. w. lane, e. g. liniger, and j. r. lloyd, "relationship between interfacial adhesion and electromigration in cu metallization," j. appl. phys., vol. 93, no. 3, pp. 1417-1421, 2003. [18] z. s. choi, r. mönig, and c. v. thompson, "activation energy and prefactor for surface electromigration and void drift in cu interconnects," j. appl. phys., vol. 102, p. 083509, 2007. [19] r. kirchheim, "stress and electromigration in al-lines of integratedcircuits," acta metall. mater., vol. 40, no. 2, pp. 309-323, 1992. [20] l. doyen, x. federspiel, l. arnaud, f. terrier, y. wouters, and v. girault, "electromigration multistress pattern technique for copper drift velocity and black's parameters extraction," proc. intl. integrated [21] reliability workshop, pp. 74-78, 2007. instruction facta universitatis series: electronics and energetics vol. 30, n o 4, december 2017, pp. 599 609 doi: 10.2298/fuee1704599s performance of macro diversity wireless communication system operating in weibull multipath fading environment suad n. suljović 1 , dejan milić 1 , zorica nikolić 1 , stefan r. panić 2 , mihajlo stefanović 1 , đoko banđur 3 1 university of niš, faculty of electronic engineering, niš, serbia 2 faculty of natural science and mathematics, university of priština, kosovska mitrovica, serbia 3 faculty of technical sciences, university of priština, kosovska mitrovica, serbia abstract. in this paper, we consider wireless mobile radio communication system with macro diversity reception. signal is subject to weibull small scale fading and gamma large scale fading resulting in system performance degradation. receiver uses macro diversity selection combining (sc) technique in order to reduce the impact of long term fading effects, and two micro diversity sc branches are used to mitigate weibull short term fading effects on system performance. probability density function (pdf), and cumulative distribution function (cdf), as well as level crossing rate (lcr) and average fade duration (afd) of the sc receiver output signal envelope are evaluated. the obtained expressions converge rapidly for all considered values of weibull fading parameter and gamma shadowing severity parameter. mathematical results are studied in order to analyze the influence of weibull fading parameter and gamma shadowing severity parameter on statistical properties of the sc receiver output signal. key words: weibull short term fading, probability density function, cumulative distribution function, level crossing rate, average fade duration. 1. introduction long term fading and short term fading degrade outage probability and limit channel capacity of wireless communication systems in general, and different techniques can be used to lessen the impact of the fading effects. one of the strategies for mitigating both effects: long term fading (shadowing), as well as short term fading, is the use of macro diversity combining reception. in general, macro diversity receiver features two or more micro diversity combiners, and it then combines their outputs in order to avoid the possibility of deep fades. received december 3, 2016; received in revised form may 5, 2017 corresponding author: suad n. suljović faculty of electronic engineering, university of niš, aleksandra medvedeva 14, 18000 niš, serbia (e-mail: suadsara@gmail.com) 600 s. suljović, d. milić, z. nikolić, s. panić, m. stefanović, đ. banđur such system reduces the influence of simultaneously long term fading effects and short term fading effects on system performance. there are a number of statistical distributions that can be used to describe small scale signal envelope variation in multipath fading channels, depending on propagation environment and communication scenario. rayleigh and nakagamim distributions can be used to describe signal envelope in small scale non line-of-sight multipath fading environments, while rician distribution can model signal envelope in line-ofsight multipath fading environments. signal envelope variation in nonlinear multipath fading environments can also be well described by using weibull model [1]. samples of a weibull random process can easily be obtained by taking the samples of a rayleigh random process and raising them to a power. weibull distribution therefore has a parameter related to nonlinearity of environment. when this weibull parameter tends to infinity, weibull multipath fading channel becomes a channel without fading effects. when weibull parameter goes to two, weibull channel reduces to rayleigh channel, and when weibull parameter goes to one, weibull channel becomes exponential fading channel. first order performance measures of a communication system include: outage probability, bit error probability and channel capacity. these performance measures can be calculated by using probability density function of receiver output signal. second order performance measures of a wireless communication system usually encompass average level crossing rate and average fade duration. these performance measures can be evaluated by using joint probability density function of the receiver output signal and the first derivate of output signal. log-normal distribution and gamma distribution can be used to describe variations of signal average power in shadowed channels. when log-normal model is used to describe long term fading, the expression for probability density function and cumulative distribution function of received output signal cannot be evaluated in the closed form. application of gamma distribution enables tractable calculation of system performance of the wireless communications system in shadowing environment [2]. there are a number of papers in open technical literature considering outage probability, bit error probability and average level crossing rate of macro diversity system with two or more micro diversity receivers operating over shadowed multipath fading channels. in [3], [4], [5] macro diversity system with two micro diversity branches operating over gamma shadowed nakagami-m multipath fading channels is considered. communication channel is described by the use of compound model [6]. system performance of macro diversity system in the presence of log-normal shadowing and rayleigh multipath fading are presented in [7]. average level crossing rate and average fade duration of macro diversity system operating over gamma shadowed multipath fading channel are evaluated in [8], where macro diversity reception in cellular system is considered and its outage probability is calculated. in this paper, we analyze macro diversity selection combining receiver, with two micro diversity sc branches, operating over gamma shadowed weibull multipath fading channel. macro diversity sc receiver serves to reduce considered gamma shadowing effects and micro diversity sc branches mitigate weibull multipath fading effects on system performance. analytical expressions can be obtained for calculation of important performance parameters such as outage probability and bit error probability. to the best author’s knowledge system performance of macro diversity system in weibull fading channel is not reported in technical literature. performance of macro diversity wireless communication system… 601 2. weibull random variable probability density function of weibull random variable is [9]: 1 1 ( ) x x ex xp        (1) where α is weibull fading parameter and ω is average power of x. cumulative distribution function of weibull random variable is [10,11]: 1 ω 0 1( ) ( ) x x x x f dtp ex t      (2) weibull random variable x, and its first derivative x , are: 2 2 1 1 2 2 2 , , , 2 x y y x x y y y x x             , (3) where y is rayleigh random variable. joint probability density function (jpdf) of x and x is [12]: 1 / 2 2, , 2 ( ) xx yy x x p x x xp j            (4) where the jacobian of the coordinate transform is: 1 2 2 2 1 2 0 2 4 0 2 y y x x x j x y y x y x                   (5) joint probability density function of rayleigh random variable y and its first derivative y is [12]: 2 2 2 22 22ω( ) ( ) 2 1 , ( , 2 ) yy yy m p p y p y y y y e e f            , (6) where fm is maximal doppler frequency, and y is gaussian random variable [13, 14], with variance β. after substituting (5) and (6) into (4), the expression for jpdf of weibull random variable and its first derivative becomes: 2 2 2 2 3 4 12 21 ω/ 2 82( ) ( ), , , 2 2 2 ω x x x xx yy yy x x x j p y y j p x x x ep                   (7) the average level crossing rate of weibull random processes is [15]: 1 / 2 ω 0 2 ω ( ) x x xx m n dxxp xx f x e      (8) the selection combining diversity receiver with inputs operating over identical, independent weibull multipath fading channel is considered next. signal envelopes at 602 s. suljović, d. milić, z. nikolić, s. panić, m. stefanović, đ. banđur inputs of a sc receiver are denoted with x1 and x2, and the sc receiver output signal envelope is denoted with x. pdf of sc receiver output signal envelope is [16]: 1 2 2 1 1 2 1 1 1 ω ω( ) ( ) ( ) ( 2 2 1 ω ) ( ) ( ) ( ) x x x x x x x x x p x x x x x xp f p f f xp x e e                 (9) cumulative distribution function of sc receiver output signal envelope is [17]: 1 2 2 1 ω( ) ( ) ( 1) x x x x x x f xf f e           (10) the jpdf of sc receiver output signal and its first derivative is [17]: 2 2 2 2 1 1 2 3 4 1 12 2 ω 8 ω, 2 , 1 2 ( ) ( ) ( ω ) x x x x xx x x x x x x p x f e ep xx                   (11) where the sc receiver output signal envelope is denoted with x. using the previous expression (11), level crossing rate of the process x is [17]: 2 2 1 1 1 / 2 ω ω 0 0 ( ) ( ) ( ) 2 2 ( )2 2 1 x x x xx x xx x x m n dxxp xx f x dxxp xx f x n f x e e                    (12) this expression can be used for calculation of average level crossing rate of wireless communication system with sc receiver operating over weibull multipath fading channel. 3. macro diversity system with two micro diversity branches macro diversity system with two micro diversity sc branches is considered next. received signal experiences gamma correlated long term fading and weibull short term fading resulting in signal envelope and average power variation. model of the system considered in this paper is shown in figure 1. signal envelopes at inputs of the first micro diversity sc combiner are denoted with x11 and x12 and at input of the second micro diversity sc combiner with x21 and x22. signal envelopes at the outputs of micro diversity sc combiners are denoted with x1 and x2, and ultimately, at the output of macro diversity sc combiner with x. fig. 1 model of a macro diversity system, featuring two front-end micro diversity combiners performance of macro diversity wireless communication system… 603 average signal powers at the inputs of micro diversity sc combiners are denoted with ω1 and ω2, and they follow correlated gamma distribution [18]: 1 2 2 0 1 2 1 2 2 0 1 ω ω 2 1/ 2 1/ 2ω 11 2 ω ω 1 2 1 1 21 22 1 00 ω ω 1 12 ( ω 1 1 2 2 2 2 2 ) 0 ( ) 0 (ω ω ) 2 ω ω ω ω ω 1γ( )(1 ) ω ω ω γ( ) ω 1 ! ( , ) ( ) γ( ) 1 ( )i c ccc i c i ci i c i c e i c e c i i p c                                (13) where γ(a) denotes the well-known gamma function [19, eq.(8.310)], ρ is correlation coefficient, ω0 is the scaling factor proportional to mean value of ω1 and ω2, c  1/2 is gamma shadowing parameter, and in () is the n-th order modified bessel function of the first kind [19, eq. (8.406)]. macro diversity sc receiver selects the branch with the highest signal power. therefore, using the expression (9), probability density function of x can be written as [16]: 1 2 1 1 2 2 1 2 ω ω 1 2 1 ω ω 1 2 2 1 2 ω ω 1 2 0 0 0 0 ω ω ω ω ω ω ω( ) ( ) ( ) ( ) ( )ω, x x x d d x p d d x pp x p p           1 1 1 2 1 2 1 0 1 ω 1 2 1 2 1 ω ω 1 2 2 00 0 0 1 1 (1 )2 1 1 1 2 0 0 4 2 ω ω ω ω ( ) (1 ) ! ( ( ) ( , ) 1 , ) (1 ) i x i c i i x x i c x d d x p c i i c d e p e i c                                                       1 2 1 2 2 1 1 2 1 2 1 2 1 2 1 2 2 2 2 2 2 1 1 0 0 2 2 2 2 2 0 1 1 1 1 1 2 2 2 2 2 1 2 2 12 2 0 0 8 γ( ) ω 1 !γ( )( )(1 ) 2 2 2 4 ω 1 ω 1 ( ) ( ) ( ) i i c i i i ii i i c i i i c i i c i i c x c i i k c i c i c x x k                                                          (14) where γ(a, x) is incomplete lower gamma function, (a)n is pochhammer symbol [19] and kν(·) is the second kind of the modified bessel function of order ν [19, eq. (8.407)]. using the expression (12), the level crossing rate of macro-diversity sc receiver output signal envelope of x can be written in the form [20]: 1 2 1 1 2 2 1 2 1 1 1 1 2 1 2 2 ω ω 1 2 1 ω ω 1 2 2 1 2 ω ω 1 2 0 0 0 0 ω 2 1 1 1 ω ω 1 2 0 0 1 1 1 10 0 ( ) ( ) ( , ) ( ) ( , ) 2 ( ) ( ω ω ω ω ω ω ω ω 8 2 ω ω ω ω γ( ) !γ( )( ) , ) (1 ) x x x i m x i i i n d d n x p d d n x p f d d n x p c i i c i c i c x                             2 1 2 1 1 2 1 2 12 4 2 1 12 2 2 2 2 2 2 0 0 02 2 1 4 2 2 2 4 ω 1 ω( ) ( )1 1( )ω i i i i c i i c i i c c x x x kk                                          (15) 604 s. suljović, d. milić, z. nikolić, s. panić, m. stefanović, đ. banđur using the expression (10), cumulative distribution function of macro diversity sc receiver can be written in the form [21]: 1 1 1 1 2 2 1 2 ω ω 1 2 1 ω ω 1 2 2 1 2 ω ω 1 2 0 0 0 0 ω ω ω ω ω ω( ) ( ) ( , ) ( ) ( , )ω ω x x x f d d f x d d f xx p p           1 1 1 1 2 1 2 1 2 1 2 2 1 2 1 2 1 2 ω 2 1 2 1 ω ω 1 2 2 2 22 0 00 0 0 1 1 2 2 2 2 0 0 1 2 1 1 2 2 0 2 2 2 ω ω ω ω γ( ) ω 1 !γ( ) ω 1 ω 11 γ(2 2 ) 4 ( ) ( ) ( , ) ( ) ( ) ( (1 ) 2 2 2 2 ω ) (1 i x i i c i i c i i i i i c i i i c c i i i d d f x p c i c x i i c c i c x k                                                 2 1 1 2 2 0 2 22 2 0 22( ω 1 ) 4( ) ( )ω ) 1 i i c i i c x x k                         (16) using expressions (16) and (15), we can easily obtain afd. the afd is defined as the average time over which the signal envelope ratio remains below the specified level after crossing that level in a downward direction, and is determined as [12,15]: 1 2 1 1 2 2 2 1 1 1 2 2 2 0 0 2 2 1 1 1 1 0 1 2 4 2 0 0 1 1 1 1 0 2 !γ( )( )(1 ) ω 1( ) ( ) ( ) 4 2 !γ( )( ) ( ( )) ( )(1 ) ω 1 i i ii i i x x j c m j j x j j j i i c i c i cf x t x n x x f x j c c jj j c                                   2 1 2 21 1 4 2 11 2 1 2 1 21 2 2 1 1 2 1 2 2 2 20 1 2 2 2 2 22 2 2 2 0 0 1 12 2 2 2 2 02 2 ω 1 γ(2 2 ) 2 2 2 2 4 ω 1 ω 12 2 2 2 4 ω 1 ( ( )) ( ) ( ) ( ) i i c j j c i i c i i c i i c i i ci i c j j c j j c i i c x x x k k x x k k                                                                     2 0 (ω 1 )         (17) 4. numerical results numerically obtained results are presented graphically in order to examine the influence of shadowing and fading severity on the concerned quantities. probability density function of macro diversity sc receiver output signal is given in fig. 2. it is evident that the probability density function shifts to the right due to the increase of α, while the change of correlation coefficient ρ causes only slight changes of general pdf behavior. level crossing rate values normalized by maximal doppler shift frequency fm, versus sc receiver output signal, are presented in fig. 3, for several values of weibull fading parameter α, gamma shadowing severity parameter c and correlation coefficient. in fig. 3, abscissa represents arbitrary crossing level, relative to scaling factor 0. http://jwcn.eurasipjournals.springeropen.com/articles/10.1186/1687-1499-2011-151#fig1 performance of macro diversity wireless communication system… 605 0 2 4 6 8 0,0 0,5 1,0 p (x ) x =1,   =1, =0.2, c=1 =1.5,   =1, =0.2, c=1 =2,   =1, =0.2, c=1 =2.5,   =1, =0.2, c=1 =3,   =1, =0.2, c=1 =1,   =1, =0.4, c=1 =1,   =1, =0.6, c=1 =1,   =1, =0.8, c=1 fig. 2 pdf of macro diversity sc receiver output signal -10 -5 0 5 10 15 20 1e-3 0,01 0,1 1 n x [x ]/ fm x[db] =1,   =1, =0.2, c=1 =1.5,   =1, =0.2, c=1 =2,   =1, =0.2, c=1 =2.5,   =1, =0.2, c=1 =3,   =1, =0.2, c=1 =1,   =1, =0.4, c=1 =1,   =1, =0.6, c=1 =1,   =1, =0.8, c=1 fig. 3 lcr for different fading severity and correlation parameter -10 -5 0 5 10 15 20 1e-4 1e-3 0,01 0,1 1 n x [x ]/ fm x[db] =1,   =1, =0.2, c=1 =1.5,   =1, =0.,2 c=1 =2,   =1, =0.2, c=1 =2.5,   =1, =0.2, c=1 =3,   =1, =0.2, c=1 =1,   =1, =0.2, c=1.5 =1,   =1, =0.2, c=2 =1,   =1, =0.2, c=2.5 =1,   =1, =0.2, c=3 fig. 4 lcr for different fading and shadowing severity 606 s. suljović, d. milić, z. nikolić, s. panić, m. stefanović, đ. banđur average level crossing rate increases as the crossing level increases towards the mean signal level. close to mean signal level, lcr achieves its maximum and then decreases again with increasing the crossing level. sharpness of the peak near the maximum is closely related to weibull fading severity parameter. while the higher values of weibull parameter α correspond to less severe fading conditions, increasing of correlation parameter slightly worsens the effectiveness of diversity reception. it is evident from fig. 3. that, for severe fading conditions, higher correlation increases probabilities that signal passes lower threshold levels. general influence of correlation is the same for lower fading severity, but this is not shown in figures. when correlation coefficient tends to one, the same signal is present simultaneously on both antenna ports and system will not be able to achieve any diversity gain. fig. 4 shows lcr when shadowing severity parameter increases. this increases the mean signal level, identified by the peak lcr, and it is the consequence of normalization by the scaling factor 0, which we chose previously. going back to (13), we see that averaging over 1, mean value of 2 is c0, and vice versa. this higher mean value is clearly seen as lcr curves shift to the right in fig. 4. by increasing parameter c, shadowing severity decreases, which is analogous to behavior due to weibull parameter . cumulative distribution function of macro diversity sc receiver output signal for different system parameters is presented in fig. 5. from the figure, we can conclude that changes of the parameter α show significant influence on the outage probability. due to an increase in the parameter α, the outage probability becomes lower, and the system is more stable at lower threshold levels. cumulative distribution clearly shows that probability of signal staying below the threshold level is lower. an increase of parameter ρ affects the stability of the system also. if ρ rises, the outage probability is greater and the system operation becomes less stable. -10 -5 0 5 10 15 20 1e-4 1e-3 0,01 0,1 1 f x [x ] x[db] =1,   =1, =0.2, c=1 =1.5   =1 =0.2, c=1 =2,   =1, =0.2, c=1 =2.5,   =1, =0.2, c=1 =3,   =1, =0.2, c=1 =1,   =1, =0.4, c=1 =1,   =1, =0.6, c=1 =1,   =1, =0.8, c=1 fig. 5 cdf for different system parameters table 1 represents the table of convergence for the expression (16) in reliance on the variable x. the table shows number of terms needed to be included in (16), in order for the accuracy of the resulting expression to achieve 6 accurate decimal positions, for the performance of macro diversity wireless communication system… 607 given parameter values. it is evident that the expression converges rapidly for the given parameters. we can conclude from the table 1 that due to increase of the coefficient α, the number of terms that have to be summed is slightly lower, while for the greater values of correlation coefficient ρ, the required number of terms increase. table 1 number of terms that should be added in expression (16) in order to reach 6 accurate decimal positions, when parameters α and ρ change. x= -10 db x=0 db x=10 db α=1, ω0=1, ρ=0.2, c=1 8 13 19 α =1.5, ω0 =1, ρ =0.2, c=1 6 13 19 α =2, ω0 =1, ρ =0.2, c=1 5 13 19 α =2.5, ω0 =1, ρ =0.2, c=1 5 13 19 α =3, ω0=1, ρ =0.2, c=1 5 13 19 α =1, ω0 =1, ρ =0.4, c=1 9 15 19 α =1, ω0 =1, ρ =0.6, c=1 9 15 21 α =1, ω0 =1, ρ =0.8, c=1 13 20 29 -10 -5 0 5 10 15 20 0,01 0,1 1 10 t x [x ]/ fm x[db] =1,   =1, =0.2, c=1 =1.5,   =1, =0.2, c=1 =2,   =1, =0.2, c=1 =2.5,   =1, =0.2, c=1 =3,   =1, =0.2, c=1 =1,   =1, =0.4, c=1 =1,   =1, =0.6, c=1 =1,   =1, =0.8, c=1 fig. 6 afd for different system parameters fig. 6 presents normalized values for average fade duration for various system parameters. when the crossing threshold level x is below the average signal level, afd stays low, and it is the main mode in which the system is operates normally. better performance is expected in cases where the value of weibull parameter α is higher, and correlation coefficient ρ is lower, resulting in lower afd. 608 s. suljović, d. milić, z. nikolić, s. panić, m. stefanović, đ. banđur 4. conclusion macro diversity receiver with macro diversity sc combiner and two micro diversity sc combiners operating over gamma shadowed multipath fading environment is considered in this paper. received signal experiences combined effects of gamma long term fading and weibull short term fading resulting in system performance degradation. when shadowing severity parameter tends to infinity the composite channel approaches a simple weibull multipath channel, and when weibull fading parameter tends to infinity the channel tends to a gamma shadowing channel. when weibull fading parameter equals two, the composite fading channel reduces to gamma shadowed rayleigh multipath channel. closed form expressions for probability density function, cumulative distribution function and average level crossing rate of macro diversity sc receiver output signal envelope are calculated. for special case when weibull parameter is equal to two, we can easily evaluate pdf, cdf and average level crossing rate for the resulting rayleigh signal envelope. infinity series expressions converge for any values of gamma shadowing severity parameter, weibull fading parameter, and shadowing correlation coefficient. number of terms that need to be summed in order to achieve desired accuracy depends on gamma severity parameter, weibull fading parameter and correlation coefficient. the number of terms increases as gamma severity parameter and weibull parameter deceases, and correlation coefficient increases. level crossing rate and average fade duration are presented graphically to show the influence of gamma severity parameter, weibull fading parameter, and correlation coefficient. on average level crossing rate of sc receiver output signal. as expected, system performance is better when the fading and shadowing severity is lower, and correlation between the diversity branches is relatively low. when the correlation of shadowing effects on the two macro branches is substantial, macro diversity system gains are minimal, and the receiver performance reduces to performance of a micro diversity receiver. acknowledgement: the paper is supported in part by the projects iii44006 and tr32051 funded by ministry of education, science and technological development of republic of serbia. references [1] n.c. sagias, g.k. karagiannidis, "gaussian class multivariate weibull distributions: theory and applications in fading channels", ieee transactions on information theory, vol. 51, no. 10, 2005. [2] p.s. bithas: "weibull-gamma composite distribution: alternative multipath/shadowing fading model", electronics letters, vol. 45, issue: 14, p. 749-751, 2009. [3] p.m. shankar, "analysis of micro diversity and dual channel macro diversity in shadowed fading channels using a compound fading model", international journal of electronics and communications (aeue), vol.62, pp.445-449, 2008. [4] d.b. đosić, d.m. stefanović, ĉ.m. stefanović, "level crossing rate of macro-diversity system with two micro-diversity sc receivers over correlated gamma shadowed α–µ multipath fading channels", iete journal of research, vol. 62 , iss. 2, 2016. [5] ĉ. m. stefanović, "macro-diversity system with macro-diversity ssc receiver and two sc microdiversity receivers in the presence of composite fading environment", in proceedings of the 23rd telecommunications forum (telfor), belgrade, pp. 321-324, 2015. [6] p. m. shankar, "performance analysis of diversity combining algorithms in shadowed fading channels", wireless personal communications, vol. 37, issue 1, pp. 61-72, 2006. http://ieeexplore.ieee.org/xpl/tocresult.jsp?isnumber=5159700 performance of macro diversity wireless communication system… 609 [7] a. adinoyi, h. yanikomeroglu, s. loyka, "hybrid macro-and generalized selection combining microdiversity in lognormal shadowed rayleigh fading channels", in proceedings of the ieee international conference on communications, vol. 1, 2004, pp. 244-248. [8] s. mukherjee, d. avidor, "effect of micro-diversity and correlated macro-diversity on outages in a cellular system", ieee transactions on wireless communications, vol. 2, no. 1, pp. 50-58, 2003. [9] a. papoulis and s. u. pillai, "probability, random variables, and stochastic processes", 4th (fourth) ed., edition mc graw-hill, london, uk, europe, isbn-13: 978-0070486584, 2002. [10] m. stefanović, d. milović, a. mitić, m. jakovljević, "performance analysis of system with selection combining over correlated weibull fading channels in the presence of co-channel interference", aeu international journal of electronics and communications, vol. 62, issue 9, pp. 695-700, october 2008. [11] a. golubović, n. sekulović, m. stefanović, d. milić, "performance analysis of dual-branch selection diversity system using novel mathematical approach", facta universitatis, series: electronics and energetics, vol. 30, no 2, pp. 235 – 244, june 2017. [12] s. suljović, d. milić, s. panić, "lcr of sc receiver output signal over α-κ-μ multipath fading channels", facta universitatis, series: electronics and energetics, vol. 29, no, 2, pp. 261 – 268, june 2016. [13] w.c. jakes, microwave mobile communications, piscataway, nj: ieee press, 1994. [14] g. l. stüber, principles of mobile communications, boston, kluwer academic publishers, 1996. [15] f.-p. calmon, m. d. yacoub, mrcs–selecting maximal ratio combining signals: a practical hybrid diversity combining scheme, ieee trans. wireless communications, 2009. [16] b. jaksić, d. stefanović, m. stefanović, p. spalević, v. milenković, "level crossing rate of macro-diversity system in the presence of multipath fading and shadowing", radio-engineering, vol. 24, no.1, 2015. [17] a. marković, z. perić, d. đošić, m. smilić, b. jakšić, "level crossing rate of macro-diversity system over composite gamma shadowed alpha-kappa-mu multipath fading channel", facta universitatis, series: automatic control and robotics, vol. 14, no 2, pp. 99 – 109, 2015. [18] m. bandjur, n. sekulović, m. stefanović, a. golubović, p. spalević, d. milić, "second-order statistics of system with micro-diversity and macro-diversity reception in gamma shadowed rician fading channels", etri journal, vol. 35, no. 4, pp. 722-725, 2013. [19] i. s. gradshteyn and i. m. ryzhik, table of integrals, series and products, 6th ed. new york: academic press, 2000. [20] n. sekulović, m. stefanović, "performance analysis of system with microand macro-diversity reception in correlated gamma shadowed rician fading channels", wireless personal communications, vol. 65, no.1, p. 143–156, 2012. [21] s. panić, d. stefanović, i. petrović, m. stefanović, j. anastasov and d. krstić, "second-order statistics of selection macro-diversity system operating over gamma shadowed κ-μ fading channels", eurasip journal on wireless communications and networking, 2011. http://ieeexplore.ieee.org/search/searchresult.jsp?searchwithin=%22authors%22:.qt.sergey%20loyka.qt.&newsearch=true instruction facta universitatis series: electronics and energetics vol. 30, n o 3, september 2017, pp. 375 382 doi: 10.2298/fuee1703375d mixed mode performance of gaas utb-mosfet with extra insulator region and undoped buried oxide region  shiva prasad das 1 , ananya dastidar 2 , partha sarkar 1 , sushanta k. mohapatra 3 1 department of electronics and communication engineering, centre for advanced post graduate studies, biju patnaik university of technology, odisha, india 2 department of instrumentation and electronics, college of engineering and technology, bhubaneswar, bput, odisha, india 3 school of electronics engineering, kiit university, bhubaneswar, odisha, india abstract. investigation of mixed mode performances for gaas utb-mosfet at nanoscale regime keeping in view of “beyond cmos” is the current trend of semiconductor industry. here it is proposed to modify conventional models by considering an extra insulator region (ir) and undoped buried oxide region (ubr) to study the performance related to digital and analog/rf applications. here a gaas is considered as the channel material. the irutb-soi-n-mosfet has shown promising results with respect to ss, dibl, ft and switching speed. key words: silicon-on-insulator, utb mosfet, gaas, dibl, analog/rf performance, insulator region. 1. introduction in recent years, there has been a growing demand of integrated circuits (ics) providing better analog/ rf applications as well as digital functionalities [1]–[3]. the silicon-oninsulator (soi) technology [1], [4], [5] based fully depleted (fd) silicon on insulator mosfets are widely used for mixed mode application ics as it offers sharp sub-threshold slope, high current drive, high transconductance, reduced parasitic capacitance, and absence of latch-up which are key parameters for digital applications [6]–[8]. due to high transconductance to drain current (gm/id) ratio and low body factor, the fd-soi-mosfets have been used to design low power circuits to operate at a high and low frequency as received september 17, 2016; received in revised form november 30, 2016 corresponding author: sushanta k. mohapatra school of electronics engineering, kiit university, bhubaneswar, odisha, india (e-mail: skmctc74@gmail.com) 376 s. p. das, a. dastidar, p. sarkar, s. k. mohapatra well as high temperature providing better performance than the conventional mosfets [9], [10]. the use of high electron mobility material like gaas is promising as it has higher saturated electron velocity, higher electron mobility, allowing it to function at much higher frequencies, less noise and be operated at higher power levels than silicon [11], [12]. previously it has been shown by orouji et al. [13] that soi-mosfets with an extra insulator region (ir-soi) in which the silicon active layer and drain region consists of an insulator region (hfo2) provides high electron reliability due to low gate leakage current and low critical electric field. the self heating effect (she) which is one of the drawbacks of fd-soi has been reduced by a new structure undoped buried region mosfet (ubrmosfet) [14]. in this paper, the analog/ rf performance along with some scaling parameters of ultra thin body (utb) soi n-channel mosfet (utb-soi-n-mosfet) has been examined along with utb-soi-mosfet with extra insulator region (ir-utb-soi-n-mosfet), utb-n-mosfet with undoped buried region under channel (ubr-utb-soi-n-mosfet) and a new structure utb-soi-n-mosfet with extra insulator region and undoped buried region under channel (ir-ubr-utb-soi-n-mosfet) with the help of the device simulator from silvaco tcad[15]. 2. device structure and simulation setup the schematic representation of four different structures utb-soi-n-mosfet, irutb-soi-n-mosfet, ubr-utb-soi-n-mosfet and ir-ubr-utb-soi-n-mosfet, which was considered for the 2-d simulation is given in fig.1. the effective oxide thickness (eot), the gate length (lg), the gaas body thickness (tgaas), the sio2 buried oxide thickness (tbox) and si substrate thickness (tsub) have been taken of 1.1 nm, 60 nm, 10 nm, 50 nm and 100 nm respectively in all the four type of structures. the source extension (ls) and the drain extension (ls) have been taken as 70 nm each. the source and drain area are highly doped with n-type donor ions with concentration 10 20 /cm 3 each to reduce the mobility degradation due to coulombs scattering. the silicon substrate is diffused with p-type acceptor ions with concentration 10 18 /cm 3 and the gaas channel region is doped with p-type acceptor ions with concentration 10 16 /cm 3 to avoid threshold voltage variation[16]. the metal gate work function is set to 4.6 ev during simulation[17]. the structures are calibrated to meet the requirement of international technology roadmap for semiconductors (itrs) in 45 nm technology node [18]. the 2-d numerical device simulator [15] atlas is used for the simulation of the proposed structures. the drain bias is fixed to vdd =1.0 v as per itrs [19]. to study the analog/ rf performance the simulation is carried out at the drain to source voltage vds = 0.5 v (half of the supply voltage i.e. vdd/2) [20] with a variable gate to source voltage (vgs) 0 v to 1.0 v. the threshold voltage is obtained by using constant current id =10 -6 a/µm, from id~vgs characteristic curve. in the channel region the electron and hole shockley-read-hall [21],[22] generation and recombination lifetime, τn and τp are set to the value 1×10 -8 sec each. in material models, lombardi mobility model [23] is used which considers the effect of transverse electric fields along with doping and temperature dependent parameters gaas utb-mosfet with extra insulator region 377 of mobility [24]. the numerical solution used here is based on the drift-diffusion approach [25]. some other material models have also been used here like the concentration dependent (conmob), parallel electric field dependence (fldmob) which is required for measuring velocity saturation effect, shockley-read-hall (srh) and optical [15]. the fermi-dirac model helps to get the result close to ideal values by a rational chebyshev approximation [19]. (a) (b) tgaas tbox tsub lc ls ld source gate drain substrate lc ubr (c) (d) fig. 1 schematic device structures (a) utb-soi-n-mosfet (b) ir-utb-soi-n-mosfet (c) ubr-utb-soi-n-mosfet (d) ir-ubr-soi-n-mosfet table 1 structure notation notation used in this article structure a utb-soi-n-mosfet b ir-utb-soi-n-mosfet c ubr-utb-soi-n-mosfet d ir-ubr-utb-soi-n-mosfet 378 s. p. das, a. dastidar, p. sarkar, s. k. mohapatra 3. result analysis as described previously these four types of structures were simulated using 2-d numerical device simulator and the parameters like the on-state drive current (ion), off-state leakage current (ioff), ion/ioff ratio, threshold voltage (vth) and power dissipation variation were evaluated which are some of the factors affecting the scaling properties of the devices. the surface potential variation with respect to channel length was also observed. the rf/ analog performance analysis was done by measuring the parameters like transconductance (gm), total capacitance (ctotal), q-factor and cut-off frequencies (ft) for the four different structures. a sub-threshold slope (ss) was calculated by using the following equation [19]. ( / ) (log ) gs d v ss mv dec i    (1) another vital parameter responsible for scaling effect is the drain induced barrier lowering (dibl) which was also evaluated by the following equation[26]. (a) (b) fig. 2 surface potential variation along channel for a, b, c and d at vgs = 1 v (a) at vds = 0.05 v (b) at vds = 1 v (a) (b) fig. 3 ion and ioff comparison for a, b, c and d (a) at vds = 0.05 v (b) at vds = 1 v gaas utb-mosfet with extra insulator region 379 1 2 0.95 th thv v dibl   (2) where vth1 and vth2 are threshold voltages at vds = 0.05 v and vds = 1 v. fig.2 shows the surface potential variation along the channel of the structures a, b, c and d, where fig. 2 (a) shows the variation of surface potential along the channel for the four structures at drain to source voltage vds = 0.05 v and fig. 2 (b) shows the surface potential variation along the channel for the four structures when vds = 1 v. the trade-off between ioff and ion has been shown in the fig. 3 for different structures. fig. 3(a) shows the ion and ioff comparison between a, b, c and d at vds = 0.05 v and fig. 3(b) shows the ion and ioff comparison between a, b, c and d at vds = 1 v. at vds = 0.05 v structure c gives better ion/ioff ratio and at vds = 1 v, structure b shows significant improvement in ion/ioff ratio. (a) (b) fig. 4 (a) static power dissipation for a, b, c, and d, (b) threshold voltage variation at vds = 0.05 v and vds =1 v in the fig. 4(a), the static power dissipation (pd = ioff x vdd) [27] variation with respect to the four type of structures is presented. the structure b provides lower static power dissipation than the other three structures. the fig. 4(b) provides the threshold voltage variation of the four structures at vds = 0.05 and vds = 1 v. the extracted value of threshold voltage, sub-threshold slope, dibl and static power dissipation are tabulated for all device structures in table 2. in fig. 5, the trans-conductance i.e. d m gs i g v    (3) for different a, b, c and d has been given. the fig. 5(a) and fig. 5(b) show the gm variation with id for the given four structures at vds = 0.05 v and vds = 1 v respectively. 380 s. p. das, a. dastidar, p. sarkar, s. k. mohapatra (a) (b) fig. 5 trans-conductance (gm) variation with id for a, b, c and d (a) at vds = 0.05 v (b) at vds =1 v (a) (b) fig. 6 (a) total capacitance (ctotal) with id for a, b, c and d at vds =1 v (b) a cut-off frequency (ft) variation with id for a, b, c and d at vds =1 v in fig. 6(a), the variation of total capacitance (ctotal = cgd + cgs ) for a, b, c and d has been given at vds = 1 v where cgd is parasitic gate to drain capacitance and cgs is the parasitic gate to source capacitance. another important parameter, a cutoff frequency (ft) has been plotted in fig. 6(b) 2 ( ) m t gs gd g f c c   (4) the q-factor (gm/ss) has been calculated for the four device structures and given in the table 3. gaas utb-mosfet with extra insulator region 381 table 2 performance parameters-1 structure vth1 (v) vth2 (v) ss1 (mv/dec) ss2 (mv/dec) dibl (mv/v) pd (x10 -12 w) a 0.420 0.403 69.81 71.95 17.678 1.92 b 0.420 0.404 69.68 71.83 17.589 1.82 c 0.505 0.436 74.11 82.21 72.923 6.55 d 0.505 0.437 74.01 81.90 71.872 6.04 table 3 performance parameters-2 structure ion1/ioff1 (x10 -9 ) ion2/ioff2 (x10 -8 ) ctotal (ff/µm) ft (x10 -11 hz) q-factor a 1.686 3.920 1.639 2.00 24.21 b 0.681 4.132 1.655 2.03 9.32 c 2.095 1.034 1.629 2.00 23.07 d 0.773 1.120 1.639 2.03 7.68 4. conclusions a comparative performance analysis of a new structure was presented namely a irubr-utb-soi-n-mosfet which contains an extra insulator region (ir) at the channel source junction, undoped buried region and having a gaas under the channel region. the scaling and rf parameters of ir-ubr-utb-soi-n-mosfet have been obtained along with conventional utb-soi-n-mosfet. from the analysis, it has been obtained that the sub-threshold slope, dibl, and the static power dissipation are lower for irutb-soi-n-mosfet than the other three structures and it also provides better ion /ioff ratio. so the above structural change in the device can be a good candidate for switching and low standby operating power application. references [1] s. cristoloveanu, “silicon on insulator technologies and devices: from present to future,” solid. state. electron., vol. 45, no. 8, pp. 1403–1411, 2001. [2] m. a. pavanello, j. a. martino, v. dessard, and d. flandre, “analog performance and application of graded-channel fully depleted soi mosfets,” solid. state. electron., vol. 44, no. 7, pp. 1219–1222, 2000. [3] k. kim, “1.1 silicon technologies and solutions for the data-driven world,” in digest of technical papers 2015 ieee international solid-state circuits conference-(isscc), 2015, pp. 1–7. [4] j.-t. park and j.-p. colinge, “multiple-gate soi mosfets: device design guidelines,” electron devices, ieee trans., vol. 49, no. 12, pp. 2222–2229, 2002. [5] a. chaudhry and m. j. kumar, “investigation of the novel attributes of a fully depleted dual-material gate soi mosfet,” electron devices, ieee trans., vol. 51, no. 9, pp. 1463–1467, 2004. [6] s. cristoloveanu and s. li, electrical characterization of silicon-on-insulator materials and devices, vol. 305. springer science & business media, 2013. [7] b. vandana, “study of floating body effect in soi technology,” int. j. mod. eng. res., vol. 3, no. june, pp. 1817–1824, 2013. [8] s. k. mohapatra, k. p. pradhan, and p. k. sahu, “ztc bias point of advanced fin based device: the importance and exploration,” facta univiversitatis: series, electronics and energetics, vol. 28, no. 3, pp. 393–405, 2015. [9] q. xie, c.-j. lee, j. xu, c. wann, j. y.-c. sun, and y. taur, “comprehensive analysis of short-channel effects in ultrathin soi mosfets,” electron devices, ieee trans., vol. 60, no. 6, pp. 1814–1819, 2013. [10] h.-s. wong, “beyond the conventional transistor,” ibm j. res. dev., vol. 46, no. 2.3, pp. 133–168, 2002. 382 s. p. das, a. dastidar, p. sarkar, s. k. mohapatra [11] r. h. reuss et al., “macroelectronics: perspectives on technology and applications,” proc. ieee, vol. 93, no. 7, pp. 1239–1256, 2005. [12] j. yoon et al., “gaas photovoltaics and optoelectronics using releasable multilayer epitaxial assemblies,” nature, vol. 465, no. 7296, pp. 329–333, 2010. [13] a. a. orouji and m. k. anvarifard, “soi mosfet with an insulator region (ir-soi): a novel device for reliable nanoscale cmos circuits,” mater. sci. eng. b, pp. 1–7, 2013. [14] m. rahimian and a. a. orouji, “a novel nanoscale mosfet with modified buried layer for improving of ac performance and self-heating effect,” mater. sci. semicond. process., vol. 15, no. 4, pp. 445–454, 2012. [15] atlas user manual. silvaco international,santa clara, 2012. [16] h. a. el hamid, j. r. guitart, and b. iñíguez, “two-dimensional analytical threshold voltage and subthreshold swing models of undoped symmetric double-gate mosfets,” electron devices, ieee trans., vol. 54(6), p. 1402–1408., 2007. [17] j. p. colinge, “multiple-gate soi mosfets,” solid state electron, vol. 48 (6), pp. 897–905, 2004. [18] “the international technology roadmap for semiconductors,” 2011. [19] s. k. mohapatra, k. p. pradhan, and p. k. sahu, “temperature dependence inflection point in ultra-thin si directly on insulator (sdoi) mosfets: an influence to key performance metrics,” superlattices microstruct., vol. 78, pp. 134–143, 2015. [20] s. chakraborty, a. mallik, and c. k. sarkar, “subthreshold performance of dual-material gate cmos devices and circuits for ultralow power analog/mixed-signal applications,” electron devices, ieee trans., vol. 55 (3), pp. 827–832, 2008. [21] w. shockley and w. t. read, “statistics of the recombination of holes and electrons,” phys. rev., vol. 87, pp. 835–842, 1952. [22] r. n. hall, “electron–hole recombination in germanium,” phys. rev., vol. phys. rev., p. 387, 1952. [23] c. lombardi, s. manzini, a. saporito, and m. vanzi, “a physically based mobility model for numerical simulation of nonplanar devices,” ieee trans. comput. des. integr. circ. syst., vol. 7 (11), pp. 1164– 1171, 1988. [24] p. k. sahu, s. k. mohapatra, and k. p. pradhan, “zero temperature-coefficient bias point over wide range of temperatures for singleand double-gate utb-soi n-mosfets with trapped charges,” mater. sci. semicond. process., vol. 31, pp. 175–183, 2015. [25] s. selberherr, “analysis and simulation of semiconductor devices,” springer–verlag, wien–newyork, 1984. [26] g. c. patil and s. qureshi, “impact of segregation layer on scalability and analog / rf performance of nanoscale schottky barrier,” j. semicond. technol. sci., vol. 12, no. 1, pp. 66–74, 2012. [27] k. p. pradhan, d. singh, s. k. mohapatra, and p. k. sahu, “assessment of iii-v finfets at 20 nm node: a process variation analysis,” procedia comput. sci., vol. 57, pp. 454–459, 2015. instruction facta universitatis series: electronics and energetics vol. 27, n o 3, september 2014, pp. 389 398 doi: 10.2298/fuee1403389p analysis of measurement error in direct and transformer-operated measurement systems for electric energy and maximum power measurement  slaviša puzović 1 , branko koprivica 2 , alenka milovanović 2 , milić đekić 2 1 edb užice, prijepolje, serbia 2 faculty of technical sciences ĉaĉak, university of kragujevac, serbia abstract. analysis of error in measuring electric energy and maximum power within direct and half-indirect measurement system at the voltage of 0.4kv is presented in the paper. the analysis involved all the elements of the measurement system, i.e. calibration and testing of the transformer-operated and direct digital energy meters and measuring current transformers. this equipment was also used for measurements in the transformer substation aiming at error analysis at measurements made under the real conditions. the results obtained show significant negative measurement error introduced by the energy meters under overload conditions. energy meters have lower values of both the consumed electric energy and maximum power in this operating mode, which can be interpreted as a loss. key words: measurement error, digital energy meters, measuring current transformers 1. introduction in the early xxi century, the power system of serbia is facing numerous strategic challenges, one of the most important ones being enhancing energy efficiency of the systems for generation, transmission and distribution of electricity. the continual increase in electricity consumption, changed consumers’ structure and inhibited construction of the resources and the network caused the long-term and excessive operation of the power system. this has resulted in its inefficient operation and has led to substantial electricity losses. these losses may be due to a number of reasons, one of major factors that require analysis being the error at measuring electric energy and maximum power (maximum average fifteen-minute active power). the systems of half-indirect of both electric energy and maximum power include measuring current transformers and transformer-operated measurement instruments for measuring active and reactive electric energy, as well as those measurement instruments  received january 28, 2014; received in revised form march 28, 2014 corresponding author: branko koprivica faculty of technical sciences ĉaĉak, svetog save 65, 32000 ĉaĉak, serbia (e-mail: branko.koprivica@ftn.kg.ac.rs) 390 s. puzović, b. koprivica, a. milovanović, m. đekić for electric energy and maximum power measurement in direct systems. the measuring instruments need to provide the required accuracy in operation. given that measuring current transformers need to meet the given accuracy class (up to 120% of the given current), the question is whether or not the measuring current transformers exceed the rated accuracy class limits when the primary current is near zero, as well as when it exceeds 120% of the rated current, and even when the overload amounts to 100%, [1, 2]. similarly, the question is also raised as to the extent to which changes in the load at the secondary windings of measuring current transformers affect measurement accuracy. this primarily refers to replacing measuring instruments at the secondary windings of measuring current transformers, i.e. replacing electro-mechanical meters with less energy-consuming digital ones. precise determination of the rated power of measuring current transformer is of utmost importance as the accuracy class and security factor are adjusted to that power. as transformer-operated instruments for measuring active and reactive electric energy and maximum power, which within the system of half-indirect measuring are connected to the measuring current transformers’ secondary windings, are dimensioned to comply with the rated secondary electric current of the measuring current transformers (1a or 5a). in [3-4] there is raised the issue of how the measuring instruments behave when the current through the measuring current transformers exceeds the specified one. the same goes for the measuring instruments within the direct measuring system. the base current in the latter is usually 10a with maximum current amounting to 40a, 60a, 80a or 100a, which are usually exploited in conditions where actual current values are twice as high as those of maximum ones. the aim of this paper is to examine how measuring current transformers and direct and transformer-operated three-phase energy meters behave under underload and overload conditions, and determine the measuring error occurring thereby. recent research regarding the accuracy of the measuring current transformers and energy meters consider mostly the impact of non-linear loads, i.e. the distortion of current and voltage, on the value of measurement error [5-9]. the influence of current and voltage thd has been studied separately for measuring current transformers, as well as for current transformers embedded in energy meters. analyses presented show the significant influence of thd on phase error of both types of current transformers. this error is highly dependant on the frequency, so measuring the harmonics may be highly inaccurate. generally, high error may be expected when load current is nonsinusoidal. given the fact that literature does not provide enough information on the measurement errors under underload and overload conditions, the idea was to perform a detailed examination on this issue. the analysis presented in this paper includes separated laboratory testings on measuring current transformers, and direct and transformer-operated three-phase energy meters, under underload and overload conditions. furthermore, the paper presents the results obtained through measurements in a 10/0.4kv substation. 2. measuring electric power and energy and measurement errors the measurement system for electric energy and peak power at the voltage level of 0.4kv includes the measuring current transformers and transformer-operated instruments for measuring active and reactive power, and maximum power in transformer-operated measurement system. the measurement system also involves instruments for measuring active and reactive power, and maximum power within the direct measurement system. analysis of measurement error in direct and transformer-operated measurement systems... 391 in this paper, measuring current transformers of 50a/5a, 100a/5a, 150a/5a and 400a/5a current ratios (manufacturer fmt zajeĉar) were tested, as well as the digital energy meters of enel belgrade, i.e. two three-phase transformer-operated energy meters and three direct three-phase energy meters. 2.1. measuring current transformer errors measuring current transformers includes current, phase and complex error. current error, gi, results from the deviation of actual transmission ratio from its specified value. it is determined by [1, 2]: 2 1 1 100% n i m i i g i   (1) where mn = i1/i2 is the indicated transformation ratio, and i1 and i2 are the primary and secondary windings currents. phase error, i, is defined by the angle between the secondary and primary current phasors. phase error is positive if the secondary current leads the primary current. given that the distortion of the secondary current is possible at the increased primary current, which results from the saturation in the core, complex error can be defined with measuring current transformers as follows: 2 2 1 1 0 1 1 ( ) d 100% t i n p m i i t i t    (1) the accuracy class of a measuring current transformer is equal to the absolute value of the current error expressed in percentage, at the specified load on the secondary winding and 120% of the rated primary current. standard class accuracies are 0.1, 0.2, 0.5, 1, 3 and 5. measurement of the electricity consumption does not include accuracy classes 1, 3 and 5. fig. 1 shows the limit values of the current and phase errors of measuring current transformers of accuracy classes 0.1, 0.2, 0.5 and 1, set out in [10]. thus, for a transformer of accuracy class 1, limit of the current error is at 120% of the rated primary current and specified load on the secondary windings of the measuring current transformer. 20 g i i1n ±1 40 60 80 10 0 12 0 % ±2 ±3 % 1 0.5 0.2 0.1 20 δ i i1n ±50 40 60 80 10 0 12 0 % ±100 ±150 min 1 0.5 0.2 0.1 a) b) fig. 1 error value range: a) current error, b) phase error 392 s. puzović, b. koprivica, a. milovanović, m. đekić 2.2. digital energy meters errors three-phase energy meters (direct and transformer-operated) are intended for measuring active and reactive electric energy in three-phase voltage system of the specified frequency of 50 hz. the accuracy of digital measurement groups is set out in [11]. under the referential conditions, the percentage error should not exceed the value of the relevant accuracy class, tables 1 and 2. table 1 percentage error limits in single-phase and three-phase direct energy meters of accuracy class 1 (ib is the base current, imax is the maximum current) current values power factor error limit in % b b 0.05 0.1 i i i  1 1.5 b max 0.1 i i i  1 1.0 b b 0.1 0.2 i i i  0.5(ind.), 0.8(kap) 1.5 b max 0.2 i i i  0.5(ind.), 0.8(kap.) 1.0 table 2 percentage error limits in single-phase and three-phase transformer-operated energy meters of accuracy class 1 (in is the rated current, imax is the maximum current) current values power factor error limit in % n n 0.02 0.05 i i i  1 1.5 n max 0.05 i i i  1 1.0 n n 0.05 0.1 i i i  0.5(ind.), 0.8(kap.) 1.5 n max 0.1 i i i  0.5(ind.), 0.8(kap.) 1.0 3. measurement results 3.1. tests with measuring current transformers the testing of measuring current transformers was performed in fmt zajeĉar, a measuring transformer company, on measuring current transformers of 50a/5a, 100a/5a, 150a/5a and 400a/5a current ratios. three stem 081 type transformers (50a/5a, 100a/5a, 150a/5a) and one sten 081 type transformer (400a/5a) were used [3]. the testing was performed under the following conditions: voltage: rated phase voltage, current: from 0% to 200% of the rated current, power factor: cosφ=1, cosφ=0.8(ind), power: sn=1.25va, sn=2.5va, sn=10va, and frequency: rated frequency of 50 hz. current errors expressed in % and phase errors expressed in minutes at different current values were measured. all the measurements gave similar distribution of errors, regardless of the measuring current transformer ratio and the load on the secondary winding. typical graphs that show current and phase errors with the primary current are given in figures 2 and 3. analysis of measurement error in direct and transformer-operated measurement systems... 393 0 50 100 150 200 -0,7 -0,6 -0,5 -0,4 -0,3 -0,2 -0,1 0,0 0,1 0,2 0,3 0,4 10va, cos=0.8 g i [%] i 1 /i n [%] 1.25va, cos=1 2.5va, cos=0,8 2.5va, cos=1 0 50 100 150 200 -3,0 -2,5 -2,0 -1,5 -1,0 -0,5 0,0 0,5 1,0 1,5 2,0 2,5 3,0 g i [%] i 1 /i n [%] a) b) fig. 2 variation in the current error with the primary current for different loads on the secondary winding: a) without the designated error limits, according to standard, and b) with the designated error limits, according to standard (broken line) the results presented suggest that the current and phase errors are lower than the limit values, regardless of the value and power factor of the primary load, and the load on secondary windings of the measuring current transformer. 0 50 100 150 200 0 5 10 15 20 25 30 1.25va, cos=1 2.5va, cos=0,8 2.5va, cos=1 10va, cos=0.8  [min] i p /i s [%] 0 20 40 60 80 100 120 140 160 180 200 -180 -160 -140 -120 -100 -80 -60 -40 -20 0 20 40 60 80 100 120 140 160 180 i 1 /i n [%]  [min] a) b) fig. 3 variation in the phase error with the primary current for different loads on the secondary winding: a) without the designated error limits, according to standard, and b) with the designated error limits, according to standard (broken line) figure 4 presents the current error for the different current ratios of measuring current transformers and the 2.5va load on the secondary winding at cosφ=1. it can be seen that the current ratio changes, for the same load at secondary winding does not affect the current error. 394 s. puzović, b. koprivica, a. milovanović, m. đekić 0 50 100 150 200 -0.5 -0.4 -0.3 -0.2 -0.1 0.0 0.1 0.2 0.3 0.4 0.5 400a/5a 100a/5a 50a/5a i 1 /i n [%] g i [%] 150a/5a fig. 4 current error for the different current ratios of measuring current transformers 3.2. testing of digital energy meters digital energy meters were tested using a control measurement system, i.e. iskramatic cats system [12, 13]. the testing was done on two three-phase transformer-operated energy meters (manufacturer enel belgrade, type dmg2), and three direct three-phase energy meters, type db2mg, of the same manufacturer [4]. the testing involved the following conditions: voltage: specified phase voltage, current: from 0.5 % to 200% of 5a rated current (transformer-operated energy meters), and from 2.5 % to 1000% of 10a base current (direct energy meters), power factor: cosφ=1, cosφ=0.5 (ind.), cosφ=0.8 (ind.), cosφ=0.8 (kap), and frequency: the rated frequency of 50 hz. errors for active and reactive electric energy were measured, as well as for the maximum power. three-phase transformer-operated energy meters fig. 5 shows the measurement errors occurring at measuring active electric energy for two transformer-operated three-phase energy meters of the same type, while cosφ=1. fig. 6 shows the error occurring at measuring active energy for the different power factor values (cosφ=1, cosφ=0.5(ind), cosφ=0.8(ind), cosφ=0.8(cap)) using a three-phase transformer-operated energy meters. a similar distribution of errors occurred when measuring reactive power. the graphs in figs. 5 and 6 show that errors occurring at measurements exceed the range of the error limits set out by a particular standard. analysis of measurement error in direct and transformer-operated measurement systems... 395 0 50 100 150 200 -10 -8 -6 -4 -2 0 2 i [%] g [%] 100 120 140 160 180 200 -10 -8 -6 -4 -2 0 g [%] i [%] fig. 5 the comparison of errors occurring at measuring active energy in two transformer-operated energy meters of the same type. broken line presents the error range set out by standard fig. 6 error occurring at measuring active energy at the different power factors direct three-phase energy meters base current for the tested direct measurement groups was 10a, whereas their maximum current was imax = 60 a or imax = 80 a. given that the testing was done with the currents not exceeding 100a, error in active and reactive power measurements was within the error range set out by standard when measurements were performed in the energy meter with maximum current imax = 80 a (tables 3 and 4). the reason for this is a small difference between the maximum current of the meter and the maximum current used in the testing. in two meters with imax = 60 a maximum current, this difference was substantially greater, which resulted in measurement errors (tables 5 and 6). table 3 percentage errors g % in direct energy meter, accuracy class 1 (active energy measurement, imax=80a) no. 1 2 3 4 5 6 7 8 9 i [a] 0.25 0.5 1 2 5 10 50 80 100 cosφ 1 1 1 1 1 1 1 1 1 error limit [%] ±1 ±1 ±1 ±1 ±1 ±1 ±1 ±1 ±1 g % -0.78 -0.27 0.07 0.19 0.03 0.05 0.31 0.35 0.65 table 4 percentage errors g % in direct energy meter, accuracy class 1 (reactive energy measurement at cos 0.5  (ind), cos 0.8  (ind), imax=80a) no. 1 2 3 4 5 6 7 8 i [a] 5 5 10 10 50 50 100 100 cosφ 0.5 0.8 0.5 0.8 0.5 0.8 0.5 0.8 error limit [%] ±2 ±2 ±2 ±2 ±2 ±2 ±2 ±2 g % -0.33 -0.13 -0.2 -0.06 0.1 0.04 -0.03 0.17 396 s. puzović, b. koprivica, a. milovanović, m. đekić table 5 percentage errors g % in direct energy meter, accuracy class 1 (active energy measurement, imax=60a) no. 1 2 3 4 5 6 7 8 9 i [a] 0.25 0.5 1 2 5 10 50 80 100 cosφ 1 1 1 1 1 1 1 1 1 error limit [%] ±1 ±1 ±1 ±1 ±1 ±1 ±1 ±1 ±1 g % 1.47 0.71 -0.2 -0.3 -0.33 -0.53 -0.66 -2.19 -4.22 table 6 percentage errors g % in direct energy meter, accuracy class 1 (reactive energy measurement at cos 0.5  (ind), cos 0.8  (ind), imax=60a) no. 1 2 3 4 5 6 7 8 i [a] 5 5 10 10 50 50 100 100 cosφ 0.5 0.8 0.5 0.8 0.5 0.8 0.5 0.8 error limit [%] ±2 ±2 ±2 ±2 ±2 ±2 ±2 ±2 g % -0,76 -1 -1,05 -4,64 maximum power measurement error measurement of the maximum power error was done on transformer-operated threephase energy meters at the current of 9a (180%) and cosφ=1. the results obtained show that peak power measurement errors at the load of 180%, cosφ=1, at a rated voltage and frequency on transformer-operated energy meters were –3.201% and –3.154%, respectively. specifically, referential measuring instrument gave the value of 5.9362kw, whereas energy meters showed the value of 5.744kw and 5.748kw. laboratory studies do not fully correspond to real conditions, which is primarily due to the short testing period. in addition, the testing carried out in laboratory is done for a finite number of measurement points at certain values of currents and power load factors. in practice, current values and load type can be changed very quickly within a wide range of values, whereas the long-term current overloads on the measurement equipment cause it to overheat, which can affect measurement characteristics of the equipment and the value of measurement errors accordingly. hence, the equipment tested in the laboratory was set up in a 10/0.4 kv substation for measurements in real conditions. substation feeders on which major changes and long-term overloads can be expected were used in these measurements. measurements performed in a 10/0.4 kv substation the measurements included setting up three measurement systems in a 10/0.4 kv substation. the complete measurement system comprised two transformer-operated energy meters dmg2, a single direct energy meters db2mg (imax = 80 a) and two sets of measuring current transformers with current ratios of 150a/5a 50a/5a, fig. 7. measurement systems were connected to each other so as to enable mutual load. analysis of measurement error in direct and transformer-operated measurement systems... 397 k l k l k l k l k l k l k l k l k l k l k l k l dmg2 5(6)a db2 mg 10-80a dmg2 5(6)a stem 081 150/5a stem 081 50/5a l1 l2 l3 n p fig. 7 connection diagram of the measurement system in a 10/0.4kv substation earlier measurements conducted in a substation showed that changes in current were within the range of 70a–120a. these current values provide nominal operation of measuring current transformers of 150a/5a current ratio. on the other hand, transformers with 50a/5a current ratio operate under overload. therefore, one of the transformeroperated energy meters (dmg2) operates in the nominal mode, whereas the other is overloaded. direct energy meter db2mg works with overload only partially. the average value of the phase voltage during measurement was 234v. the measurement of the consumed active and reactive energy, and maximum power over the period of 2h 45min was performed. table 7 shows the results of measurements obtained under the stated conditions. the results indicate a significant difference among the individual measurement systems. it can be assumed that the first measurement system, comprising measuring current transformers with 150a/5a current ratio and dmg2 transformer measurement group, gives the measurements with an error within an acceptable range (based on the results shown in previous subsections). compared with these results, the relative deviation in measurement results for other two measurement systems was calculated. the results also point to significantly greater deviations than allowed. table 7 percentage g % errors in direct energy meter, accuracy class 1 (active and reactive energy and peak power) dmg2+mct 150/5 db2mg dmg2+mct 50/5 wa [kwh] 138.90 135.37 119.50 wr [kvarh] 31.20 30.42 29.30 pmax [kw] 71.16 66.800 57.12 δwa [%]  -2.54 -13.97 δwr [%]  -2.5 -6.09 δpmax [%]  -6.13 -19.73 4. conclusion this paper presents the results of testing of the electric energy and maximum power measurement systems within the system of direct and half-indirect measurement at the voltage level of 0.4kv. laboratory studies of measuring current transformers indicated 398 s. puzović, b. koprivica, a. milovanović, m. đekić that the current and phase errors, regardless of the power factor and primary load values, and the load on the secondary windings of the measuring current transformer, are below the limit values. however, it is important to note that, when selecting measuring current transformer, attention should be paid to the load on the secondary winding, as it can affect the measurement error. laboratory testing of transformer-operated energy meters revealed that the measurement errors of active and reactive electric energy and maximum power are:  within the limits of accuracy class in overloads up to 70% (regardless of the load type),  beyond the limits of accuracy class in overloads above 70%, i.e.: 1) in 80% overloads the error ranges from 3.154% to 3.5%, and 2) in 100% overloads the error exceeds 9%. in direct energy meters, measurement results were within the limits of accuracy class when the value of the maximum current of the measurement group is slightly lower than the maximum operating current (up to 20%). higher values of the operating currents result in similar error values as in transformer-operated energy meters. measurements conducted in substation confirm the results obtained in laboratory conditions. increase in the measurement error can be expected under real conditions. the results obtained imply that the energy meters introduce significant negative measurement error under overload conditions. this infers that in this operating mode, energy meters have lower values of both the consumed electric energy and maximum power, which can be interpreted as a loss. future analysis in this area will be focused on the influence of current and voltage thd to the measuring current transformer and energy meters errors. references [1] p. duduković, m. đekić, electrical measurements, first edition, nauĉna knjiga, beograd, 1991. (in serbian) [2] v. bego, measuring transformers, školska knjiga, zagreb, 1977. (in serbian) [3] katalog proizvoda strujni transformatori za merenje 0.72 kv, fabrika mernih transformatora zajeĉar, zajeĉar 2010. [4] catalogues – db2mg, dmg1, dmg2, enel belgrade, belgrade, 2010. [5] a.e. emanuel, j.a. orr, "current harmonics measurement by means of current transformers", ieee trans. power deliv., vol. 22, pp. 1318–1325, july 2007. [6] p. mlejnek, p. kaspar, "calibrations of phase and ratio errors of current and voltage channels of energy meter", journal of physics: conference series, vol. 450, p. 012046, 2013. [7] d. stevanovic, p. petkovic, "the losses at power grid caused by small nonlinear loads", serb. jour. elec. eng., vol. 10, pp. 209–217, feb 2013. [8] m. soinski, w. pluta, s. zurek, a. kozłowski, "metrological attributes of current transformers in electrical energy meters", in proceedings of the international workshop on 1&2 dimensional measurement and testing. vienna, austria, 2012. [9] k. draxler, r. styblíkova, "effect of magnetization on instrument transformer errors", jour. elec. eng., vol. 10, pp. 209–217, feb 2013. [10] srps en 60044-1:2009, merni transformatori deo 1: strujni transformatori, institut za standardizaciju srbije, beograd, 25.02.2009. [11] srps en 62053-21:2008, oprema za merenje elektriĉne energije naizmeniĉne struje deo 21: statiĉka brojila aktivne energije (klase 1 i 2), institut za standardizaciju srbije, beograd, 29.12.2008. [12] j.g. webster, the measurement, instrumentation and sensors handbook, first edition, crc press, boca raton, fl, usa, 2000. [13] http://www.iskraemeco.si/emecoweb/eng/products/equipment/iskramatic_cats.html facta universitatis series: electronics and energetics vol. 29, no 2, june 2016, pp. 269 283 doi: 10.2298/fuee1602269m design of iir digital filters with critical monotonic passband amplitude characteristic a case study  dejan mirković, miona andrejević stošović, predrag petković, vančo litovski university of niš, faculty of electronic engineering, serbia abstract. a case study is reported related to the design of iir digital filters exhibiting critical monotonic amplitude characteristic (cmac) in the pass band. this kind of amplitude characteristic offers several advantages as compared to its non-monotonic counterparts, although it has not been studied thoroughly so far, if at all. after giving a short overview of the way of cmacs generation, arguments will be listed in favor of the iir version of the digital filter function realization. next, the iir implementation of the digital filters will be considered in short. the main part of the paper will be devoted to the design sequence of this kind of filters which will be illustrated on the example of a band-pass filter obtained by a set of transformations from an all-pole low-pass analogue prototype. this will be the first time a cmac band-pass iir digital filter is reported. key words: digital filters, iir, monotone amplitude characteristic, all-pole filters 1. introduction the critical monotonic amplitude characteristic (cmac) filters represent an extension of the broad family of filtering functions having all transmission zeroes at infinity [1]. they exhibit distinctive properties such as monotonic amplitude response in the pass band, reduced group delay distortions, higher symmetry of the pulse response, improved mapping of tolerances, improved sensitivity, and high selectivity. the interest for a digital realization of this kind of filtering functions comes from several reasons. first of all, only one sub-class of these functions has already been published in its digital form, the butterworth filters [2]. as shown in [1] and elsewhere, however, practically all sub-classes of cmac functions outperform the butterworth solution in almost every aspect of implementation with the exception of function’s simplicity. this study is a part of our effort to make cmacs more popular and to help bridging the gap between designers and cmac which has deepened during time [3]. second, due to their monotonic behavior, their sensitivity in the passband is reduced and accordingly, they received april 24, 2015; received in revised form august 3, 2015 corresponding author: dejan mirković faculty of electronic engineering, university of niš, aleksandra medvedeva 14, 18000 niš, serbia (e-mail: dejan.mirkovic@elfak.ni.ac.rs) 270 d. mirković, m. andrejević stošović, p. petković, v. litovski offer a good alternative to their non-monotonic counterparts (e.g. chebyshev and least pth [4]). at the same time, this means an improvement in the mapping of the tolerances of the circuit parameters into the tolerances of the attenuation characteristic [5]. finally, they exhibit smaller distortions of the passband group delay which reduces the complexity of the potential phase-corrector to be used to flatten the group delay characteristic [6]. this also means that cmac have smaller asymmetry of the response to a dirac pulse in the time domain which may be of crucial importance for some applications in telecommunication and signal processing. it is our opinion that the advantages of cmac filtering functions have not been completely understood in the research and design community. that especially stands for the iir implementation where no instances of implementation of cmac may be found. the reason for that, in our opinion, is inertia and the need of some additional (mathematical) knowledge for generation of the cmac transfer functions as compared with the chebyshev and butterworth filters. here we try to reopen the subject of cmac design by reporting the results related to the design of band-pass digital iir filter which is the first implementation of band-pass cmac of all. being a designer, one is first to decide either to go for fir filters and start the synthesis of transfer functions for each type of cmac from scratch, or to go for iir filters and transform the existing analog data into the digital domain. in the text below, a short paragraph is devoted to help the decision. as a conclusion, the designer will be advised to go for an iir filter with parallel implementation as the most economical solution in almost every respect. next, one is to create the cmac transfer function and to choose among sub-classes. again, a short paragraph will be devoted to this issue. four main sub-classes of cmac will be described from the implementation point of view. corresponding transfer function generation will be discussed shortly. based on these, a design sequence will be advised for finding the coefficients of the transfer function of iir filters in the z-domain. note that parallel implementation will be recommended and all the calculations will be performed under that presumption. the transformed function will be studied from both stability and accuracy point of view. the procedure will be exemplified on the case of a band-pass iir filter. to get it, a lowpass to band-pass transformation was performed in the analog domain. in that way the analog prototype so obtained was to be transformed into the z-domain by bilinear transformation. the implementation obtained in this way was evaluated by simulation of a filter excited by a complex signal in the time domain. various possible computing technologies were taken into account by changing the number of significant figures for the computations in order to establish the most economical implementation satisfying the design requirements. the paper is organized as follows. in the second paragraph arguments will be given for adopting iir digital filters. in the third paragraph the cmac function will be introduced. then, in the fourth paragraph, the bilinear transform implementation to a parallelized analog transfer function will be given. the case study describing the design (and its verification) of a band-pass cmac filter will be given in the fifth paragraph. 2. properties of the iir digital filters in digital filter design, one is to decide first on the choice between fir and iir filter functions and then to proceed to the approximation problem. then, one is to choose among design of iir digital filters with critical monotonic passband amplitude characteristic 271 different structures exhibiting the same transfer function. in the case of digital filters, the choice is to be done between the canonical (or state variable) and the parallel form. these two are illustrated in fig. 1 for an iir digital filter. it should be noted that if the order of the filter, n, is even, first order cell at the bottom of fig. 1b is omitted, leaving only second order cells in filter realization. when taking the decision between fir and iir filters one has to have in mind several criteria such as complexity of the solution, stability of the system, and processing time. the first criterion may be fragmented into several having the same origin. namely, the complexity of the solution will influence the power consumption, the silicon area and the design effort especially when special techniques are to be implemented for reduction of the power consumption [7]. fig. 1 realization of an nth order iir filter, a) canonical b) parallel (for n odd) note that not all of the criteria are of equal weight in design. for some applications the latency, i.e. the computational time may be of prime importance since it allows for speed. in others, reduction of heating or silicon area may prevail as a main criterion. putting all together, the choice is to be made by taking into account several, if not all criteria. in our detailed study [3] we came to the following. the use of iir filter has the following advantages: 1. lower complexity (in some cases, e.g. [8], incomparably lower); 2. lower dissipation; 3. lower silicon area; 4. available analog prototypes to transform. 272 d. mirković, m. andrejević stošović, p. petković, v. litovski the use of fir filters has the following advantages: 1. lower latency; 2. easier synthesis of linear phase filters; 3. better stability. the use of parallel architecture for the iir filters as shown in fig. 1b however, mitigates all disadvantages (stability, latency) of the iir filters, while there are no methods to do the same for the fir counterparts. it is to note here that getting a linear phase by fir filters doubles the complexity of the solution while using a phase corrector for the iir solution contributes marginally to its complexity [8]. that was the reason why we adopted the parallel architecture and the iir filter structure for the implementation in the cmac design. 3. cmac filters in the s-domain polynomial (or all-pole) filters with critical monotonic amplitude characteristics (cmac) in the passband have been available for several decades now [1]. the main property of cmac is related to the critical monotonicity of the amplitude response in the passband which will be first described here in short. the squared amplitude characteristic may be expressed as 2 2( jω) 1/{1 (ω )}h k  (1) where k(ω 2 ) is the characteristic function. in the simplest form (as proposed in [9]), for n even, one has:            2/ 1 22 2/ 1 222 2222 )ω1( )ωω( ε)(ωε)ω( n i i n i i n lk , (2) where ω is the normalized angular frequency, n is the order of the filter, ε defines the insertion loss at the passband edge, i.e.  2 = 10 amax/10  1, 0 < ωi < 1, i=1,2, ..., ⌊n/2 ⌋ are the abscissa of the inflection points, amax is the maximum allowable attenuation (in db) in the passband, and ⌊ ⌋ denotes the floor function. ln(ω 2 ) is a polynomial with n second order real zeroes located in the interval (1,1). since the characteristic function has a maximum number of inflection points in the passband, so do the amplitude characteristic and the attenuation, the last one being defined as 22 (ω ) 10 log(1/ (jω) ) [db]a h  (3) the main property of cmac leads to a good mapping of the element tolerances into the tolerances of the attenuation. namely, as shown in [4], the tolerance of the attenuation may be expressed in the following form: ω ω    a x x a i i , (4) where xi is the ith parameter of the analog circuit. having a maximal number of inflexion points (where both the first and the second derivatives are equal to zero) of the amplitude design of iir digital filters with critical monotonic passband amplitude characteristic 273 characteristic in the passband, the cmac forces the left-hand side of (4) to go through zero a maximal number of times. note, the derivative of a in (4) does not change its sign if cmac is used since it is monotonic which is different to the non-monotonic functions, e.g. c and ls. filters exhibiting cmac characteristic are also known to have lower group delay distortions in the passband than their c and ls counterparts [10]. altogether, the existence of cmac gives to the filter designer an additional freedom in the choice of the best solution for a filter design problem. there are four main classes of cmac as discussed in [1] and [10]. they originate from the design criteria implemented for synthesis of the transfer function. these criteria are: 1. maximally flat in the origin. the class of filters thus obtained is called butterworth’s after the author [11]. these will be here referred to as b-filters. 2. maximum slope of the characteristic function at the edge of the passband [12] [13] [14]. the name l-filters comes from the fact that for the original derivation legendre polynomials were used. 3. maximum asymptotic attenuation. [15]. here, these will be referred to as h-filters. 4. least-squares-monotonic. in this case, the reflected power in the pass-band was minimized under the critical monotonicity criterion [16] and named lsm filters. a catalog of the coefficients of the transfer functions of all four classes of cmac for n up to 10, obtained by these criteria, was published in [1] where a comparative study was also given. to illustrate this here, fig. 2 depicts the passband attenuation characteristics of the above four classes for n=7. in the next section, before proceeding to cmac digital filter design, the arguments for using iir filters will be discussed in short as based on the comparison of the properties of fir and iir filters. fig. 2 the four main cmac approximants for n=7 4. the bilinear transform and cmac in the z-domain there are several transformations claiming to preserve some of the original properties of the analog filter function when producing a digital domain counterpart. as listed in [17], these methods may be categorized in two groups. in the first group are put the ones which implement a specific criterion of approximation such as: the impulse response 274 d. mirković, m. andrejević stošović, p. petković, v. litovski invariant method; the modified impulse response invariant method; the step response invariant method (or zero order hold); the magnitude-invariance method; and the phase-invariance method. there are, however, transformations based on substitution of the complex frequency in the s-domain by an expression being a function of z. in that way, one has the matchedz transform method, and three methods obtained by approximation of the analog integrator by a digital one. these are known as the backward euler (backward difference); the trapezoidal method or the bilinear transform method, discussed in [18], and the second order formula introduced in [17]. the most popular among all of these is the bilinear transform. its main properties are simplicity of implementation and good preservation of the properties of the amplitude characteristic of the analog filter. it preserves the stability of the analog prototype. it introduces distortions (reduced by increasing the sampling rate) into the phase (group delay) characteristic which, however, has no importance in many applications. it is implemented via the following transformation into the analog transfer function: 1 12    z z t s . (5) where z is the complex digital angular frequency, and t is the sampling rate t=1/fs, fs being the sampling frequency. in that way )( 1 12 )( daa zh z z t hsh          (6) is obtained, where ha stands for the transfer function of the analog filter, while hd stands for that of the digital filter. the procedure of implementation of the bilinear transform to a parallelized analog transfer function together with the stability analysis and numerical considerations were discussed in [3] and we will not repeat them here. instead, in the sequel, we will go for the design of a band-pass filter obtained by low-pass-to-band-pass transformation in the sdomain and then transposed into the z-domain. it is our goal with that design to study all steps that remain to be performed in order to get an implementable design and to analyze the implementation problems related to the limited number of binary digits that arise in real life situations. 5. design, vhdl modeling, and simulation of cmac iir filters the following steps are to be performed in order to get an implementable design of the filter: creation of the band-pass filter in the s-domain; performing the s-to-z transform; conversion the decimal coefficient values into binary; scaling the coefficients to become implementable in fixed point arithmetic; and verification of the design by simulation of the filter hardware. concurrently, based on transfer function evaluation performed after taking into account the finite number of digits used for the representation of the coefficients (after quantization), a final decision will be enabled about the acceptability of the given approximation, i.e. selected number of binary digits. that and scaling are steps of crucial importance for defining the quality of the final solution. design of iir digital filters with critical monotonic passband amplitude characteristic 275 the example filter will be created based on the following requirements: a) band-pass (bp); b) central frequency: f0 = 3 khz; c) bandwidth: fbw = 900 hz; d) sampling frequency: fs = 50 khz e) order of the prototype low-pass filter n=7; f) pass-band amplitude approximation lsm; g) s-to-z transform used: bilinear; h) architecture: parallel combination of transpose direct form ii (tdf ii) filter sections. the well known [19] low-pass to band-pass transform was used: 0 22 0 r 1 ω    bw , (7) where ω is the angular frequency of the prototype filter, while ω is the angular frequency of the band-pass filter. ω0 is the central angular frequency, while bwr=bw/ ω0. bw is the bandwidth of the filter expressed as angular frequency. after the substitution of slp=jω and sbp=jω, (7) becomes a second order algebraic equation with complex coefficients which is usually solved by the geffe algorithm [20]. the new function has fourteen poles obtained by solving (7) as depicted in table 1 (together with the poles of the prototype lp lsm filter), and seven zeroes in the origin. in this case bwr=fbw / f0 =0.3 and ω0=1 rad/s was used. table 1 pole locations of the bp and lp lsm filter in the s-domain band-pass low-pass no. real part imaginary part real part imaginary part 1/2 -0.08266346190 ±0.99657751935 -0.1179475625 ±0.9751626241 3/4 -0.02025317565 ±1.15676424786 -0.3342221750 ±0.7735798237 5/6 -0.01513109310 ±0.86421546064 -0.4935853895 ±0.4252967357 7/8 -0.05591895577 ±1.12151432405 -0.5510897460 0.0 9/10 -0.04434769671 ±0.88944037693 11/12 -0.07876429908 ±1.06309950994 13/14 -0.06931131778 ±0.93551048922 next, the transfer function of the band-pass filter was expressed as a sum of partial fractions to enable parallel realization and, before the bilinear s-to-z transformation was implemented; the poles of the band-pass filter were to be denormalized: every pole coordinate was multiplied by 2π∙f0. based on this the coefficients of the biquads were calculated and s-toz-domain mapping enabled. the resulting coefficients of the biquads in the z domain are given in the first row (entitled full precision) of table 2. this concludes the synthesis procedure. we proceed now with the realization. as the first step, we encounter the necessity to express the coefficient values with a finite number of digits as physical implementation is expected. this process is usually referred to as quantization. only fixed point, two’s complement, biquad’s coefficients representation is considered. fig. 3 shows the transfer function’s pole locations in the z-plane for various binary word lengths used to represent the coefficient values. note that for all cases the poles are confined within the unit circle which confirms our claim that parallel realization will mitigate stability problems in iir realizations. 276 d. mirković, m. andrejević stošović, p. petković, v. litovski fig. 3 z – plane pole location of the bp lsm filter, a) unit circle, b) zoomed poles location. the following notation was used for the quantized version of the filter coefficients, q[n f]. n stands for the number of bits of the whole digital word and f for the number of bits allocated for the digits after the decimal point. accordingly, q[n f] will populate the range (rng) in increments (inc) as follows:     mnfcm  ;1log max2 , (8a) ff n f n incrng 2 1 ; 2 12 , 2 2 11          , (8b) where cmax is the coefficient with maximal absolute value, m is the number of bits allocated for the integer part plus the sign, and f is the number of bits allocated for the fractional part. the symbol ⌈ ⌉ denotes the ceiling function. two operations are performed over coefficients: first, scaling is done with the help of the results of (8a) and appropriate number of bits is determined for integer and fractional part; second, coefficients are quantized, i.e. mapped to appropriate values in range given with (8b) using round to nearest method. decimal and hexadecimal representation of coefficients quantized with q[16 14] are given in second and third row of table 2, respectively. observing table 2, one can see that the coefficient with maximal absolute value is the d1 coefficient of the first section, therefore m = 2, f = 14 are required for 16 bit representation. for these parameters range rng = [−2, 1.99993896484375] is covered in inc = 0.00006103515625 increments. assuming absence of any other source of computational error or noise we calculated the attenuation characteristic of the filter for different quantization formats of the coefficients as discussed above. the results for the example bp lsm filter are depicted in fig. 4. observing fig. 4, one can conclude that variants with 16-bit word length and higher, produce amplitude characteristics that start to agree with the one obtained with full precision. therefore, 16-bits representation can be used if attenuation larger than 50 dbs is not required (observing the lower stop-band in fig. 5). of course, one can use q[24 22] or q[32 30] if more accurate design is required. design of iir digital filters with critical monotonic passband amplitude characteristic 277 table 2 original and quantized filter coefficients numerator f u ll p re c is io n cell c0 c1 c2 i +0.0033086494281706663 -0.0018617614854014842 -0.0051704109135721505 ii +0.0067062285319410561 +0.0085906290469084413 +0.0018844005149673854 iii -0.044039453119340377 -0.0120564809558326 +0.031982972163507775 iv +0.06277875243767074 0 -0.062778752453249653 v -0.029334088252293972 +0.015384970101239978 +0.044719058353533951 vi -0.006864819539353288 -0.013389525398100878 -0.00652470585874759 vii +0.0074447307119548771 +0.0032630379244648162 -0.0041816927874900617 q [1 6 1 4 ] d e c im a l i +0.00329589843750 -0.00189208984375 -0.00518798828125 ii +0.00671386718750 +0.00860595703125 +0.00189208984375 iii -0.04406738281250 -0.01208496093750 +0.03198242187500 iv +0.06280517578125 +0.00000000000000 -0.06280517578125 v -0.02935791015625 +0.01538085937500 +0.04473876953125 vi -0.00683593750000 -0.01336669921875 -0.00653076171875 vii +0.00744628906250 +0.00323486328125 -0.00421142578125 h e x a d e c im a l i 0036 ffe1 ffab ii 006e 008d 001f iii fd2e ff3a 020c iv 0405 0000 fbfb v fe1f 00fc 02dd vi ff90 ff25 ff95 vii 007a 0035 ffbb denominator f u ll p re c is io n cell d1 d2 i -1.886085860514888 +0.98894784642499967 ii -1.8601293281281488 +0.96799935587139696 iii -1.8323004517946606 +0.95057717438073264 iv -1.8083338880121447 +0.94157013740084117 v -1.7935719318070145 +0.94450186279953363 vi -1.7923157567661698 +0.96044412883386077 vii -1.8052459566148433 +0.98552821289667847 q [1 6 1 4 ] d e c im a l i -1.88610839843750 +0.98895263671875 ii -1.86010742187500 +0.96801757812500 iii -1.83227539062500 +0.95056152343750 iv -1.80834960937500 +0.94158935546875 v -1.79357910156250 +0.94451904296875 vi -1.79229736328125 +0.96044921875000 vii -1.80523681640625 +0.98553466796875 h e x a d e c im a l i 874a 3f4b ii 88f4 3df4 iii 8abc 3cd6 iv 8c44 3c43 v 8d36 3c73 vi 8d4b 3d78 vii 8c77 3f13 after 16-bits representation is adopted, we perform an additional verification, but now in the time domain. fig. 5 depicts the results of time domain simulation using coefficients 278 d. mirković, m. andrejević stošović, p. petković, v. litovski quantized with q[16 14]. both signals and appropriate spectra are presented. the spectra shown in the fig. 5b and 5d are obtained with nfft = 65536 point fft. the input test signal is 0 1 2 3 sin(2π ) sin(2π ) sin(2π ) sin(2π ) in s f t f t f t f t    , (9) with: f0=3 khz, f1=374.60 hz, f2=749.97 hz, and f3=5999.76 hz. the bandwidth is limited by [fl, fu] = [2583.56, 3483.56] hz. the values of the test frequencies are picked to match integer multiples of fft resolution bin (fs/nfft) in order to minimize spectral leakage in the resulting fft image of the spectrum. a) b) fig. 4 attenuation of the 14 th order lsm band-pass filter: a) pass-band, b) stop-band design of iir digital filters with critical monotonic passband amplitude characteristic 279 observing the spectra in fig. 5b and 5d one can see that after filtering there is only one dominant bin at frequency f0, while the others are filtered out. fig. 5 time domain simulation of the bp filter mathematical model: a) input waveform, b) input spectrum, c) output waveform and d) output spectrum for hardware implementation a versatile vhdl code was written. it combines second and/or first order cells presented in fig. 1b. illustrative schematics of the described second order tdfii and top-level filter cells are shown in fig. 6a and 6b, respectively. appropriate number/position of bits at each signal path is labeled as well. fig. 6 schematic representation of a) second order tdf ii and b) top level filter cell each delay block (z -1 ) is realized as a register. parallel multipliers and ripple carry adders (with add and subtract functions) are designed for multiplication and summing operations, respectively. according to fig. 6a it can be concluded that second order cell requires five multipliers and four adders. on the other hand, assuming zero values for c2,i 280 d. mirković, m. andrejević stošović, p. petković, v. litovski and d2,i coefficients first order cell stems out from second order one. therefore, first order cell will require three multipliers and only two adders. to ensure successful synthesis whole filter is described structurally. each individual block, starting from basic ones, i.e. multipliers, adders and registers up to top level entity is described. therefore, no predesigned structures are assumed making the code as portable as possible. tdf_iii represents first or second order tdf ii cell. din, and dout are input and output digital word. index bounds and constants in fig. 6b are defined as follows,   1-/2 + 2) ,mod(,,0 prri   , (10a)       0 for 1,)max( 0 for ),max( ki ki j , (10b) where i is the index of the filter’s section and j is the number of adders used to sum outputs of the sections. the order of the resulting transfer function is marked with r. it should be emphasized that the order of the resulting band-pass/stop transfer function is doubled compared to low-pass prototype function. symbols ⌊ ⌋, max(x) and mod(x,y) denotes floor function, maximal value, and modulo operator (reminder after x by y division), respectively. parameter p is the flag that detects existence (1-exist, 0-do not exist) of two real poles/residues in resulting band-pass/stop transfer function. if this is the case, two first order sections are generated. finally, k represents direct term of partial fraction expansion of the resulting transfer function. this term is always zero, if the order of the denominator polynomial, m, is less than the order of the numerator polynomial, n. otherwise, this term is of the order m – n. in filter’s transfer functions it can only happen to be m = n which gives k as a simple constant factor. therefore, a branch with the k factor is nothing but a simple buffer stage. possible values for parameters k and p are given in table 3. table 3 possible values for p and k parameters filter type even order odd order p k p k high-pass 0 1 0 1 band-pass 0 0 1 0 band-stop 0 1 1 1 accordingly, vhdl entity accepts generics and has interface ports shown in table 4. vhdl code sample is given below. next, vhdl description was verified by logic simulation with the excitation described in the previous section. it is important to mention that when dealing with hardware implementation two important effects must be examined, namely: saturation and round-off noise. saturation is intensely dependent of the input signal waveform and filter’s architecture and coefficient values. even more, internal states usually saturate with different speeds making the tracking of the saturation process a non trivial task. design of iir digital filters with critical monotonic passband amplitude characteristic 281 table 4 generics and ports of vhdl entity symbol description g e n e ri c s n word length f number of bits after decimal point r order of resulting transfer function p flag for detecting two real poles/residues k direct term of partial fraction expansion cfs coefficients of the filter p o rt s clk clock signal at sample rate frequency rst reset signal active at negative level x input n bit signal y output n bit signal round-off noise is the direct consequence of a fixed-point representation. simply, product of two n-bit fixed-point numbers is a 2n bit number. this product must eventually be quantized to n-bits by rounding or truncation, which results with the round-off noise. a number of techniques can be used to mitigate this problem [21], [22]. the most commonly used technique to prevent saturation and round-off noise is the dynamic range scaling (or simply scaling) of the input signal prior to filtering action. namely, each input signal value should be scaled down into a specific range which ideally, ensures no saturation in any of the internal and external nodes of the filter. luckily, since two’s complement representation is exploited, saturation of the internal nodes is to be allowed since it will be interpreted as an overflow (wrap-around) effect. e.g. an overflow occurs when the sum of two positive numbers yields a negative result and vice versa, otherwise the result is correct. similar occurrence happens when the internal state values reach boundaries of the dynamic range, i.e. first larger/smaller value, then the maximal/minimal is interpreted as the minimal/maximal value of the range. this can be tolerated as long as the final values of the output signal are valid, i.e. wrap-around does not coincide with the moment of the output signal acquisition. this is where parallel realization, adopted in this work, again outperforms the cascade one. saturation conditions are drastically relaxed when using parallel realization, especially as the order of filter increases. occurrence of wrap-around is reduced as well. this is simply because no matter how large the filter order is, all second and/or first order cells process the input signal independently of each other. therefore, scaling of the input signal applies to all cells at once. the bottleneck is, of course, the output summing node. nevertheless, net sensitivity to saturation is reduced when constrains regarding saturation are relaxed at each individual cell. one may also choose to omit scaling and relay completely on two’s complement representation, but this technique requires sound knowledge about the input signal waveform and algorithm for tracking and handling wrap-around effect. also, all possible cases have to be predicted, therefore extensive simulations are required. this is usually too expensive in the real-world applications, therefore some form of scaling is always applied. moreover, scaling technique is quite easy to implement in the digital domain, knowing that each scaling down/up by two is nothing but the one simple shift right/left operation. accordingly, scaling operation can be implemented as tunable (programmable). even non-linear scaling can be implemented if high accuracy is required. unfortunately, there is no scaling technique which provides closed-form, general solution and it all depends on concrete application at the end of the day. this implies that some exploration of the time domain simulation results is inevitable in the design process to 282 d. mirković, m. andrejević stošović, p. petković, v. litovski determine the appropriate scaling factor. finally, combining several techniques to cope saturation and round-off noise may result with more efficient solution, but scaling with fixed coefficient, because of its simplicity, is still considered suitable for verity of applications and therefore utilized in this design, as well. since our test signal is known in advance, fixed scaling coefficient is to be determined. finding maximum and minimum of the function given in (3), one can determinate the range of the input signal, i.e. rngin = [−3.73, 3.73]. using time domain simulation scaling factor of four turned out to be suitable. this can be also intuitively concluded when looking at the range of filter’s coefficients. namely, dividing rngin with four gives rngnew = [−0.93, 0.93], which is smaller than half of the filter’s coefficient range rng = [−2, 1.99993896484375], leaving enough headroom for values of internal states to spread without reaching saturation. after filtering, output is scaled up and the obtained results are presented in fig. 7. fig. 7 time domain simulation of the bp filter vhdl model: a) output waveform, b) output spectrum sound representation of signal’s spectrum using fft usually requires a large number of samples. this inevitably leads to longer time domain simulation. to minimize duration of the time domain logic simulation, a smaller number of fft points, compared with a case with purely mathematical model which simulates faster (fig. 5b, 5d), is desired. therefore, nfft = 16384 is chosen for representing the output signal spectrum in fig 7b. it turns out that this number gives a satisfactory compromise between simulation time and fft accuracy. finally, comparing fig. 7a with fig. 5c and fig. 7b with fig. 5d, one can see that the time and frequency domains of the output signal obtained by simulating mathematical and hardware models of the filter match. this proves that hardware representation successfully implements the desired behavior of the designed filter. 6. conclusion with this case study we intended to fulfill two main goals. first, we wanted to raise the awareness of the salient advantages of the cmac filtering functions as compared with their non-monotonic counterparts. to achieve this, we gave a short overview of the properties of cmac amplitude characteristics. the second goal was to give, for the first time, design results characterizing the amplitude characteristic of band-pass iir digital filters. accordingly, we went through several steps. first, we gave arguments on the choice of iir filters. then, we gave arguments for the parallel implementation of digital design of iir digital filters with critical monotonic passband amplitude characteristic 283 filters that was used throughout the design process. next, we described and exemplified the complete design procedure including the verification steps needed to support the design decisions taken on the way. all that was performed on the example of a band-pass cmac iir digital filter, a solution that was here reported for the very first time. acknowledgement: this research was funded by the ministry of education, science and technological development of republic of serbia under contract no. tr32004. references [1] d. topisirović, v. litovski, and m. andrejević stošović, “unified theory and state-variable implementation of critical-monotonic all-pole filters,” international journal of circuit theory and applications, vol. 43, no. 4, pp. 502–515, 2015. [2] jf. kaiser, digital filters, in: ff. kuo, jf. kaiser (eds.) system analysis by digital computer. wiley: new york, 1996, chapter 7, p. 245. [3] d. mirković, m. andrejević stošović, p. petković and v. litovski, “iir digital filters with critical monotonic pass-band amplitude characteristic,” aeu international journal of electronics and communications, vol. 69, no. 10, pp. 1495-1505, oct. 2015, issn 1434-8411. [4] ds. humpherys, the analysis, design, and synthesis of electrical filters, prentice-hall, 1970 [5] k. geher, theory of network tolerances, akademiai kiadó: budapest, hungary, 1971. [6] v. litovski, “synthesis of monotonic passband sharp cutoff filters with constant group delay response,” circuits and systems, ieee transactions on, vol. 26, no. 8, pp. 597–602, aug 1979. [7] rabey j. low power design essentials. springer science + business media, llc: new york, 2009. [8] mf. quélhas, a. petraglia, mr. petraglia, “efficient group delay equalization of discrete-time iir filters,” in proceedings of the xii european signal processing conference, eusipco-2004, vol. 1, vienna, austria, 2004, pp. 125-128. [9] rabrenović d, jovanović v. low-pass filters with critical monotonic magnitude. publications of faculty of electrical engineering, eta series: belgrade, 1973, pp. 59–68. [10] b. d. rakovich, “designing monotonic low-pass filters – comparison of some methods and criteria,” international journal of circuit theory and applications, vol. 2, no. 3, pp. 215–221, 1974. [11] s. butterworth, “on the theory of filter amplifiers,” experimental wireless and the wireless engineer, vol. 7, 1930, pp. 536-541. [12] a. papoulis, “optimum filters with monotonic response,” in proceedings of the ire, vol. 46, no. 3, pp. 606–609, 1958. [13] a. papoulis, “on monotonic response filters,” proceedings of the ire, vol. 47, pp. 332–333, 1959. [14] m. fukada, “optimum filters of even orders with monotonic response,” circuit theory, ire transactions on, vol. 6, no. 3, pp. 277–281, 1959. [15] p. halpern, “optimum monotonic low-pass filters,” circuit theory, ieee transactions on, vol. 16, no. 2, pp. 240–242, may 1969. [16] b. rakovich and v. litovski, “least-squares monotonic lowpass filters with sharp cutoff,” electronics letters, vol. 9, no. 4, pp. 75–76, february 1973. [17] d. mirković, p. petković, and v. litovski, “a second order s-to-z transform and its implementation to iir filter design,” compel the international journal for computation and mathematics in electrical and electronic engineering, vol. 33, no. 5, pp. 1831–1843, 2014. [18] w. park, k.-s. park, and h.-m. koh, “active control of large structures using a bilinear pole-shifting transform with h∞ control method,” engineering structures, vol. 30, no. 11, pp. 3336–3344, 2008. [19] h. orchard and g. c. temes, “filter design using transformed variables,” circuit theory, ieee transactions on, vol. 15, no. 4, pp. 385–408, 1968. [20] p. geffe, “designers guide to active bandpass filters,” part iii’, edn, vol. 19, no. 7, 1974. [21] k. k. parhi, scaling and round-off noise in: vlsi digital signal processing systems: design and implementation. john wiley & sons, 2007, chapter 11. [22] k. prasad and p. sathyanarayana, “signal scaling in cascade digital filters,” circuits, systems and signal processing, vol. 8, no. 4, pp. 421–426, 1989. instruction facta universitatis series: electronics and energetics vol. 30, n o 2, june 2017, pp. 179 185 doi: 10.2298/fuee1702179d a new lumped element bridged-t absorptive bandstop filter  suhash c. dutta roy formerly at the department of electrical engineering, indian institute of technology, delhi, new delhi india abstract. following a brief review of previous work on bandstop filters, the inadequacy of a recent work to obtain a perfect notch or perfect absorption at the notch frequency ω0 is demonstrated. a simple and elegant alternative solution, based on purely analytical arguments, is then presented. the resulting network is shown to achieve perfect matching as well as perfect absorption at the notch frequency and has several other advantages. a comparison has also been made with the conventional bridged-t band-stop filter. key words: bandstop filter, bridged-t network, circuit design. 1. introduction bandstop filters are circuits which reject, to within a specified tolerance, a band of frequencies around a centre frequency at which there is complete rejection. such filters are known by various names, such as band rejection filters, notch filters, null networks etc. and are required in many situations in communication and instrumentation. bandstop filters have fascinated a large number of researchers, including the present author, who has written papers on the analysis [1-7], design [8] and its limitations [9], and analysis and applications of dual input techniques to such filters [10-12]. all these contributions relate to analog circuits. bandstop filters are also required in digital signal processing, and the author and his students have done extensive work on digital notch filters, using both fir and iir techniques [13-21]. of these, [21] is a review of fir notch filter design, which appeared in this journal. at low frequencies, passive rc networks are mostly used, except in situations where a selectivity, defined as (notch frequency)/(3 db stop bandwidth), is required to be more than half. in the latter cases, either active rc filters or lc networks are to be used. for high frequencies, lc networks are easily designed and implemented. at microwave frequencies, distributed networks are preferred over lumped networks, although the latter  received november 3, 2016 corresponding author: suhash c. dutta roy department of electrical engineering, indian institute of technology, 164, hauz khas apartments, new delhi 110016, india (e-mail: s.c.dutta.roy@gmail.com) 180 s. c. dutta roy have the advantage of occupying less space, and as is well known, space is a premium in microwave integrated circuits. examples of lumped element microwave bandstop filters can be found in [22-29], while bandstop filters with distributed elements can be found in [30,31]. 2. scope and organization of the paper this paper is concerned with the design of a band-stop filter which achieves a perfect notch and perfect absorption at some frequency ω0. in this context, we first demonstrate, in section 3, the inadequacy of a recent solution proposed by chieh and rowland [32], by network theoretic arguments. in the next section, we present a new, simple and elegant alternative design, based on purely analytical arguments. the resulting network is shown to achieve perfect matching as well as perfect absorption at the notch frequency, and has several other advantages. a normalized design is discussed in section 5, and the simulation results are presented. a comparison of the new design with the conventional bridged-t bandstop filter is made in section 6. finally, section 7 gives the concluding comments. 3. chieh and rowland's design chieh and rowland [32] proposed the symmetrical network of fig. 1 where z1(jω)=1/(jωc), (1a) z2(jω)=r1+jωl1+1/( jωc1) (1b) and z3(jω)=r2+jωl2+1/(jωc2). (1c) and both z2 and z3 resonate at the same frequency ωo. for ready reference, we reproduce here the expressions for the z-parameters of the network and the scattering parameters, in slightly different forms: fig. 1 the bridged-t network z11=z22=z2+(z1 2 +z1z3)/(2z1+z3), (2) z12=z21=z2+z1 2 /(2z1+z3), (3) s12=s21=2z21 zo/[(z11+ zo) 2 -z21 2 ], (4) a new lumped element bridged-t absorptive band-stop filter 181 and s11=s22=(z11 2 z21 2 ]/[(z11+ zo) 2 -z21 2 ]. (5) note from (2) and (3) that z11=z21+z1z3/(2z1+z3). (6) from (4) and (5), we observe that for a perfect notch as well as perfect absorption at the frequency ωo, we require z21(jωo)=0 (7) and z11(jωo)= zo. (8) from (1), we have z1(jωo)=1/(jωoc), z2(jωo)=r2 , and z3(jωo)=r1. (9) substituting these values in (3) gives, on simplification, z21(jωo)= r2+1/[ jωoc(2+ jωoc r1)], (10) which cannot be made zero. also, under this condition, z11(jωo)= r2+(1+ jωoc r1)/[ jωoc(2+ jωoc r1)], (11) which cannot be equal to zo if the latter is purely resistive, which is usually the case. equation (8) can be satisfied only if zo is a complex series rc impedance. thus the network of fig. 1 with the element values given by (1) can achieve neither perfect notch nor perfect absorption. 4. the new design the problem to be solved can be restated as follows: given ωo and ro and the network topology of fig.1, find z1, z2 and z3 such that z21(jωo)=z2(jωo)+[z1(jωo)] 2 /[2z1(jωo)+z3(jωo)]=0, (12) and z11(jωo)=z21(jωo)+z1(jωo)z3(jωo)/[2z1(jωo)+z3(jωo)]=ro. (13) where zo has been assumed to be resistive, equal to ro. in view of (12), (13) reduces to z11(jωo)=z1(jωo)z3(jωo)/[2z1(jωo)+z3(jωo)]=ro. (14) from (14), z3 is expressed in terms of z1 as z3(jωo)=2roz1(jωo)/[z1(jωo)-ro]. (15) combining this with (12) and simplifying, we get z2(jωo)=[ro-z1(jωo)]/2. (16) we can now choose a z1. if we take z1(jωo)=1/(jωoc), as in [1], then (15) gives, on simplification, z3(jωo)=[2ro/(1+ωo 2 c 2 ro 2 )]+jωo[2cro 2 /(1+ωo 2 c 2 ro 2 )] (17) 182 s. c. dutta roy which represents a series combination of an inductance l3 and a resistance r3, where r3=[2ro/(1+ωo 2 c 2 ro 2 )] and l3=2cro 2 /(1+ωo 2 c 2 ro 2 ). (18) similarly, (16) gives z2(jωo)=(ro/2)+jωo/(2ωo 2 c), (19) which also represents a series combination of an inductance l2 and a resistance r2, where r2=(ro/2) and l2=1/(2ωo 2 c). (20) in theory, c can be chosen to have any value, but as we shall see, it will be most convenient to choose c from the expression for r3 given in (18), which gives c=[(2ro/r3)-1] 1/2 /(ωoro) (21) note that if we choose c=1/(ωoro), (22) then r3 becomes equal to ro. also, under this condition, (17) and (18) give l3=ro/ωo and l2=ro/(2ωo). (23) this choice of c is advantageous because then z3 can be obtained by a series combination of z2 and z2 and there is no spread in the element values of the network. also note that lossy inductors can be used with ease because their losses can be absorbed in their series resistances. finally, the element valus of the network are consolidated as c=1/(ωoro), l3=2l2=ro/ωo and r3=2r2= ro. (24) fig. 2 the normalized design of the absorptive bandstop filter 5. a normalized design it is always convenient to have a normalized design which can be denormalized by impedance and frequency scaling. let ro=1 ohm and ωo=1 rad/sec. then (24) gives the element values as c=1f, l3=2l2=1h and r3=2r2=1 ohm. (25) a new lumped element bridged-t absorptive band-stop filter 183 the resulting network is shown in fig. 2. this network has been simulated with matlab and the obtained plots of │s11(jω)│and│s21(jω)│are shown in fig. 3. these plots exactly match the theoretical predictions. 6. comparison with the conventional bridged-t bandstop filter it may be noted that compared to network proposed in [32], the conventional bridgedt bandstop filter [3] performs better because it achieves a perfect notch but not perfect absorption. in this network, z1(jωo)=1/(jωc), z2(jωo)= r+jωl, and z3(jω)=r. (26) the network then achieves a perfect notch at ω=[2/(lc)] 1/2 under the condition l=crr, but it cannot achieve s11(jωo)=0 unless zo is a parallel combination of a capacitor c and a resistor r/2, which is not the usual case. also, if we choose r=r, then there is no spread in the component values. further, as in the proposed alternative, a lossy inductor can be used here. in addition, in comparison with the networks of [32] and that proposed here, it uses the least number, viz. three of reactive elements, yielding a transfer function of order three. 10 -1 10 0 10 1 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 fig. 3 performance of the normalized design. the upper curve is a plot of │s21(jω)│and the lower curve represents │s11(jω)│ 7. concluding comments it has been shown that the network proposed in [32] achieves neither a perfect notch nor perfect absorption. an alternative solution is proposed here purely by analytical, rather than physical or heuristic arguments, which achieves these two objectives simultaneously. the element values are obtained very simply, rather than by numerical and parametric methods as in [32]. also, the new solution uses only two capacitors, instead of four, which 184 s. c. dutta roy reduces the order of the transfer function by two. by an appropriate choice of the elements, there is no spread in the element values. a normalized design has been presented and the resulting characteristics of │s11(jω)│and│s21(jωo)│ have been plotted. a comparison of the two circuits has also been made with the conventional bridged-t bandstop filter. acknowledgement: the author thanks professor y. v. joshi for his help in the preparation of the manuscript and performing the simulation. references [1] s. c. dutta roy, ―analyzing the parallel-t rc network – yet another method‖, iete j. educ., vol. 44, pp. 111-116, [2] s. c. dutta. roy, ―a quick method of analyzing parallel ladder networks‖, int. j. elect. eng. educ., vol. 13, pp. 70-75, jan. 1976. [3] s. c. dutta roy, ―miller’s theorem revisited‖, circuits, syst. signal proces., vol. 19, pp. 487-499, dec. 2000. [4] s. c. dutta roy, ―on second order digital bandpass and bandstop filters", iete j. educ., vol. 49, pp. 5963, may-aug. 2008. [5] s. c. dutta roy, ―interference rejection in a uwb syetem: an example of lc driving point synthesis‖, iete j. educ., vol. 50, pp. 55-58, may-aug. 2009. [6] s. c. dutta roy, ―on some three terminal lumped and distributed rc null networks‖, ieee trans. circuit theory, vol. ct-11, pp. 98-103, mar. 1964. [7] s. c. dutta roy and b. a. shenoi, ―notch networks using distributed rc elements". proc. ieee, vol. 54, pp. 1220-1221, sept. 1966. [8] s. c. dutta roy. "on the design of parallel-t resistance capacitance networks for maximum selectivity‖, j.ite, vol. 8, pp. 218-223, sept. 1962. [9] s. c. dutta roy, ―parallel—t rc network : limitations of design equations and shaping the transmission characteristic‖, lndian j. pure appl. phys., vol. 1, pp.175-181, may 1963. [10] s. c. dutta roy, "dual input null networks‖, proc. ieee, vol. 55, pp. 221-222, feb. 1967. [11] s. c. dutta roy and n. choudhury, ―an application of dual input networks", proc. ieee, vol. 56, pp. 647-646. may 1970. [12] s. c. dutta roy and r. p. sah. "dual input distributed rc notch filter‖, lnd. j. pure appl. phys., vol. 9, pp. 762-763, sept. 1971. [13] s. c. dutta roy, s. b. jain and b. kumar, "design of digital notch filters‖, iee proc. – vision, image signal process., vol. 141, pp. 334-338, oct. 1994. [14] b. kumar, s. b. jain and s c. dutta roy, "on the design of fir notch filters", iete j. res., vol. 43, pp. 65-68, jan.feb. 1997. [15] s. b. jain, b. kumar and s. c. dutta roy, "design of fir notch filters by using bernstein polynomials‖, int. j. circuit theory applic., vol. 25, pp. 135-139, mar.-apr. 1997. [16] s. c. dutta roy, s. b. jain and b. kumar, ‖design of digital fir notch filters from second order llr prototype", iete j. res., vol. 43, pp. 275-279, jul.-aug. 1997. [17] s. b. jain, b. kumar and s. c. dutta roy, ‖semi-analytic method for the design of digital fir filters with specified notch frequency", signal process., vol. 59, pp. 235-241, 1997. [18] y. v. joshi and s. c. dutta roy, ‖design of llr digital notch filters‖, circuits, syst. signal process., vol. 16, pp. 415-427, 1997. [19] y. v. joshi and s. c. dutta roy. "design of lift notch filters with different passband gains‖,iee proceedings — vision, image signal process., vol. 147, pp. 11-19, feb. 1998. [20] y. v. joshi and s. c. dutta roy. "design of iir multiple notch filters based on all-pass filters‖, ieee trans.circuits syst.-ii: trans. briefs, vol, 46, pp. 134-138, feb. 1999. [21] s. c. dutta roy, b. kumar and s. b. jain, ―fir notch filter design – a review‖ (invited paper), facta universitatis (nis) – series : electron. energ., vol. 14, pp. 295-327, dec. 2001. [22] a. s. alkanhal, ―compact bandstop filters with extended upper passbands‖, active and passive components, vol. 2008, doi: http://dx.doi. org/10.11552008/356049. http://dx.doi/ a new lumped element bridged-t absorptive band-stop filter 185 [23] o. p. gupta and r. j. wenzel, ―design tables for a class of optimum new bandstop filters‖, ieee trans. microw. theo. tech., vol. 18, pp. 402-404, july [24] k. s. k. yeo and p. vijaykumar, ―quasi-elliptic microstrip bandstop filters using tap coupled open loop resonators‖, prog. electromag. res., vol. 35, pp. 1-11, 2013. [25] y. s. mezaal, h. t. eyyuboglu and j. k. ali, ―wide bandpass and narrow bandstop microstrip filters based on hilbert fractal geometry: design and simulation results‖, plos one, vol. 9, e115412, 2014. [26] m. m. bait-sawailam, ―miniaturized bandstop filters using slotted complementary networks‖, int. j. dig. inform. and wireless commun.,vol. 4, pp. 401-407, mar. 2014. [27] d. r. jachowski, ―narrowband absorption bandstop filtres with multiple signal paths‖, us patent no. 7323955b2, pub: jan. 29 2008. [28] w. m. pathelbab and m. b. steer, ―design of bandstop filters utilising circuit prototypes‖, iee micrpw, ant. propag., vol. 1, pp. 523-526, march 2007. [29] d. r. jachowski, ―tunable lumped element notch filter with constant bandwidth‖, in proc. of the ieee int. wireless inf. technol. syst. conf., pp. 1-4, 2010. [30] t. c. lee, j. lee, e. j. naglich and d. peroulis, ―octave tunable lumped element notch filter with resonator q-independent zero reflection coefficient‖, in proc. of the ieee mtt-s int. digest, pp. 1-4, 2014. [31] j. lee, t. leeand w. j. chappel, ―lumped element realization of absorptive bandstop filter with anomalously high spectral isolation‖, ieee trans. microw. theo. tech., vol. 60, pp. 2424-2430, aug. 2012. [32] j. s. chieh and j. rowland, ―quasi-lumped element bridged-t absorptive bandstop filter‖, ieee microw. wireless compon. lett., vol. 26, pp. 264-266, apr. 2016. instruction facta universitatis series:electronics and energetics vol. 27, n o 1, march 2014, pp. 137 151 doi: 10.2298/fuee1401137b evaluating system security using transaction level modelling  aisha bushager 1 , mark zwolinski 2 1 department of information systems, college of information technology, university of bahrain, bahrain 2 electronics and computer science, university of southampton, southampton so17 1bj, uk abstract. the design of secure systems requires the use of security analysis techniques. security objectives have to be considered during the early stages of system development and design; an executable model will give the designer the advantage of exploring the vulnerabilities early, and therefore enhancing the system security. in this work we create an executable model of a smart card system using systemc with the transaction level modelling (tlm) extensions. the model includes the security protocols and transactions. the model is used to compare a number of authentication mechanisms with different probabilities of failure. in addition, a number of probable attacks, including theft of a private key and denial of service were modelled to examine the vulnerabilities. the executable model shows that security protocols and transactions can be effectively simulated in order to design improvements to withstand different types of security attacks. key words: security modelling, systemc, transaction level modelling, protocols, smart cards. 1. introduction robust and secure system design requires the selection and implementation of a set of policies, procedures, architectures, technology, and personnel. however, there is no system that is 100% secure; there will always be a way to breach the system. the objective in security analysis is to identify the weak points. this requires modelling and simulation tools. we have used an executable model of a smart card system as an exemplar, including the security protocols and transactions, to allow examination of the security strengths and weaknesses by executing tests on the model. this paper extends work previously presented [1].  received january 12, 2014 corresponding author: mark zwolinski electronics and computer science, university of southampton, southampton so17 1bj, uk (e-mail: mz@ecs.soton.ac.uk) 138 a. bushager, m. zwolinski 2. related work security protocols are sets of rules designed to ensure particular security goals. however, designing and implementing these protocols is difficult and they may fail against various attacks. to be able to effectively integrate the security protocols at early stages of development, modelling languages and techniques are used to better visualize the entire system. one such modelling tool is communicating sequential processes (csp), which is a process algebra that is used to describe and analyse security properties and protocols by providing a mathematical framework [2]. however, to be able to use csp, the designer must have specialized knowledge and training, which limits the usage of this method. gspml, [3], is a visual security protocol modelling language. again, this language introduces notations and complex models that are targeted to security specialists. stereotypes and tags are used to create and present security requirements and assumptions, constraints may be attached but they should be satisfied by modelling elements with the related stereotype [4]. the unified modelling language (uml) version 2.0 has been widely used to model security protocols [5]. for example, umlsec [4], [6] is an extension to uml for integrating security related information into uml specifications, by specifying security requirements through stereotypes, tagged values, and constraints[7]. an adversary can be created in umlsec to model possible threats to a system. umlsec was used to find possible vulnerabilities in common electronic purse specifications (ceps) [4], it was also used to define security permissions that enforce restrictions on the workflows of a system [8]. none of the above modelling languages provides an automatic transition from design to code implementation. a designer would like to have an executable model that allows a better testing of the designed model and therefore links the gap between the design phase and the code implantation phase. in our work, an executable model is produced using systemc with the tlm extensions [9]. systemc has been used to produce a methodology to simulate security attacks on smart cards with fault injection [10] and it has also been used to create an environment for design verification of smart cards using security attack simulation [11]. in tlm, communication among computational components is modelled by channels and transaction requests are handled by calling interface functions of these channel models [12]. 3. using uml to model smart card transactions as an illustration of our methodology, we use a smart card system. because smart cards are used to store sensitive data such as pins, passwords, and keys, they are likely targets for criminal attacks. the main purpose of an attack is to get hold of this data. attackers might perform various numbers and types of attack on the smart card system. 3.1. overview of a smart card system figure 1 is a use case diagram that gives an overview of the basic components and functions of any smart card system. the use case diagram is a behavioural uml diagram that presents the system functionality. in our system, the actors illustrated in the figure represent the main components of the system, which are the user, smart card, smart evaluating system security using transaction level modelling 139 card reader, client, server, and database. the use cases represent the functions or services that take place while the system is operating. the focus of the analysis in this study will be on the functions of three main components, which are the user, smart card, and the smart card reader. fig. 1 overview of a smart card system the system combines three security mechanisms and a smart card ("what the user has"). the mechanisms are: pin, biometrics, and pki. the first two mechanisms are responsible for user identification and verification, a pin is: "what the user knows", and the biometrics are: "who the user is". pki verifies the devices in the system. when the user decides to use the smart card, the first step is to insert the smart card in the smart card reader. the smart card reader has number of jobs: it has to verify and authenticate the user and smart card, commit transactions, and exchange and confirm the user details with the other system components. to be able to demonstrate the transactions of the system, another type of uml diagram has to be used, figure 2. the following sections describe the registration phase and the verification phase of the smart card system and the potential threats and attacks. 3.2. smart card registration system to be able to demonstrate the transactions and message sequence between the smart card system objects, a sequence diagram is used, e.g. figure 2, which is a behavioural diagram that shows the interactions of system processes. the user provides the required information along with the biometric evidence. the system then saves the user details in the smart card and captures the fingerprint, which is the biometric method used in the proposed design, and produces a template that is 140 a. bushager, m. zwolinski stored in the system and the smart card. then, the registration system requests a pin from the user to be used in future verification processes. fig. 2 registration phase in pin, biometrics (fingerprint), and pki smart card system the pin is stored in the smart card for future verification. finally, the smart card system requests a private key from the certificate authority (ca) to generate a digital signature [13]. the ca, on the other hand, requests user verification from the registration system, generates a pair of keys for the user. the ca also issues a digital certificate corresponding to the public key, and sends the private key to the smart card to generate a digital signature that combines the private key and the biometric template of the user. 3.3. smart card system verification figure 3 shows the transactions that take place when the user uses the smart card in a security environment that combines pin, biometrics, and pki security methods. the sender first inserts the pin, the smart card reader extracts the stored pin from the smart card and starts the comparison process. if the match is successful the smart card reader will ask for another proof, which is the sender's fingerprint, otherwise, the transaction will be aborted after allowing the sender three attempts to enter the pin. the sender scans the finger through the smart card reader scanner; the reader will extract the sender's biometric feature and produce a template. the matching process will then take place and the result will decide whether the sender has permission to access the evaluating system security using transaction level modelling 141 system or not. if the match is true, the smart card releases the sender's private key. next, the sender starts to send a message to the receiver; the message is going to be digitally signed with the sender's private key, and the system will request the receiver's public key from the ca to encrypt the message. the ca will send the digital certificate and the message will be encrypted using both the sender's private key and the receiver's pubic key, therefore, the digital envelope is now ready to be sent securely to the receiver. finally, the receiver will send a request to the ca to get the sender's public key to decrypt the message. again, using both the sender's public key and the receiver's private key the receiver will be able to decrypt the message successfully. fig. 3 verification processes in pin, biometrics (fingerprint), and pki smart card system these security methods should achieve the security goals of confidentiality, integrity, authentication, and non-repudiation. however, each mechanism has its pros and cons. for example, fingerprints have disadvantages: how can we know that the biometric provided is not subject to misuse? if the user was clever and powerful enough to fool the system and use a false fingerprint, then the system will be breached and an intruder will have access to the real user's credentials and privileges. the pki method has its disadvantages as well. if one breach takes place during the transaction the sender and the receiver can both suffer security loss. 142 a. bushager, m. zwolinski 3.4. smart card system threats threats are the possible means by which a security policy may be breached [14]. a threat source can be any person, thing, event, or idea that poses danger to an asset within a system in terms of confidentiality, integrity, availability, or legitimate use. moreover, threats can be deliberate or accidental [14]. if deliberate, a threat can be categorized as passive, such as network sniffing, or active, such as negligence, errors, attempt to gain unauthorized access to the system, or changing the value of a particular transaction by malicious persons. therefore, possible threats on a smart card system include unauthorized system access, hacking and system intrusion, information leakage or theft, integrity violation (errors and omissions by insiders or outsiders), distributed denial of service, illegitimate use (dishonest or disgruntled insiders or outsiders), system penetration and tampering. threat sources have different motivations that may lead to various attacks on any government or business information system; therefore, the parties involved in the smart card system must be familiar with the human threat environments and their different motivations. 3.5. possible attacks on a smart card system attacks may occur at every single stage of a product's lifecycle, starting from the development stage, the manufacturing stage, and ending up with actual usage. attacks that take place at the development stage and the manufacturing stage of a smart card are most likely to be carried out by an insider, [15]. attacks during the smart card use stage can be physical or logical [15]. physical attacks may manipulate the semiconductor itself and usually require equipment like microscopes, focused ion beams, etc. [16]. sidechannel attacks consist of observing behaviour while the information is being processed and include timing analysis and power analysis [17]. in contrast, logical attacks or so-called software attacks do not attack the hardware properties directly; they are more focused on the communication and flow of information between the smart card and the terminal [15]. attackers can write malicious software, that can be employed in a software attack on a smart card, for example, in smart cards that support java card it is possible to load and run software. examples of logical attacks could be bug exploits, illegal bytecode, and attacks during pin comparison. other types of attacks take place during the authentication phase of the smart card system, where the user identity is authenticated using different types of authentication mechanisms like biometrics [18]. 3.6. modelling attacks using umlsec after using uml diagrams to express the smart card system protocol and processes, and to represent the transactions that take place while messages are exchanged during the registration and verification processes, in addition to knowing where the areas are that could be vulnerable to attacks, it is also essential to test the model against possible attacks. umlsec was used to model attacks, using stereotypes such as secrecy and secure information flow along with their tags and constraints. an adversary type in umlsec can have a function called threat that allows the adversary to commit delete, read, and insert attacks. nevertheless, the model is still static and not executable. evaluating system security using transaction level modelling 143 4. animating the model using systemc tlm systemc was developed to support the need for a language that improves the overall productivity for designers in the electronic systems field [9]. it supports the development of complex systems by the design and verification of hardware system components at a high level of abstraction. the systemc library is open source and written in c++. in addition, it contains a lightweight kernel that schedules the processes. the systemc library provides concurrent and hierarchical modules, ports, channels, processes, and clocks. large designs are always broken down hierarchically to be able to manage complexity; structural decomposition of the simulated model in systemc is specified with modules. the module is the smallest container with state, behaviour, and structure for hierarchical connectivity [9]. within a module, we use a thread process, which is associated with its own thread of execution. once the thread starts executing it is in complete control of the simulation until it chooses to return control to the simulator. hence, the thread process is used to model sequential behaviour [9]. systemc has two ways to pass control to the simulator again, one way is to exit by (return), in this case the thread is totally stopped, the other way is by having a (wait), therefore, every thread contains an infinite loop and usually has at least one wait function. the tlm library is built on top of systemc and allows abstract communications to be modelled in a structured manner. in tlm communication between components is modelled by channels and transaction requests, which are implemented by calling interface functions of the channel models [12]. the initiator port and the target port are distinguished in tlm. an initiator is a module that creates new transactions and passes them on by calling a method of one of the core interfaces. the target is a module that receives the transactions from the initiator. a system component can be an initiator, a target, or an interconnect. the interconnect module accesses a transaction but does not act as an initiator or a target for that transaction, for example routers can be interconnect modules in a system. another important element in tlm is the generic payload, which allows data abstraction. 4.1. smart card system simulation the executable model produced in our work shows the sequence of transactions that occur in the smart card system while the smart card is used; they correspond to the transactions in figure 3. hence, in the executable model, the smart card system objects and their related transactions, the lifelines in the uml diagram, are represented as objects – modules in systemc, and the arrows are represented as tlm transactions. the modules have two types of socket, an initiator socket that is responsible for sending the transactions and a target socket that is responsible for receiving the transactions; both sockets are defined in the module structure. the sender module communicates with the smart card module and the smart card reader module. an initiator socket from the sender to the smart card is created, along with another initiator socket to the smart card reader module, to allow the sender to send transactions to both modules. the initiator is responsible for calling the transport function to send the payload to the target socket. on the other hand, a target socket is created and then registered in the constructor; the target socket receives the payload from the transfer function for processing and response. 144 a. bushager, m. zwolinski the next step is creating the threads that correspond to the processes taking place in each module, creating the payloads that are transferred from a module to the other, creating functions, and setting events and variables. in the smart card executable model, the authentication methods used are pin and biometrics. the user, modelled as part of the sender module, enters the pin. if the pin is correct, the user enters the fingerprint. the number of attempts allowed for the sender is programmable. the executable model counts the number of attempts, and compares the inserted pin and fingerprint with the saved pin and fingerprint template in the smart card. also, there is a time limit for inserting the pin and fingerprint, otherwise a timeout message will appear. if the number of incorrect attempts exceeded the limit, the system blocks the smart card and saves the smart card id in the banned smart card list. errors in entering the correct pin vary; it could be wrong digits, taking a long time to insert the correct pin, or an attacker trying to insert the pin randomly. the same steps take place when entering a fingerprint. the successful attempts at pin and fingerprint entry will confirm that the sender is a legitimate user. therefore, when the sender passes the authentication step, the smart card releases the private key. then the transactions related to signing the message with the private and public keys take place, and finally the system sends the digitally signed message to the receiver. in reality, the user enters the pin and scans the fingerprint through an input device like a keypad, biometric scanner, or touch pad. however, our executable model can randomise the pin and fingerprint entries, and also randomise the correct and incorrect time. a simple pseudo-random number generator is used to randomise the pin and fingerprint entries along with randomising the correct and incorrect time in seconds. the simple random number generator is fast and provides better randomness properties like adjusting the ratios, changing the range of sample smart cards to be tested, and modifying the probabilities of failure. an arbitrary ratio of successful pin and fingerprint is used; it can be modified to allow flexibility in testing different probabilities of failure. the executable module has the smart card system objects and their related transactions. the lifelines in the uml diagram are represented as objects, modules in systemc, and the arrows are represented as transactions using tlm. the transitions in the output correspond to the transaction number in the uml diagram in figure 3. obviously, the designer can observe the attempts to enter the right pin and biometric along with the required timing. this allows the testing of the effectiveness of the authentication methods used. by running the simulation on different numbers of smart cards with different probabilities of failure it is possible to evaluate the effectiveness of each authentication method. 4.2. testing the authentication methods validation of the authentication methods in the smart card system is based on two proposed models. the first model uses a pin followed by a biometric authentication method, while the second model reverses the sequence. the main reason behind carrying out these correctness tests is to check that the simulation using the executable model is actually working. the purpose of these tests is to verify:  the functionality/workability of the smart card simulation tool and the availability of test results;  the reliability of the smart card simulation tool through simulation; evaluating system security using transaction level modelling 145  the degree of flexibility in assigning thresholds and failure probabilities, which will assist in customising the simulation tool based on the industry and sector in which the smart card system will be used;  the speed of testing, which allows users of the simulation tool to obtain results and manipulate thresholds with ease and flexibility. the following tests have been performed: 1. pin followed by biometrics. 2. biometrics followed by pin. for each of these tests, an arbitrary probability of failure has been assigned to each of the authentication methods. for example, the probability of failure for the pin is set at 15%, for the biometrics (fingerprint) it is set at 10%, and the time allowed for entering the correct pin and correct fingerprint is set at 10 seconds for each. the reason for assuming that the pin has a slightly higher probability of failure is that the pin authentication method is weaker than the biometrics and thus there is a higher probability of successful attacks and user errors and mistakes. the first test (pin followed by biometrics) used 100 to 3,000 smart cards. table 1 displays the results for the authentication method based on the scenarios of potential failure/error. table 1 results from testing the pin followed by biometrics authentication method remarks number of simulated smart cards 100 500 1000 1500 2000 2500 3000 good pin decoded 100 500 998 1490 1976 2464 2950 pin incorrect/re-enter correct pin 16 102 207 302 394 493 587 timeout error (pin) 9 58 125 189 257 299 376 good bio decoded 100 500 998 1490 1976 2464 2950 bio incorrect/re-enter correct bio 13 38 82 126 167 200 234 timeout error (bio) 11 58 124 171 236 299 359 an examination of the results may be interpreted according to the industry and sector of use, which dictate the levels of acceptable thresholds and probabilities of failure. initially, when examining the relationship between the expected and observed results of failure attempts across all sample sizes we are able to confirm that it is a linear relationship and that observed failure attempts are always below the expected range. in a sample of 3,000 cards, failure attempts are 963 over 30% of the sample size. this failure percentage alerts us to the vulnerability of the system. this entails a low level of acceptance of usage from both parties due to the increased risks represented by the use of this method. having such a high degree of risk and vulnerability in the system will expose it to numerous additional threats from different sources. the results of the expected and observed pin and biometric failure attempts are listed in table 2 and recorded as percentage of the total sample size. 146 a. bushager, m. zwolinski table 2 percentages of expected and observed pin followed by biometrics failure attempts number of smart cards 100 500 1000 1500 2000 2500 3000 percentage observed (pin) 8 11 11 11 11 11 11 percentage expected (pin) 15 15 15 15 15 15 15 percentage observed (bio) 8 6 7 7 7 7 7 percentage expected (bio) 10 10 10 10 10 10 10 when comparing the observed pin failure attempts to the biometrics failure attempts, it is noted that the percentages are 11% and 7%, respectively. although the difference is relatively small, it indicates that the pin authentication method requires additional monitoring, particularly in avoiding risks of external threats that pose potential harm against the users and system confidentiality and privacy. furthermore, under the simulation of 1,000 smart cards, it is noted that two cards have been banned for reaching the maximum attempts of pin entry. however, as the sample size increases, the number of banned smart cards grows significantly as illustrated in figure 4. fig. 4 smart cards banned in pin and biometrics proposed model for example when simulating 3000 smart cards, about 50 of them were banned during the pin authentication step. on the other hand, for the biometrics authentication method, it is noted that no smart cards have been banned when using this method. this is a clear indication of the level of security that the use of biometric authentication provides when adopted by smart cards, particularly ones that store and have access to sensitive data. in the second test, the initial expectation is that the use of a biometrics authentication first will decrease the possibility of failure attempts and attacks. this mechanism supports the security concept of using something you own (smart card), something you are (biometrics), and something you know (pin). -10 0 10 20 30 40 50 60 100 500 1000 1500 2000 2500 3000 f a il u re a e m p ts number of smart cards max pin a empts/card banned max fingerprint/card banned evaluating system security using transaction level modelling 147 when using the biometrics authentication method before the pin, the number of banned smart cards is recorded at 7 and 2 consecutively for a sample size of 3,000 smart cards. this is low compared to when the pin is used prior to the biometrics where the number of banned smart cards was 50 and 0 consecutively for a sample size of 3,000. given the benefits to the user and administrator, as well as the practicality of using the biometrics and pin authentication methods across most industries, it is recommended to adopt this method in the given order as it provides better security levels. in summary, the executable model developed using systemc tlm allowed the designer to test the proposed models that support a combination of authentication methods; by running simulations on different number of smart cards with different authentication methods and recording the results, the designer can examine the robustness of the proposed models in terms of enhancing security specifically during the phase of authenticating the smart card system users. the simulation tool provided a quick, automated, and flexible environment to test the proposed models, in addition to allowing the designer to observe and modify the transactions whenever changes are required. testing the proposed model against physical and logical attacks while the smart card is in use has resulted in giving the attacker the chance to get hold of the users private key, and therefore violating numbers of security properties like authentication, confidentiality, privacy, and integrity. this in essence shows that the system is vulnerable to threats and successful attacks taking place. yet, to be able to reduce the probability of successful attacks, our approach allows the designer to modify the executable model to test against future attacks. 4.3. simulating attacks on smart card system there are different types of attacks that have different probabilities of occurrence and different consequences for the smart card system and its users. each attack targets different areas of the system and has a specific goal; some attacks violate the smart card system authentication, privacy, and confidentiality like attacks on pin or attacks on biometrics. other attacks violate the smart card system integrity, reliability, and even authentication like invasive attacks, side channel attacks, etc. figure 5 is a uml sequence diagram that demonstrates the types of attacks that may occur in any smart card system, even though safeguards and controls like pin, biometrics, and pki are in place. the purple callouts represent the types of possible attacks that an attacker can carry out in that area precisely; in addition, the red callouts represent the attacks that are created in the executable model to test the system robustness. the executable model allows us to simulate an attack on the system. an attack on any part of the system is essentially another transaction inserted into the model. for example, to simulate an attack that allows the attacker to steal the private key released from the smart card object, which is coded as a state machine, an attacker is implemented as a class that can intrude into multiple modules in a thread-safe manner. thus, a transaction is effectively inserted into the model with one line of code at the appropriate point in the smart card module. 148 a. bushager, m. zwolinski fig. 5 possible attacks on pin, biometrics (fingerprint), and pki smart card system now, the model waits for transitions 1 to 8 to occur, and then the attacker interferes and attacks the system after transition 8 where the private key is released, figure 6. smartcard_reader_object: begin transition 8 smartcard_reader_object: end transition 8 sender_object: end transition 5 attacker initialized, @104 s attacker stole the private key, @104 s smartcard_object: begin transition 9 smartcard_object: end transition 9 fig. 6 simulated private key theft in this example, the attacker has to conduct a physical or logical attack to be able to get hold of the private key. for example, the attacker can practise a successful side channel attack, invasive attack, attacks during pin comparison, or attacks on biometrics. the executable model in this study does not simulate the physical or logical attack; it only assumes that a physical or logical attack has taken place. for that reason, it simulates an attack and creates an attacker class with features that allow the attacker to modify the transitions and as a result gain access to the user's secret information, specifically the private key. evaluating system security using transaction level modelling 149 another example of utilising the executable module in attacks simulation is by modelling another sort of an attack, which is carried out on the key exchange operation. this time the attacker monitors the public keys exchanged between the users and the ca, and gets hold of the users' public keys. being able to interfere with the key exchange protocol opens a door for the attacker to practice attacks that result in network disruption and loss of user trust like for example carrying out a man-in-the-middle attack [19], or a multi-protocol attack [20]. this example focuses on modelling an attack that allows the attacker to interfere through the transactions exchanged between the user and the receiver and gets hold of the data exchanged without both of the users knowing, by being able to model the attack, it is possible to point out a gap in the protocol that allows an attacker to monitor the flow of data, interfere within the transactions, and get hold of the public keys exchanged, figure 7. smartcard_object: begin transition 13 certificate_authority_object: begin transition 14 certificate_authority_object: end transition 14 attacker stole the receiver public key, @203 s smartcard_object: end transition 13 smartcard_object: begin transition 15 smartcard_object: end transition 15 smartcard_object: begin transition 16 smartcard_object: end transition 16 receiver_object: begin transition 17 certificate_authority_object: begin transition 18 certificate_authority_object: end transition 18 attacker stole the sender public key, @206 s receiver_object: end transition 17 fig. 7 simulated public key theft a denial of service (dos) attack is simulated using the same model. the attack aims at violating the availability property of the system security. the dos attack will take place against the certificate authority server; the attacker attempts to exhaust the server, which will result in the server being unable to provide the services for legitimate users. the following is part of the dos attack simulation output: as the output shows, the transactions of the smart card system are running normally, however, when the dos attack successfully takes place, the service is denied and the attacker gets hold of the users public keys exchanged among the system objects. in addition, the subsequent transactions failed to occur because the certificate authority server is unavailable. this attack shows that the availability property has been violated and the system users will not be able to use their smart cards until the certificate authority server recovers from the attack. dos attacks are indistinguishable from legitimate sign-in requests. the only differentiation is in the frequency of sign-in attempts and their origin. a large number of sign-in attempts in rapid succession can be indicative of a dos attack. hence, smart card systems can be protected from dos attacks by identifying high frequency of login attempts from a source and denying service to the source of such attack. another effective way is to limit the number of login attempts a user is allowed at a time. in summary, the executable model developed using systemc tlm allowed the designer to test the proposed models that support a combination of authentication 150 a. bushager, m. zwolinski methods; by running simulations on different number of smart cards with different authentication methods and recording the results, the designer can examine the robustness of the proposed models in terms of enhancing security specifically during the phase of authenticating the smart card system users. the simulation tool provided a quick, automated, and flexible environment to test the proposed models, in addition to allowing the designer to observe and modify the transactions whenever changes are required. in addition, the systemc tlm executable model also allowed the designer to discover the weak points of the system and point out vulnerabilities; the successful attacks indicate that there are weaknesses in the security protocol. to be able to reduce the probability of successful attacks, the designer can modify the executable model to test against future attacks. in contrast with the uml diagram, the animation makes it possible to see the attack actually happening. moreover, it is possible to make changes easily within the model and to try a number of attacks to test the system's robustness by simply inserting transactions into the uml diagram, and transforming them into transactions within the systemc tlm executable model. 5. conclusion uml diagrams are an excellent way of modelling systems, along with their extensions; they have features that show the designer how things should work. however, uml does not allow the designer to see what happens if something goes wrong with the system. therefore, to be able to see things happening and give reasons about the system, simulation has to take place. systemc tlm was used to transform a static uml model into an executable model. the executable model providing the opportunity to see the transaction flow within the system objects in an animated manner. in addition, it allowed the simulation of attacks in different parts of the system. the model gives a clear view of the weaknesses in the security requirements, methods, and protocols used in the smart card system. references [1] a. bushager and m. zwolinski, "modelling smart card security protocols in systemc tlm", in: embedded and ubiquitous computing (euc), 2010 ieee/ifip 8th international conference on. 2010, pp. 637–643. [2] s. schneider, "security properties and csp", in: proceedings ieee symposium on security and privacy, 1996, pp. 174 –187. [3] j. mcdermott, "visual security protocol modeling", in: proceedings of the 2005 workshop on new security paradigms, nspw '05:. new york, ny, usa: acm. isbn 1-59593-317-4; 2005, pp. 97–109. [4] j. jürjens, "umlsec: extending uml for secure systems development", in: uml 2002 – the unified modeling language. 2002, pp. 412–425. [5] object management group, introduction to omg's unified modeling language tm (uml ®) 2005;url http://www.omg.org/gettingstarted/what is uml.htm. [6] j. jürjens, "modelling audit security for smart-card payment schemes with umlsec", in: proceedings of sec 2001 – 16th international conference on information security, 2001, pp. 93–108. [7] j. jürjens, "using umlsec and goal-trees for secure systems development", in: proceedings of the 2002 acm symposium on applied computing. 2002, pp. 1026–1031. [8] j. jürjens, j. schreck, and y. yu, "automated analysis of permission-based security using umlsec", in: fundamental approaches to software engineering, 11th international conference, fase 2008, budapest, hungary, march 29-april 6, 2008. proceedings. 2008, pp. 292–295. evaluating system security using transaction level modelling 151 [9] ieee standard system c language reference manual. ieee std 1666 2005 [10] k. rothbart, u. neffe, c. steger, r. weiss, e. riegerand a. muehlberger, "high level fault injection for attack simulation in smart cards", in: proceedings of asian test symposium 2004, pp. 118–121. [11] k. rothbart, u. neffe, c. steger, r. weiss, e. rieger and a. muehlberger, "extended abstract: an environment for design verification of smart card systems using attack simulation in systemc", in: acm/ieee international conference on formal methods and models for co-design, 2005, pp.253–254. [12] l. cai, and d. gajski, "transaction level modeling: an overview", in: proceedings of the 1st ieee/acm/ifip international conference on hardware/software codesign and system synthesis. codes+isss '03; new york, ny, usa: acm. isbn 1-58113-742-7; 2003, pp. 19–24. [13] c. williams, "configuring enterprise public key infrastructures to permit integrated deployment of signature, encryption and access control systems", in: military communications conference, 2005. milcom 2005. ieee. 2005, pp. 2172 – 2175 vol. 4. [14] r.j. anderson, security engineering: a guide to building dependable distributed systems. wiley publishing; 2 ed.; 2008. isbn 9780470068526. [15] w. rankl, "overview about attacks on smart cards",information security technical report 2003, vol. 8, pp.67 – 84. [16] k. markantonakis, m. tunstall, g. hancke, i. askoxylakis, and k. mayes, "attacking smart card systems: theory and practice",information security technical report 2009,vol. 14, pp.46 – 56. [17] k. baddam, and m. zwolinski, "evaluation of dynamic voltage and frequency scaling as a differential power analysis countermeasure", in: vlsid '07: proceedings of the 20th international conference on vlsi design. washington, dc, usa: ieee computer society. isbn 0-7695-2762-0; 2007, pp. 854–862. [18] x. leng, "smart card applications and security. information security technical report 2009, vol. 14, pp. 36 – 45. [19] c. y. yang, c.c. leeand s.y. hsiao, "man-in-the-middle attack on the authentication of the user from the remote autonomous object". international journal of network security, 2005, pp.81–83. [20] a. m. johnston and p.s. gemmell, "authenticated key exchange provably secure against the man-in-themiddle attack". journal of cryptology, 2002, pp.139–148. facta universitatis series: electronics and energetics vol. 29, n o 3, september 2016, pp. 419 435 doi: 10.2298/fuee1603419m using internet of things in monitoring and management of dams in serbia rastko martać 1 , nikola milivojević 1 , vladimir milivojević 1 , vukašin ćirović 1 , dušan barać 2 1 institute for the development of water resources “jaroslav ĉerni”, serbia 2 faculty of organizational sciences, university of belgrade, serbia abstract. this paper discusses harnessing internet of things in monitoring and managing dams in republic of serbia. large dams are of major importance, primarily because of their use for electricity, but risks which are associated with it should be greatly taken into account. there is a need to consolidate information related to dam facilities in order to use them for dam management in the republic of serbia. an information system has been developed based on the existing systems, allowing utilization of intelligent network sensors. the aim of the paper is to describe possibilities of the internet of things application within a specific system for dam safety management. in order to facilitate the inclusion of a large number of intelligent sensors, a new data acquisition module for communication with sensors in the monitoring network is defined. the system should provide on time alerting in case security parameters deviate from the expected values. key words: internet of things, cloud, dams, dam safety management, monitoring, serbia 1. introduction most of the dams in serbia were built in the sixties and seventies of 20 th century. the risk for security increases with the age of the building, which is why management and security of the facility has to be improved in order to timely consider possible negative situations [1]. it should be noted that these facilities are of vital importance for society, because they are used to produce electricity and water supply. dams also provide water supply to cities, flood control, and can assist river navigation. many dams are multipurpose, providing more than one of the above benefits. their damage or possible demolition can cause serious consequences to the environment. in order to provide support for the management of complex systems of hydro power plants, it is necessary to establish communication between metering systems and computer models. the complexity of the management of water resources is due to the conflicting demands of different users (hydropower, agriculture, etc.) for limited resources, and this received july 10, 2015; received in revised form november 13, 2015 corresponding author: rastko martać institute for the development of water resources “jaroslav ĉerni”, jaroslava ĉernog 80, belgrade, serbia (e-mail: mrastko@gmail.com) 420 r. martać, n. milivojević, v. milivojević, v. ćirović, d. barać complexity increases in extreme weather conditions, such as droughts and floods, which are reflected in populated areas. dam safety management is a long-term and continuous process that has to be improved permanently [2] [3] [4]. in this respect, procedures and processes of dam safety management must continually be improved in all aspects, both in terms of measuring equipment, as well as in the management and use of data in the procedures for determining safety facilities. a modern system for dam safety management should be established, so that it primarily provides operational status of monitoring dam safety in real time and to enable operational conclusion on the status of the dam safety practically on a daily basis. the whole concept of technical monitoring, with a posteriori reasoning after a few months, or even more than a year, loses much of its meaning and importance (the past practice was based on the preparation of periodic reports on the behavior of the dam). the modern concept of dam safety management should be based on the physically based and software-supported technical system [5] [6]. the physical foundation of this concept relates to the provision of data of importance to the safety of the dam and the accurate measurement of relevant physical quantities, which are to be tracked on the dam with installed equipment for technical monitoring. today's level of information and telecommunication infrastructure enables implementation of the advanced systems for measuring, acquisition and archiving data. these systems should be able to automatically collect monitoring data, to perform data validation and to securely archive them as to provide users with data in unified and efficient manner. with long-term monitoring of the instruments operation, database obtained by reliable instruments could be formed. implementation and use of iot on dams enables creation of databases of reliable instruments which can give more precise evaluation of the dam safety. internet of things (iot) is a network of physical objects in which electronics are incorporated, as well as software and sensors that allow users to obtain timely and accurate data through services for data exchange between manufacturers, users or other connected devices [7]. reliable data could enable users to react in the right way at the right time, in case of critical situations or natural disasters and in some cases to predict events. the aim of paper is to describe possibilities of the internet of things application within a specific system for dam safety management. the idea is to improve the system of data collection with the implementation of cloud and wsn. all data processing would be moved to cloud to free up computer resources. wsn would provide more reliable data. 2. literature review 2.1. dam safety management observing the safety of dams is one of important measures to ensure the safety of the dam [8]. this is an important and indispensable activity in the work and management of the dam. computer software plays a vital role in monitoring the safety of dams. many dam owners have developed information systems for the dam safety management supervision to facilitate management and analysis of data. fujian electric power company in east china has 27 different types of dam: concrete, earth, arch and embankment dams. all these dams are deployed in remote rural areas, making it difficult to manage security information for all dams. it is therefore important to develop an information system for remote control of the monitoring system, to collect and to transfer dam safety monitoring data so that all this using internet of things in monitoring and management of dams in serbia 421 information can be processed, analyzed and evaluated to effectively adopt the decision on the status of the dam safety. fortunately, such a remote information system was successfully developed jointly by all the participants in the business. it was applied to a group of dams of fujian electric power company, where the staff can use the system for analysis and evaluation of data observations. lately, it is possible to see an increase in damages and failures on the dams due to aging, earthquakes and unusual changes in climate [9]. for these reasons, the safety of the dams is gaining in importance every day in terms of disaster management at the national level. in the world there are numerous organizations that are responsible for the dam safety, and some of them are: the international commission on large dams (icold), committee on dam safety and dam security (codss), association of state dam safety officials (asdso), the interagency committee on dam safety (icods), the national dam safety review board (ndsrb) and dam safety interest group (dsig). kwater (korea water resource corporation) which currently runs and manages 30 large dams developed a system for dam safety (kdsms). this system is used in a consistent and efficient management of dam safety. kdsms consists of data for a dam and reservoir, hydrological information system, management system for the area of control and data, system of instruments and observations including the monitoring of earthquakes, a system for improving research and security and information system of corporation. for effective control of dam life cycle, it is very important to implement the diagnosis in real time and a reasonable estimate of dam safety based on the prototype observation [10]. the development of iewsds (intelligent early-warning systems of dam safety) is an important approach for the realization of this goal. huai-zhi su et al. observed the dam as a vital and intelligent system and constructed a bionic model of safe dams, which consists of a system of observations (nerve), central processing units (big brain), and tools for decision-making (the body). with the above-described model and system engineering, the authors have designed iewsds. intelligent machine that performs reasoning is the central processing unit of the system iewsds, it performs data analysis, and applies the algorithm of diagnosis and assessment of the safety of dams. because of the persistent non-linear and dynamic characteristics, the system has adopted a combined model based on a wavy network to exert approximation and prediction of behavior of the dam. the security status of the dam is changing dynamically, requiring qualitative and quantitative change in behavior. huai-zhi su et al. in the paper propose an expanded method of assessment [10]. the application shows that the bionic model is possible and suggests key technology operation. systems can provide technical support to improve dam safety management, prolonging the life of dams and avoiding accidents. disposal of tailings is of great importance for mining, because the processing of ores produces a large amount of tailings [11]. in the past few years there have been a catastrophic accident at the tailings dam and tailings mines, which have caused enormous damage and great human losses. to improve security of tailings dams, the control and pre-alarm system tdmpas (the tailings dam monitoring and pre-alarm system) is introduced for monitoring tailings dams, which are based on the use of iot and the cloud with the ability to monitor line saturation, the water level and the deformation of the dam in real time. tdmpas helped engineers in the mines to monitor the dam 24/7 and automatically receive pre-alarm information from remote locations in any weather conditions. tdmpas was applied at several mines and showed that the application in the monitoring of the physical condition of the tailings dam was justified. 422 r. martać, n. milivojević, v. milivojević, v. ćirović, d. barać 2.2. the application of iot numerous works related to iot application in system for observing have been published. for instance in the [12] authors deal with the localization system, based on zigbee technology in real-time in order to provide prompt support for safe management of the dam construction sites. the system is based on the tracking technology using wireless sensors and a set of servers that run software for processing the collected data, visually monitoring the condition of the site in real-time and remote communication with other systems such as erp, crm. a low-power tracking technology is network hardware based on zigbee technology, which uses the technology of fingerprinting software. the proposed system for observing in real time for employees was successfully implemented in the xiluodu arch dam construction site. implementation and development of the internet of things (iot) is closely connected with the construction of smart grids [13]. generally, using the technology of wireless communications and observations all electrical devices can be connected in iot, in order to make the smart grid become interactive electricity network in real time. qiaoming zou et al. summarize the current state in that area, analyze the current structure and characteristics, as well as key technologies that enable the implementation of the iot. authors brought up some concrete analysis and discussion on the implementation of iot in asset management and in the automatic reading meter system of smart grid and gave conclusions about the perspective of the application of iot in smart grids. operating state tailings ponds, which are an important production area in the mine, directly affect the safety of people and property, as well as production at the mine [14]. to build a system for the security surveillance of tailings ponds, using gis technology we cannot only manage the data and information of tailings scientifically and effectively, but also give full play to the advantage of computer's storage of massive data. the interactive operation of gis spatial query and analysis facilitates accurate and convenient search management, alteration and statistics of data. with the observation of the height of the seepage line of dam body, the water level in the tank, the index of dry coast, deformation and deviation of the dam body, we can promptly obtain information such as the fluctuation of the water level, which is important for timely forecasting stability of the dam body, thus achieving safe management of tailings ponds, as well as early warning of danger. lately, much attention has been paid to climate changes, control and management of the environment, so iis (integrated information system) is gaining on importance. the paper described in [15] presents a new iis that combines iot, cloud computing, geo-informatics (remote sensing rs, geographic information system gis, global positioning system gps) and e-science for monitoring and management of living environment, with a case study of regional climate change and environmental impact. in order to collect data and other information to a perception layer, multiple sensors and web services have been utilized. both networks, private and public, were used to access and transport mass data and other information in the network layer. the result of this case study shows that there is a visible trend of the increase in air temperature in xinjiang in the past 55 years and an apparently growing trend in rainfall since the early 1980s [15]. besides the correlation between environmental indicators and meteorological elements, the availability of water resources is a decisive factor in the terrestrial ecosystem in the area. the study shows that the iis greatly contributed to the study, not only in terms of data collection using iot, but also in the use of web services and applications that are based on the cloud (cloud) platform and e-science, and that effective evaluation and monitoring can still be improved. using internet of things in monitoring and management of dams in serbia 423 3. management and monitoring of large dam safety in [16] and [17], authors describe the current state of dams in the republic of serbia. over time, the sensors cease to operate or provide inaccurate values, so it is necessary to replace or implement new modern sensors. although the system of maintenance of dams in serbia is not up-to-date and fully equipped on all dams, dams have not had a harder disasters or major problems which is primarily, due to the good design and high quality of the works during their construction. however, despite the fact that so far there has not been any greater damage on the individual objects, which could have jeopardized their security and stability, or reduced their functionality, we must keep in mind that especially with aging dams, we can expect emergence of various problems which have already been testified by some peculiar features, which will be described later. most dams have technical monitoring systems that are essential from the point of monitoring and control the state of the facilities. these systems generally date from the time of building the facilities, and in the meantime have not been significantly renewed nor have they been further developed. often, those systems are technologically outdated, so failure of old instruments and missing of the data needed for monitoring dam safety is not uncommon. the system monitoring of the dam becomes incomplete as per the type and frequency of monitoring. in the last few years, the reconstruction process of system for monitoring (djerdap 1 [16], gruza [17]) has started. in the forthcoming period it is obvious that significant activities will happen with regard to these issues. it is necessary to establish a modern, functional and optimized system for technical monitoring of most of the remaining dams, in the form of automatic telemetry system for acquisition, which should allow continuous automatic measurement and recording of measurement data in a given time interval. in the future, increasing the fund of collected data will create conditions for a more detailed analysis of the condition and behavior of the facilities during operation, which is an integral part of the concept of dam safety management and would enable a precise definition of the trend behavior of the dam and should provide an opportunity for early detection of possible anomalies in the condition of the dam. this could be an example of a dam on which there have been good initial assumptions for the development and application of modern control system of dam safety. on many dams in serbia, the state of monitoring system can be assessed as partially satisfactory. this means that based on all available results of observations it is possible to make assessment of the condition of the facilities, but it is necessary to take steps to improve the situation of monitoring. the implementation of a new, up-to-date system of monitoring can provide more accurate assessment of the state of the system. in recent years, steps have been taken to improve the system of technical surveillance by implementing advanced information technologies and a software system for managing the security of the dam. because the first system is a prerequisite for the latter, phase development in the area of dam safety in serbia should be expected. this complete system is applied on the rock fill dam prvonek, near vranje, while the realization of systems for high gravity concrete dam "djerdap 1" and "djerdap 2" is in progress. advanced system for technical measurement usually consists of following mechanisms: automatic acquisition, validation, archiving and access to all relevant data obtained in the system of technical surveillance. the core of this system is an information system for 424 r. martać, n. milivojević, v. milivojević, v. ćirović, d. barać technical measurement, whose purpose is to be technical support in the collection, management and processing of measurement data. the aim is to allow merging diverse data in one place from the entire system of technical monitoring, having a score of reliability, as well as access to all data to be simple, interactive and fast. establishing a system for dam safety management implies the existence of an advanced system of technical monitoring, and thus the information system of technical surveillance. relying on advanced system of technical surveillance as a source of reliable data, it is possible to develop a set of statistical and mathematical models based on physics, as well as following mathematical apparatus for monitoring the state and analysis of dam safety. this established system of dam safety management is used for:  tracking and monitoring the behavior, which consists of continuous monitoring, measurement and determination of compliance measured values and their expected values,  checking of the dam safety, which may be initial, periodic and extraordinary, and refers to determining the condition of the facilities and determining the degree of the facilities safety. the activity of monitoring and tracking behavior relies on established statistical models based on measured inputs that can provide the expected value of a variable. if the measured value deviate within permissible limits which are expected, it can be concluded that the system has no major changes. in the modern automated system, this process is daily and has an alarming role in the case that on the basis of measurements concluded that the facility does not behave as expected. this alarm is a signal that a special security check should be performed. checking the safety of dams is carried out to determine the condition of the facility and degree of the safety, by checking the facility behavior in a series of scenarios, respectively situations that are valid from the standpoint of dam safety. this check is done periodically after the expiry of a defined period or extraordinary, because the system of technical surveillance and the use of statistical models have shown that facility is behaving differently than it is expected. given that in the analysis of state of complex real objects, it is not possible to a priori completely define in homogeneity and the actual characteristics of the material, and on the other side having a large number of measuring different indicators of the state of the facility, for determining the current state of all parameters, it is necessary to establish assimilation mechanisms of real measurements. practically, based on the measurements of relevant physical quantities, calibration of physical parameters of the system is performed (such as: e.g. elastic modules, filtration coefficient, etc.), so the calculated quantities can be more appropriate to the measured ones. in this way, identification can be performed in the zone where changes have occurred. only over the updated model is it possible to carry out safety analysis and based on the analyses it can be decided which measures must be undertaken to improve the safety of the dam. dam safety management is reflected in the use of systems to support dam safety management, and it continues through the entire life cycle of the dam. 3.1. software system to support the dam safety management software system to support the dam safety management, shown in [18], was realized on the principles of service-oriented architecture (soa), which enables not only the use of data in real-time, but also the expandability and interconnection with other information using internet of things in monitoring and management of dams in serbia 425 systems. to create this system, commercially available technologies such as sql server databases, .net framework, ado.net to connect to databases and web services were used. the system architecture is shown in the following figure. fig. 1 the structure of the software system for managing dam safety (adopted from [19]) software system consists of the following components:  interface with the system to technical monitoring  number of modules for statistical analysis  numeric module for the simulation of surface leakage  numeric module for stress-strain analysis  numeric module for data assimilation applications that are an integral part of the solution allow users to see current measurements (measurements in real time) as well as the estimation of the state of the dam at the time. for details see [19]. 426 r. martać, n. milivojević, v. milivojević, v. ćirović, d. barać 4. the acquisition module for communication with sensors in the monitoring network dams have a lot of different instruments, such as rain gauges, water level gauges, flow meter, precipitation meter, etc. in order to improve the observation of dams it is necessary to bring these instruments into a single network and allow them to communicate with each other. due to the large number of different instruments, it is essential to enable communication between devices. this can be achieved with the help of sensorml and wireless sensor networks (wsns). 4.1. sensor model language the goal is to make all types of devices discoverable and accessible using standard web services and schemas [20]. standard xml encoding scheme can be used for metadata describing sensors, sensor platforms, sensor tasking interfaces, and sensor-derived data, if connections can be layered with web and internet protocols. sensor can enable direct communications by publishing xml descriptions of its control interface, so it is possible to receive real-time or stored monitoring data, determine the sensor's location, identify the characteristics of its monitoring capabilities, and even request specific monitoring tasks. sensor web enablement (swe) standards are open standards based on open and universally accepted standards for the internet and web, and for spatial location and they are foundational standards for communicating with sensors, actuators and processors whose location matters [21]. they are a key enabler for the internet of things. the sensor model language (sensorml) 2.0 provides a standard encoding and supports the internet of things (iot) and web of things (wot) by providing the ability to describe a sensor (or other online processing component) and to provide a link to the realtime values coming from this component [20]. the sensorml is a head component that provides sensor information necessary for discovery, processing, and geo-registration of sensor monitoring. an example on the web page http://www.sensorml.com/sensorml-2.0/examples/iot simple.html describes a sensor with a simple data stream consisting of temperature. it is combination of simple sensor and iot. the data themselves can be accessed through the url [22]. accessing this url would return either the latest value(s) or open up an html stream of real-time values. the proposal of authors of this paper is to use a web service that will access to sensor's data via the above mentioned url. in addition for obtaining real data, the role of the web service is also transmission and storage of real data in the central database. the end user calls the web service via the software that is described in the previous chapter. the web service can be used for all types of sensors. 4.2. wireless sensor network the constant evolution of technologies, low cost technologies with embedded wireless transmitter, low-power and powerful chipset led to the massive use and development of wireless sensors networks (wsns). wsn can scale from tens to hundreds of nodes and seamlessly integrate with existing wired measurement and control systems [15]. the network cluster architecture, which takes advantage of multi-hop and clustering, is adopted to lower the energy consumption. a wireless sensor network consists of a number of smart nodes, gateways or sink nodes and a computer management center [23]. using internet of things in monitoring and management of dams in serbia 427 sensor’s data are shared among smart nodes and sent to distributed or centralized system for analytics, which can be on cloud or in local [24]. wdsn (wsn applied on dams) is a self-organized wireless network with dynamic topology structures, which consists of sensor nodes and gateway nodes. the sensor nodes collect the dam data about water level, shift, stress and leakage, temperature, rainfall, seepage and displacement in the dam sections which is transferred to the database server through the gateway nodes. the sensor used in wdsn is different from the common one. it is an intelligent one which can not only perceive the variation of tested physical value and output the corresponding change information, but also communicate with others. the intelligent sensor has several parts, such as sensitive components, embedded processors, storages and power supplies. these smart sensors in wdsn network are very important for measuring the reliability of the dam because at any moment it is possible to get information about the functionality of the device. wdsn structure is shown in fig. 2. fig. 2 wireless sensor network (wsn) the whole network is divided into several clusters, each of which is a monitoring area. the wireless sensor nodes in each cluster can communicate with each other and transmit the data to the gateway through multi-hops. the gateways can also communicate with each other and transmit the data to the sink. 4.3. communication services for automated measurements which are not included in the information system which could provide data to user outside of the system, it is necessary to set up special services for communication with measuring systems. due to the specific requirements for the reliability of the measurement system it is not recommended to directly access data. these services carry out local data collecting and sending on the processing and validation. 428 r. martać, n. milivojević, v. milivojević, v. ćirović, d. barać in the structure of the service communication, module for data collection is directly connected to measuring systems and has a central role [25]. module collects data from various sources, translates them into a standard format and passes them to a service for processing and validation of data. with this module, depending on the number and types of measuring systems which need to communicate, participants in the services are software components for the acquisition, which are divided into: components for communication with passive sensors, components for communication with active sensors, components for communication with passive data logger, the components for communicating with an active data logger and components for the information contained in the files. each of these components must implement the appropriate interface module for data collection. the number of components of one type is only limited with computing resources, while the number of these types of components is specified with configuration of measuring systems. the latter means that the concrete implementation of these services on an object does not have to contain all the components, but it is possible to add the components in the case of the extension configuration. fig. 3 communication services with measurement systems (adopted from [25]) 5. detecting sensor failure and continuing further work during the life cycle of the dam instruments the risk of cancellation of individual sensors is, of course, increasing, so the safety assessment of the dam should be brought without taking into account the measurements from these sensors. consequently, in order to implement the iot in monitoring and dam safety management, it is necessary to implement the adaptive algorithm for detecting sensor failure. algorithm should signal on time which measurements are missing, i.e. without which sensor the decision on safety of the facility has been made. using internet of things in monitoring and management of dams in serbia 429 the algorithm for failure detection of sensors, suitable for use in iot, represents the connection between the adaptive system for modeling the behavior of the dam and the acquisitions module for communication with sensors in the monitoring network. there are several different approaches to modeling the behavior of the dam. the earliest models were based on the application of statistical [26] and numerical [27] methods. the development of artificial intelligence has enabled the application of new techniques such as artificial neural networks [28], genetic algorithms [29] and adaptive neuro-fuzzy systems [30]. for application in the internet of things are the most suitable adaptive models that provide results in real time. one such has been described in [18]. it is a hybrid system that combines statistical models and genetic algorithms, so it can model the expected behavior of the dam. the basis of this system is the linear regression model, which is sensitive to the change of input parameters set. for this reason, the adaptive part is added to the system, in which, genetic algorithms, represent the basis. in accordance with the theory of genetic algorithms, model of linear regression is seen as the optimization problem, where each regression model represents one entity within the population. based on the available measurements, the generator of regressor creates a corresponding set of functions that can be applied. this means that there is always an alternative to the main model in case that some information is not available, so that the safety dam monitoring is not compromised. at the same time, in case of missing data, through the communication module with sensors, it is possible to get information from which sensor in network, the information is received and the system should timely alert about partial malfunction of the sensor. in the case of missing the entire set of data from a sensor (all measurements that the sensor performs), the system announces a complete malfunction of the sensor. results of regression models represent the parameters on the basis of which the current state of the dam is estimated and alarming is performed, in case that the parameters deviate from the expected values. with this information, it is also important to give information of the available measurements in the system and the state of the sensors, because as noted earlier, the regression model is formed on available measurements in the system. this could jeopardize the credibility of the results obtained from the regression model in a situation of incomplete measurements. for this reason, the condition of sensors is an important factor in making a correct decision about the real state of the dam. further development will be directed to the use of collected data in the advanced numerical models (fem etc.) and implementation of cloud computing. 5.1. fem for the modeling of the stress-deformation and filtration phenomena on the dam finite element method (fem) is used. fem can form a physical model of the building with the surrounding rock mass. to make this model fit the real model of a dam, it is necessary to repeat a particular phenomenon at the dam that occurred during operation. based on the results of technical surveillance calibration of material parameters is carried out and fem gives information about a realistic model of the dam, which should serve to further monitor the behavior of the object in order to anticipate certain undesirable situations in the further exploitation [31] [32]. an example of an arch dam model is shown in fig. 4. 430 r. martać, n. milivojević, v. milivojević, v. ćirović, d. barać fig. 4 fem arch dam to carry out safety analysis over the present state model, numerical module for assimilation of measured data should be developed. this module should enable, on the basis of the data obtained from the information system for technical monitoring, assimilation of measurements, i.e. determine updated values of fem model parameters. the core of the module consists of optimization algorithms required for the assimilation of measurements and automated communication with numerical modules. up to date parameters of individual material models that form the fem model describe real state of construction. 5.2. big data and cloud in remote sensing internet of things (iot) is a concept that includes all the objects around us as part of the internet. coverage iot is very large and includes a variety of smart devices such as smart phones, digital cameras, smart rain gauge, an outside temperature sensor and a variety of other types of sensors. when all these devices are interconnected, they provide much more intelligent processes and services that can be used in various areas. such large number of devices and sensors on dams connected to the internet provides a multitude of services and produces a large amount of data (big data). cloud computing is a model for on-demand-access to repository of configurable resources (budget, networks, servers, storage, applications, services, software, etc.), which can easily provide such infrastructure, applications and software. platforms based on a cloud help us to connect to the things that surround us, so it is possible to access them from anywhere at any time. cloud acts as a front-end to access the iot. applications which interact with devices, such as sensors, have special requirements for massive storage to record big data, a huge power computation that would provide data processing in real-time and high speed internet to allow high speed data throughput [33]. using internet of things in monitoring and management of dams in serbia 431 6. proof of concept the main goal of practical work is the dam safety. computers with limited resources need to be less burdened, i.e. the execution of operations should be relocated to the server. furthermore, it is necessary to increase the level of reliability of the dam safety system. this new innovative system would be implemented on dam prvonek, which is one of the last built dams in serbia and has modern sensors. fig. 5 describes implemented system on dam prvonek. the figure shows the data flow from the measuring instrument to the end-user. the measured data are temporarily stored in data logger. every data logger has own software for downloading data, which is installed on acquisition server. downloaded data format is csv (comma separated values). the acquisition server sends data to the central server. end-users use specific software for data analyses. the installed software on the computer of end-user uses resources of the computer and not server. if operations are complex, it is possible that operations will need a lot of computer resources. fig. 5 current data flow on dam prvonek the next figure (fig. 6) shows further project improvement. all data from acquisition servers are sent to the central server on the cloud. all data transformation and processes are performed in the cloud. the above mentioned represents an etl process (extract, transfer and load). all operations use server’s resources. end-user computer works only with prepared data for reports and has much more free resources for other operations. 432 r. martać, n. milivojević, v. milivojević, v. ćirović, d. barać all data are available to end-users 24/7. end-user can access data any place any time. fig. 6 model for cloud and big data it is possible to apply a new system on all instruments: rain gauges, water level gauges, flow meters, precipitation meters, etc. this new system is useful for all types of dams. implementing wsn architecture from fig. 2 will make a system of sensors more reliable and data more accurate. the nodes within wsn network communicate between themselves and send data about the malfunctioning sensor in real time through gateway to sin node, which further sends data onto the cloud server. the software for mathematical calculations generates statistical curve of dam stability using the received data. this statistical curve has to be in specific value limits. the curve could be generated based on data from different instruments. most frequently used instruments are piezometer, coordinometer, clinometer and thermometer. piezometer measures level of ground and underground flows. coordinometer measures dam movements. clinometer measures the angle of movement, while thermometer measures temperature. if there is a malfunctioning of instrument, a new formula, which excludes given instrument, is automatically generated by specific algorithms and provides approximately the same curve as if all instruments were in perfect working order. end user launches software for generating curve. all received data are stored in database on cloud server. the new system provides database with more reliable data which enables better analyzes and reporting. using internet of things in monitoring and management of dams in serbia 433 7. conclusion in this paper authors give an example for possible application of latest technologies such as internet of things, sensorml and wireless sensors networks with software for dam safety management. combination of these technologies and software improves functionality of dams. sensor technology, computer technology and network technology are advancing together while the demand grows for ways to connect information systems with the real world. linking diverse technologies in this fertile market environment, integrators are offering new solutions for plant security, industrial controls, meteorology, geophysical survey, flood monitoring, risk assessment, tracking, environmental monitoring, defense, logistics and many other applications [34]. internet of things, as a technology that is in trend, allows sensors to become intelligent by connecting them to the internet. this allows sensors to communicate with each other. application of iot in modern business significantly improves operations of companies. application of iot on dams would provide more efficient recording of failed sensors, which would significantly reduce the probability of damage occurring. with the collected data about failed sensors, it is possible to make database of reliability instruments, which directly shows the reliability of dams. combination of wsn, big data, cloud computing with iot would greatly improve the operation of the dams. all technologies produce a lot of data, which requires massive data storage. cloud, as a form of technology, that gains momentum as iot, could allow storage of large amounts of data on the web. with cloud computing end users could access the data anytime and anywhere. all data processing would be done on a cloud, which would considerably make the functioning of the system for data collection faster and more reliable. using the last forms of technology such as big data, cloud computing and iot will improve the operation of dams in serbia and significantly minimize the chances for failure to happen. serbia has good quality dams, so it is only needed to start implementing new technologies so that we could possibly prevent potential failure from happening. the implementation of the system for managing and monitoring dam safety and the implementation of new technology reduces the risk of a major failure of the dam. acknowledgement: the authors would like to thank to the ministry of education, science and technological development, republic of serbia, for financial support project number 174031. references [1] "assosiation of state dam safety officials," april 2012. [online]. available: http://www.damsafety. org/media/documents/downloadabledocuments/livingwithdams_asdso2012.pdf. [2] david s. bowles, loren r. anderson and terry f. glover, "the practice of dam safety risk assessment and management: its roots, its branches, and its fruit," 1998. [3] david s. bowles, loren r anderson , terry f. glover and sanjay s. chauhan, "dam safety decisionmaking: combining engineering assessments with risk information," 2003. [4] charles r. farrar and keith worden, "an introduction to structural health monitoring," the royal society, 2007. [5] shen zhen-zhong, chen yun-ping, wang cheng, li tao-fan and li ze-yuan, "development of realtime monitoring and early warning system of dam safety," vol. 3, 2010. 434 r. martać, n. milivojević, v. milivojević, v. ćirović, d. barać [6] jesung jeon, jongwook lee, donghoon shin and hangyu park, "development of dam safety management system," advances in engineering software, vol. 40, no. 8, p. 554–563, 2009. [7] i. bojanova, "defining the internet of things," computing now, 16 march 2015. [online]. available: http://www.computer.org/web/sensing-iot/content?g=53926943&type=article&urltitle=defining-theinternet-of-things. [8] f. bao t., s. gu c. and y. zhang, "remote safety monitoring management information system for dam group," in 2nd international conference on structural health monitoring of intelligent infrastructure, shenzhen, 2006. [9] jeon jesung, lee jongwook, shin donghoon and hangyu park, "development of dam safety management system," advances in engineering software, vol. 40, no. 8, pp. 554-563, 2009. [10] h. z. su and z. p. wen, "intelligent early-warning system of dam safety," proceedings of 2005 international conference on machine learning and cybernetics, vols 1-9, pp. 1868-1877, 2005. [11] enji sun, xingkai zhang and zhongxue li, "the internet of things (iot) and cloud computing (cc) based tailings dam monitoring and pre-alarm system in mines," safety science, vol. 50, no. 4, pp. 811-815, 2012. [12] peng lin, junfeng guan and qingbin li, "a real-time zigbee-based location system in xiluodu arch dam," civil engineering, architecture and sustainable infrastructure ii, pts 1 and 2, vols. 438-439, pp. 1329-1333, 2013. [13] qiaoming zou, lijun qin and qiyan ma, "the application of the internet of things in the smart grid," materials science and information technology, pts 1-8, vols. 433-440, no. 2012, pp. 3388-3394, 2011. [14] yang yingxin, hou chunhua and han yanxia, "design of safety monitoring system of tailing pond based on gis technology," electronic information and electrical engineering, vol. 19, pp. 408-410, 2012. [15] shifeng fang, li da xu and yunqiang zhu, "an integrated system for regional environmental monitoring and management based on internet of things," ieee transactions on industrial informatics, vol. 10, no. 2, pp. 1596-1605, 2014. [16] dragan maksimović, tina savić-tomić and maja pavić, "uticaj inoviranog oskultacionog sistema na pouzdanost osmatranja, upravljanja i održavanja glavnog objekta he “djerdap 1”," in sdvb prvi kongres, bajina bašta, 2008. [17] ljubomir petrović and srdjan djurić, "osavremenjavanje sistema osmatranja na brani gruža," in sdvb prvi kongres, bajina bašta, 2008. [18] b. stojanović, m. milivojević, m. ivanović, n. milivojević and d. divac, "adaptive system for dam behavior modeling based on linear regression and genethic algorithms," advances in engineering software, vol. 65, pp. 182-190, 2013. [19] n. milivojević, n. grujović, d. divac, v. milivojević and r. martać, "information system for dam safety management," icist, vol. 1, pp. 56-60, 2014. [20] mike botts and lance mckee, "sensors online," 1 april 2013. [online]. available: http://www. sensorsmag.com/networking-communications/a-sensor-model-language-moving-sensor-data-internet-967. [21] "the ogc approves sensorml 2.0, advanced standard for internet of things," ogc, february 2014. [online]. available: http://www.opengeospatial.org/node/1971. [22] "internet of things simple sensor (sensorml 2.0 examples)," [online]. available: http://www.sensorml. com/sensorml-2.0/examples/iotsimple.html. [23] xinying miao, jinkui chu, linghan zhang and jing qiao, "development of wireless sensor network for dam monitoring," journal of information & computational science, vol. 6, no. 9, p. 1609–1616, 2012. [24] i. f. akyildiz, w. su, y. sankarasubramaniam and e. cayirci, "wireless sensor networks: a survey, computer networks," no. 38, p. 393–422, 2002. [25] razvoj sistema za podršku optimalnom održavanju visokih brana u srbiji (tr37013), 2011-2015. [26] r. ardito and g. cocchetti, "statistical approach to damage diagnostic of concrete dam by radar monitoring: formulation and pseudo-experimental test," engineering structures, vol. 28, no. 14, p. 2036–2045, 2006. using internet of things in monitoring and management of dams in serbia 435 [27] yifeng chen, ran hu, wenbo lu, dianqing li and chuangbing zhou, "modeling coupled processes of non-steady seepage flow and non-linear deformation for a concrete-faced rockfill dam," computers & structures, vol. 89, no. 13-14, p. 1333–1351, 2011. [28] j. mata, "interpretation of concrete dam behaviour with artificial neural network and multiple linear regression models," engineering structures, vol. 33, no. 3, p. 903–910, 2011. [29] chang xu, dongjie yue and chengfa deng, "hybrid ga/simpls as alternative regression model in dam deformation analysis," engineering applications of artificial intelligence, vol. 25, no. 3, p. 468– 475, 2012. [30] vesna ranković, nenad grujović, dejan divac, nikola milivojević and aleksandar novaković, "modelling of dam behaviour based on neuro-fuzzy identification," engineering structures, vol. 35, p. 107–113, 2012. [31] d. divac, d. vuĉković and m. živković, "modeliranje filtracionih i naponsko-deformacionih procesa u interakciji akumulacionog jezera, brane i stenske mase, na primerima brane sv. petka u makedoniji i brane prvonek kod vranja," graċevinski kalendar, 2004, pp. 9-57. [32] m. kojić, r. slavković, m. živković and n. grujović, metod konaĉnih elemenata i (linearna analiza), kragujevac: mašinski fakultet u kragujevcu, 1998. [33] prahlada b. b. rao, payal saluja, neetu n. sharma, ankit mittal and shivay veer sharma, "cloud computing for internet of things & sensing based applications," 2012 sixth international conference on sensing technology (icst), pp. 374-380, 2012. [34] "sensor web enablement (swe)," ogc, [online]. available: http://www.opengeospatial.org/ogc/ markets-technologies/swe . [35] l. nachabe, m. girod-genet and b. el hassan, "unified data model for wireless sensor network," ieee sensors journal, vol. 15, no. 7, pp. 3657-3667, 2015. [36] a. ghosh and s. k. das, "coverage and connectivity issues in wireless sensor networks: a survey, pervasive and mobile computing," no. 2, p. 303–334, 2008. [37] y. sang, h. shen, y. inoguchi, y. tan and n. xiong, "secure data aggregation in wireless sensor networks: a survey," p. 315–320, 2006. [38] "ogc," [online]. available: http://www.opengeospatial.org/standards/sensorml. instruction facta universitatis series: electronics and energetics vol. 27, n o 1, march 2014, pp. 113 127 doi: 10.2298/fuee1401113a bandgap engineering of carbon allotropes  vijay k. arora faculty of electrical engineering, universiti teknologi malaysia, utm skudai 81310 department of electrical engineering and physics, wilkes university, wilkes-barre, pa 18766, u. s. a. abstract. starting from the graphene layer, the bandgap engineering of carbon nanotubes (cnts) and graphene nanoribbons (gnrs) is described by applying an appropriate boundary condition. linear e-k relationship of graphene transforms to a parabolic one as momentum vector in the tube direction is reduced to dimensions smaller than inverse of the tube diameter of a cnt. similar transition is noticeable for narrow width of a gnr. in this regime, effective mass and bandgap expressions are obtained. a cnt or gnr displays a distinctly 1d character suitable for applications in quantum transport. key words: bandgap engineering, graphene, carbon nanotube, graphene nanoribbon, neadf, carrier statistics 1. introduction carbon allotropes have their basis in graphene, a single layer of graphite with carbon atoms arranged in a honeycomb lattice. graphene has many extraordinary electrical, mechanical, and thermal properties, such as high carrier mobility, ambipolar electrical field effect, tunable band gap, room temperature quantum hall effect, high elasticity, and superior thermal conductivity. it is projected to be a material of scientific legend, comparable only to penicillin as a panacea. there is a modern adage: silicon comes from geology and carbon comes from biology. cohesive band structure of graphene rolled into a cnt in a variety of chiral directions has recently been reported [1]. graphite, a stack of graphene layers, is found in pencils. as shown in fig. 1, formation of these allotropes originates from a graphene layer through various cutouts. a carbon nanotubes (cnt) is a rolled-up sheet of graphene (also see fig. 2). a fullerene molecule is a “buckyball,” nanometer-size sphere. a graphene nanoribbon (gnr) is a cutout from a graphene sheet with a narrow width of high aspect ratio.  received january 10, 2014 corresponding author: vijay k. arora wilkes university, e-mail: vijay.arora@wilkes.edu (e-mail: vijay.arora@wilkes.edu) 114 v. k. arora fig. 1 carbon allotropes arising from graphene sheet to form zero-dimensional (0d) buckyball, one-dimensional (1d) cnt, and three dimensional graphite (3d). each layer of two-dimensional (2d) graphite can be converted to a 1d gnr by making width smaller. copyright macmillan publishers limited [2], a. k. geim and k. s. novoselov, "the rise of graphene," nature materials, vol. 6, pp. 183-191, mar 2007 carbon (6c 12 ) atom with 6 electrons has electronic configuration 1s 2 2s 2 2p 2 . it is a tetravalent material with four of its electrons in shell 2 and still able to accommodate 4 more in 2p orbitals. however, carbon orbitals can hybridize because the s-orbital and porbitals of carbon's second electronic shell have very similar energies [3]. as a result, carbon can adapt to form chemical bonds with different geometries. three sp 2 orbitals form -bond residing in the graphene plane of fig. 2. these are pretty strong bonds that demonstrate superior electronic properties. fourth electron with -bond is delocalized conduction electron. each atom contributes 1/3 rd of the  -electron to a hexagon. with 6 atoms forming corners of a hexagon, each hexagon contributes 2 -electrons. the areal electronic density of -electrons is 2 /g hn a with 2 3 3 / 2 h cc a a is the area of the haxagon. the areal density is 19 23.82 10 g n m    with c-c bond length of 0.142 cc a nm . the intrinsic line density in a cnt is expected to be a function of diameter 11 1 1.2 10 ( ) cnt t n d nm m    when rolled into a cnt. similarly, the line density of gnr is expected to be 10 13.82 10 ( ) gnr n w nm m    . bandgap engineering of carbon allotropes 115 fig. 2 the rolled up graphene sheet into a carbon nanotube in recent years, there is a transformation in the way quantum and ballistic transport is described [3]. equilibrium carrier statistics with large number of stochastic carriers is the basis of any transport. nonequilibrium arora's distribution function (neadf) [4] seamlessly transforms the stochastic carrier motion in equilibrium with no external influence into a streamlined one in a high-field-initiated extreme nonequilibrium for current to flow and get saturated. a new paradigm for characterization and performance evaluation of carbon allotropes is emerging from the application of neadf to graphene and its allotropes. 2. bandgap engineering the electronic quantum transport in a carbon nanotube (cnt) is sensitive to the precise arrangement of carbon atoms. there are two families of cnts: singled-wall swcnt and multiple-wall mwcnt. the diameter of swcnt spans a range of 0.5 to 5 nm. the lengths can exceed several micrometers and can be as large as a cm. mwcnt is a cluster of multiply nested or concentric swcnts. the focus here is on swcnt. depending on the chirality, rollup of fig. 2 can lead to either a semiconducting or metallic state. when the arrangement of carbon atoms is changed by mechanical stretching, a cnt is expected to change from semiconducting to metallic or vice versa. several unique properties result from the cylindrical shape and the carbon-carbon bonding geometry of a cnt. wong and akinwande [3] are vivacious in connecting physics and technology of a graphene nanolayer to that of a cnt with splendid outcomes. cnt band structure arising out of 6-fold dirac kpoints with equivalency of k and k' points can lead to complex mathematics. however, once nearest neighbor tight binding (nntb) formalism is applied, the resulting dirac cone, as revealed in fig. 2, gives useful information for a variety of chirality directions. in fact, the k-points offer much simplicity for quantum transport applications. in the metallic state e-k relation is linear. however, the fermi energy and associated velocity are different in the semiconducting state. intrinsic fermi energy efo = 0 is applicable for undoped or uninduced carrier concentration. induction of carriers will move the fermi level in the conduction (ntype) or valence band (p-type). 116 v. k. arora linear e-k relation can be written in terms of kt, the momentum vector in the longitudinal direction of the tube, and kc, the momentum vector in the direction of rollup. the new description becomes 2 2 | | fo f fo f t c e e v k e v k k     (1) the k-point degeneracy is crucial to the profound understanding of the symmetries of graphene folding into a cnt in a given chiral direction. symmetry arguments indicate two distinct sets of k points satisfying the relationship 'k k  , confirming the opposite phase with the same energy. there are three k and three k' points, each k (or k') rotated from the other by 2 / 3. the zone degeneracy gk = 2 is based on two distinct sets of k and k' points in addition to spin degeneracy gs = 2. there are six equivalent k points, and each k point is shared by three hexagons; hence gk = 2 is for graphene as well as for rolled-up cnt. the phase kcch of the propagating wave e ikcch in the chiral direction results in rolled up cnt to satisfy the boundary condition (2 / 3) c h k c   (2) where v = (n  m) mod 3 = 0,1,2 is the band index. (n  m) mod 3 is an abbreviated form of (n-m) modulo 3 that is the remainder of the euclidean division of (n  m) by 3. the quantization condition transforms to kc = v(2/3dt) when ch = dt is used for a cnt's circular parameter. using (2 / 3 ) c t k d in eq. (2) yields the band structure as given by 2 2 2 | | 3 fo f fo f t t e e v k e v k d            (3) the equation is re-written to introduce the bandgap near 0 t k  2 2 3 32 1 1 3 2 2 2 gt t t t fo f fo t ed k d k e e v e d                         (4) with the binomial expansion 1/ 2 1 (1 ) 1 2 x x   near the lowest point (kt  0) of the subband and keeping first order term transforms (3) to 2 2 * 2 2 g t fo t e k e e m         (5) with 4 0.88 . 3 g f t t ev nm e v d d    and * 2 0.077 3 t o t f o t m nm m d v m d    (6) here 63 10 / 2 cc f a v m s    with 3.1 ev  the c-c bond strength. bandgap engineering of carbon allotropes 117 fig. 3 e vs kt graph with chirality (10,4) for v = 0, (13,0) for v = 1, and (10,5) for v = 2, all with diameter dt  1.0 nm. solid line is used for exact formulation (eq. 3) and dash-dot line showing parabolic approximation (eq. 4) fig. 3 displays the ev  kt relationship for v = 0 metallic and v = 1(2) semiconducting (sc1(2)) states. v is the band index confined to these three values only as v = 3 is equivalent to v = 0 and pattern repeats itself in these three modes. the diameter is dt  1.0 nm for the chosen chirality directions: (10,4) for v = 0, (13,0) for v = 1, and (10,5) for v = 2. assuming that each of these configurations are equally likely, about 1/3 rd of the cnts are metallic and 2/3 semiconducting with bandgap that varies with chirality. as seen in fig. 3, the curvature near kt = 0 is parabolic making it possible to define the effective mass that also depends on chirality. fig. 4 shows the bandgap of cnts with different chiral configurations, covering metallic (v = 0) and two semiconducting (v = 1,2) states. as expected, the bandgap is zero for metallic state, in agreement with eq. (4). sc2 bandgap is twice as large as that of sc1. fig. 4 also shows chirality leading to v = 0, 1 or 2. the nntb bandgap is likewise shown. it also exhibits the wide band gap nature of sc2. fig. 4 calculated band gap as a function of cnt diameter showing an agreement with nntb calculation. chiral vectors are indicated against corresponding points gnr, as shown in fig. 5, are strips of graphene with ultra-thin width (<50 nm). the electronic states of gnrs largely depend on the edge structures. the precise values of the bandgaps are sensitive to the passivation of the carbon atoms at the edges of the nanoribbons. just like for a cnt, bandgap dependence on inverse width is preserved. -2 0 2 -2 -1 0 1 2 k t (nm -1 ) e ( e v ) =2 =1 =0 1 2 3 4 0 0.5 1 1.5 d t (nm) e g ( e v ) nntb =2 =1 =0 (10,5) (10,4) (15,15) (8,7) (13,0) (19,0) (18,2) (21,8) (32,10) (16,5) (26,0) (32,0) (38,0) 118 v. k. arora fig. 5 armchair and zigzag graphene nanoribbons (gnr) where edges look like armchair and zigzag respectively in momentum k-space, there are bonding and anti-bonding wavefunctions. in the absence of a magnetic field, forward k and backward k moving states have identical eigenenergies as is well-known both for parabolic semiconductors as well as for graphene with e = ћvfo | k |. this degeneracy occurs both at k and 'k with 'k k  . the total phase change as one starts from one of the three k points to other two and returning to the same one is . the angular spacing between k and k' in k-space is (2v + 1) / 6 where v = 0,1,2. v = 3 is equivalent to v = 0 repeating the pattern. this gives (2 1) / 6 w k w    with v = 0,1 and 2 for equivalent k or k' points, where kw is the momentum vector along the width and kl is along the length of the nanoribbon. the small width of gnrs can lead to quantum confinement of carriers that can be modeled analogous to the standing waves in a pipe open at both ends. the dispersion for a gnr is then given by 2 2 | | (2 1) 6 fo f fo f l e e v k e v k w              (7) this leads to bandgap equation 2 3 1 2 (2 1) g l fo e wk e e           (8) in the parabolic approximation as for cnt, the bandgap and effective mass are given by 0.69 . (2 1) (2 1) 3 g f ev nm e v w w       (9) and * 0.06 (2 1) 6 l o f o m nm m w v m w      (10) armchair zigzag bandgap engineering of carbon allotropes 119 fig. 6 gives the bandgap as a function of width with experimental values indicated that are spread between v = 1 and v = 2 configurations. the gnr bandgap and associated transport framework are sketchy as compared to that of a cnt. 3. carrier statistics the theoretical development of electronic transport in a graphene nanostructure is complicated due to linear e-k relation with zero effective mass of a dirac fermion, as reviewed in a number of notable works [5-7]. an ideal graphene is a monoatomic layer of carbon atoms arranged in a honeycomb lattice. monoatomic layer makes graphene a perfect two-dimensional (2d) material. as a 2d nanolayer, graphene sheet has some semblance to a metal-oxide-semiconductor field-effect-transistor (mosfet). wu et al [5]. give excellent comparison of linear e-k (energy versus momentum) relation in a graphene nanolayer to a quadratic one in a nano-mosfet. six-valley parabolic band structure in a mosfet, even though anisotropic, has a finite effective mass [8-10]. as graphene is a relatively new material with a variety of its allotropes, the landscape of electronic structure and applications over the whole range of electric and magnetic fields is in its infancy [11]. fig. 6 the gnr bandgap as a function of width w. markers are experimental data (m. han et. al. phy, rev. lett. 98, 206805(2007)) the dirac cone described in eq. (1) shows rise in energy with the magnitude of momentum vector: 2 2 | | fo f fo f x y e e v k e v k k     (11) where k is the momentum vector which in circular coordinates has components kx = k cos and ky = k sin and vf = (1/ћ)de /dk is constant due to linear rise of energy e with momentum vector k. ћvf is the gradient of e-k dispersion. the linear dispersion of dirac cone is confirmed up to 0.6ev [3, 12]. vf  10 6 m/s is the accepted value of fermi velocity near the fermi energy that lies at the cone apex ef  efo = 0 for intrinsic graphene with efo = 0 as the reference level. the fermi velocity vectors are randomly oriented in the graphene sheet. 20 40 60 80 100 0 0.05 0.1 0.15 0.2 w (nm) e g ( e v ) =0 =1 =2 120 v. k. arora the deviation ef  efo of the fermi energy from dirac point defines the degeneracy of the fermi energy that itself depends on 2d carrier concentration ng as given by [4] 1 ( ) g g c n n   (12) with 2 (2 / )( / ) g b f n k t v (13) ( ) / f fo b e e k t   (14) ( ) j  is the fermi-dirac integral (fdi) of order j [13, 14] with j=1 for graphene. the linear carrier density of cnt is similarly described by ( , ) cnt cnt cnt g n en  (15) with eg = eg / kbt and ncnt = dokbt the effective density of states with do = 4/ћvfo = 1.93 ev 1 nm 1 . ( , )cnt ge is the cnt integral that can be evaluated numerically [1]. equilibrium carrier statistics for gnr is similarly obtainable. in equilibrium, the velocity vectors are randomly oriented in the tubular direction with half oriented in the positive x-direction and half directed in the negative x-direction for a tubular direction along the x-axis. this makes the vector sum of velocity vectors equal to zero, as expected. however, the average magnitude of the carrier motion is not zero at a finite temperature. the group average velocity of a carrier in essence informs the speed of a propagating signal. it is also a useful parameter giving information as velocity vectors are re-aligned in the direction of an electric field [15] as it sets the limit at saturation velocity that is the ultimate attainable velocity in any conductor. in a ballistic transport when electrons are injected from the contacts, the fermi velocity of the contacts plays a predominant role [16, 17]. it is often closely associated with the maximum frequency of the signal with which the information is transmitted by the drifting carriers. formally, the carrier group velocity is defined as 1 ( ) de v e dk  (16) the magnitude of the velocity of eq. (8.5.13) can be related to the dos by rewriting it as 1 ( ) 2 ( ) s cnt gde dn v e dn dk d e   (17) where 2 s gdn dk   (with 2 s g  for spin degeneracy) in the k-space and dcnt(e) = dn / de is the dos for a single valley. when multiplied with the dos and the fermi-dirac distribution function and divided by the electron concentration given by eq. (3.9.15) the magnitude of the velocity vector, the intrinsic velocity vi, for a cnt is given by 0 0 ( ) i f cnt v v u  , / cnt cnt cnt u n n (18) bandgap engineering of carbon allotropes 121 the name intrinsic is given to this velocity as it is intrinsic to the sample as compared to the drift velocity that is driven by an external field. similarly, the intrinsic velocity of gnr follows the same pattern ( , ) gnr gnrgnr g n n e as in eq. (15). fig. 7 the normalized intrinsic velocity vi / vfo as a function of normalized carrier concentration ucnt = ncnt / ncnt for different chiralties. p is parabolic approximation with effective mass the intrinsic velocity and unidirectional velocity for arbitrary degeneracy are shown in fig. 7 for band index v = 0,1,2. the intrinsic velocity is not equal to the fermi velocity vfo  10 6 m/s for semiconducting samples approaching vfo as expected in strong degeneracy. however, for a metallic cnt, the intrinsic velocity is the intrinsic fermi velocity. the fermi velocity in the parabolic model is calculable. however, it has no physical meaning as parabolic approximation works only in the nondegenerate regime. the nonequilibrium carrier statistics is challenging considering a variety of approaches in the published literature with no convergence in sight. neadf is natural extension of fermi-dirac statistics with electrochemical potential ef during the free flight of a carrier changing by q e where e is the electric field and is the mean free path (mfp) [15]. neadf is given by ( ) 1 ( , ) 1 f b e e q k t f e e       e e, (19) neadf is the key to transformation of stochastic velocity vectors into streamlined one giving intrinsic velocity that is the fermi velocity in metallic state, but substantially below the fermi level in semiconducting state. this intrinsic velocity vi is lowered by the onset of a quantum emission. the nature of quantum (photon or phonon) depends on the substrate. a sample of intrinsic velocity in a cnt is given in fig. 6. intrinsic velocity is the average of the magnitude of the stochastic velocity with fermi-dirac distribution that is obtained from eq. (19) when electric field is zero ( e =0). this is the limit on the drift velocity as stochastic vectors in equilibrium transform to streamlined unidirectional vectors. in the metallic state, the saturation velocity is limited to vfo. however, in a semiconducting state, it is substantially below vfo. the parabolic (p) approximation, although simple in its appearance, is not valid in the degenerate realm. 10 -2 10 0 10 2 0 0.2 0.4 0.6 0.8 1 u cnt v i/ v f 0 m (10,4) sc1 (13,0) sc1p (13,0) sc2 (10,5) sc2p (10,5) 122 v. k. arora equal number of electrons has directed velocity moments in and opposite to the electric field (n+ = n = n/2) in equilibrium, where n is the total concentration and n / n the fraction antiparallel (+) and parallel () to the applied electric field applied in the –xdirection. the electron concentration n+ >> n opposing electric field overpowers in the presence of an electric field as / exp( / ) b n n q k t    e . the fraction of electrons going in the +x-direction (opposite to an applied electric field /v le ) is / tanh( / )bn n q k t  e the drift velocity d v as a function of electric field then naturally follows as [15] tanh( / ) d u b v v q k t e (20) here vu is the unidirectional intrinsic velocity appropriate for twice the carrier concentration as electron cannot be accommodated in already filled state because of pauli exclusion principle. in nondegenerate domain, the distinction between vu and vi is not necessary. the quantum emission can also be accommodated to obtain saturation velocity tanh( / ) sat u q b v v k t  that is smaller than vu by quantum emission factor tanh( / ) q b k t approaching unity as energy of a quantum q >> kbt goes substantially beyond the thermal energy. in the other extreme, when emitted quanta are much smaller in energy, the bose-einstein statistics limits q / kbt  1. eq. (20) is strictly for nondegenerate statistics and that too for 1d nanostructures. different expressions are obtained for 2d and 3d nanostructure. however, use of tanh function unifies the current-voltage profile that can be usefully employed for characterization in the wake of failure of ohm's law. one way to preserve eq. (20) for degenerate statistics is to define the degeneracy temperature te for electrons. the low field carrier drift velocity is obtained when n+ /n is multiplied by vfo: 1 ( / ) [ ( ) / ( )] d fo fo o v n n v v          (21) fdi approximates to ( ) exp( )j    for all values of j for nondegenerate statistics. the degenerate mobility expression is obtained from (11) by retaining the factor in bracket giving / o fo b e q v k t  (22) with 1 2 3 2 ( ) ( ) e ud id t v t v        (23) where te is the degeneracy temperature signifying the higher energy of the degenerate electrons that is substantial higher than the thermal energy kbt. obviously te / t = 1 as expected. however, for strongly degenerate statistics te / t =  giving te = ef / kb. the fermi energy is ef = ncnt / do for metallic degenerate cnt. te is therefore equal to te = ncnt / kbdo. 4. i-v characteristics in the graphene and cnt because of linear e-k relationship, the mobility expression is different from other semiconductors. a rudimentary analysis of the mobility in terms of mfp is to change mobility expression / * / *o oq m q m v    by replacing *m v k  bandgap engineering of carbon allotropes 123 ( ) / f fo f e e v . this gives a simple mobility expression that has been utilized in [4] in extracting mfp. the expression obtained from this analogy is / ( ) o o f f fo q v e e     (24) the resistance then can be obtained from either 2 /or l w , with 2 2 21 ( )on opn p q    , for graphene with n2  ng replacement for areal density for graphene and n1  ncnt for cnt in 1 1 1 11 ( )o on opr l n p q      ). the velocity response to the high electric field is discussed in [4]. neadf's transformation of equilibrium stochastic velocity vectors into a streamlined mode in extreme nonequilibrium leads to velocity saturation in a towering electric field. in a metallic cnt, the randomly oriented velocity vectors in equilibrium are of uniform fermi velocity vfo = 1.010 6 m/s [1]. the saturation current isat = ncntqvfo arises naturally from this saturation, where ncnt = 1.5310 8 m 1 is the linear carrier concentration along the length of the tube consistent with experimentally observed [18] isat = 21 a. q is the electronic charge. the carrier statistics [1] gives ef = 67.5 mev which is larger than the thermal energy for all temperatures considered (t = 4, 100, and 200 k), making applicable statistics strongly degenerate. the transition from ohmic to nonohmic saturated behavior initiates at the critical voltage ( / )c bv k t q l for nondegenerate statistics with energy kbt and ( / )c fv e q l for degenerate statistics with energy ef. the mfp extracted from ro = 40 k is 70 nm that gives mobility [4] 2 / 10, 000 / o f f q v e cm vs   . the possibility of ballistic transport is miniscule given 1l m  . the ballistic transport in 2d systems is extensively discussed by arora and co-workers,[17, 19] where it is shown that the ballistic conduction degrades substantially the mobility in a 2d ballistic conductor with length smaller than the ballistic mfp. it may be tempting to apply the same formalism to 1d nanowire or nano cnt. however, the surge in resistance in a 1d resistor contradicts expected vanishing resistance for a ballistic conductor. a highfield resistance model[3] that employs the onset of phonon emission consistent with phonon-emission-limited mfp q of tan et. al [20] explains very well the saturation in 2d gaas/algaas quantum well. phonon-emission-limited mfp is generalized to any energy quantum by arora, tan, and gupta [4]. q is the distance that a carrier travels before gaining enough energy q qq  e to emit a quantum of energy q with the probability of emission given by the bose-einstein statistics. /q q q  e is infinite in equilibrium, very large in low electric field, and a limiting factor only in an extremely high electric field becoming comparable or smaller than the low-field mfp. that is why in the published literature on cnt, it is considered a high-field mfp, distinct from low-field scatteringlimited mfp. it is q that was used by yao et. al [18] to interpret the linear rise r / ro = 1 + (v / vc) in resistance with the applied voltage. here r = v / i is the direct resistance. this direct resistance r cannot replicate the incremental signal resistance /r dv di . therefore, the description of yao et. al [18] is deficient in not employing the distribution function and hence does not attribute correctly the source of current saturation, the transition point to current saturation, and the paradigm leading to rise of direct and incremental resistance. 124 v. k. arora neadf has a recipe for nonohmic transport leading to current saturation consistent with velocity saturation. as stated earlier, n+ >> n in the presence of an electric field as / exp( / ) b n n q k t    e . the fraction of electrons going in the opposite direction to an applied electric field /v le is then / tanh( / )bn n q k t  e . the current-voltage relation with tanh function ( fo i n qv    ) is a derivative of rigorous degenerate statistics [15] with ( / )c fv e q l and magnitude of velocity vector equal to the fermi velocity fo v for a metallic cnt. the current-voltage characteristics in a cnt are given by tanh( / ) sat c i i v v (25) fig. 8 is a plot of eq. (25) along with the experimental data of yao et al [18]. also shown are the lines at temperature t = 4, 100, and 200 k following the rigorous degenerate statistics [15]. the distinction between direct /r v i and differential /r dv di mode of resistance is crucial when i-v relation is nonlinear. r and r are given by / ( / ) / tanh( / ) o c c r r v v v v (26) 2 / cosh ( / ) o c r r v v (27) this relationship is in direct contrast to r / ro = 1 + (v / vc) with vc = isatro used by yao et. al [18], which can be obtained from eq. (26) by using approximation tanh ( ) / 1x x x  . io of yao et. al is the same as isat. fig. 8 i-v characteristics of a cnt of length 1 m. th stands for theoretical curves derived from degenerate statistics. tanh curves are display of eq. (25) as shown in fig. 9, the rise in r / ro is exponential compared to linear rise in r / ro. the potential divider rule between channel and contacts will make the lower-length resistor more resistive [21]. hence great care is needed to ascertain the critical voltage vc of the contact and channel regions. -5 0 5 -20 0 20 v (v) i (  a ) 4 k exp 100 k exp 200 k exp 4 k th 100 k th 200 k th tanh bandgap engineering of carbon allotropes 125 fig. 9 r-v characteristics of a cnt of length 1 m. markers and lines have same legend as in fig.7. the differential resistance r (eq. 27) rises sharply than the direct resistance r (eq. 26) fig. 8 makes it clear that both direct slope i/v giving inverse resistance r 1 and incremental slope dv/di giving differential (incremental) resistance r 1 decrease as voltage is increased ultimately reaching zero in the regime of saturation. however, in the work cited [4] the r is shown to decrease while conductance dv/di increases with applied voltage. it may be noted that incremental resistance increases almost exponentially as indicated in fig. 9 for v > vc and hence the curves are limited to a mv range to indicate superlinear surge of incremental resistance. direct resistance does follow the linear rise with applied voltage. the following observations are made consistent with the experimental data: 1. ohmic transport is valid so far the applied voltage across the length of the channel is below its critical value (v < vc). 2. the transition to nonlinear regime at the onset of critical electric field corresponding to energy gained in a mean free path is comparable to the thermal energy for nondegenerate statistics and fermi energy for degenerate statistics [21, 22]. 3. resistance surge effect in ballistic channels corroborate well with that observed by yao et. al [18] preceded by what was pointed out by greenberg and del alamo[23] in 1994. the surge in contact region will change the distribution of voltage between contacts and the channel.in this light, yao et. al [18] correctly conjectured that the measured resistance to be a combination of the resistance due to the contacts and the scattering-limited resistance of the cnt channel. the application of neadf in cnt [1] gives not only the comprehensive overview of metallic and semiconducting band structure of cnt, but also elucidates the rise of resistance due to the limit imposed on the drift velocity by the fermi velocity. 4. onset of quantum emission lowers the saturation velocity. however, if quantum is larger than the thermal energy, its effect on transport is negligible [22]. it is important to employ bose-einstein statistics [4] to phase-in the possible presence of acoustic phonon emissions in addition to optical phonons or for that matter photons as transitions are induced by transfer to higher quantum level induced by an electric field. the phonon emission, generalized to quantum emission with bose-einstein statistics, is effective in lowering the saturation velocity only if the energy of the quantum is higher than the thermal energy. quantum emission does not affect the ohmic mobility or for that matter ohmic resistance. -5 0 5 50 100 150 200 250 v (v) r ( k  ) r r 126 v. k. arora 5. conclusions carbon nano allotropes offer distinct advantage in meeting the expectations of more than moore era. the paper reviews the complete landscape as graphene is rolled into a cnt or cut into a gnr. the rollover effect in terms of metallic (m) and semiconducting sc1 and sc2 is distinctly new given the competing explanation for chiral and achiral cnts, making metallic cnt distinct from a semiconducting one. the same can be said of gnr where there is a complete absence of a semiconducting state. the neadf is unique for high-field applications as it seamlessly makes a transition from ohmic domain to nonohmic domain. it is the nonohmic domain that is not interconnected in the published literature, necessitating the use of hot-electron temperature. neadf clearly shows that hot-electron temperature is not necessary in description of high-field transport. the formalism presented connects very well the low-field and high-field regimes. the drive to reduce the size below the scattering-limited mfp is expected to eliminate scattering. this expectation goes against the experimental observation of rapid rise in the resistance [13, 21, 23-25]. this scattering-free ballistic transport, as is known in the literature, gives a resistance quantum h / 2q 2 = 12.9 k. however, if the length of cnt is larger than the scattering-limited mfp, the resistance will rise almost linearly. that is perhaps the reason that observed experimental resistance of 40.0 k as observed by yao et. al [18] exceeds its ballistic value for a 1  m resistor. in fact, greenberg and del alamo [23] have demonstrated that resistance surge in the parasitic regions degrades the performance of an ingaas transistor. to sum it up, explorations of new physical phenomena on this length scale require the contributions from many different fields of science and engineering, including physics, chemistry, biology, materials science, and electrical engineering. however, quantum physics forms the backbone of understanding at a nanoscale in biochemical sciences where most applications of nanoensemble is apparent. this review exhibits new phenomena at the interface between the microscopic world of atoms and the macroscopic world of everyday experience that occur at the nanoscale. such studies will undoubtedly lead to further applications with enormous benefit to society. acknowledgement: the paper is a part of the research done at the universiti teknologi malaysia (utm) under the utm distinguished visiting professor program and utm research university grant (gup) q.j130000.2623.04h32 of the ministry of education (moe). references [1] v. k. arora and a. bhattacharyya, "cohesive band structure of carbon nanotubes for applications in quantum transport," nanoscale vol. 5, pp. 10927-10935, 2013. [2] a. k. geim and k. s. novoselov, "the rise of graphene," nature materials, vol. 6, pp. 183-191, mar 2007. [3] p. h. s. wong and d. akinwande, carbon nanotube and graphene device physics. cambridge: cambridge university press, 2011. [4] v. k. arora, m. l. p. tan, and c. gupta, "high-field transport in a graphene nanolayer," journal of applied physics, vol. 112, p. 114330, 2012. [5] y. h. wu, t. yu, and z. x. shen, "two-dimensional carbon nanostructures: fundamental properties, synthesis, characterization, and potential applications," journal of applied physics, vol. 108, p. 071301, oct 1 2010. [6] a. h. castro neto, f. guinea, n. m. r. peres, k. s. novoselov, and a. k. geim, "the electronic properties of graphene," reviews of modern physics, vol. 81, pp. 109-162, jan-mar 2009. bandgap engineering of carbon allotropes 127 [7] k. s. novoselov, s. v. morozov, t. m. g. mohinddin, l. a. ponomarenko, d. c. elias, r. yang, i. i. barbolina, p. blake, t. j. booth, d. jiang, j. giesbers, e. w. hill, and a. k. geim, "electronic properties of graphene," physica status solidi b-basic solid state physics, vol. 244, pp. 4106-4111, nov 2007. [8] m. l. p. tan, v. k. arora, i. saad, m. taghi ahmadi, and r. ismail, "the drain velocity overshoot in an 80 nm metal-oxide-semiconductor field-effect transistor," journal of applied physics, vol. 105, p. 074503, 2009. [9] i. saad, m. l. p. tan, a. c. e. lee, r. ismail, and v. k. arora, "scattering-limited and ballistic transport in a nano-cmos circuit," microelectronics journal, vol. 40, pp. 581-583, mar 2009. [10] v. k. arora, m. l. p. tan, i. saad, and r. ismail, "ballistic quantum transport in a nanoscale metaloxide-semiconductor field effect transistor," applied physics letters, vol. 91, p. 103510, 2007. [11] v. e. dorgan, m. h. bae, and e. pop, "mobility and saturation velocity in graphene on sio(2)," applied physics letters, vol. 97, p. 082112, aug 2010. [12] i. gierz, c. riedl, u. starke, c. r. ast, and k. kern, "atomic hole doping of graphene," nano letters, vol. 8, pp. 4603-4607, dec 2008. [13] v. k. arora and m. l. p. tan, "high-field transport in graphene and carbon nanotubes," presented at the international conference on electron devices and solid state circuits 2013 (edssc2013), ieeexplore digital library, hong kong polytechnic university, 2013. [14] v. k. arora, nanoelectronics: quantum engineering of low-dimensional nanoensemble. wilkes-barre, pa: wilkes university, 2013. [15] v. k. arora, d. c. y. chek, m. l. p. tan, and a. m. hashim, "transition of equilibrium stochastic to unidirectional velocity vectors in a nanowire subjected to a towering electric field," journal of applied physics, vol. 108, pp. 114314-8, 2010. [16] v. k. arora, m. s. z. abidin, m. l. p. tan, and m. a. riyadi, "temperature-dependent ballistic transport in a channel with length below the scattering-limited mean free path," journal of applied physics, vol. 111, mar 1 2012. [17] v. k. arora, m. s. z. abidin, s. tembhurne, and m. a. riyadi, "concentration dependence of drift and magnetoresistance ballistic mobility in a scaled-down metal-oxide semiconductor field-effect transistor," appl. phys. lett., vol. 99, p. 063106, 2011. [18] z. yao, c. l. kane, and c. dekker, "high-field electrical transport in single-wall carbon nanotubes," physical review letters, vol. 84, pp. 2941-2944, 2000. [19] v. k. arora, "ballistic transport in nanoscale devices," presented at the mixdes 2012 : 19th international conference mixed design of integrated circuits and systems, wasaw, poland, 2012. [20] l. s. tan, s. j. chua, and v. k. arora, "velocity-field characteristics of selectively doped gaas/alxga1xas quantum-well heterostructures," physical review b, vol. 47, pp. 13868-13871, 1993. [21] m. l. p. tan, t. saxena, and v. arora, "resistance blow-up effect in micro-circuit engineering," solidstate electronics, vol. 54, pp. 1617-1624, dec 2010. [22] v. k. arora, "theory of scattering-limited and ballistic mobility and saturation velocity in lowdimensional nanostructures," current nanoscience, vol. 5, pp. 227-231, may 2009. [23] d. r. greenberg and j. a. d. alamo, "velocity saturation in the extrinsic device: a fundamental limit in hfet's," ieee trans. electron devices, vol. 41, pp. 1334-1339, 1994. [24] v. k. arora, "quantum transport in nanowires and nanographene," presented at the 28th international conference on microelectronics (miel2012), nis, serbia, 2012. [25] t. saxena, d. c. y. chek, m. l. p. tan, and v. k. arora, "microcircuit modeling and simulation beyond ohm's law," ieee transactions on education, vol. 54, pp. 34-40, feb 2011. instruction facta universitatis series: electronics and energetics vol. 27, n o 3, september 2014, pp. 411 424 doi: 10.2298/fuee1403411d implementation of artificial neural networks based ai concepts to the smart grid  marko dimitrijević, miona andrejević stošović, jelena milojković, vančo litovski faculty of electronic engineering, university of niš, serbia abstract. ict and energy are two economic domains that became among the most influential to the growth of modern society. these, in the same time, due to exploitation of natural resources and producing unwanted effects to the environment, represent a kind of menace to the eco system and the human future. implementation of measures to mitigate these unwanted effects established a new paradigm of production and distribution of electrical energy named smart grid. it relies on many novelties that improve the production, distribution and consumption of electricity among which one of the most important is the ict. among the ict concepts implemented in modern smart grid one recognizes the artificial intelligence and, specifically the artificial neural network. here, after reviewing the subject and setting the case, we are reporting some of our newest results aiming at broadening the set of tools being offered by ict to the smart grid. we will describe our result in prediction of electricity demand and characterization of new threats to the security of the ict that may use the grid as a carrier of the attack. we will use artificial neural networks (anns) as a tool in both subjects. key words: smart grid, ict, artificial intelligence, ann, prediction, security. 1. introduction in our recent studies we addressed the problem of interaction of the ict and energy sector including the specific interrelation through the subject of security [1, 2]. most of the claims reported were later on confirmed in the literature as, for example, in [3, 4, 5, 6]. it is our intention here to report on some aspects of these interrelations and, via some new case studies, to demonstrate how much the modern energy distribution system may be supported by ict. in particular, we intend to emphasize the potential role of the artificial intelligence in improving the implementation of the new emerging concepts of production, consumption and distribution of electricity. the ict industry plays a vital role in the global economy and is a major driver of growth and development [3]. several of the most transformative economic trends (e.g., social media, big data, multi-channel retail, etc.) involve the use of ict.  received january 31, 2014; received in revised form june 5, 2014 corresponding author: miona andrejević stošović university of niš, faculty of electronic engineering, aleksandra medvedeva 14, 18000 niš, (miona.andrejevic@elfak.ni.ac.rs) 412 m. dimitrijević, m. andrejević stošović, j. milojković, v. litovski in addition to its positive implications for economic growth, ict‟s greenhouse gasses (ghg) abatement potential must also be considered [3]. the ict industry accounted for 1.9% of total global ghg emissions in 2011, which is significantly less than its overall contribution to gdp. nonetheless, this is a significant amount of emissions that the industry must address, especially as we expect even faster adoption of ict in the future. however, in the last several years there have been promising strides toward decreasing the growth rate of ict emissions. early on, sustainable ict focused on green ict initiatives that minimize the ecological impact of the development, management, use, and disposal of computing resources. that is named the first wave of sustainable ict [7]. green ict tends to be product-oriented and mostly focused on reducing energy costs and carbon emissions for data centres and desktops. several studies were reported on the energy footprint of computers and data centres [8, 9, 10]. as concerns about ict‟s impact on the environment have risen, these issues have become limiting factors in determining the feasibility of deploying new ict systems, even though processing power is widely available and affordable. on the other side the electric power sector went through revolutionary transformations that include deregulation, use of alternative energy sources, and introduction of ict. at the distribution level, the new requirements call for the development of:  distribution grids accessible to distributed generation (dg) and renewable energy sources (ress), either self-dispatched or dispatched by local distribution system operators,  distribution grids enabling local energy demand management interacting with the users through smart metering systems, and  distribution grids that benefit transmission dynamic control techniques and overall level of power security, quality, reliability, and availability. the key technology supposed to fulfil these requirements today is named smart grid. smart grids and smart power systems in the energy sector can have major impacts on improving energy distribution and optimizing energy usage [11]. defining the smart grid in a concise way is not an easy task as the concept is relatively new and as various alternative components build up a smart grid. some authors even argue that it is “too hard” to define the concept [12]. looking at different definitions reveals that the smart grid has been defined in different ways by different organizations and authors. here is one of them: “a „smart grid‟ is a set of software and hardware tools that enable generators to route power more efficiently, reducing the need for excess capacity and allowing two-way, real time information exchange with their customers for real time demand side management (dsm). it improves efficiency, energy monitoring and data capture across the power generation and transmission and distribution network [13]”. the need of implementation of ai within the smart grid was recognized by the professional and scientific community [5,14]. for example, the work in [15] surveys some of the most relevant applications of ann techniques to the field of energy systems. these applications range from a wide variety of purposes such as, modeling solar energy heat-up response [16], prediction of the global solar irradiance [17], adaptive critic design [18], or even for security issues as reviewed in [19]. the idea behind these applications is based on learning how system performances can be related to certain input values, for instance, how weather conditions (solar or wind) determine the energy output that can be expected [20]. in the past decades anns have emerged as a technology with a great promise for identifying and modeling data patterns that are not easily discernible by traditional implementation of artificial neural networks based ai concepts to the smart grid 413 methods. a comprehensive review of ann use in forecasting may be found in [21]. among the many successful implementations we may mention [22, 23, 24]. applications of anns for security purposes were discussed in [5, 6]. putting all together, at this moment, one may state that the ai concepts and especially anns may be implemented in the following aspects of the life of modern distributed energy resources.  various forecasting tasks, like renewable energy forecasting, storage forecasting and demand forecasting, that need intelligent rules. we will address this issue later on.  protection. being by nature fault tolerant, the anns are most likely a very good means for localizing the faults within a micro grid and in the same time to be capable to isolate it in case of a fault in the main grid.  intelligent diagnosis of equipment in micro grid. anns are a better option for diagnosing faults in electrical equipment for the following reasons:  they can interpolate from previous learning and give a more accurate response to unseen data, making them better at handling uncertainty.  they are fault tolerant, so they handle corrupt or missing data more effectively.  they are good non-linear function approximators by nature, making them better at equipment diagnostics.  they are more suitable for extracting the relationship between input and output in fault detection and diagnosis applications.  demand side management. it appears that demand-side management technologies that simply rely on reacting to control or price signals will not be enough. rather, what is necessary are more sophisticated approaches that are truly adaptive to the state of the grid, that are able to learn the correct response given any particular situation, and that can look ahead and predict both supply and demand trends in the near future, in order to prepare for future reductions in available supply, or to make the most effective use of supply when it is available.  intelligent data processing including data-mining. the main challenge to be tackled in the smart grid comes from the vast amount of information involved in it. in contrast to traditional grids, in which the consumption metering information was only retrieved monthly, smart grids present a new scenario in which all the interconnected nodes are gathering information about many different matters, and not only consumption (i.e. real-time prices, peak loads, network status, power quality issues, etc.) [25]. in this sense, one of the main challenges for computational intelligence is how to intelligently manage such an amount of information so that conclusions and inferences can be drawn to support the decision making process.  security. here we see the grid as a highly interconnected vulnerable communication network being exposed to all kinds of malicious cyber attacks such as eavesdropping, tempering and even jeopardizing the physical structure of the system. the two case studies we are reporting here are interrelated by the fact that they both use artificial neural networks to improve the performance of the grid since the one (prediction) may be seen as a base for protection of the grid from overload while the second is related to profiling the loads connected to the grid and protect them of misuse. in addition, both solutions rely on the measured data generated by modern metering systems [ami/amr][26, 27]. the paper is organized as follows. in the second paragraph we will give a brief review on the anns and the structures we are using for interpolation and extrapolation. then, in 414 m. dimitrijević, m. andrejević stošović, j. milojković, v. litovski the third paragraph the implementation of anns in load prediction related to next day peak-load forecasting will be given. note, the method implemented here is genera in the sense that we have application to other types of load prediction such as short, medium, and long term. the implementation of the very same ann structures to the new eavesdropping method related to the profiling of the loads (in this case a computer) to grid, will be described in the fourth paragraph. 2. a short review of the methods of ann implementation we will first briefly introduce the feed-forward neural networks that will be used as a basic structure for prediction throughout this paper. fig. 1 a fully connected feed-forward ann the network is depicted in fig. 1. it has only one hidden layer, which has been proven sufficient for this kind of problem [28]. indices: in, h, and o, in this figure, stand for input, hidden, and output, respectively. for the set of weights, w(k, l), connecting the input and the hidden layer we have: k=1,2,..., min, l=1,2,..., mh, while for the set connecting the hidden and output layer we have: k=1,2,...mh, l=1,2,..., mo. the threshold is here denoted as θx,r, r=1,2,..., mh or mo, with x standing for h or o, depending on the layer. the neurons in the input layer are simply distributing the signals, while those in the hidden layer are activated by a sigmoidal (logistic) function. finally, the neurons in the output layer are activated by a linear function. the learning algorithm used for training is a version of the steepest-descent minimization algorithm [29]. the initialization problem was solved according to literature [30]. the number of hidden neurons, mh, is of main concern. to get it we applied a procedure that is based on proceedings given in literature [28, 31, 32]. for prediction purposes we developed two structures [33]. the first one was named time controlled recurrent (tcr). it is depicted in fig. 2. the second was named feedforward accommodated for prediction (ffap). its structure is depicted in fig. 3. later on, these two structures were further elaborated as discussed in the succeeding paragraph. it is worth mentioning that, in our opinion, for deterministic forecasting one always needs at least two predictions being supportive to each other. since no knowledge of the forecasting outcome is available, the second prediction is only means to corroborate the first one. having in mind, however, that both predictions carry the same uncertainty, we decided for the best final prediction to accept the average of the two. implementation of artificial neural networks based ai concepts to the smart grid 415 fig. 2 time controlled recurrent (tcr) ann fig. 3 the feed-forward accommodated for prediction (ffap) structure 3. prediction of peak-load at suburban level electric load prediction is essential for power generation and operation [34]. it is vital in many aspects such as providing price effective generation, system security, and planning. among others, it enables: scheduling fuel purchases, scheduling power generation, planning of energy transactions, and assessment of system safety [35]. the load forecast errors imply high extra costs: if the load is underestimated one has extra costs caused by the damages due to lack of energy or by overloading system elements; if the load is overestimated, the network investment costs overtake the real needs, and the fuel stocks are overvalued, locking up capital investment. in a smart grid context, prediction allows for developing computationally efficient learning algorithms that can accurately predict both the prosumers‟ (produce/consumer) consumption and generation profiles (instead of only the usage profile for a consumer) as well as the price of electricity in real time in order to inform profitable trading decisions. given this, a number of researchers have suggested that more sophisticated tariffs, such as real-time pricing (rtp) or spot pricing (where the price per kwh of electricity consumed is different for each half-hour and is provided to the consumer a day, or a few hours, ahead of time), in conjunction with more sophisticated „agents‟ that can autonomously respond to these price signals, would avoid this [36]. consequently, the quality of load forecasts has greatly influenced the economic planning in areas such as generation capacity, purchasing fuel, assessing system‟s security, maintenance scheduling, and energy transmission [37]. 416 m. dimitrijević, m. andrejević stošović, j. milojković, v. litovski the power load value is determined by several environmental and social factors. seasonal and daily profiles are the most apparent influential. temperature and air humidity are the primary parameters determining the energy consumption generally and especially in urban residential areas. working times, holidays, and weekends are characterized by specific load profile. environmental disasters, sudden increase of large loads or outages, and important social events are further complicating the load-time function. all together, the load curve is a nonlinear function of many variables that map themselves into it in an unknown way. in the next, our newest results in the application of artificial neural networks (anns) for prediction of daily peak loads at suburban level will be presented. 3.1. problem formulation we took data for the implementation of our method from the unite 1999 competition file [38]. the task was: given the peak values for the previous days, predict the peak-load value for the next day. according to studies of the behaviour of the consumers, in general, one may expect the peak-value to happen at about 19.00 hours. there are some exceptions but these are not influencing the general method we implement. when speaking about the very peakvalue one may recognize a regular periodicity with, unfortunately, some exceptions. fig. 4 represents the daily peak-value for one month (april 1997) extracted from [38]. note the difficulty to recognize the periodicity of the phenomenon. fig. 4 the daily peak-value for one month (april 1997) extracted from [38] the problem may be stated as follows. given the series (tk, f(tk)), k=1,2, ....n , where tk, is the time instant – namely day in the calendar, f(tk) the peak-value at that day, and k the counter, the last known peak-value is at the n-th day. our task is to predict the peak-value at the (n+1)st day. for the purpose of prediction in the subject of electricity we developed two ann structures named etcr and effap [39] which we implement simultaneously. the idea is the following: when predicting one is making a step into the dark. if one wants to have any confidence in the prediction one has to have at least two predictions that support each other. then, since both are of equal importance, instead of accepting one of them the average is calculated and stated as final result. we will give some rudimentary description of etcr and effap anns in the next. implementation of artificial neural networks based ai concepts to the smart grid 417 for the verification of the method we undertook the task to predict the daily peakvalues in may 1997 and to compare with the data given by the unite 1999 competition. 3.2. the etcr solution the etcr ann structure tailored for the application at hand is depicted in fig. 5. the name stands for extended time controlled recurrent. it is a recurrent ann with two feed-back loops. the first one is feeding back the peak-values of the most recent days while the second is feeding back the peak values from two previous weeks but of the same day in the week as the one to be predicted. in this way we implement two principles. first, we claim that only the most recent values have influence to the current value and there is no need for a huge amount of useless data. second, one has to exploit the pseudo-periodic behaviour of the consumers since same days in the week have similar load profile. the etcr is supposed to approximate the function: 1 2 3 4 7 14( , , , , , , )i i i i i iiy f i y y y y y y      (1) where the samples are the daily-peak values. when progressing in time i will raise its value by one. fig. 5 etcr: extended time controlled recurrent according to (1) fig. 6 the extended feed forward accommodated for prediction (effap) according to (2) as for the first test of the method we predicted the peak-value for april 30. 1997 what according to the unite 1999 was 609 kw. the resulting ann had 7 input terminals, 2 output terminals, and 5 neurons in the hidden layer. after bringing a proper excitation we got as a prediction y={625.3241}, what is depicted in table 1. 3.3. the effap solution the effap ann tailored for the application at hand is depicted in fig. 6. the name stands for extended feed forward accommodated for prediction. it is a feed forward ann with three inputs one of them being the time i, while the rest are the peak-values from the previous weeks. there are five outputs each of them supposed to learn the same 418 m. dimitrijević, m. andrejević stošović, j. milojković, v. litovski function but shifted in time for one day. the following set of functions approximates the phenomenon: 1 2 3 6 131{ , , , , ,} ( , , )i i i i i iiy y y y y i y y      f . (2) of course, this network is approximating the very same function as the etcr does but in a different manner. as a result for april 30 th 1997, the effap ann obtained after training had 3 input neurons, 5 output neurons, and 5 neurons in the hidden layer. after proper excitation the following prediction was obtained y= {653.2675}. the result is again depicted in table 1. table 1 prediction of the peak-value consumption at april 30 th 1997 of the unite data no. expected value etcr % effap % average value of the prediction % number of hidden neurons etcr effap 1 609 625.3241 2.68 653.2675 7.27 639.2958 4.975 5 5 3.4. overall solution as stated above, the final solution to the prediction problem in our method is obtained by averaging the etcr and the effap predictions. it is shown in table 1, too. it is encouraging. to get a complete picture about the capabilities of the method we made a prediction for every day in may 1997. our first partial results were published in [40] while here we are giving complete results for the whole month as shown in fig. 4. these allow for real evaluation of the properties of the method. by inspection of fig. 7 we conclude that the method proposed may be implemented for prediction of the peak-load at suburban level. the largest discrepancies between the actual and the predicted values are lower than 17% even in the worst case. in 22 out of 30 days the error was lower than 10%, while in 12 out of 30 days the error was lower than 5%. fig. 7 error of prediction (y-axis) as a function of the day in the month may 1997 (x-axis) implementation of artificial neural networks based ai concepts to the smart grid 419 4. a very specific view to the security within the research of the behaviour of computers from the power consumption point of view [10], different software packages were implemented in order to create the energy profile of the computer under different “loading” conditions. we noticed, however, that not only the power consumed, but the thd was dependent on the application running within the pc. so, table 2 contains all harmonics generated by one personal computer (dell optiplex 980, intel core i7 cpu @ 2.8ghz, 4gb ram, 500gb hdd) under different working conditions. approximately 50 harmonics were observed in a sample (200ms, 10000 samples) of a grid current. since even harmonics have incomparably smaller values than the odd ones, in table 2 only the dc, the main, and the odd harmonics are presented. fig. 8. illustrates two columns of table 2. table 2. odd harmonics extracted from one string measurement in eight different states of the workstation harm. no. off (1) idle (2) video (3) cpu arithmetic (4) gpu rendering (5) multimedia cpu (6) physical disks (7) file system benchmark (8) dc -0.55 -0.84 1.3 -0.52 -0.68 -1.3 -0.23 -0.51 1 89.7 400.26 475.4 785.73 747.73 394.33 381.54 411.72 3 3.05 47.9 54.03 34.6 35.84 47.79 48.05 47.73 5 8.55 23.18 23.52 28.7 28.42 22.83 23.53 24.14 7 8.94 11.41 12.3 17.43 16.77 9.74 6.96 9.61 9 3.08 9.19 7.7 10.12 9.26 9.17 8.63 9.5 11 8.76 6.17 7.24 12.27 11.13 6.12 5.36 5.53 13 2.77 1.4 1.73 6.01 5.81 1.99 2.49 2.96 15 6.28 9.81 12.19 5.98 6.84 9.32 9.94 8.92 17 4.81 3.66 5.1 8.91 9.9 5.6 3.76 3.71 19 0.69 4.16 5.05 5.74 5.68 3.3 5.75 7.31 21 0.92 7.39 6.52 4.89 5.12 6.65 5.55 5.29 23 0.62 5.17 7.15 6.06 7.19 5.55 4.56 4.3 25 0.53 4.12 6.2 5.86 4.63 4.6 5.2 4.76 27 0.94 5.18 8.31 2.29 1.28 4.2 3.07 6.35 29 0.62 6.61 6.35 2.94 4.3 5.85 4.93 6.26 31 0.54 4.89 3.64 2.54 3.61 4.98 3.96 5.16 33 1.08 7.58 5.23 4.48 3.67 7.84 8.2 7.34 35 0.47 3.98 2.72 1.71 1.59 4.27 4.17 2.94 37 0.45 2.61 2.09 0.51 0.93 2.98 3.19 2.2 39 0.58 3.9 2.83 2.94 3.55 3.97 4.7 2.81 41 0.54 1.29 0.97 1.26 0.56 1.54 0.96 1.11 43 0.24 1.28 0.46 1.24 0.67 1.39 1.24 1.82 45 0.27 1.91 0.85 1.44 1.79 2.2 1.93 1.77 47 0.39 0.94 0.98 0.34 0.48 0.55 0.9 1.03 49 0.21 0.36 0.53 1.95 1.78 0.7 1.34 0.95 420 m. dimitrijević, m. andrejević stošović, j. milojković, v. litovski fig. 8 measured odd harmonics in two cases: physical disc drive active and cpu loaded by arithmetic computations. the first harmonic is omitted for convenience question is: what would this table have to do with security? there are many security issues related to the grid. among them the most vulnerable subsystem, looking from the ict point of view, is the advanced metering infrastructure (ami). while it could bring significant benefits, it is potentially subject to security violations such as tampering with software in the meters, eavesdropping on its communication links, or abusing the copious amount of private data the new meters are able to collect. in addition to securing market sensitive data from competitors, information systems for the power grid need to defend against to malicious attacks [41] that intend to harm the power grid as a whole. the more comprehensive an information system becomes, the greater the consequences of a successful attack and thus the need for security measures increases. one of the ways of eavesdropping a home, an office, or a company is monitoring the power consumption and creating an energy profile of the subject [42]. having this information a large number of malicious actions can be undertaken such as burglaries and other damaging security breaches. here we expose an additional way of eavesdropping where the harmonic structure of the current drawn from the grid is base for information on the activities within a home or an office. the problem will be illustrated on the example depicted in table 2. here the pc is taking the role of the whole which is supervised. we will show in the next how one can precisely find the state in which the computer is, based on measurements of the supply current taken by its ac/dc converter from the grid. note, in the example depicted in table 2, power factor correction was applied within the converter. while there are several possibilities that allow information to be extracted from table 2 about the state in which the computer is, here we will use anns. an ann was trained to create a response recognizing which one of the sets of harmonics of table 2 is present at its input. its structure is depicted in fig. 9. to simplify, for the proper vector of harmonics, the corresponding output of the ann was forced to unity while the rest of the outputs were kept at zero. in other words, it was trained to recognize which software was running within the computer. full success was achieved meaning, after training, the ann was classifying perfectly. implementation of artificial neural networks based ai concepts to the smart grid 421 fig. 9 artificial neural network that eavesdrops the personal computer based on information on harmonics in its mains current to make the problem harder, i.e. to introduce the possible variations due measurement errors, we transformed table 2 so that every entry was recalculated by the formula [1 (2 1) 0.025] new x x rnd      , (3) where rnd is a pseudo-random number with uniform distribution within the [0,1] segment. in other words a “noise” of amplitude (peak-to-peak) as large as 5% of the harmonic value was added as “measurement disturbance”. again, as can be seen from table 3, excellent classification was obtained. table 3 responses of the ann to noisy input data ann‟s output→ input vector↓ off idle video cpu arithmetic gpu rendering multimedia cpu physical disks file system benchmark (1) 0.94189 -0.00826428 -4.98446e-05 0.0596502 0.00545632 -2.68923e-05 0.00254522 0.00128351 1 0 0 0 0 0 0 0 (2) -0.100789 0.936809 -6.30066e-05 0.107029 -0.00390563 -4.56815e-05 0.0353001 0.0301201 0 1 0 0 0 0 0 0 (3) 0.0747284 -0.0347075 1.00742 -0.0946782 0.0368009 6.60139e-06 0.0172143 -0.00950488 0 0 1 0 0 0 0 0 (4) 0.0530374 -0.00513355 -3.01133e-05 0.94394 0.00599003 4.07148e-06 -0.00314594 0.0039932 0 0 0 1 0 0 0 0 (5) -0.0714551 0.141341 0.000249561 0.347383 0.694706 2.93044e-05 -0.0165517 -0.0935344 0 0 0 0 1 0 0 0 (6) -0.0390391 -0.068559 -2.64327e-05 0.0464038 -0.0182126 0.994595 0.0357881 0.0513166 0 0 0 0 0 1 0 0 (7) 0.0221675 -0.0245939 -7.75624e-06 -0.0287134 0.0235965 -8.00252e-07 1.01758 -0.010466 0 0 0 0 0 0 1 0 (8) 0.0524894 -0.0178626 -6.26366e-05 -0.0587603 0.0177179 1.40437e-06 0.00103932 1.00386 0 0 0 0 0 0 0 1 422 m. dimitrijević, m. andrejević stošović, j. milojković, v. litovski finally, eight new sets of “harmonics” were created artificially by permutations within the rows in table 2 and the newly created columns were used as excitation to the ann. none succeeded to deceive the network. to conclude, there are robust classification mechanisms whose implementation may give to a malicious attacker, having a sophisticated tool based on current monitoring, an opportunity to monitor every activity within a computer and, in general, a data centre or similar. note, the spectrum of a current taken by a household is not much more complicated than the one of the computer since the main consumers in the household are linear loads and do not generate additional harmonics. from that point of view, we consider our method applicable to a broader list of situations then just a computer. 5. conclusion the modern electricity distribution system gradually evolves into a very large and very complex structure in which ict is getting more and more important role. it is nowadays most frequently referred to as smart grid. there is almost unlimited number of possible applications of ict subsystems within the smart grid and one is not to say that smart grid is a fixed structure whose capabilities are finally set. a special offer of the ict to the smart grid is artificial intelligence and particularly the artificial neural networks. here we represent our attempts to contribute to the development of the smart grid toward an advanced, reliable and secure system. the case studies reported are part of the same project since the same methodology is implemented and they are considering two important and interrelated aspects: the profiling of the load and the protection of the grid. in particular, we discussed some of the most recent results produced within the laboratory for electronic design automation at the university of niš, serbia, which are related to load prediction at suburban level, and a new way of cyber-attack to the ict connected to the grid. both results are based on our own methodology of measurements and own concepts of implementation of anns. as for the load prediction it is worth mentioning that the results reported are part of a set of implementation of our concept to short term [43], medium term [40], and long term [44] prediction of electricity loads. when appropriate, e.g. short term prediction, real-time implementation of the prediction was implemented [39]. the results related to the profiling the computer looking at it from the grid, however, are brand new and will be further elaborated and implemented to more complex computer loads such as data centres or company networks. acknowledgement: this research was partly funded by the ministry of education, science and technological development of republic of serbia under contract no tr32004. references [1] v. litovski, p. petković, ”why the power grid needs cryptography?”, proc. of the symposium on industrial electronics indel 2008, banja luka, 06.11.-08.11., 2008, pp. 75-81. reprinted in: electronics, issn 1450-5843, vol. 13, no. 1, june 2009, pp. 30-36. [2] m., dimitrijević, j., milojković, s., slobodan bojanić, o., nieto-taladriz, and v., litovski, “ict and power: new challenges and solutions”, int. j. reasoning-based intelligent systems, vol. 5, no. 1, 2013, pp. 32-41. publisher: inderscience enterprises, issn: 1755-0556, e-issn: 1755-0564. implementation of artificial neural networks based ai concepts to the smart grid 423 [3] -,“gesi smarter 2020: the role of ict in driving a sustainable future”, the boston consulting group, http://gesi.org/smarter2020. [4] s. iyer, “cyber security for smart grid, cryptography, and privacy”, hindawi publishing corporation, int. j. of digital multimedia broadcasting, vol. 2011, article id 372020, 8 pages. [5] w. wang, and z. lu, “cyber security in the smart grid: survey and challenges”, computer networks, vol. 57, pp. 1344–1371, 2013. [6] f. aloul, a. r. al-ali, r. al-dalky, m. al-mardini, and w. el-hajj, “smart grid security: threats, vulnerabilities and solutions”, international journal of smart grid and clean energy, vol. 1, no. 1, pp. 1-6, 2012. [7] r. harmon, h. demirkan, “the next wave of sustainable it”, it professional, vol. 13, no. 1, pp. 19-25, jan./feb. 2011, doi:10.1109/mitp.2010.140. [8] -,“electricity consumption and efficiency trends in the enlarged european union”, institute for environment and sustainability, 2007, http://www.eubusiness.com/ topics/energy/electricity-jrc.bk/ [9] a. p. bianzino, a. k. raju, d. rossi, “greening the internet: measuring web power consumption”, it pro, january/february 2011, published by the ieee computer society, pp. 48-53. [10] o. nieto, et al., “energy profile of a personal computer”, proceedings of the lvi conf. of etran, zlatibor, serbia, june 2012, isbn 978-86-80509-67-9, proc. on a disc, paper el3.3-1-4. [11] r. adam, w. wintersteller, from distribution to contribution. commercializing the smart grid, booz & company, munich, 2008. [12] j. miller, “the smart grid – how do we get there?”, smart grid news, june 26, 2008. http://www.smartgridnews.com/ [13] -,“smart 2020: enabling the low carbon economy in the information age”, climate group, gesi 2008, www.theclimategroup.org/assets/resources/publications/smart2020 report.pdf. [14] d. ramchurn, p. vytelingum, a. rogers, a., and n. r. jennings, “putting the 'smarts' into the smart grid: a grand challenge for artificial intelligence”, communications of the acm , vol. 55, no. 4, april 2012. [15] s. kalogirou, k. metaxiotis, and a. mellit, “artificial intelligence techniques for modern energy applications”, intelligent information systems and knowledge management for energy: applications for decision support, usage, and environmental protection, igi global, pp. 1-39, 2010. [16] s. kalogirou, c. neocleous, and c. schizas, “artificial neural networks for modelling the starting up of a solar steam generator”, applied energy, vol. 60, pp. 89– 100, 1998. [17] p. l. zervas, h. sarimvies, j. a. palyvos, n. g. c. markatos, “model-based optimal control of a hybrid power generation system consisting of photovoltaic arrays and fuel cells”, journal of power source, vol. 181, pp. 327–338, 2008. [18] p. j. werbos, “approximate dynamic programming for real time control and neural modelling”. in white da and sofge da (eds.), handbook of intelligent control, van nostrand reinhold, new york, 1992, pp. 493-525. [19] y. mansour, e. vaahedi, m. a. el-sharkawi, “dynamic security contingency screening and ranking using neural networks”, ieee trans power syst., vol. 8, no. 4, pp. 942–950, july 1997. [20] d. riley, g. k. venayagamoorthy, “characterization and modeling of a grid connected photovoltaic system using a recurrent neural network”, in proc. ieee int. joint conf. neural networks, san jose, ca, july 31–aug. 5, 2011. [21] b. g. zhang, e. patuwo, and m. y. hu, “forecasting with artificial neural networks: the state of the art”, international journal of forecasting, vol. 14, no. 1, pp. 35-62, march 1998. [22] j. g. m . zade, and r. noori, “prediction of municipal solid waste generation by use of artificial neural network: a case study”, int. j. environmental reserch, vol. 2, no. 1, pp. 13-22, 2008. [23] s. canu, y. grandvalet, and x . ding, “one step ahead forecasting using multilayered perceptron”, working paper de i'universite de technologie de compiegne. [24] j. connor, and r. douglas martin, “recurrent neural networks and robust time series prediction”, ieee trans. on neural networks, vol. 5, no. 2, pp. 240-254, march 1994. [25] y. simmhan, s. aman, b. cao, m. giakkoupis, a. kumbhare, q. zhou, d. paul, c. fern, a. sharma, v. prasanna, “an informatics approach to demand response optimization in smart grids”, technical report, computer science dept., usc, 2011. [26] c. king, “advanced metering infrastructure (ami) overview of system features and capabilities”, emeter corporation, https://www.smartgrid.gov/sites/default/files/doc/files/overview_ami_system _features_capabilities_200405.pdf [27] m. dimitrijević, and v. litovski, „power factor and distortion measuring for small loads using usb acquisition module”, journal of circuits, systems, and computers, vol. 20, no. 5, pp. 867-880, august 2011. [28] t. masters, practical neural network recipes in c++, academic press, san diego, 1993. 424 m. dimitrijević, m. andrejević stošović, j. milojković, v. litovski [29] z. zografski, “a novel machine learning algorithm and its use in modeling and simulation of dynamical systems”, proc. of s"" annual european computer conference, ieee compeuro'91, bologna, italy, pp. 860-864, 1991. [30] t. denoeux and r. lengelle, “initializing back propagation networks with prototypes”, neural networks (pergamon press), vol. 6, pp. 351-363, 1993. [31] g.-b. huang and h. a . babri, “upper bound on the number of hidden neurons in feedforward networks with arbitrary bounded nonlinear activation function”, ieee trans, on neural networks, vol. 9, pp. 224228, 1998. [32] e. b. baum and d. haussler, “what size net gives valid generalization”, neural computing, vol. 1, pp. 151-160, 1989. [33] j. milojković, and v. litovski, “comparison of some ann based forecasting methods implemented on short time series”, 9th symposium on neural network applications in electrical engineering, neurel-2008, pp. 179-179, belgrade, serbia, 2008. [34] h. m. al-hamadi, s. a. soliman, “short-term electric load forecasting based on kalman filtering algorithm with moving window weather and load model”, electric power systems research, vol. 68, no. 1, 2004, pp. 47-59. [35] s., tzafestas, and e., tzafestas, “computational intelligence techniques for short-term electric load forecasting”, journal of intelligent and robotic systems, vol. 31, no. 1-3, 2001, pp. 7-68. [36] f. schweppe, b. daryanian, and r. tabors, “algorithms for a spot price responding residential load controller”, power engineering review vol. 9, no. 5, pp. 49–50, 1989. [37] f. liu, r. d. findlay, q. song, “a neural network based short term electric load forecasting in ontario canada”, in int. conf. on computational intelligence for modelling control and automation, and int. conf. on intelligent agents, web technologies and internet commerce, (cimca-iawtic'06), 2006, pp. 119 – 125. [38] worldwide competition within the eunite network. (2001). [online] available: http://neuron.tuke.sk/ competition [39] j. milojković, v. litovski, “dynamic one step ahead prediction of electricity loads at suburban level”, proc. of the first ieee int. workshop on smart grid modeling and simulation – at ieee smartgridcomm 2011, sgms2011, brussels, october 2011, proc. on disc, paper no. 25. [40] j. milojković, v. litovski, “one day ahead peak electricity load prediction”, ix symposium industrial electronics, indel 2012, banja luka, november 2012, pp. 261-267. [41] f. cleveland, “iec tc57 security standards for the power system information infrastructure beyond simple encryption”, june 2007. iec tc57 wg15 security standards white paper ver. 11. http://www.xanthus-consulting.com/pages/publications.htm [42] m. andrejević stošović, m. dimitrijević, v. litovski, “computer security vulnerability seen from the electricity distribution grid side”, applied artificial intelligence, taylor & francis ltd., 2014, accepted for publication. [43] j. milojković, and v. litovski, “new ann models for short term forecasting of electricity loads”, proc. of the 7th eurosim congress on modelling and simulation vol.2: full papers (cd), czech technical university in prague, faculty of electrical engineering, dept. of computer science and engineering, prague, czech republic isbn 978-80-01-04589-3, september 2010. [44] j. milojković, v. litovski, o. nieto-taladriz, and s. bojanić, “forecasting based on short time series using anns and grey theory – some basic comparisons”, in proc. of the 11th int. work-conference on artificial neural networks, iwann 2011, june 2011, torremolinos-málaga (spain). j. cabestany, i. rojas, and g. joya (eds.): part i, lncs 6691, pp. 183–190, 2011, © springer-verlag berlin heidelberg 2011. issn: 0302-9743. instruction facta universitatis series: electronics and energetics vol. 27, n o 3, september 2014, pp. 399 410 doi: 10.2298/fuee1403399c effective combining of color and texture descriptors for indoor-outdoor image classification  stevica cvetković 1 , saša v. nikolić 1 , slobodan ilić 2 1 university of niš, faculty of electronic engineering, niš, serbia 2 technische universitat münchen (tum), munich, germany abstract. although many indoor-outdoor image classification methods have been proposed in the literature, most of them have omitted comparison with basic methods to justify the need for complex feature extraction and classification procedures. in this paper we propose a relatively simple but highly accurate method for indoor-outdoor image classification, based on combination of carefully engineered mpeg-7 color and texture descriptors. in order to determine the optimal combination of descriptors in terms of fast extraction, compact representation and high accuracy, we conducted comprehensive empirical tests over several color and texture descriptors. the descriptors combination was used for training and testing of a binary svm classifier. we have shown that the proper descriptors preprocessing before svm classification has significant impact on the final result. comprehensive experimental evaluation shows that the proposed method outperforms several more complex indoor-outdoor image classification techniques on a couple of public datasets. key words: feature extraction, image classification, image color analysis, image edge detection, support vector machines. 1. introduction indoor-outdoor image classification is a problem that attracts considerable attention of scientific population involved in content based image retrieval [1]. it is a restricted case of the general image classification problem, which represents the basis for decision of further processing steps depending on the scene type. for instance, assumption that indoor and outdoor images are usually taken under different illumination conditions can be used for decision about forthcoming color correction approach [2]. furthermore, indooroutdoor classification can be exploited for many image processing applications such as image orientation detection [3], image retrieval [4], or robotics [5].  received january 28, 2014; received in revised form march 28, 2014 corresponding author: stevica cvetković university of niš, faculty of electronic engineering (e-mail: stevica.cvetkovic@elfak.ni.ac.rs) 400 s. cvetković, s, v. nikolić, s. ilić several approaches for indoor-outdoor image classification have been proposed in the literature so far. in the recent methods there is an evident trend of introduction of additional information about camera or scene [6], [7]. however, this kind of information which includes exposure time, object distance or flash fired info, is commonly unavailable to the system. there is also evident involvement of domain specific assumptions about indooroutdoor scenes, such as presence of sky or grass in outdoor images [8], [9], or intentional favoring of specific image partitions [10]. in our work, the goal is to define an effective and efficient method for indoor-outdoor image classification based on standardized low-level descriptors and proven machine learning techniques. another goal is to achieve sufficient generality of the method and to avoid introduction of domain specific knowledge about the scene. to this end we propose a carefully engineered procedure for the composition of mpeg-7 color and texture descriptors characterized by efficient extraction, compact representation and high discriminative power. we conducted comprehensive empirical tests to determine the optimal combination of descriptors for this purpose. after important preprocessing procedure, combined descriptors will be conducted to input of an optimally tuned support vector machines (svm) classifier. we have empirically shown a large impact of feature preprocessing before svm on the accuracy of the system. experimental evaluation will show that the proposed method outperforms some more complex state-ofthe-art indoor-outdoor classification techniques on a couple of standard datasets. to be successfully applied for the image classification task, an image descriptor (feature) should be highly discriminative and invariant against image content [11]. such a descriptor should generate features with high variance and good distribution over category samples. in addition, it should be robust to different levels of image quality and resolution. there is a large collection of visual descriptors available in the literature with corresponding strengths and weaknesses [12], [13]. in order to provide standardized descriptors of image and video content, mpeg-7 standard defines three classes of still image visual descriptors: color, texture and shape descriptors [14], [15]. each class of visual features characterizes only a certain aspect of image content, so the combination of features is necessarily employed to provide an appropriate description of image content. we performed exhaustive experiments on combining several mpeg-7 color and texture descriptors which were carefully chosen to meet the requirements of fast calculation, compact representation and high discriminative power. once features have been extracted, method for automatic image classification should be applied. approaches for image classification can be roughly grouped in two categories: (a) learning-based methods that are able to learn optimal parameters based on input training samples. these methods include svm [6], [8], [10], [16]-[18], neural networks [19], [20], decision trees [2], hidden markov models [36], etc. (b) non-parametric methods that perform classification directly on the data, without learning the parameters. the most widely used non-parametric method is k-nearest neighbors (k-nn) which determines image class based on the class of its most similar images [4], [21], [22]. although non-parametric methods require no learning steps and are able to naturally handle a large number of classes, they often suffer from high variation along the decision boundary caused by finite sampling in terms of bias-variance decomposition [23]. as a consequence, their accuracy could be inferior compared to learning-based methods [24]. in addition, processing time of non-parametric methods is considerably larger than the learning-based methods, which makes them inconvenient for large scale classification systems. for our purpose we propose to apply binary svm image classifier on preprocessed effective combining of color and texture descriptors for indoor-outdoor image classification 401 feature vectors formed by combination of several visual descriptors. we applied classification over low-level features only, i.e. features that can be automatically extracted without any a priori knowledge of the image content. although our paper does not involve fundamentally new procedures, it has several main contributions: (1) it has empirically shown that a baseline method for indooroutdoor image classification can reach satisfactorily good results without using complex image descriptors and sophisticated machine learning techniques, (2) gives extensive review on indoor-outdoor image classification topic that is lacking in the literature, and (3) provides comprehensive statistical analysis of descriptor combination with feature scaling methods in order to demonstrate importance of each component. in the rest of the paper we first give overview of the previous research in the field of indoor-outdoor image classification. in section 3, the reasons for choosing specific mpeg-7 descriptors are explained including a brief overview of feature extraction procedures as well as strategy for their combination. then, details of svm image classification method are presented, including description of feature preprocessing step and svm parameters selection. finally, testing methodology and results are presented and discussed. 2. previous research the research on indoor-outdoor image classification can be traced back to the work of szummer and picard [21] who applied a two-stage classification approach on features that combine ohta color space histogram and multi-resolution simultaneous autoregressive model (msar). at the first stage, they used k-nn to classify subblocks of the image, while the final decision was based on the majority rule. the accuracy of 90.3% is achieved on a set of over 1300 consumer images. in a similar approach, serrano et al. [18], extracted lst color histogram and wavelet texture features for classification of image sub-blocks. they used linear svm classifiers to train color and texture features separately. recognition rate was 90.2%, on a set of 1200 images. in [4], indoor-outdoor classification is proposed at the highest level of a hierarchical image classification method. color moments in the luv color space were computed for 10x10 image subblocks. concatenation of the feature vectors of all subblocks produced final feature vector. finally, k-nn classifiers have been evaluated on a database of 6931 vacation photographs achieving accuracy of 90.5%. straight edges were used as a feature in a method proposed in [22]. the authors claim that the proportion of straight edges in indoor images is larger in comparison to outdoor images. the final classification of the image is based on a k-nn rule applied to the proportion of straight edges contained in sub-blocks of the image. in addition, a multi-resolution estimates are used to improve the results. tests conducted on a set of 872 photographs, reported classification accuracy of 90.71%. gupta et al. [19] use a fuzzy clustering method to initially segment an image into sub-regions. segments are then described using simple color, texture, and shape features. the probabilistic neural network is finally applied for the classification that reported accuracy of 92.36% on a benchmark set of 902 images. indoor-outdoor classification is used to improve automatic illuminant estimation in [2]. the feature vector consists of color, texture and edge information. decision forests of classification and regression trees are used for classification. testing was performed on a collection of 6785 images, downloaded from the web or acquired by digital cameras. they reported a classification accuracy of 93.1%. 402 s. cvetković, s, v. nikolić, s. ilić table 1 chronological overview of the published methods for indoor-outdoor image classification authors year classifier accuracy (in %) number of images szummer & picard [21] 1998. knn 90.30 1343 vailaya et al. [4] 2001. knn 90.50 6931 serrano et al. [18] 2004. svm 90.20 1200 serrano et al. [8] 2004. svm+bayesian 90.70 1200 payne & singh [22] 2005. knn 90.71 872 boutell & luo [6] 2005. svm+bayesian 94.10 5120 liu et al. [7] 2005. lda+boosting 92.20 13000 lu et al. [9] 2005. gmm+lda 93.80 1400 gupta et al. [19] 2007. neural network 92.36 902 bianco et al. [2] 2008. decision forest 93.10 6785 kim et al. [10] 2010. svm 90.26 1276 all of the previously mentioned methods are concerned with low-level features only that are extracted directly from digital images with no impact of human perception. in addition, there have been proposed several methods based on high-level image information, i.e. semantic assumptions about the scene. authors of [18] have extended their approach in [8] by introducing semantic detectors for grass and sky to improve the classification accuracy. the results from the svm sub-block classification and semantic detectors were integrated using a bayesian classifier. a classification accuracy of 90.7% was reported on a set of 1200 consumer images. boutell and luo [6] proposed a fusion of low-level image information with the camera metadata information provided in exchangeable image file format (exif), such as exposure time, flash fired and subject distance. first, they applied svm to classify images by low-level for color and texture-features. then, bayesian network was used to classify low-level features integrated with exif metadata. on a benchmark set of 5120 images, the reported accuracy was 94.1%. in a similar approach [7], the combination of color moments and edge direction histogram was extracted as low-level features. to improve the classification accuracy, they utilized 14 exif features associated with images. linear discriminant analysis (lda) algorithm was utilized to implement linear combinations between all extracted features. finally, the combined features are used with the original features in boosting classification algorithm. on a large set of about 13000 digital photographs, they achieved 92.2% accuracy. authors of [9] first trained gaussian mixture models (gmm) to describe the color-texture properties of image patches for 20 predefined materials (building, blue sky, bush, etc.). these models are then applied to a test image to produce 20 probability density response maps which are later used to train lda classifiers for scene categories. a database of 1400 photos taken from 43 persons was used for testing. the indoor-outdoor classification rate was 93.8%. in [10], authors make an assumption that foreground objects (human bodies and faces), which often appear in the central part of the image, may negatively affect the system performance. they partitioned the image into five blocks and extracted edge and color orientation histogram (ecoh) for each block. then, the features are weighted according to the block positions (central part is less weighted) and concatenated to generate the final feature vector. svm classifier evaluated on 1276 images obtained the 90.26% classification rate. effective combining of color and texture descriptors for indoor-outdoor image classification 403 3. feature extraction and combination statistical analysis of global visual descriptors for image retrieval in [11], [12], has shown a large overlapping of the information they extract. although mpeg-7 suggests a number of different descriptors, most of them are highly dependent on each other [11]. on the other side, some of them like dominant color descriptor (dcd), are too computationally expensive for practical applications. to extract features with compact representation and low computational costs we considered the following five mpeg-7 descriptors: scalable color descriptor (scd), color structure descriptor (csd), color layout descriptor (cld), homogeneous texture descriptor (htd), and edge histogram descriptor (ehd). in order to test their behavior and make a selection of the best descriptors for combining, we have conducted exhaustive experiments. we will show that a combination of only a few of them is sufficient for successful indoor-outdoor image classification. first, we give a brief overview of the selected descriptors and describe a method that we used for their combination. further details about descriptor extraction procedures could be found in [14], [25], [26]. color descriptors scalable color descriptor (scd) measures color distribution over an entire image. it is a histogram in the hsv color space that is encoded using the haar transform. the histogram is extracted in hsv space uniformly quantized to 16 levels of h, 4 levels of s and 4 levels of v, giving 256 bins in total. these values are truncated into an 11 bit integer representation and non-linearly mapped into a 4-bit representation. this representation gives a higher significance to smaller values with high probability. to reduce the size of this representation, the histogram values are encoded using haar transform. its representation is scalable in terms of a coefficient number varying from 16 up to 256. our experiments have shown that using more than 64 coefficients does not necessarily lead to a significant accuracy improvement. therefore, we used the following representation of the descriptor ),...,( 641 scdscdscd fff (1) color structure descriptor (csd) extends the image color histogram with information about local spatial structure of the color. it is based on the concept of color structure histogram which counts the number of times a particular color is contained within the 8x8 window as the window scans over the image. mpeg-7 specific color space, denoted as hmmd [14], is used for the extraction. it is first non-uniformly quantized into n colors [27], determining the number of bins in the color structure histogram. then, the window scans over the entire image, and for each color which is present within the window, it increments a corresponding histogram bin. finally, histogram values are normalized and nonlinearly quantized to 8 bits/bin. in our experiments we used csd containing 64 bins: ),...,( 641 csdcsdcsd fff (2) color layout descriptor (cld) has been designed to efficiently represent spatial layout of colors inside an image. it is obtained by applying the discrete cosine transformation (dct) on local representative colors of 64 image blocks in ycbcr color space. the descriptor is characterized by compact representation, invariance to resolution 404 s. cvetković, s, v. nikolić, s. ilić changing and low computational complexity. the extraction process starts with image partitioning of each rgb color channel into 8x8=64 non-overlapping blocks to guarantee resolution invariance. then, a single representative color is computed for each block by simple pixel averaging. in the next step, conversion to ycbcr color space is done and color channels are transformed by dct to obtain three sets of 64 dct coefficients. finally, a zigzag scanned dct coefficients are concatenated into a feature vector containing the most informative elements of each ycbcr color channel. our rough experiment has shown that the feature vector with 22 elements represents a good choice: ),...,,,...,,,...,( 6161101 cld cr cld cr cld cb cld cb cld y cld y cld fffffff (3) texture descriptors homogeneous texture descriptor (htd) characterizes the region texture by the mean energy and the energy deviation from a set of 30 frequency channels. the descriptor is extracted by first partitioning the frequency space into 30 equidistant channels. the individual feature channels are filtered with a bank of 2-d gabor functions, and the mean and standard deviation of the energy in each of the channels is calculated. the final form of the descriptor that consists of 62 coefficients is ),...,,,...,,,( 301301 htdhtdhtdhtdhtd sd htd dc htd ddeefff (4) the first two components are the mean and standard deviation of the complete image, and ei htd and di htd are mean energy and energy deviation of the corresponding i-th frequency channel, respectively. edge histogram descriptor (ehd) represents spatial distribution of 5 types of edge orientations inside local image partitions called sub-images. one local edge histogram is generated for each of 44=16 subimages, representing distribution of five edge orientations inside a subimage. to generate the local edge histogram, edges in the sub-image are extracted and classified into five categories depending on the orientation (vertical, horizontal, diagonal, diagonal, and non-directional). since there are 16 subimages, final edge histogram will contain 16x5 = 80 bins formed by concatenation of the local histograms ),...,( 801 ehdehdehd fff (5) combination of descriptors when using multiple visual features for image classification, crucial problem is how to combine them in order to measure image similarity. generally, there are two approaches for feature combination (fusion, aggregation, composition, merging) [17], [28], [34]. the first one, named “early fusion” performs a combination of features before the estimation of the distances between images. in contrast, “late fusion” applies classification on each feature separately, after which it integrates these results into final decision. an obvious disadvantage of late fusion approach is its computational expensiveness, as every feature requires separate classification stage. another disadvantage is the potential loss of correlation in mixed feature space [17]. since our goal is to develop computationally efficient and accurate method, we focused on the “early fusion” approach. specifically, effective combining of color and texture descriptors for indoor-outdoor image classification 405 we create the final feature vector by concatenating the extracted feature vectors (3 color and 2 textures), where not all feature vectors are necessarily involved. formally, the most extensive form of final feature vector is: ),,,,( ehdhtdcldcsdscd i ffffff  (6) when considering the combination of only two features (e.g. cld+ehd), we will use the final feature vector in the form ),( ehdcld i fff  . as it will be shown in the experimental evaluation, not all the features have to be combined to achieve the best performance. our experiments have shown that the combination of only a few of them is sufficient for fast and accurate indoor-outdoor image classification. depending on the number of features chosen for image representation, the final feature vector will contain from 22 up to 292 elements. the feature vector formed in this way will serve as an input of svm classifier described in the following section. 4. svm based indoor-outdoor image classification svm is one of the most popular machine learning methods for classification of multimedia content [6], [8], [10], [16]-[18]. it is a supervised machine learning technique that performs learning from examples in order to predict the values of previously unseen data. svm can be formalized as an optimization problem which finds the best hyperplane for two or more groups of vectors by maximizing the size of the margin between groups. in order to get the best performance svm, it is crucial to apply appropriate feature scaling and svm parameters tuning [35]. the procedures that we performed are described in details below. although many existing svm approaches apply some sort of feature scaling, the impact of this on svm classification performance is still not sufficiently clear. our intension is to empirically test significance of feature scaling procedures before svm classification. in general, complex image features may contain a significantly different range of values since they are combined from several components (e.g. texture and color information). as a consequence, components with a higher variance will be dominant in determining distance between images. to avoid high influence of the feature component with a large variance, each element of the feature vector must be scaled using appropriate method. let us consider a collection of m indoor-outdoor images where each image is represented by its n dimensional feature vector formed by combination of several mpeg7 descriptors. we examined two basic and efficient feature scaling methods: a) linear min-max scaling [11], and b) scaling to zero mean and unit variance (“z-score”) [29]. linear min-max method can be mathematically represented as: njmi jj jj j ii ii i ,...,1;,...,1, )(min)(max )(min)( )( '     ff ff f , (7) where f 'i (j) represents element at position j in the scaled feature vector of the image i, min fi (j) and max fi (j) are minimal and maximal elements at position j among all training feature vectors. the resulting feature vector will be normalized to range [0, 1]. this 406 s. cvetković, s, v. nikolić, s. ilić approach has the advantage that the relative distributions (variances) of both rows and columns of the feature matrix are preserved. another approach to be considered is scaling to zero mean and unit variance (“zscore”). it is defined by: njmi j i stdev j i meanj i j i ,...,1;,...,1, )( )()( )( '    f ff f (8) where mean fi (j) and stdev fi (j) represents mean and standard deviation of elements at position j among all training feature vector. the comparative evaluation of two scaling approaches will reveal their influence on svm classification performance. as we will show later on, “z-score” significantly outperforms the first method, and hence was chosen as the optimal one. it will be shown that proper feature scaling may increase classification accuracy up to several percent. besides appropriate scaling of feature vectors, svm requires to choose a kernel function with corresponding parameters. we considered a commonly used non-linear gaussian rbf kernel with l2 norm, over the scaled feature vectors. for optimal selection of the svm parameters pair we applied “n-fold cross validation” which separates the training dataset into n subsets and tests every subset using a svm classifier trained on the remaining subsets. systematic “grid-search” [13] was performed over various pairs of values to select the pair with the best accuracy. in order to limit the search complexity, parameter values for evaluation were sampled to form a grid of equidistant steps. 5. experimental evaluation image datasets currently, there is an evident lack of a comprehensive standard dataset for indooroutdoor image classification testing. most of the proposed methods in the literature use their custom image datasets. for the purpose of objectivity, we will test and compare our results only with relevant methods whose test datasets are publicly available. thus, methods presented in [19] and [10] were used for comparison using datasets they provided. the first image dataset is the iitm-scid2 (extended scene classification image database) introduced in [19]. it contains 902 indoor-outdoor images with a wide variation of scenes and resolutions in range from 80x80 up to 2048x1536 pixels. out of this dataset, 193 indoor and 200 outdoor images are used for training, while 249 indoor and 260 outdoor images for testing. compared to the second dataset, images of this dataset show a large variation in the scene content and resolution, which makes this dataset more suitable for testing of the real world performances. the second dataset, hereafter referred as corel-inout, was provided by the authors in [10]. its basis is the wang’s image database [30] extended with various images obtained from the web. it consists of a total of 1276 indoor-outdoor images of different scenes, all of the 256x256 pixels size. specifically, 650 of the images were used for the training of classifiers among which 320 are indoor and 330 outdoor images. for the verification phase, other 626 images composed of 310 indoor and 316 outdoor images, are used. examples of images from both datasets are shown in fig. 1. effective combining of color and texture descriptors for indoor-outdoor image classification 407 a) b) c) d) fig. 1 examples of test images: a) iitm-scid2 indoor, b) iitm-scid2 outdoor, c) corel-inout indoor, d) corel-inout outdoor test results we have tested performance of indoor-outdoor image classification using various combinations of descriptors as well as two feature scaling approaches. the prototype system is implemented in matlab, where mpeg-7 features are extracted using c++ implementation [31]. for svm classification we utilized matlab implementation of libsvm library [32]. in the first experiment we have tested the impact of feature vectors scaling on svm classification accuracy. table 2 presents accuracy of svm classification for different single descriptors, when feature vectors are scaled using two approaches: min-max and “z-score”. table 2 impact of feature scaling method on the svm classification accuracy (in %) descriptor (dimension) iitm-scid2 dataset corel-inout dataset min-max z-score min-max z-score scd (64) 81.73 84.68 70.73 70.93 csd (64) 84.87 87.43 80.67 83.23 cld (22) 78.98 82.71 80.83 82.59 htd (62) 79.17 82.12 74.76 79.39 ehd (80) 83.10 87.03 83.70 83.87 we have performed a comprehensive experimental evaluation in order to get the combination of mpeg-7 features that gives the best classification performances. table 3 presents svm classification accuracy of the proposed method for different combinations of mpeg-7 color and texture descriptors. note that each combined descriptor includes at least one color and one texture descriptor. the same tests are performed on iitm-scid2 and corel-inout image datasets. 408 s. cvetković, s, v. nikolić, s. ilić table 3 accuracy of svm classification using different combination of descriptors and “z-score” preprocessing (in %) descriptor combination dimens. iitm-scid2 dataset corel-inout dataset scd+htd 126 88.61 81.95 scd+ehd 144 91.36 87.38 csd+htd 126 88.02 86.74 csd+ehd 144 90.37 88.66 cld+htd 84 88.02 87.70 cld+ehd 102 91.55 91.53 scd+csd+htd 190 89.59 84.82 scd+csd+ehd 208 91.55 88.66 scd+cld+htd 148 92.34 88.66 scd+cld+ehd 166 93.71 91.53 csd+cld+htd 148 91.16 89.30 csd+cld+ehd 166 92.93 91.05 scd+htd+ehd 206 91.36 89.30 csd+htd+ehd 206 92.73 91.05 cld+htd+ehd 164 92.34 92.01 scd+csd+cld+htd 212 91.94 88.82 scd+csd+cld+ehd 230 93.32 91.05 scd+csd+htd+ehd 270 92.34 89.78 scd+cld+htd+ehd 228 93.32 92.17 csd+cld+htd+ehd 228 93.71 92.49 scd+csd+cld+htd+ehd 292 93.71 92.01 it can be observed that the combination of four descriptors csd+cld+htd+ehd gives the best overall results for both datasets, and therefore can be considered the optimal combination of mpeg-7 descriptors. it can also be noted that among combinations of two descriptors, cld+ehd performs better than all others. when considering three descriptors combinations, scd+cld+ehd gives the best average accuracy. general observation is that the introduction of additional descriptor does not necessarily lead to performance improvement. if a request is to have a fast and sufficiently accurate descriptor, than cld+ehd represents a reasonable choice, providing excellent costs/performance ratio. finally, table 4 presents the results of our method using the most accurate mpeg-7 descriptors combination (csd+cld+htd+ehd) with “z-score” scaling and svm classification, compared to the results of methods [19] and [10] on iitm-scid2 and corel-inout datasets, respectively. the results presented in table 4 show that the proposed method outperforms both compared methods. we have achieved 93.71% classification accuracy on iitm-scid2 dataset, which is better than 92.36% reported in [19]. on the second dataset, a result of 92.49% is improvement of over 2% compared to [10]. since the overall accuracy is over 92.49%, it may be concluded that the proposed method is very effective for the indoor-outdoor image classification. there should also be noted high quality of the results despite the relatively small size of the training datasets; knowing that svm requires a rather large dataset of images to obtain good generalization capabilities. effective combining of color and texture descriptors for indoor-outdoor image classification 409 table 4 accuracy comparison of different methods for indoor-outdoor image classification (in %) method iitm-scid2 dataset corel-inout dataset total indoor outdoor total indoor outdoor gupta et al. [19] 92.36 94.00 90.80 kim et al. [10] 90.26 90.00 90.29 our method 93.71 95.58 91.92 92.49 93.55 91.46 6. conclusion we have presented a relatively simple but highly accurate method for indoor-outdoor image classification based on combination of mpeg-7 features and svm classification. since we intended to create a computationally efficient method, we chose to apply the combination of low-level color and texture features in which all features contribute equally to the final result. we have empirically found that the combination of four mpeg-7 descriptors (csd+cld+htd+ehd) scaled to zero mean and unit variance before input into svm classifier, outperforms all others. also, the combination of only two descriptors cld+ehd is a good trade-off if we further intend to reduce computational costs while retaining the high level of accuracy. experiments conducted on two public datasets achieved 93.71% and 92.49% accuracy, which is comparative to the top results previously published in the literature. future research will be targeted towards using regions of interest (roi) [33] for performance improving. references [1] r. datta, d. joshi, j. li, j. z. wang, “image retrieval: ideas, influences, and trends of the new age,” acm computing surveys, vol. 40, no. 2, pp. 1-60, 2008. [2] s. bianco, g. ciocca, c. cusano, r. schettini, “improving color constancy using indoor-outdoor image classification,” ieee transactions on image processing, vol. 17, no. 12, pp. 2381-2392, 2008. [3] l. zhang, m. li, h.-j. zhang, “boosting image orientation detection with indoor vs. outdoor classification,” proceedings of wacv ’02, washington, dc, usa, ieee computer society, 2002, pp. 95-99. [4] a. vailaya, m. a. t. figueiredo, a. k. jain, h.-j zhang, “image classification for content-based indexing,” ieee transactions on image processing, vol. 10, no. 1, pp. 117-130, 2001. [5] j. collier, a. ramirez-serrano, “environment classification for indoor/outdoor robotic mapping,” proceedings of canadian conference on computer and robot vision crv’09, kelowna, british columbia, canada, 2009, pp. 276-283. [6] m. boutell, j. luo, “beyond pixels: exploiting camera metadata for photo classification,” pattern recognition, vol. 38, no. 6, pp. 935-946, 2005. [7] x. liu, l. zhang, m. li, h. zhang, d. wang, ”boosting image classification with lda-based feature combination for digital photograph management,” pattern recognition, vol. 38, pp. 887-901, 2005. [8] n. serrano, a. savakis, j. luo, “improved scene classification using efficient low-level features and semantic cues,” pattern recognition, 37(9), pp. 1773-1784, 2004. [9] l. lu, k. toyama, g. d. hager, ”a two level approach for scene recognition,” proceedings of cvpr’05, washington, dc, ieee computer society, 2005, pp. 688-695. [10] w. kim, j. park, c. kim, “a novel method for efficient indoor–outdoor image classification,” journal of signal processing systems, vol. 61, no. 3, pp. 251-258, 2010. [11] h. eidenberger, „statistical analysis of content-based mpeg-7 descriptors for image retrieval,” multimedia systems, vol. 10, no. 2, pp. 84-97, 2004. 410 s. cvetković, s, v. nikolić, s. ilić [12] t. deselaers, d. keysers, h. ney, “features for image retrieval: a quantitative comparison,” proceedings of dagm sspr’04, tübingen, germany, springer, 2004, pp. 228-236. [13] t. deselaers, d. keysers, h. ney, “features for image retrieval: an experimental comparison,” information retrieval, vol. 11, pp. 77-107, 2008. [14] b.s. manjunath, p. salembier, t. sikora, introduction to mpeg-7, san francisco, ca, usa, wiley, 2002. [15] s. chang, t. sikora, a. puri, “overview of the mpeg-7 standard,” ieee transactions on circuits and systems for video technology, vol. 11, no. 6, pp. 688-695, 2001. [16] s. n. lindstaedt, r. mörzinger, r. sorschag, v. pammer, g. thallinger, “automatic image annotation using visual content and folksonomies,” multimedia tools and applications, vol. 42, no. 1, pp. 97-113, 2009. [17] c. g. m. snoek, m. worring, a. w. m. smeulders, “early versus late fusion in semantic video analysis,” proceedings of acm multimedia ’05, new york, ny, usa, acm, 2005, pp. 399-402. [18] n. serrano, a. savakis, a. luo, “a computationally efficient approach to indoor/outdoor scene classification,” proceedings of icpr’02, quebec city, canada, 2002, pp. 146-149. [19] l. gupta, v. pathangay, a. patra, a. dyana, s. das, “indoor versus outdoor scene classification using probabilistic neural network,” eurasip journal on advances in signal processing, pp. 1-11, 2007. [20] s. park, “content-based image classification using a neural network,” pattern recognition letters, vol. 25, no. 3, pp. 287-300, 2004. [21] m. szummer, r. w. picard, “indoor-outdoor image classification,” proceedings of iwcbaivd’98, ieee computer society, 1998, pp. 42-51. [22] a. payne, s. singh, “indoor vs. outdoor scene classification in digital photographs,” pattern recognition, vol. 38, no. 10, 2005, pp. 1533-1545, 2005. [23] h. zhang, a. c. berg, m. maire, j. malik, “svm-knn: discriminative nearest neighbor classification for visual category recognition,” in proceedings of cvpr’06, new york, ny, usa, 2006, pp. 2126-2136. [24] m. varma and d. ray, “learning the discriminative power-invariance trade-off,” proceedings of iccv’07, rio de janeiro, brazil, 2007, pp. 1-8. avilable: http://dx.doi.org/10.1109/iccv.2007.4408875 [25] b. s. manjunath, j. r. ohm, v.v. vinod, a. yamada, “color and texture descriptors,” ieee trans. circuits and systems for video technology, vol. 11, no. 6, pp. 703-715, 2001. [26] a. yamada, m. pickering, s. jeannin, l. cieplinski, j. r. ohm, m. kim, mpeg-7 visual part of experimentation model version 10.0. iso/iec jtc1/sc29/wg11/n4063, 2001. [27] r. datta, j. li, j. z. wang, “content-based image retrieval: approaches and trends of the new age,” proceedings acm sigmm mir ’05, new york, ny, usa, acm, 2005, pp. 253-262. [28] s. ayache, g. quénot, j. gensel, “classifier fusion for svm-based multimedia semantic indexing,” proceedings of ecir’07, berlin, germany, springer-verlag, 2007, pp. 494-504. [29] r. j. larsen and m. l. marx. an introduction to mathematical statistics and its applications, pearson prentice hall, 2006. [30] j. z. wang, j. li, g. wiederhold, “simplicity: semantics-sensitive integrated matching for picture libraries,” ieee transactions on pattern analysis and machine intelligence, vol. 23, no. 9, pp. 947-963, 2001. [31] m. bastan, h. cam, u. gudukbay, o. ulusoy, “bilvideo-7: an mpeg-7compatible video indexing and retrieval system,” ieee multimedia, vol. 17, pp. 62-73, 2010. [32] c.-c. chang and c.-j. lin, “libsvm: a library for support vector machines,” acm transactions on intelligent systems and technology, vol. 2, no. 3, pp. 1-27, 2011. [33] j. lee, j. nang, “content-based image retrieval method using the relative location of multiple rois,” advances in electrical and computer engineering, vol. 11, no. 3, pp. 85 – 90, 2011. [34] m. soysal and a.a. alatan, “combining mpeg‐7 based visual experts for reaching semantics,” in proc. of vlbv03, madrid, 2003. [35] d. lu and d. weng, "a survey of image classification methods and techniques for improving classification performance," international journal of remote sensing, vol. 28, issue 5, 2007, pp. 823‐870. [36] j. li and j.z. wang, “automatic linguistic indexing of pictures by a statistical modeling approach,” ieee trans. on pami, vol. 25, no. 9, 2003, pp.1075‐1088. microsoft word fu_ee_25_final_paper le facta universitatis series: electronics and energetics vol. 27, no 2, june 2014, pp. 251 258 doi: 10.2298/fuee1402251c cmos ic radiation hardening by design alessandra camplani, seyedruhollah shojaii, hitesh shrimali, alberto stabile, valentino liberali infn-milano and department of physics, università degli studi di milano via g. celoria, 16 – 20133 milano, italy abstract. design techniques for radiation hardening of integrated circuits in commercial cmos technologies are presented. circuits designed with the proposed approaches are more tolerant to both total dose and to single event effects. the main drawback of the techniques for radiation hardening by design is the increase of silicon area, compared with a conventional design. key words: radiation hardening, cmos technology, integrated circuits 1. introduction commercial integrated circuits (ics) may not have an adequate level of immunity to radiations to guarantee good reliability in harsh environments. radiation hard circuits undergo a set of qualification tests, before being used in space (satellites) or in nuclear applications (high energy physics, nuclear power plants, medical equipments for radiology and radiotherapy). however, is worth remarking that every electronic equipment can be affected by low dose rate radiation, due to various sources: e.g., natural radioactivity in materials, high energy cosmic rays, x-ray scanners in airports, etc. the evolution of ic fabrication technology towards ever more dense integration scale has a twofold effect on radiation tolerance: at modern nano-scale size, devices are more tolerant to cumulative (long-term) effects, but on the other hand they are more prone to soft errors due to single events. therefore, design of complex integrated systems should account for such effects. in recent years, specific techniques have been developed to obtain integrated circuits with a high immunity to radiations. radiation tolerance can be increased either by modifying the fabrication process (rhbp: radiation hardening by process), or by adopting design techniques (rhbd: radiation hardening by design). in this paper, rhbd techniques are presented, to achieve a satisfactory tolerance to both total dose and single event effects in mos devices and circuits.  received february 3, 2014 corresponding author: valentino liberali infn-milano and department of physics, università degli studi di milano via g. celoria, 16 – 20133 milano, italy (e-mail: valentino.liberali@mi.infn.it) 252 a. camplani, s. shojaii, h. shrimali, a. stabile, v. liberali 2. interaction between radiation and silicon the interaction between external radiation (photons like x-rays and γ-rays, charged particles like protons, electrons, and heavy ions, or neutral particles) and a semiconductor may cause two main phenomena: ionization and displacement. 2.1 ionization phenomenon when radiation interacts with the semiconductor material, an electron in the valence band may acquire enough energy to pass in the conduction band. therefore, an electronhole pair (hep) is generated: a free electron is present in the conduction band and a hole in the valence band. if an electric field exists in the ionization region (e.g., in biased devices), heps are separated and carriers move within the semiconductor, giving an extra (parasitic) current. then, the carriers may recombine, or remain trapped, or drift into an electrode. the ionization phenomenon is measured with the linear energy transfer (let). the let indicates the quantity of energy lost by the incident particle along its path into the target material. the let depends on atomic number of the particles and on energy of the particle, target material and the collision location: let = ∙ mev ∙ (1) where is the density of the target material, and indicates the average energy transferred into the target material per length unit along the particle trajectory. ionization effects can be divided into two main categories:  temporary ionization effect is due to hep separation and generation of a parasitic current;  fixed ionization effect is due to trapping of carriers in insulators, where the mobility of carriers is lower than in the semiconductor, or at the interface between insulator and semiconductor; when positive charges are trapped, a shift of device parameters occurs, and circuit performance may be affected. 2.2 displacement when a neutral particle interacts with the silicon lattice, it transfers energy to lattice atoms. a transferred energy greater than 20 ev can displace a silicon atom, which moves toward an interstitial position, and the displaced atom man displace other atoms along its trajectory. defects due to atom displacement in the silicon lattice act as energy levels within band-gap. these levels alter electric properties of semiconductor (e.g., life time of minority carriers, doping density, mobility, etc.). 3. radiation effects on ics damaging effects due to radiation can be divided into two major categories: cumulative effects due to a long-time exposure to radiation, and single event effects due to the interaction with a single particle. cmos ic radiation hardening by design 253 3.1 cumulative effects from the viewpoint of circuit performance, cumulative effects can be divided into total ionizing dose (tid) effects, caused either by charged particles (e.g., electrons or protons), or by photons (x-rays and γ-rays), and displacement damage dose (ddd) effects, caused by massive particles (e.g., neutrons, protons, or heavy ions). in cmos integrated circuits, the most sensitive region to cumulative effects is the gate oxide. when a single particle collides with the oxide, heps are generated; if the ionized region is crossed by an electric field, electrons and holes are separated. electrons are quickly collected by neighboring electrodes because their mobility is approximately 20 cm2/(vs), while holes move slowly by hopping transport toward the sio2-si interface, because their mobility ranges from 10 4 cm2/(vs) to 10 11 cm2/(vs). these holes remain trapped into the oxide for a long time (approximately from 10 3 s to 10 6 s) [1]. the trapped holes can be seen as fixed positive charges, which obviously introduce a negative shift in threshold voltage ∆ , given by: ∆ = − ∆ = − ∆ (2) where q is the elementary charge, cox = ox /tox is the oxide capacitance per unit area, not is the density of trapped holes into the oxide, ox is the dielectric constant of the oxide, tox is the oxide thickness. at the first degree of approximation, ∆ is proportional to . for very thin gate oxide (e.g, for thickness lower than approximately 3 nm), threshold shift becomes negligible [2]. however, field oxides are thick (approximately in the range from 100 nm to 1000 nm) and trap positive charged particles. charge trap effects occur especially in the shallow trench isolation (sti) regions at the transition between field thick oxide and gate thin oxide. the region on the side of an sti can be modeled as a parasitic transistor in parallel to the mos transistor channel. parasitic transistors have the same length as designed transistors, however their voltage threshold is larger, due to thick oxide, so the parasitic transistors are normally turned off. however, positive charged particles are trapped in the thick oxide region attract negative carriers, and this charge can be seen as a fixed charge on the gate of parasitic transistors that could turn on, thus creating a parasitic path between drain and source, in parallel with the mos transistor channel. in an nmos transistor, tid may induce a parasitic channel between the source and the drain, leading to a leakage current when the nmos device is in the “off” state (fig. 1). furthermore, channel carriers can be trapped at the si-sio2 interface [3], decreasing carrier mobility and transconductance. fig. 1 holes trapped in the shallow trench isolation (sti) 254 a. camplani, s. shojaii, h. shrimali, a. stabile, v. liberali in a pmos transistor, tid causes an increase of the threshold voltage and a reduction of the effective channel width. the latter effect is negligible for usual transistor sizes; however, for very narrow pmos devices (with ≪ 1⁄ ), this effect must be taken into account [4]. ddd effects are due to collisions between neutral particles and nuclei of silicon belonging to the lattice structure [5]. lattice defects at si-sio2 interface introduce energy states in the band-gap, which may trap channel carriers. the voltage threshold shift is: ∆ = − (2) where is the trapped charge at the interface, which depends on device biasing. moreover, trap states due to lattice defects facilitate electron transitions between valence band and conduction band, and the carrier mobility decreases [6]: = ∙∆ (3) where is the pre-irradiated mobility, is a parameter dependent on the chosen technology, is the number of charges trapped at interface. it is important to point out that nowadays tid effects are negligible in the ic core. therefore, only the circuit at the ic periphery (pad ring) require a special care, due to the higher voltage and the thicker oxide of the periphery transistors. 3.2 single event effects single event effects (see) are due to charge generation in a reverse-biased p-n junction in the cmos ic. the junction may be part of a mos transistor (drain-body of source-body), or may be a well-substrate junction. fig. 2 charge generation and parasitic current in a reverse-biased p-n junction the electric field in the reverse-biased p-n junction separates electrons and holes. the generated carriers are collected by neighbouring electrodes, thus giving a parasitic current with a peak due to carrier drift, followed by a tail due to carrier diffusion (fig. 2). from a functional viewpoint, the current due to see may cause a soft error, which is a non-destructive and temporary effect, or a hard error, which cause irreversible effects and is destructive. a soft error is a non destructive see, i.e., an effect that do not cause a permanent damage to the ic [7]. soft errors occur when the total parasitic charge generated is larger than the critical charge of the affected node. cmos ic radiation hardening by design 255 a single event transient (set) is a transient glitch which affects the voltage of a node in combinational logic. transients are temporary, however they may propagate to adjacent nodes where the effect of other set can be added. sometimes, the sum of set can trigger damaging effects [8]. a single event upset (seu) occurs when a see changes the logic value of a memory cell (e.g., a latch), or when set propagation toggles the data stored into a memory [9]. if a seu affects two or more memory cells, a multiple bit upset (mbu) occurs. a seu in the control logic may lead to a single event functional interruption (sefi). a single event latch-up (sel) is due to a see that triggers on a positive gain loop due to parasitic bipolar transistors in cmos technology, leading to a high current intensity in the loop, which may damage the ic interconnections if the device is not turned off promptly [10]. other destructive see are the single event burnout (seb), which occurs in high voltage devices when an avalanche multiplication mechanism is triggered by a parasitic charge in a p-n junction reverse biased [11], and single event gate rupture (segr), when the displacement effect combined with a high parasitic gate current can result in an oxide gate rupture [12]. seb and segr occur in power mos transistors, and are not a concern for cmos logic. hence, they will not considered in the following sections of the paper. sensitivity versus see is measured with the cross section (in square centimeters), which represent sensitive area of device. 4. design of radiation-hardened mos devices special design techniques can be adopted to improve device tolerance to radiation. 4.1 nmos transistors edge-less transistors (elt) are mos transistors with annular gate shape. this geometry was proved to reduce current leakage due to cumulative effects in nmos transistors, even at very high total doses, at the expense of a larger area, as shown in fig. 3(a) [13]-[14]. (a) (b) fig. 3 layout of (a) nmos elts; (b) conventional pmos transistors 256 a. camplani, s. shojaii, h. shrimali, a. stabile, v. liberali when using elts, the internal side of the ring-shaped transistor should be used as the drain terminal of the mos device, and the external side should be the source terminal. in this way, the design minimizes the area of the drain, which is the most sensitive node for see, thus reducing the cross-section. 4.2 pmos transistors pmos transistors are not prone to current leakage, since hole trapping do not attract channel carriers. therefore, pmos transistors do not require elt shape, and they can be designed with conventional geometry, as shown in fig. 3(b), in order to save area and to maintain the ratio between pull-up and pull-down transistor sizes. 4.3 guard rings the use of double guard rings around p-wells and n-wells, biased to constant voltages, prevents sel [14]. moreover, the use of guard rings around transistors of the same type biased at different voltages reduces inter-device leakage, since positive charges trapped in the sti oxide cannot induce a parasitic channel between n-type diffusions at different voltages (fig. 4). fig. 5 shows a detail of the layout of a logic circuit employing both guard rings and elts. (a) (b) fig. 4 cross-section of two nmos transistors: (a) without guard rings; (b) with guard rings between the two transistors fig. 5 portion of a layout with elts and guard rings compared to conventional layout design, elts and guard rings require a larger silicon area. therefore, a higher level of radiati there are no sources in the current document.n tolerance can be achieved only at the expense of a larger area [15]. cmos ic radiation hardening by design 257 5. design of radiation-hardened cmos circuits an ic designer may use other radiation hardening techniques, such as redundancy and error correcting codes at the architectural level, and optimization of logic cells at circuit level. 5.1 architectural solutions at architectural level, radiation hardness can be improved by using redundant logic, such as ecc (error correcting codes). another example is the “scrambling” in a memory array: the physical location of bits do not correspond to the logical bit position, to avoid logical multiple bit upset (mbu) due to see. a further improvement can be obtained by storing each bit of a byte into a different memory array, and by providing each memory array with separate bit-line and word-line decoders, to avoid mbus due to address upset [15]. 5.2 logic circuits sensitivity to see can be analyzed through injection of “soft faults” in different circuit locations [17]. simulation results demonstrate the most sensitive nodes with respect to set are the circuit nodes which are not directly connected to voltage supplies. therefore, set sensitivity can be reduced by using fully cmos logic and by minimizing the number of transistors which are not directly connected to supplies [18]. to mitigate sefi, the numbers of feedback loops in the circuits must be minimized. 6. conclusion this paper has presented an overview of the effects due to the interaction between radiation and ics. the overview also emphasizes some design techniques developed to avoid or to mitigate radiation effects. it is important to remark that design solutions to improve radiation hardness lead to an increase of the ic area. nevertheless, they should be adopted when robustness in radiation environment is an important parameter. in addition, rhbd techniques in comparison with other approaches (shield or component selections) can be applied to different fabrication processes in order to increase the overall radiation hardening. references [1] p. j. mcwhorter and p. s. winokur, “simple technique for separating the effects of interface traps and trapped-oxide charge in metal-oxide-semiconductor transistors,” appl. phys. lett., vol. 48, pp. 133–135, jan. 1986. [2] n. s. saks, m. g. ancona, and j. a. modolo, “radiation effects in mos capacitors with very thin oxides at 80 k,” ieee trans. nucl. sci., vol. 31, pp. 1249–1255, dec. 1984. [3] f. b. mclean, “a framework for understanding radiation-induced interface states in sio2 mos structures,” ieee trans. nucl. sci., vol. 27, pp. 1651–1657, dec. 1980. [4] m. gaillardin, v. goiffon, s. girard, m. martinez, p. magnan, and p. paillet, “enhanced radiationinduced narrow channel effects in commercial 0.18 μm bulk technology,” ieee trans. nucl. sci., vol. 58, pp. 2807–2815, dec. 2011. 258 a. camplani, s. shojaii, h. shrimali, a. stabile, v. liberali [5] g. baccarani and m. r. wordeman, “transconductance degradation in thin-oxide mosfet’s,” ieee trans. electron devices, vol. 30, pp. 1295–1304, oct. 1983. [6] j. r. schwank, f. w. sexton, m. r. shaneyfelt, and d. m. fleetwood, “total ionizing dose hardness assurance issues for high dose rate environments,” ieee trans. nucl. sci., vol. 54, pp. 1042–1048, aug2007. [7] t. c. may, “soft errors in vlsi: present and future,” ieee trans. comp., hybrids, manufact. technol., vol. 2, pp. 377–387, dec. 1979. [8] j. l. andrews, j. e. schroeder, b. l. gingerich, w. a. kolasinski, r. koga, and s. e. diehl, “single event error immune cmos ram,” ieee trans. nucl. sci., vol. 29, pp. 2040–2043, dec. 1982. [9] l. t. clark, k. c. mohr, k. e. holbert, x. yao, j. knudsen, and h. shah, “optimizing radiation hard by design sram cells,” ieee trans. nucl. sci., vol. 54, pp. 2028–2036, dec. 2007. [10] johnston, “the influence of vlsi technology evolution on radiation induced latchup in space systems,” ieee trans. nucl. sci., vol. 43, pp. 505–521, apr. 1996. [11] j. h. hohl and k. f. galloway, “analytical model for single event burnout of power mosfets,” ieee trans. nucl. sci., vol. 34, pp. 1275–1280, dec. 1987. [12] f. wheatley, j. l. titus, and d. i. burton, “single-event gate rupture in vertical power mosfets; an original empirical expression,” ieee trans. nucl. sci., vol. 41, pp. 2152–2159, dec. 1994. [13] g. anelli, m. campbell, m. delmastro, f. faccio, s. floria, a. giraldo, e. heijne, p. jarron, k. kloukinas, a. marchioro, p. moreira, and w. snoeys, “radiation tolerant vlsi circuits in standard deep submicron cmos technologies for the lhc experiments: practical design aspects,” ieee trans. nucl. sci., vol. 46, pp. 1690–1696, dec. 1999. [14] calligaro, v. liberali, a. stabile, m. bagatin, s. gerardin, and a. paccagnella, “a multi-megarad, radiation hardened by design 512 kbit sram in cmos technology,” in proc. ieee int. conf. on microelectronics (icm), cairo, egypt, dec. 2010, pp. 375–378. [15] m. benigni, v. liberali, a. stabile, and c. calligaro, “design of rad-hard sram cells: a comparative study,” in proc. ieee int. conf. on microelectronics (miel), niš, serbia, may 2010, pp. 279–282. [16] stabile, v. liberali, and c. calligaro, “a radiation hardened 512 kbit sram in 180 nm cmos technology,” in proc. int. conf. on electronics, circuits and systems (icecs), hammamet, tunisia, dec. 2009, pp. 655–658. [17] do, v. liberali, a. stabile, and c. calligaro, “layout-oriented simulation of non-destructive single event effects in cmos ic blocks,” in proc. eur. conf. on radiation and its effects on components and systems (radecs), bruges, belgium, sep. 2009. [18] stabile, v. liberali, and c. calligaro, “design of a rad-hard library of digital cells for space applications,” in proc. int. conf. on electronics, circuits and systems (icecs), malta, sept. 2008, pp. 149–152. << /ascii85encodepages false /allowtransparency false /autopositionepsfiles true /autorotatepages /none /binding /left /calgrayprofile (dot gain 20%) /calrgbprofile (srgb iec61966-2.1) /calcmykprofile (u.s. web coated \050swop\051 v2) /srgbprofile (srgb iec61966-2.1) /cannotembedfontpolicy /error /compatibilitylevel 1.4 /compressobjects /tags /compresspages true /convertimagestoindexed true /passthroughjpegimages true /createjobticket false /defaultrenderingintent /default /detectblends true /detectcurves 0.0000 /colorconversionstrategy /cmyk /dothumbnails false /embedallfonts true /embedopentype false /parseiccprofilesincomments true /embedjoboptions true /dscreportinglevel 0 /emitdscwarnings false /endpage -1 /imagememory 1048576 /lockdistillerparams false /maxsubsetpct 100 /optimize true /opm 1 /parsedsccomments true /parsedsccommentsfordocinfo true /preservecopypage true /preservedicmykvalues true /preserveepsinfo true /preserveflatness true /preservehalftoneinfo false /preserveopicomments true /preserveoverprintsettings true /startpage 1 /subsetfonts true /transferfunctioninfo /apply /ucrandbginfo /preserve /useprologue false /colorsettingsfile () /alwaysembed [ true ] /neverembed [ true ] /antialiascolorimages false /cropcolorimages true /colorimageminresolution 300 /colorimageminresolutionpolicy /ok /downsamplecolorimages true /colorimagedownsampletype /bicubic /colorimageresolution 300 /colorimagedepth -1 /colorimagemindownsampledepth 1 /colorimagedownsamplethreshold 1.50000 /encodecolorimages true /colorimagefilter /dctencode /autofiltercolorimages true /colorimageautofilterstrategy /jpeg /coloracsimagedict << /qfactor 0.15 /hsamples [1 1 1 1] /vsamples [1 1 1 1] >> /colorimagedict << /qfactor 0.15 /hsamples [1 1 1 1] /vsamples [1 1 1 1] >> /jpeg2000coloracsimagedict << /tilewidth 256 /tileheight 256 /quality 30 >> /jpeg2000colorimagedict << /tilewidth 256 /tileheight 256 /quality 30 >> /antialiasgrayimages false /cropgrayimages true /grayimageminresolution 300 /grayimageminresolutionpolicy /ok /downsamplegrayimages true /grayimagedownsampletype /bicubic /grayimageresolution 300 /grayimagedepth -1 /grayimagemindownsampledepth 2 /grayimagedownsamplethreshold 1.50000 /encodegrayimages true /grayimagefilter /dctencode /autofiltergrayimages true /grayimageautofilterstrategy /jpeg /grayacsimagedict << /qfactor 0.15 /hsamples [1 1 1 1] /vsamples [1 1 1 1] >> /grayimagedict << /qfactor 0.15 /hsamples [1 1 1 1] /vsamples [1 1 1 1] >> /jpeg2000grayacsimagedict << /tilewidth 256 /tileheight 256 /quality 30 >> /jpeg2000grayimagedict << /tilewidth 256 /tileheight 256 /quality 30 >> /antialiasmonoimages false /cropmonoimages true /monoimageminresolution 1200 /monoimageminresolutionpolicy /ok /downsamplemonoimages true /monoimagedownsampletype /bicubic /monoimageresolution 1200 /monoimagedepth -1 /monoimagedownsamplethreshold 1.50000 /encodemonoimages true /monoimagefilter /ccittfaxencode /monoimagedict << /k -1 >> /allowpsxobjects false /checkcompliance [ /none ] /pdfx1acheck false /pdfx3check false /pdfxcompliantpdfonly false /pdfxnotrimboxerror true /pdfxtrimboxtomediaboxoffset [ 0.00000 0.00000 0.00000 0.00000 ] /pdfxsetbleedboxtomediabox true /pdfxbleedboxtotrimboxoffset [ 0.00000 0.00000 0.00000 0.00000 ] /pdfxoutputintentprofile () /pdfxoutputconditionidentifier () /pdfxoutputcondition () /pdfxregistryname () /pdfxtrapped /false /createjdffile false /description << /ara /bgr /chs /cht /cze /dan /deu /esp /eti /fra /gre /heb /hrv (za stvaranje adobe pdf dokumenata najpogodnijih za visokokvalitetni ispis prije tiskanja koristite ove postavke. stvoreni pdf dokumenti mogu se otvoriti acrobat i adobe reader 5.0 i kasnijim verzijama.) /hun /ita /jpn /kor /lth /lvi /nld (gebruik deze instellingen om adobe pdf-documenten te maken die zijn geoptimaliseerd voor prepress-afdrukken van hoge kwaliteit. de gemaakte pdf-documenten kunnen worden geopend met acrobat en adobe reader 5.0 en hoger.) /nor /pol /ptb /rum /rus /sky /slv /suo /sve /tur /ukr /enu (use these settings to create adobe pdf documents best suited for high-quality prepress printing. created pdf documents can be opened with acrobat and adobe reader 5.0 and later.) >> /namespace [ (adobe) (common) (1.0) ] /othernamespaces [ << /asreaderspreads false /cropimagestoframes true /errorcontrol /warnandcontinue /flattenerignorespreadoverrides false /includeguidesgrids false /includenonprinting false /includeslug false /namespace [ (adobe) (indesign) (4.0) ] /omitplacedbitmaps false /omitplacedeps false /omitplacedpdf false /simulateoverprint /legacy >> << /addbleedmarks false /addcolorbars false /addcropmarks false /addpageinfo false /addregmarks false /convertcolors /converttocmyk /destinationprofilename () /destinationprofileselector /documentcmyk /downsample16bitimages true /flattenerpreset << /presetselector /mediumresolution >> /formelements false /generatestructure false /includebookmarks false /includehyperlinks false /includeinteractive false /includelayers false /includeprofiles false /multimediahandling /useobjectsettings /namespace [ (adobe) (creativesuite) (2.0) ] /pdfxoutputintentprofileselector /documentcmyk /preserveediting true /untaggedcmykhandling /leaveuntagged /untaggedrgbhandling /usedocumentprofile /usedocumentbleed false >> ] >> setdistillerparams << /hwresolution [2400 2400] /pagesize [612.000 792.000] >> setpagedevice instruction facta universitatis series: electronics and energetics vol. 29, n o 3, september 2016, pp. 461 474 doi: 10.2298/fuee1603461d designing an intelligent home media center igor đurić 1 , vanjica ratković-živanović 2 , milica labus 1 , dragana groj 1 , nikola milanović 1 1 faculty of organizational sciences, university of belgrade, serbia 2 radio television of serbia, belgrade, serbia abstract. this paper presents design and implementation of a personal intelligent home media center. the primary goal was to increase the quality of life with the use of ambient intelligence in smart homes. the solution presented here uses client-server architecture with network-attached storage for storing all multimedia contents. sensors are used to identify person’s presence and ambient intelligence techniques to recommend the most suitable multimedia content to end-users. the major advantages of this personal intelligent home media center are speed, intelligence, inexpensive components and scalability. the implementation was done in within one home media center, for the evaluation purposes. key words: smart home, ambient intelligence, media center 1. introduction ambient intelligence (hereinafter: ami) offers an opportunity to realize an old dream the smart or intelligent home. people spend a lot of time in their homes and both social and technological drivers are broadening the scope of activities to be undertaken at home. advances in technology are ultimately improving the way home environment can react and adopt to residents‟ needs. besides sleeping, rest and entertainment are main functions performed at home. radio, tv and music records/cds have been the dominant entertainments in home environment for decades. with advances in technology, all these contents became digital and new form of multimedia entertainment (listening, watching, and interacting) was developed. each home has potentially many devices that store different digital contents (videos, music and photos) and media storage units (cds, dvds, etc.). each device has limited memory available, and often the same content is stored in many places. idea behind home media center is to integrate all these devices and units and to allow centralized storage, search, and playback, internet streaming and often even recording of digital contents. there is no universal plug and play solution since each home has different setup of devices. received june 30, 2015; received in revised form november 10, 2015 corresponding author: igor đurić faculty of organizational sciences, university of belgrade, jove ilića 154, 11000 belgrade, serbia (e-mail: igor@elab.rs) 462 i. đurić, v. ratković-živanović, m. labus, d. groj, n. milanović in this paper we provide an overview of ambient intelligent applications in smart homes and technological challenges behind implementation of intelligent home media centers. we also explore some of the existing smart or intelligent home media center solutions. in particular, we specifically focus on implementation of broadly implementable and affordable intelligent home media center based on inexpensive hardware and open source software. our goal was to develop intelligent home media center which can respond to specific needs and moods of its users. our system is in testing phase and future research will rely on users‟ feedback. 2. literature review 2.1. ambient intelligence in smart homes ambient intelligence is a new paradigm for an intelligent environment which uses information and communications technologies to become an active, adaptive and responsive to people presence and their needs, and thereby improving the quality of their lives [1]. there are various definitions of ami, but they all highlight the following features of underlying technologies: sensitive, responsive, adaptive, transparent, ubiquitous and intelligent [2]. according to european information society technology advisory group, ami is “a set of properties of an environment that we are in the process of creating” and that it should be treated as “imagined concept” and not as a set of specific requirements [3]. ami focuses on user(s) in their environment and emphasizes ease of use, user-empowerment and support for human interactions in seamless, unobtrusive and often invisible way [4]. ami is emerging discipline today. we have not only necessary supporting technology present, but the user demand has also reached that critical point for prosperous development. as cook, augusto and jakkula point out, ami incorporates aspects of context-aware computing 1 , disappearing computers 2 , and pervasive/ubiquitous 3 computing, and enriches them with artificial intelligence research in the fields of machine learning, agent-based software, and robotics [2]. ami applications have been developed in many fields such as smart homes, health monitoring and assistance, hospitals, transportation, emergency services, education, workplaces, etc.. in this project we have focused on smart homes. "smart home" is the term commonly used to define a residence that has appliances, lighting, heating, air conditioning, tvs, computers, entertainment audio & video systems, security, and camera systems that are capable of communicating with each other and which can be controlled remotely [5]. another definition according to the smart homes association is: “the integration of technology and services through home networking for a better quality of living” [6]. smart homes make life easier and more convenient. no matter where you are, smart system will alert you if something is going wrong in the house. for example, not only a resident will be woken up with notification of a fire alarm, the smart home would also unlock doors, dial the fire department and light the path to safety. 1 „context-aware computing is a style of computing in which situational and environmental information about people, places and things is used to anticipate immediate needs and proactively offer enriched, situation-aware and usable content, functions and experiences.“ – gartner it glossary (www.gartner.com/it-glossary/) 2 „the most profound technologies are those that disappear. they weave themselves into the fabric of everyday life until they are indistinguishable from it“[30] 3 „ubiquitous computing“ was first defined by weiser[31]; ibm later called it „pervasive computing“[29] http://www.gartner.com/it-glossary/ designing an intelligent home media center 463 there are many areas where ami is applied in functions of smart home applications: home automation, communication and socialization, resting, refreshing, entertainment, sport, working and learning [7]. home automation covers basic housing supporting functions, like heating, piping, ventilation, air-conditioning or hpac, lighting, electrical installations, but also home security functions and the functions to increase the autonomy and support the independent living, especially of elderly residents. one example is the european ist amigo project which developed a networked home system of heterogeneous devices and services from the following domains: personal computing, mobile computing, consumer electronics and home automation [8]. besides improving quality of people‟s lives, home automation solutions are addressing energy efficiency by providing comprehensive support for energy savings [9]. electric bills go down when lights are automatically turned off when a person leaves the room, and rooms can be heated or cooled based on who's there at that moment. some devices can track how much energy each appliance is using and command it to use less. communication and socialization functions are already well established at homes by the use of landline phones, internet, tv, mobile phones and other hand-held/hands-free devices. one of the further developments in this area will be dynamic networking where ami technologies would seamlessly put people in contact based on comparable permanent patterns of interest or specific requests with the use of context modeling techniques [10] applications of ambient intelligence embedded in clocks, beds, lamps, windows, floors, ceilings, furniture, etc., can improve sleeping and other forms of relaxation in home. important issues to address here are activity recognition and conflict resolution [11]. another ami trend is to optimize the time which is anyway consumed in the bathroom for other functions, for example bathroom mirror could display clock, news, weather, advices on health improvement based on person‟s weight [12]. entertainment and sport activities are not necessarily done at home, but ami technologies can again enrich this experience. for example, voice recognition could be combined with databases so that resident can turn on the music by simply humming a few lines of a song [12]. one of the challenges is to allow the right balance between relatively passive enjoyment of multimedia entertainment and interactive engagement in them. regarding exercising at home, friedewald et al. envision that future trend is to integrate physical exercise capacities into „„ordinary‟‟ furniture placed in living room, bedroom or even kitchen [7]. ami technologies promise tremendous benefits for an elderly persons living alone. smart systems can notify residents when it is time to take the medicine, alert the hospital if a resident fell, track how much residents are eating, automatically turn off the water before a tub overflow or turn off the oven if no one is present in the home. smart home systems provide an opportunity for adult children who live elsewhere to participate in the care of their aging parents [6]. ami environment consist of sensors, controllers and intelligent agents. sensors gather data from the real world based on which intelligent agents perceive the state of the environment and users. intelligent agents are systems that can decide what to do and then do it [13]. they reason about the gathered data using a variety of ami techniques, and act upon the environment using controllers. thus, sensing, reasoning, and acting are the main functional parts of ami algorithms [2]. there are wired and wireless sensors. they can be integrated into the environment or they can be attached to persons or items. the example of the latter case is rfid tags that 464 i. đurić, v. ratković-živanović, m. labus, d. groj, n. milanović can be coupled with an rfid reader to monitor the movement of the tagged objects. when analyzing sensors data, ami systems may employ a centralized or distributed model [14]. resent research implies that wired and wireless distributed computing are a key mean to accomplish established ami goals [1]. reasoning is accomplished through modeling of user‟s behavior, activity prediction and recognition, decision making, and spatial-temporal reasoning [15]. in the mavhome (managing an intelligent versatile home) smart home project a data mining pre-processor identifies common sequential patterns in data, and then uses those patterns to build a hierarchical model of resident behavior [16]. luhr goes even further and uses video data to find sequential association rules in resident actions [17]. other examples of projects which all adaptively control home environments by anticipating the location, routes and activities of their residents are the neural network house [18], the intelligent home [19] and the placelab [20]. ami systems execute actions through various controllers, robots and other intelligent and assistive devices. mobile robot assistants are already found in nursing homes [21], developed to assist elderly individuals with mild cognitive and physical impairments, as well as to support nurses in their daily activities. the goal of many leading smart home projects associated with wearable/implantable monitoring systems and assistive robotics is to allow older people to live autonomously in a comfortable and secure environment [22]. important challenge for ambient intelligence is how to make technology learn about the people and their identity and how to apply such knowledge in varying contexts but at the same time how to secure a sufficient degree of privacy and prevention against misuse so that people will trust intelligent world that surrounds them. the future of ami depends on how successful it will address those important security and trust issues [23]. other areas of further research within domain of smart homes are better use of resources, home security, appliance management, digital entertainment, energy management and assistive computing/health care, as well as smart environments to support elderly and disabled persons [6]. 2.2. intelligent home media centers various solutions for intelligent media players are accessible on the market with numerous advantages as well as some challenges. in this paper we discuss some of the best media players that were examined as a starting point in our development of a personal intelligent home media center. one of widely used open-source media player is kodi (formerly xbmc), developed by the xbmc foundation, a non-profit technology consortium. it is media center for playing videos, music, pictures, games, and more. kodi operates on a linux, os x, windows, ios and android, with a 10-foot user interface for use with televisions and remote controls. it allows users to play and view videos, music, podcasts, and other digital media files from local and network storage media and the internet. one of the challenges is that it needs a 3d capable graphics hardware controller for rendering. additional issue is that kodi's internal cross-platform video and audio players (dvdplayer and paplayer) cannot play any audio or video files that are protected/encrypted with drm (digital rights management) technologies for access control [24]. other examples of home media center solutions are: noontec n5 gigalink and ciscolinksys media center extender with dvd. https://en.wikipedia.org/wiki/consortium https://en.wikipedia.org/wiki/3d_computer_graphics https://en.wikipedia.org/wiki/gpu https://en.wikipedia.org/wiki/encryption https://en.wikipedia.org/wiki/digital_rights_management https://en.wikipedia.org/wiki/access_control designing an intelligent home media center 465 noontec n5 gigalink is a reliable storage server, which can back up a large number of multimedia files (hard drive required), such as digital pictures, music and movies. it supports upnp and dlna functions, so users can go through pictures, listen to their favorite music or watch the 1080p high definition movies on the high definition tv, ps3, xbox360 or other players connected to the home network, which can give them the experience of truly digital home. it supports mobile access from iphone, ipad, android smartphone and tablets connected through the local area network. it also supports file server, ftp server, and samba server [25]. the linksys dma2200 connects the latest 1080p dvd players with windows media center and streams user‟s digital music, movies and photos to any tv in their home wirelessly. with elegant and easy navigation menu screens, users can play dvds, view family slide shows, browse music collection by cover art, listen to entire playlists or choose from a vast selection of internet radio stations from all over the world. the linksys media center extender and user‟s windows vista media center pc give a complete pvr solution-allowing them to watch, pause, rewind and record live tv (pc-embedded or optional tv tuner required). another approach is self-made personalized smart tv which uses client-server architecture together with ami to implement popular tv functions. the best three unique functions of this approach are recognizing user‟s gestures to control a tv, creating collaborative recommendations from social opinions and using environmental collaboration for enabling context-aware services [25]. and final approach, similar to our solution, is custom-made intelligent home media center such as intelligent multimedia service system (imss) [27] and ubiquitous-hybrid multimedia system (u-hms) [28]. imss is based on context awareness and ubiquitous computing. it provides multimedia interoperability among incompatible multimedia devices, device specific video encoding, copyright and license management. similarly, u-hms system offers multimedia interoperability among incompatible devices, transparent services, authentication method and security services. it is based on wireless sensor network (wsn), context awareness, and mpeg-21 dia/video transcoding. disadvantage of these two systems is that there are based on external services such as certificate authority (ca), digital object identifier (doi), multimedia/content management system (mms/cms) and license management system (lms). 3. design of an intelligent home media center in this chapter development of an intelligent home media center will be described together with the basic usage scenarios. all needed components will be listed together with the short description and required and desirable features. for setting up an intelligent home media center the following hardware components are needed:  network-attached storage (in further text nas) for storing all data  main server for sharing meta data between clients and for hosting web server  two microcomputers  sensors  power switches  mobile devices (mobile phone, laptop)  router  output device (tv). 466 i. đurić, v. ratković-živanović, m. labus, d. groj, n. milanović main communication between devices will go through the main server. one microcomputer will communicate with the nas storage when its multimedia content is played. all other communication will go through the main server. this approach will provide a possibility to store history of all actions in the system. fig. 1 design of an intelligent home media center 3.1. infrastructure overview nas storage nas storage will be used to store all multimedia content. this storage should be always on and accessible for all users inside the private network. for the outside world this storage should be invisible. server which will be used for nas storage should have a large hard drive for storing a lot of multimedia content. processor does not need to be too powerful and amount of ram is not crucial. scheduled tasks will be running on this server to scan the whole system and send information about new available multimedia content to the main server. file share on this server should be available all the time for all users in the network. this server does not need to have any kind of graphical user interface (gui) since only data will be passed between nas storage and other devices. additional requirement is to have a file share which is accessible from different operating systems, both mobile and computer. main server this server should have more processing power than nas server. also, amount or ram on this server should be bigger than one on the nas server because it will communicate with a certain amount of end-users and devices at once. on the other hand, hard drive space does designing an intelligent home media center 467 not need to be too big. this server will have a database with meta data about all multimedia content which is available in the system. the content meta data should have an option to be shared inside a private network through http protocol and with outside network with http and email protocols. that requires an email server running on the main server. main server should also have a web server installed, through which other devices will access the list of available multimedia content. in addition, main server should work as a "main switch" which will allow end-users to turn off any device in the system. also main server will be able to automatically turn on any device in the system if it is needed. these features should be also accessible over the web through implemented web services. there is no need to have a desktop gui on the main server, but access to terminal over ssh protocol is required. microcomputers with the use of microcomputer(s) end-users will browse multimedia content over http protocol. some software for playing multimedia content is required, as well as graphical card which can handle high quality videos. processing power and amount of ram for these computers should be average. these computers should have an option to work with some kind of sensors or readers which are able to recognize presence of certain persons and notice main server about it. amount of hard drive for these computers can be very low. these computers must have a gui which could be controlled with some input device (such is tv remote). sensors system should contain sensors which can note a presence of a certain person or device inside the network. these sensors should just note the presence. microcomputer should send the information to the main server about the detected person. also, sensors should have an ability to record some additional variables inside an intelligent home media center such as: what‟s the weather, is the light on, is this person moving, etc. according to this requirement, for example, temperature sensors could be used to measure temperature, nfc tag readers to recognize persons, etc. other devices system should also contain power switches, mobile devices, router and one output device. power switches should be used to turn on microcomputers. mobile devices will be used for controlling the system through native application which will communicate with the main server. this application will provide a basic functionality for controlling the system. this application should have an option to communicate with the main server from private and public networks. the system must have a router which will provide internet access for all computers inside the private network. this router must have an option to set a static identifier for all devices. also, some services on devices must be visible even from the outside world. client output device must be in the system. over the output device users will watch multimedia content and see the status of the system. 3.2. software communications and relations communications between software components is illustrated in the figure 2. on the main server mysql database will be installed. only main server should have access to 468 i. đurić, v. ratković-živanović, m. labus, d. groj, n. milanović this database. on the main server also should be a web page together with an api visible to the outside world and a possibility to communicate over ssh protocol. android application will be communicating with main server through web page and rest api and over ssh connection. microcomputer with various sensors will be communicating only with the main server and only via rest api. nfc tag reader will be connected to one microcomputer. other microcomputer with a bundle of sensors will be connected to a tv and it will communicate with both samba share server on the nas and with web interface on the main server. nas storage will have ssh connection open to the outside world and a samba file share which makes all multimedia content available to all users of an intelligent home media center. fig. 2 software communications and relations 3.3 usage scenario activity diagram for usage scenario is presented in figure 3. when a person comes in a range of an intelligent home media center, microcomputer is noticing that over sensors. micro computer is contacting the main server to inform it about presence of a person. main server is turning on all devices in the network. users of an intelligent home media center can control the center over mobile devices or over microcomputer which is connect to the output device. over microcomputer and output device user will have an option to browse all multimedia content in the network. system will also have a functionality to offer to end-user what to watch or listen. for this particular use case concept of ambient designing an intelligent home media center 469 intelligence should be implemented. micro computer will, based on the data received from sensors and user profile, offer the appropriate content. fig. 3 activity diagram for the usage scenario 4. implementation in this chapter implementation of an intelligent home media center is described in detail. implementation details for each component are discussed, together with best practices and approaches. 470 i. đurić, v. ratković-živanović, m. labus, d. groj, n. milanović 4.1. nas storage for an operating system on the nas storage we chose linux centos 7 with minimum installation. samba file share is installed on the server. beside samba share, only perl programming language and apache server are installed. this server communicates with the main server over rest (representational state transfer) protocol. there is an xml settings file on this server which stores all paths for multimedia content inside the system. for scheduled scanning the system cron jobs are used. cron jobs are running each day at 5:00am in a form of a perl script. perl script first reads the xml settings file and for each of the paths search is performed for available content. for each movie, perl script is, over rest client, getting meta content from open movie database api (omdb). when all data is collected, it's packed in json format and sent out to the main server again over rest. since this operation is very performance consuming, it is run when nas does not have any other pending requests. if system usage is over 50%, bash script will create a new cron job which will start the same perl script 30 minutes later. apache server is used to show images over http protocol. since not all images are accessible on the server, there is another daily cron job which creates symbolic links from images folder to the root folder of apache server. there is also a rest service which is used to start bash scripts. bash scripts are used for basic data manipulation and for shutting down the system. each time nas server receives a request from network, it is informing the main server about it. 4.2. main server main server machine has more processing power than one used for the nas server. main server also does not require gui, but it requires stable and powerful operating system. again centos 7 with minimum installation is chosen. on the main server apache web server is installed alongside with mysql database, email client and php rest server. php rest server is used to collect all requests from devices in the system. main server is receiving meta data about multimedia content from the nas server. all received meta data is stored in a database. also, main server can receive a rest request for turning off a device in the system, sending an email content or displaying data. main usage scenarios are:  meta data about multimedia contents is synchronized between nas server and main server database: when nas server is sending meta data about multimedia content, all meta data content which is not anymore on the nas server is removed from the main server database. all new content is added to the database  when a request for turning off a device is received, main server is turning off the selected device through ssh connection. turning off can be scheduled over ssh also.  when a request for sharing a meta data through an email is received, main server is sending meta data content via email to desired addresses.  when a request for displaying data is received, main server is collecting data from the database and sending it in json format. this use case can occur only when native mobile application is communicating with the server. designing an intelligent home media center 471 apache web server is hosting a web page which is displaying all multimedia meta data from the database. besides that, web page also implements the following functionalities: turning off request for a selected device, power status of all devices in the system, requests for putting devices to sleep and download request for desired content through torrent client. one of implemented web pages is “what to do” page. when user visits this page, system is asking if user wants to watch images, play music or watch videos. based on a few simple questions (“how do you feel?”, “how much time do you have?”, “what would you like to do?”) and user responds, system will offer an appropriate multimedia content to the user. this small feature is based on a concept of ambient intelligence. each time user is looking at some multimedia content, microcomputer is writing to a database what was the genre of the content, what was the weather, time of the day (so we can track what user wants to do at what time) and user‟s mood. based on user‟s answers, time of the day, weather and previous data, system is offering a certain multimedia content to the user. table with user‟s profiles contains zero or more entries for the each user. when user fills out all the questions, data from user‟s profile database is checked. if there are no entries for the user, all content that fulfills user‟s request gets displayed. if there are entries in the user‟s profile database, only content younger than two months (we assume that user is changing habits and expectations for multimedia content) gets selected. if there are entries with the same weather and time of the day like in the time user filled out questions, they are used. otherwise all available content from user‟s profile database is used. in the end hash of user‟s preferences is created. for example, if user wants to watch a movie, hash with user‟s preferences contains favorite genre, average time of movies user watched, average imdb rating of movies user watched, etc. user‟s preferences hash, together with user‟s request is used to search for an appropriate content for the user. if user wants to exercise and user has less than half an hour available, main server will offer p90x cardio training to the user since this training lasts only 20 minutes and can be done indoors. example of using “what to do” feature over android device is presented below: fig. 4 android application for the intelligent home media center 472 i. đurić, v. ratković-živanović, m. labus, d. groj, n. milanović 4.3. micro computers we chose two raspberry pi 2 model b microcomputers because of their price, performances and availability. we have chosen rasbian operating system because of its expandability and big user community. one microcomputer is connected to the output device. on this raspberry pi two additional features have been installed lib-cec library and kodi media player lib-cec library have been installed to monitor events from the output device‟s remote to which raspberry pi microcomputer is connected. keyboard arrows and enter keys have been mapped together with few other buttons to allow browsing menus with output device‟s remote and turning off raspberry pi over remote. kodi media player is used to play movies and music from the nas storage. on the other raspberry pi nfc tag reader is connected. purpose of nfc tag reader is to note when some persons are nearby, in the radius of intelligent home media center. when a person‟ presence is detected, system is sending a rest request to the main server to turn on all devices. nfc tag reader is placed on the door‟s entrance. users have nfc tag bracelets and key chains. when user is entering the door, she or he should place nfc tag near the reader so tag gets recognized. 4.4. mobile devices we used mobile phone with android operating system. native application has been developed to communicate over rest api with the main server. this application can communicate with the main server even outside private network because port 80 on the main server is visible to the outside world. 4.5. output device we have used lg tv as an output device. out tv has hdmi-cec support (this feature is called simplink on lg tvs). it is important to use a hdmi cable with cec support, otherwise tv‟s remote events won't be passed to the microcomputer. 4.6. components setup main server, nas server and router are in the same room. since over nas server a large amount of data is transferred, nas server is connected to router via lan cable. main server is also connected to router over lan cable because it communicates frequently with the nas server. all other devices in the system are connected to router over wi-fi. since microcomputers don‟t have wi-fi receivers, small wi-fi receivers are provided. nfc card reader is connected to one of microcomputers to notice the presence of certain tags nearby. second microcomputer is connected to tv. one or more mobile devices can be in the private network. 5. conclusion in this paper we have explained in detail design and implementation of a personal intelligent home media center. there are multiple advantages of our solution comparing designing an intelligent home media center 473 to a classic home media center, such as dvd player. comparing to a dvd player or a home theater, our solution offers more comfort in using by allowing users to control the system over multiple devices (such as mobile phone, laptop, tv remote, etc...). also, the most of movie theaters have support only for several multimedia content types. self-built solution offers a possibility to stream almost any content. comparing to a smart tv, our solution is cheaper, with a richer set of features and it is open for communication with any device over the rest server. the intelligent home media center is developed to satisfy the personal needs of the end-user. solution is very scalable and it is easy to add more components such as clients, servers and other devices. it is inexpensive to build, it is fast and it is intelligent. it uses sensors to detect person‟s presence and concepts of ambient intelligence to recommend appropriate media to the user. it is easily maintainable, devices can be replaced or upgraded and additional software features can be implemented there are also some disadvantages of the design and implementation of this personal intelligent home media center. it requires well knowledge of various information technologies. consequently, comparing to other alternatives, it is not a plug & play solution. there are many components in the system and if one component is not configured correctly, the whole system will not work. we will try to address these disadvantages in our future work. further development of our personal intelligent home media center will be focused on the integration with the digital interactive television. based on the recognized mood of the user and user‟s individual profile, system could automatically select tv content. this integration could be especially exploited in the interactive tv advertising landscape [32], [33]. in addition, further development will introduce additional security features, such as parental control and user privacy. references [1] c. benavente-peces, a. ahrens, j. filipe, “advances in technologies and techniques for ambient intelligence”, journal of ambient intelligence and humanized computing, 2014, vol. 5, no. 5, pp. 621-622. [2] d. j. cook, j. c. augusto, v. r. jakkula, “ambient intelligence: technologies, applications, and opportunities”, pervasive and mobile computing, vol. 5, no. 4, pp. 277-298. [3] k. ducatel, m. bogdanowicz, f. scapolo, f., ambient intelligence: from vision to reality. ist advisory group, 1–31. retrieved from http://scholar.google.com/scholar?hl=en&btng=search&q=intitle:ambient+ intelligence+:+from+vision+to+reality#1 [4] k. ducatel, m. bogdanowicz, f. scapolo, j. leijten, j.-c. burgelman, istag scenarios for ambient intelligence in 2010. society, 58. retrieved from ftp://ftp.cordis.europa.eu/pub/ist/docs/istagscenarios 2010.pdf [5] internet source: http://www.smarthomeusa.com/smarthome, retrieved 2015 [6] j.-c. rosslin, k. tai-hoon, “applications, systems and methods in smart home technology : a review”, international journal of advanced science and technology, 2010, vol. 15, pp. 37-48. [7] m. friedewald, o. da costa, y. punie, p. alahuhta, s. heinonen, “perspectives of ambient intelligence in the home environment”, telematics and informatics, 2005, vol. 22, no. 3, pp. 221-238. [8] amigo: ambient intelligence for the networked home environment. (2008). retrieved from http://www.hitech projects.com/euprojects/amigo. [9] a. de paola, s. gaglio, g. lo re, m. ortolani, “sensor 9k: a testbed for designing and experimenting with wsn-based ambient intelligence applications”, pervasive and mobile computing, vol. 8, no. 3, pp. 448–466, 2012. [10] a. sorici, g. picard, o. boissier, a. zimmermann, a. florea, “consert: applying semantic web technologies to context modeling in ambient intelligence”, computers & electrical engineering, vol. 44, pp. 280-306, 2012 http://scholar.google.com/scholar?hl=en&btng=search&q=intitle:ambient+intelligence+:+from+vision+to+reality#1 http://scholar.google.com/scholar?hl=en&btng=search&q=intitle:ambient+intelligence+:+from+vision+to+reality#1 ftp://ftp.cordis.europa.eu/pub/ist/docs/istagscenarios2010.pdf ftp://ftp.cordis.europa.eu/pub/ist/docs/istagscenarios2010.pdf http://www.smarthomeusa.com/smarthome,%20retrieved%202015 http://www.hitech-projects.com/euprojects/amigo http://www.hitech-projects.com/euprojects/amigo 474 i. đurić, v. ratković-živanović, m. labus, d. groj, n. milanović [11] f. sebbak, a. chibani, y. amirat, a. mokhtari, f. benhammadi, “an evidential fusion approach for activity recognition in ambient intelligence environments”, robotics and autonomous systems, vol. 61, no. 11, pp. 1235-1245, 2013 [12] peterson, k. e. (2002). if high-tech it is your idea of paradise, welcome to valhalla. [13] s. russell, p norvig, “artificial intelligence: a modern approach”, international dental journal, vol. 60. [14] i. f. akyildiz, w. su, y. sankarasubramaniam, e. cayirci, “a survey on sensor networks”, ieee communications magazine, vol. 40, no. 8, pp. 102-105, 2002. [15] galton, a. (2000). qualitative spatial change. oxford university press. [16] d. cook, m. youngblood, s. das, “a multi-agent approach to controlling a smart environment”, designing smart homes, pp. 165-182. [17] s. lühr, g. west, s. venkatesh, “recognition of emergent human behaviour in a smart home: a data mining approach”, pervasive and mobile computing, vol. 3, no. 2, pp. 95-116, 2007. [18] m.c. mozer, “lessons from an adaptive home”, smart environments, pp. 271-294. 2005 [19] v. lesser, m. atighetchi, b. benyo, b. horling, a. raja, r. vincent, s. x. q. zhang, “the intelligent home testbed”, environment, vol. 2, no. 15. 1999. [20] s. s. intille, k. larson, j. s. beaudin, j. nawyn, e. munguia tapia, p. kaushik, “a living laboratory for the design and evaluation of ubiquitous computing technologies”, in proceedings of the chi ’05 extended abstracts on human factors in computing systems, 2005, pp. 1941-1944. [21] j. pineau, m. montemerlo, m. pollack, n. roy, s. thrun, (2003). towards robotic assistants in nursing homes: challenges and results. robotics and autonomous systems, vol. 42, no. 3-4, 271-281, 2003. [22] m. chan, d. estève, c. escriba, e. campo, “a review of smart homes-present state and future challenges”, computer methods and programs in biomedicine, vol. 91, pp. 1, pp. 55-81, 2008. [23] m. friedewald, e. vildjiounaite, y. punie, d. wright, “privacy, identity and security in ambient intelligence: a scenario analysis”, telematics and informatics, vol. 24, no. 1, 2007. [24] interner source: http://kodi.tv/about/, retrieved 2015 [25] internet source: http://www.digilifeonline.com.au/, retrieved 2015 [26] l. wei-po, c. kaoli, and j.-y. huang. "a smart tv system with body-gesture control, tag-based rating and context-aware recommendation."knowledge-based systems, vol. 56, pp. 167-178, 2014. [27] j. park, h. park, s. lee, j. choi, d. lee, d. “intelligent multimedia service system based on context awareness in smart home”, context, pp. 1146–1152, 2005. [28] j. h. park, s. lee, j. lim, l.t. yang, “u-hms: hybrid system for secure intelligent multimedia data services in ubi-home”, journal of intelligent manufacturing, vol. 20, no. 3, pp. 337-346. 2009. [29] krill, p. (2000). ibm research envisions pervasive computing. [30] m. weiser, m. “the computer for the twenty-first century”, scientific american, vol. 265, no. 3, 94– 104. 1991 [31] m. weiser, “hot topics-ubiquitous computing”, computer, vol. 26, no.10, 1993. [32] e. athanasiadis, s. mitropoulos, “a distributed platform for personalized advertising in digital interactive tv environments”, journal of systems and software, vol. 83, no. 8, pp. 1453-1469, 2010. [33] g. lekakos, d. papakiriakopoulos, k. chorianopoulos, an integrated approach to interactive and personalized tv advertising. channels, 1-10, 2001. http://kodi.tv/about/ http://www.digilifeonline.com.au/ 10637 facta universitatis series: electronics and energetics vol. 35, no 4, december 2022, pp. 603-617 https://doi.org/10.2298/fuee2204603d © 2022 by university of niš, serbia | creative commons license: cc by-nc-n original scientific paper chaotic seismic signal modeling based on noise and earthquake anomaly detection leila dehbozorgi, reza akbari-hasanjani, reza sabbaghi-nadooshan department of electrical engineering, central tehran branch, islamic azad university, tehran, iran abstract. since ancient times, people have tried to predict earthquakes using simple perceptions such as animal behavior. the prediction of the time and strength of an earthquake is of primary concern. in this study chaotic signal modeling is used based on noise and detecting anomalies before an earthquake using artificial neural networks (anns). artificial neural networks are efficient tools for solving complex problems such as prediction and identification. in this study, the effective features of chaotic signal model is obtained considering noise and detection of anomalies five minutes before an earthquake occurrence. neuro-fuzzy classifier and mlp neural network approaches showed acceptable accuracy of 84.6491% and 82.8947%, respectively. results demonstrate that the proposed method is an effective seismic signal model based on noise and anomaly detection before an earthquake. key words: artificial neural networks, chaos, earthquake, entropy, prediction, seismic signal processing, wavelet transforms 1. introduction earthquake prediction is a branch of seismology and should be distinguished from earthquake warning systems which provide a real-time warning to regions that might be affected. the purpose of a chaotic signal model considering noise and detection of anomalies before an earthquake is to warn of an impending major earthquake to reduce death and destruction. in the 1970s, scientists were optimistic that a practical method for predicting earthquakes would soon be found [1]. however, further devastating earthquakes occurred that caused destruction and loss of life exceeding 6,300 persons in the m7.2 1995 kobe earthquake in japan, 15,000 in the m7.4 1999 izmit earthquake in turkey, and over 30,000 in the m6.7 2003 bam earthquake in iran [2]. there are many common methods of detecting anomalies before an earthquake which use artificial neural networks (anns), genetic programming (gp), and radial basis function networks. artificial neural networks have applications in areas such as identification, prediction, and image processing. in ref. [3], the back propagation neural network and new mark displacement analysis examined the earthquake risk in the manjil-rudbar damaged area received april 4, 2022; revised july 26, 2022; accepted august 10, 2022 corresponding author: reza sabbaghi-nadooshan department of electrical engineering, central tehran branch, islamic azad university, tehran, iran e-mail: r_sabbaghi@iauctb.ac.ir https://en.wikipedia.org/wiki/earthquake_warning_system 604 l. dehbozorgi, r. akbari-hasanjani, r. sabbaghi-nadooshan in 1990. in order to evaluate earthquake signals, it is better to use factual information than the null hypothesis. ref. [4] considered a model for noisy signal and detecting anomalies before an earthquake using anns and got acceptable results on ghir station in iran. researchers have developed software for short-term earthquake prediction using pressure reduction and temperature rise, which has resulted in 70.5% accurate forecasting in japan. the accuracy of this network is not optimal for predicting [5]. in ref. [6] used location related parameters in the neural network to predict earthquakes in iran. the researchers in ref. [7] used the deep learning model of dlep for earthquake prediction, which used explicit and implicit features. there is no suitable time frame for earthquake prediction. in ref. [8] the neural network is discussed to predict the arrival time of p-wave earthquake occurrence in taiwan.the time frame for earthquake prediction is concise. in ref. [9] has used a grnn neural network to predict earthquakes on the iranian plateau. in ref. [22] examines the possibility of using the dlis algorithm to identify and reconstruct the location, size, and thickness distribution of several complex defects. in ref. [23], it has used 8 mini-stations of the new region located in north sumatra which it uses the svm model (one of the machine learning tools in digital signal processing) to distinguish seismic activities. however, the proposed model has acceptable accuracy but the amount of data to be tested can be increased and the used more data is necessary for test network performance. ref. [24] used the deep learning to predict earthquakes and p-wave has been investigated, but the time frame for earthquake prediction is concise. ref. [25] also used the deep learning and neural network to predict earthquakes. the period of 3 seconds before the earthquake is intended to predict the earthquake that is a concise time to forecast. ref. [26] suggested the augmented linear mixing model (almm) method. the most of the focus of this article is on image processing, object recognition, and classification. in this purpose, a dictionary is defined to model spectral variables. this paper focuses more on image processing, not signal processing. the dictionary has little similarity with the training data in neural network. but, the application of methods used for image processing needs more investigation in signal processing. ref. [27] proposed a model called fourier-based rotation-invariant feature boosting (frifb) to increase the speed of calculations and reduce complexity. in this way, the fourier is calculated in polar coordinates and then the subsequent analyses are performed. in this article, we have defined several extracted frequency features, which are used almost the same way but with some differences as in the above article. for example, to extract newer and different features, we applied signal divider features to fft and psd, and then extracted statistical features for each part. it is explained in more detail in sections 2-3. in ref. [10], two feature groups were compared to detect anomalies before an earthquake. those consisted of 54 and 87 features. the accuracy values for data classifier and mlp neural network are equal to 60.6383% and 55.8511% for the feature matrix with dimensions of 54 and 87, with a total of 626 records. this method employed much more data than previous methods. it does not have the desired accuracy, but there is a time frame before the earthquake to predict it. most previous articles use concise time frame to predict earthquakes, and the number of features is minimal and related to geological features. for this reason, it is impossible to make an accurate decision about the types of effective features in the occurrence of an earthquake. this study aims to determine the most desirable characteristic matrix for detecting anomalies within 5 minutes before an earthquake and chaotic signal modeling. chaotic seismic signal modeling based on noise and earthquake anomaly detection 605 in this study, a new class of effective features is developed for chaotic signal modeling before an earthquake using intelligent networks with a more extensive database for generalization than previous methods. then a model is evaluated for noisy signal and detection of anomalies before an earthquake using neuro-fuzzy and mlp classifiers. the innovation of this article is the use of more data for training and testing the classifiers, considering 5 minutes before an earthquake to predict it, using different features than previous articles and comparing the performance of two neuro-fuzzy networks and mlp classifiers. most papers use only geological or frequency-related features. in this article, we tried to examine the different types of features and determine the effectiveness or ineffectiveness of each. the rest of the paper is organized as follows: section 2 discusses the basic concepts of the features and ann structure. section 3 proposes the design method and discusses the simulation results. section 4 concludes with the obtained result. 2. data and methodology chaotic signal modeling based on noise was employed with neuro-fuzzy and mlp classifiers and using a large amount of data and some new features. the seismic waves are processed to detect anomalies before an earthquake onset. the method is divided into six main stages (in fig. 1): (1) considering the earthquake onset to select an observation window and detect anomalies; (2) slice the rest of the signal into two sections; (3) high pass filtering of the signals to reject baseline drift; (4) feature extraction from the filtered signal; (5) feed the feature vector to the intelligent networks; (6) after training and testing the classifiers, select the effective features using uta algorithm [11,12]. the selected signals were processed using a high pass butterworth filter to remove baseline drift in the signals and the cut-off frequency (fc) was set at 0.04 hz [4]. there was not enough evidence showing how an earthquake related to a known feature, so a mixture of time, time-scale, and chaos features were extracted, and the effective features were selected after achieving acceptable accuracy [4, 10]. the whole process of the algorithm is shows in fig. 1. 2.1. features 2.1.1. statistical features the ten statistical features evaluated are mode, mean, variance, covariance , maximum data, minimum data, signal standard deviation , median, deviation of string factor from symmetry (sk) and stretch factor. the ‘sk’ and ‘k’ features represented as follow. where xi and denote the signal and the mean of signal, respectively, and n is the number of data [13]. ( ) 3 3 2 ( ( ) ) ( ( ) )i isk       = −  −         (1) ( ) 4 4 2 ( ( ) ) ( ( ) ) 3i ik e       = = −  −  −        (2) 606 l. dehbozorgi, r. akbari-hasanjani, r. sabbaghi-nadooshan separating the earthquake signal from the main signal entering signal remove 5 minutes before the earthquake dividing the rest of the signal into two equal parts filtering both signal sections to remove low frequencies extracting the features for each section neuralnetwork (out put =1?) chaotic modeling based on noise and detecting anomalies before the earthquake accure no anomalies detect yes no uta algorithm effective features selection finish fig. 1 flowchart of chaotic modeling based on noise and detecting anomalies before the earthquake 2.1.2. chaos features chaotic systems are highly dependent on initial conditions. in other words, if two trajectories start very close to each other, they diverge from each other rapidly and exponentially if and only if their processes have chaotic behavior. the difference between the two trajectories after the time period of t is measured as the lyapunov exponent ( ). where that x0 is a point on a trajectory at time t and x0 + ∆x0 is the point near to x0 on a different trajectory where ∆x0 approaches zero and presents the initial amount of separation between the two points. 0 0 (1 ) ( ( , ) )im n     → =     (3) there are three states for the lyapunov exponent (λ): (1) λ>0: the system is chaotic. (2) λ<0: the system is not chaotic. (3) λ =0: the system reaches steady state condition [13]. chaotic seismic signal modeling based on noise and earthquake anomaly detection 607 2.1.3. signal divider a signal divider applied to classify of data between the maximum and minimum signal values. the signal divided into 16 equal classes and the amount of available data in each class is extracted as the feature. 2.1.4. entropy entropy is a measure of the system disorder. entropy h(x) of discrete random variable x is evaluated as following [14], so that p(x) is the probability of x occurrence. 2( ) ( ) ( ( )) x og  = −      (4) 2.1.5. discrete wavelet transform (dwt) wavelet transform can be seen as the projection of a signal into a set of basic functions named wavelets. a wavelet transform includes a function based on the mother wavelet function and has an excellent localization characteristic in the time-scale domain [15]. most of the energy in a wavelet function is concentrated in a short interval and is damped quickly. of the various types of wavelet functions, the daubechies wavelet transform is one of the most common. in wavelet transforms, the signal passes through an internal filter and is divided into a low-frequency (ca) and a high-frequency (cd) component. the dwt of signal x[n] is defined based on approximation coefficient wφ [j0,k] and detail coefficient wψ [j,k], as it is shown as follows. where n=0,1,2,…, m-1, k=0,2,…,2j-1 and j=0,1,2,…,j-1, and m is the number of samples to be transformed using wavelet function. 00 , [ , ] (1 / ) [ ] j k n w j k m x n =   (5) 0 , 0 , for [ , ] (1/ ) [ ] j k n w jj k m x n j =   (6) the basic functions φj,k [n], and ψi,k [n] are defined as follow. where φ[n] is the scaling function and ψ[n] is the wavelet function [4,16]. 2 , [ ] 2 [2 ] j j j k n n k =   − (7) 2 , [ ] 2 [2 ] j j j k n n k =   − (8) the daubechies 2 wavelet transform is implemented in the next five steps. the output of each array is selected using half of the inputs selected at each step. the statistical values are used as features in each step. 2.1.6. fast fourier transform (fft) using the equation 9, fast fourier transform (fft) for an n×n matrix is calculated [17]. the statistical features and data classifier for the fft of the signal evaluated as features. 1 0 ( ) ( ) (exp( 2 / )) n n k n j k n − =  =   − k=0, 1, 2, …, n (9) 2.1.7. power spectral density (psd) the power spectral density (psd) function shows the strength of variation (energy) as a function of frequency. it shows at which frequencies variations are strong and at which 608 l. dehbozorgi, r. akbari-hasanjani, r. sabbaghi-nadooshan frequencies variations are weak. the energy is obtained within a specific frequency range by integrating psd in the frequency range. the computation of psd is done directly by computing autocorrelation function r(τ) and then transforming it. the results are demonstrated in the following formulas for signal s(t). 2( ) ( )p t s t= (10) ( ) ( ) (exp( 2 )) ( ( ))s f r j d f r    + − =  −  = (11) the power of the signal in a frequency band can be calculated as: 2 2 1 1 ( ) ( ) f f f f p s f df s f df − − =  +   (12) afterward, statistical features of the signal’s psd were derived as psd features. 2.1.8. trajectory a trajectory is a path followed by an object moving through space as a function of time. in this present study, a signal with n pieces of data is presumed. each part of the signal is depicted as {x(t1), x(t2), …, x(tn)} such that t1,t2, …,tn refers to the data stored in a time series [18]. first, the x(n+1) to x(n) graph is represented as a signal trajectory and then is divided into 16 houses. the number of pieces of data stored in each house in a matrix is a feature. 2.2. classification networks 2.2.1. multilayer perceptron (mlp) network multilayer perceptron (mlp) is a well-known feed-forward neural network that is used for classification usually because of its good performance. generally, an mlp contains input and output layers and one or more hidden layers. after forming the structure of a network, the neurons are connected by linking weights and they are trained using a training algorithm (fig. 2) [19]. input layer first hidden layer second hidden layer output layer output. . . . . . . . . . . . . . . . . . fig. 2 the structure of multi-layer perceptron network chaotic seismic signal modeling based on noise and earthquake anomaly detection 609 2.2.2. neuro-fuzzy classification networks fuzzy systems use two significant paradigms: fuzzy logic and neural networks [20]. fuzzy logic programming in matlab software includes conditional statements. the neural network consists of several nodes which are connected to each other by weights (fig. 3). 0x1 xn ++ v -m f=a/b f ba x µ layer layer layer µ=exp[-(x-xi l ) 2 /i 12 ] y -1 z m × × fig. 3 network representation of the fuzzy system [10] 1 1 1 1 2 1 1 exp( (( ) )) ( ) exp( (( ) ) ) m l n l i i i i l m n l i i i i l y x x f x x x   − − = = − = =      − −     =     − −       (13) here, three parameters i l , y −1 and xi −1 define in the phase of learning and must be determined to design a neuro-fuzzy system, and m is the number of rules considered. input x passes through a product gaussian operator to become zl, then result of this stage passes through summation operator b and weighted operator a. finally, output f is calculated [20]. 01 exp( (( ) )) nl p l l i i ii z x x  − = = − − (14) 1 m l i b z = =  (15) 1 ( ) m l l l a y q z − = =   (16) f a b= (17) 2.3. feature selection in the uta algorithm, the average of one feature in all instances is calculated. then the selected feature in all input vectors is replaced by the calculated mean value. then trained network is tested with the new features and new matrix. if the system cognition is 610 l. dehbozorgi, r. akbari-hasanjani, r. sabbaghi-nadooshan decreased, that feature is effective, but if the result doesn’t change or improve, that feature is considered ineffective (noisy feature) and should be removed from the input vector [11]. 3. analysis of results the database contains 760 records at 5 to 7 on the richter scale from the international institute of earthquake engineering and seismology for 21 earthquake recording stations in iran (between 2004 and 2010). the sampling frequency is 50 hz (fig. 4). table 1 shows the date, time, geographical location, depth, and magnitude of each earthquake. table 1 characteristics of 5 to 7 richter earthquakes recorded between 2004 and 2009 date of occurrence time of occurrence magnitude and geographical characteristics year month day hour minute second latitude longitude depth magnitude 2004 10 6 11 14 26.1 28.8 57.9 14.1 5.2 2004 10 7 12 54 56.1 28.4 57.2 15.9 5 2004 10 7 21 46 15.2 37.3 54.5 16.8 6.2 2004 10 16 10 4 33.9 33.5 45.7 18 5 2005 3 13 3 31 27.3 27.3 61.5 54.8 6.1 2005 5 1 18 58 38.8 30.8 56.9 14.2 5.1 2005 5 14 18 4 57.1 30.7 56.6 14.1 5.2 2005 6 19 4 46 4.5 33.1 58.2 15 5.2 2005 8 9 5 9 19.7 28.8 52.6 18 5 2005 11 27 16 30 39.1 27.0 55.7 14.1 5.2 2005 11 29 5 57 3 37.5 54.6 15 5 2005 12 26 23 15 51.1 32.1 49.1 32.9 5.2 2005 12 27 21 53 15 28.1 56.1 15 5.1 2006 2 18 11 3 31.5 30.7 55.8 14.1 5 2006 2 28 7 31 3.4 28.1 56.7 18 5.8 2006 3 25 7 28 57.3 27.5 55.8 15.8 5.5 2006 3 25 9 55 16 27.6 56.0 15.9 5.1 2006 3 25 10 0 37 27.4 55.7 15 5 2006 3 30 19 36 18 33.6 48.9 15 5.1 2006 3 31 1 17 2.3 33.6 48.9 14.1 6.1 2006 3 31 11 54 2.6 33.8 48.7 17.5 5.2 2006 6 28 21 2 9.2 26.8 55.9 10 5.6 2006 7 18 23 27 5.5 26.2 61.1 46 5 2006 11 5 20 6 40.2 37.4 48.8 14.1 5 2007 3 26 6 36 50 29.1 58.4 14.1 5 2007 6 18 14 29 49.4 34.5 50.8 17.3 5.6 2008 3 9 3 51 6.4 33.3 59.1 17.9 5 2008 8 27 21 52 39.9 32.3 47.3 32.5 5.6 2008 9 10 11 0 35.1 26.9 55.7 6.7 5.8 2008 10 25 20 17 16.9 26.6 54.8 14.2 5.1 2008 12 7 13 36 20.8 26.9 55.7 11 5.2 2008 12 9 15 9 27.4 27.0 55.8 15 5 2009 7 22 3 53 2.6 26.7 55.8 14.2 5.2 2009 10 4 21 50 49.6 31.8 49.4 15 5.1 chaotic seismic signal modeling based on noise and earthquake anomaly detection 611 fig. 4 distribution of stations in the iran broadband national center seismology [21] for training, 70% of the data were randomly selected and the remaining 30% of the data was used for testing in matlab r2017a software. initially, for anomaly detection, 380 records that have 20 minutes of the signal and have the property that an earthquake has happened after were selected as the sig1 group and 380 records equal in length by the sig1 group were selected which no earthquake has occurred in the next following five minutes after them and five minutes before the earthquake in each record deleted (fig. 5). the first five minutes of sig2 and last five minutes of sig1 separated to extract features for the feature vector. a fourth-order high pass butterworth filter was applied to remove the low frequency (fc = 0.04) and then the signals normalized (fig. 6). 612 l. dehbozorgi, r. akbari-hasanjani, r. sabbaghi-nadooshan fig. 5 classification of seismic signals before the earthquake, the earthquake does not happen after sig2 and happen 5 minutes after sig1 fig. 6 a) the original signal, b) the filtered signal chaotic seismic signal modeling based on noise and earthquake anomaly detection 613 after filtering the 15000 samples, the statistical features are derived for each record using the chaostest.m in matlab [4]. chaostest.m tests the positive existence of the dominant lyapunov exponent λ and local lyapunov exponents. the second output parameter is h, which is the result comparing λ and α. the value of p is the observing probability. another output is orders, which gives the triplet (l, m, q), minimizes the schwarz information criterion to obtain the best coefficients and calculates λ. the confidence interval (ci) for λ is determined at level α (α is a fixed number with a default value of 0.05 ). the dwt is implemented for five steps and eight statistical features (mode, mean, variance, covariance, maximum, minimum, median, and signal standard deviation) is saved for each step. the signal divider is evaluated for fft and eight statistical features are calculated for fft and psd, respectively. moreover, the x(n+1) to x(n) graph is provided as a signal trajectory and then divided into 16 houses. the amount of data stored in each place is saved as a feature. input feature vector has 260 values per instance. the neuro-fuzzy classifier has 260 inputs, 14 neurons (rules), and one output. the threshold of the classification in neuro-fuzzy classifier is 0.49. furthermore, the mlp neural network has 260 neurons in the input layer, two hidden layers, and an output layer consisting of two neurons. neuro-fuzzy classifier and mlp neural network were successfully trained in matlab and the testing results are presented for both networks. the networks have one output; each output value uniquely represents one category (0: no earthquake; 1: earthquake). after training, both classifiers were tested and then 3503 iterations of training, the results indicated that the neuro-fuzzy classifier was better than the mlp network and could detect anomalies five minutes before an earthquake with an acceptable accuracy of 84.6491% (fig. 7; table 2). fig. 7 difference between the output of neuro-fuzzy classifier and real output after 3503epoch 614 l. dehbozorgi, r. akbari-hasanjani, r. sabbaghi-nadooshan table 2 neuro-fuzzy classifier’s performance compare with mlp before feature selection classifier neuro-fuzzy classifier multilayer perceptron (mlp) network accuracy 84.6491% 81.1404% sensitivity 71.93% 80.70% specificity 97.37% 81.58% average error 0.1237 0.1691 0.1663 fig. 8 compares the neuro-fuzzy classifier and mlp performance before feature selection. this figure shows that the neuro-fuzzy classifier is produced better results for accuracy. 79.00% 80.00% 81.00% 82.00% 83.00% 84.00% 85.00% mlp neurofuzzy accuracy% fig. 8 neuro-fuzzy classifier’s performance compared to mlp before feature selection after training and testing, the uta algorithm implemented for feature selection and ineffective features were deleted. this algorithm decreased the input vector dimensions to 150 for the mlp network and 29 for the neuro-fuzzy classifier. both classifiers were trained and tested again (table 3). table 4 shows some of the more effective features of both classifiers. results show that frequency characteristics are priorities for both classifiers and the neurofuzzy classifier produced better results for accuracy and sensitivity. table 3 neuro-fuzzy classifier’s performance compare with mlp after feature selection classifier neuro-fuzzy classifier multilayer perceptron (mlp) network accuracy 84.6491% 82.8947% sensitivity 74.56% 71.05% specificity 94.74% 94.74% average error 0.1512 0.1782 0.1580 table 4 some of more effective features after implementation of uta algorithm for neuro-fuzzy classifier and mlp neural network mlp neural network neuro-fuzzy classifier mean of angle (fft) mean of angle (fft) mean of angle of normalize (fft) median of entropy max of data covariance of ca (dwt) mean of abs (fft) signal standard deviation of (psd) max of ca (dwt) mean of ca (dwt) mean of cd (dwt) trajectory chaotic seismic signal modeling based on noise and earthquake anomaly detection 615 fig. 9 shows the results of the accuracy of this study compared with other studies. amount of accuracy optimization compared to previous articles is obtained using the following formula: present implementation result improvement (%) 1 100 previousimplementationresult   = −     (18) accuracy: improvement (%) (this study, [4]) = (1−(84.6491/60.8491))100 = 39.079% improvement (%) (this study, [10]) = (1−(84.6491/60.6383)100 = 39.59% improvement (%) (this study, [7]) = (1−(84.6491/50)100 = 69.2982% improvement (%) (this study, [8]) = (1−(84.6491/75)100 = 12.8654% it can be seen that the accuracy in the proposed design is more optimal than the previous articles. the amount of improvement is even close to 70% (fig. 10). fig. 11 shows the results of the present study improved for the neuro-fuzzy classifier after the implementation of the new features in this study. it shows that the neuro-fuzzy classifier has performs better than the mlp network. 75 50 60.869660.6383 84.6491 0 20 40 60 80 100 0 2 4 6 other studi es[8] other studi es[7] previous researches [4] previous researches [10] thi s study fig. 9 neuro-fuzzy classifier’s performance for this study (ts) compared with the other studies 0.00% 10.00% 20.00% 30.00% 40.00% 50.00% 60.00% 70.00% [4] [10] [7] [8] improvement(accuracy) % fig. 10 comparison of the accuracy improvement of the proposed design with previous articles 616 l. dehbozorgi, r. akbari-hasanjani, r. sabbaghi-nadooshan fig. 11 neuro-fuzzy classifier’s performance compared with mlp after feature selection 4. conclusion in this article, the proposed method can detect anomalies before an earthquake by using new features. one of the innovations of this article is extracting new features. also, we considered a longer period of time than the rest of the articles to detect the anomaly before the earthquake then evaluated two types of classifiers. finally, we chose the best network and the most optimal features. the proposed method provided a new matrix of features that was capable of chaotic signal modeling based on noise and detection of anomalies during the five minutes before the earthquake with an acceptable accuracy of 84.6491%. moreover, the results indicate that the uta algorithm decreased input feature dimensions without loss of accuracy. the selected features demonstrated that chaotic signal modeling based on noise and detecting anomalies before an earthquake is very dependent on frequency features, followed by entropy, trajectory, chaotic and statistical features. future work would be to collect more earthquake data globally, add more frequency-dependent parameters to the feature vector, and use committee machines to increase the classification accuracy. it is also possible to extract a new feature from combination of two or three features for example, the combination of entropy and classification and frequency features or other possible combinations. references [1] r. j. geller, d. d. jackson, y. y. kagan and f. mulargia, "earthquakes cannot be predicted," science, vol. 275, pp. 1616-1617, 1997. [2] s. uyeda, t. nagao, and m. kamogawa, "short-term earthquake prediction: current status of seismoelectromagnetics", tectonophysics, vol. 470, no. 3-4, pp. 205-213, 2009. [3] a. m. rajabi, m. khodaparast, and m. mohammadi, "earthquake-induced landslide prediction using back-propagation type artificial neural network: case study in northern iran", natural hazards, vol. 110, no. 1, pp. 679-694, 2022. [4] l. dehbozorgi, "case study of seismic signals for ghir station before the earthquake", bulletin of earthquake science and engineering, vol. 5, no. 4, pp. 131-143, 2019. [5] h. shiraishi, "developing and validating earthquake prediction software", international journal of engineering and techniques, vol. 8, pp. 63-69, 2022. [6] m. yousefzadeh, s. a. hosseini, and m. farnaghi, "spatiotemporally explicit earthquake prediction using deep neural network", soil dynamics and earthquake engineering, vol. 144, p. 106663, 2021. http://moho.ess.ucla.edu/~kagan/geller_et_al_1997.pdf http://www.bese.ir/article_240366.html?lang=en chaotic seismic signal modeling based on noise and earthquake anomaly detection 617 [7] r. li, x. lu, s. li, h. yang, j. qiu, and l. zhang, "dlep: a deep learning model for earthquake prediction," in proceedings of the 2020 international joint conference on neural networks (ijcnn), 2020, pp. 1-8. [8] y.-j. chiang, t.-l. chin, and d.-y. chen, "neural network-based strong motion prediction for on-site earthquake early warning", sensors, vol. 22, no. 3, pp. 704, 2022. [9] s. yaghmaei-sabegh, "earthquake ground-motion duration estimation using general regression neural network", scientia iranica, vol. 25, no. 5, pp. 2425-2439, 2018. [10] l. dehbozorgi and f. farokhi, "notice of retraction: effective feature selection for short-term earthquake prediction using neuro-fuzzy classifier", in proceedings of the 2010 second iita international conference on geoscience and remote sensing, 2010, vol. 2, pp. 165-169. [11] j. utans, j. moody, s. rehfuss, and h. siegelmann, "input variable selection for neural networks: application to predicting the us business cycle", in proceedings of 1995 conference on computational intelligence for financial engineering (cifer), 1995, pp. 118-122. [12] m. f. redondo and c. h. espinosa, "a comparison among feature selection methods based on trained networks", in proceedings of the neural networks for signal processing ix: proceedings of the 1999 ieee signal processing society workshop (cat. no. 98th8468), 1999, pp. 205-214. [13] k. majumdar and m. h. myers, "amplitude suppression and chaos control in epileptic eeg signals", computational and mathematical methods in medicine, vol. 7, no. 1, pp. 53-66, 2006. [14] s. byun et al., "entropy analysis of heart rate variability and its application to recognize major depressive disorder: a pilot study", technology and health care, vol. 27, no. s1, pp. 407-424, 2019. [15] k. sui and h.-g. kim, "research on application of multimedia image processing technology based on wavelet transform", eurasip journal on image and video processing, vol. 2019, no. 1, pp. 1-9, 2019. [16] a. qin, z. shang, j. tian, y. wang, t. zhang, and y. y. tang, "spectral–spatial graph convolutional networks for semisupervised hyperspectral image classification", ieee geoscience and remote sensing letters, vol. 16, no. 2, pp. 241-245, 2018. [17] a. l. zheleznyakova, "physically-based method for real-time modelling of ship motion in irregular waves", ocean engineering, vol. 195, pp. 106686, 2020. [18] y. xue, p. j. ludovice, and m. a. grover, "dynamic coarse graining in complex system simulation", in proceedings of the 2011 american control conference, 2011, pp. 5031-5036. [19] n. singh and r. khan, "speaker recognition and fast fourier transform", international journal, vol. 5, no. 7, 2015. [20] r. tabbussum and a. q. dar, "performance evaluation of artificial intelligence paradigms—artificial neural networks, fuzzy logic, and adaptive neuro-fuzzy inference system for flood prediction", environmental science and pollution research, vol. 28, no. 20, pp. 25265-25282, 2021. [21] international institute of earthquake engineering and seismology. [online]. available: http://www.iiees.ac.ir. [22] j. tong, m. lin, x. wang, j. li, j. ren, l. liang, y. liu, "deep learning inversion with supervision: a rapid and cascaded imaging technique", ultrasonics, 122, 106686, 2022. [23] m. sinambelaa, m. situmoranga, k.tarigana, s. humaidia, makmur siraitb, "waveforms classification of northern sumatera earthquakes for new mini region stations using support vector machine", advanced science engineering information technology, vol.11, no. 2, 2021. [24] w. yanwei, l. xiaojun, w. zifa, et al., "deep learning for p-wave arrival picking in earthquake early warning", earthq. eng. eng., vol. 20, pp. 391-402, 2021. [25] m. s. abdalzaher, m. s. soliman, s. m. el-hady, a. benslimane, et al., "a deep learning model for earthquake parameters observation in iot system-based earthquake early warning", ieee internet of things journal, vol. 9, no. 11, pp. 8412-8424, 2022. [26] d. hong, n. yokoya, j. chanussot, x. x. zhu, "an augmented linear mixing model to address spectral variability for hyperspectral unmixing", ieee transactions on image processing, vol. 28, no. 4, 2019. [27] x. wu, d. hong, j. chanussot, y. xu, r. tao, y. wang, "fourier-based rotation-invariant feature boosting: an efficient framework for geospatial object detection", ieee geoscience and remote sensing letters, vol. 17, no. 2, 2020. http://www.iiees.ac.ir/ https://ieeexplore.ieee.org/author/37085776557 https://ieeexplore.ieee.org/author/37088891934 https://ieeexplore.ieee.org/author/37089390270 https://ieeexplore.ieee.org/author/37371767900 https://ieeexplore.ieee.org/xpl/recentissue.jsp?punumber=6488907 https://ieeexplore.ieee.org/xpl/recentissue.jsp?punumber=6488907 https://ieeexplore.ieee.org/xpl/tocresult.jsp?isnumber=9780059 https://ieeexplore.ieee.org/author/37085775450 https://ieeexplore.ieee.org/author/37588926500 https://ieeexplore.ieee.org/author/37265876800 https://ieeexplore.ieee.org/author/37404573000 https://ieeexplore.ieee.org/author/37085902460 https://ieeexplore.ieee.org/author/37085775450 https://ieeexplore.ieee.org/author/37265876800 https://ieeexplore.ieee.org/author/37085405293 https://ieeexplore.ieee.org/author/37289461800 https://ieeexplore.ieee.org/author/37085440814 https://ieeexplore.ieee.org/xpl/recentissue.jsp?punumber=8859 instruction facta universitatis series: electronics and energetics vol. 27, n o 2, june 2014, pp. 183 203 doi: 10.2298/fuee1402183j plasmonic enhancement of light trapping in photodetectors  zoran jakšić 1 , marko obradov 1 , slobodan vuković 1,2 , milivoj belić 2 1 center of microelectronic technologies, institute of chemistry, technology and metallurgy, university of belgrade, serbia 2 science program, texas a&m university at qatar, p.o. box 23874 doha, qatar abstract. we consider the possibility to use plasmonics to enhance light trapping in such semiconductor detectors as solar cells and infrared detectors for night vision. plasmonic structures can transform propagating electromagnetic waves into evanescent waves with the local density of states vastly increased within subwavelength volumes compared to the free space, thus surpassing the conventional methods for photon management. we show how one may utilize plasmonic nanoparticles both to squeeze the optical field into the active region and to increase the optical path by mie scattering, apply ordered plasmonic nanocomposites (subwavelength plasmonic crystals or plasmonic metamaterials), or design nanoantennas to maximize absorption within the detector. we show that many approaches used for solar cells can be also utilized in infrared range if different redshifting strategies are applied. key words: plasmonics, metamaterials, nanoantennas, solar cells, infrared detectors, light trapping 1. introduction an important requirement posed in photodetector design is to maximize the useful photon flux for a given physical thickness of active region of the device [1]. probably the most important type of such devices nowadays are solar cells [2-4]. they are basically photovoltaic detectors where an optical signal (radiation of the sun) is converted to voltage and thus to useful energy. since materials for solar cells are expensive, it is of interest to make their active region as thin as possible. another important class of the devices are infrared (ir) detectors [5] used in e.g. remote sensing, night vision, etc. since they are intended for larger wavelengths – typically they operate within the atmospheric windows at (3-5) m or (8-12) m – their thickness is usually relatively small compared to the operating wavelength.  received january 14, 2014 corresponding author: zoran jakšić center of microelectronic technologies, institute of chemistry, technology and metallurgy, university of belgrade, njegoševa 12, 11000 belgrade, serbia (e-mail: jaksa@nanosys.ihtm.bg.ac.rs) 184 z. jakšić, m. obradov, s. vuković, m. belić actually both the thickness of solar cells and night vision devices may be in subwavelength domain, i.e. smaller than the operating wavelength. a requirement posed to the designers in both situations is how to maximize optical trapping within such thin active regions. an important aspect of decreasing the thickness in the case of general semiconductor detectors is that it is followed by an increase of the response speed. thus the basic task in the design of such detectors is to maintain or even improve quantum efficiency in the operating wavelength range while decreasing the thickness as much as possible. the engineering methods dedicated to maximization of the available optical flux in photodetectors are termed the photon management or the light management [6]. several general strategies are available for this purpose [7], as shown in fig. 1. fig. 1 strategies for maximization of optical flux in photodetector first, one may perform external light concentration and collect optical energy from an incident area larger than the physical dimensions of the detector active region itself (photon collector). a typical example of this approach would be the use of concentrating lenses or reflectors that gather irradiation from the so-called optical area and focus it onto the electric area of the detector. non-imaging collectors can be used to that purpose [8-11]. after the signal has reached the active area of the detector, various antireflection coatings and structures can be used to decrease the reflected component of the incident radiation and to allow as large part of it as possible to enter the active region itself [12]. all of these structures basically match the impedance of the free space/detector environment to that of the detector material. once inside the detector, one can increase the optical path through the active region, which can be done by backside reflectors redirecting radiation back to the active region, or by various scattering structures at the front and at the back side of the device which change the path of the beam to make it longer and make use of total internal reflection to return the beam to the active region. it is also possible to utilize resonant structures (resonant cavity enhancement) [13], thus obtaining a narrow-bandwidth response, or to incorporate photodetector in a photonic crystal cavity [14, 15]. another important approach to detector enhancement after the beam has entered the active region is to perform internal optical concentration (spatial localization), i.e. to plasmonic enhancement of light trapping in photodetectors 185 fabricate structures that will perform squeezing of the optical space from a larger volume to a smaller one, thus increasing the local density of states of optical energy within the latter. the last two approaches, i.e. optical path increase and spatial localization belong to the light trapping schemes. the advent of nanostructuring technologies brought an impetus to this field. various building blocks with nanometer dimensions have been proposed for e.g. solar cell energy harvesting improvement, including nanoparticles, nanowires, different core-shell geometries, colloidal quantum dots, etc. [16, 17]. recently the use of plasmonics appeared as a novel approach to nanotechnological improvement of photodetector light trapping [18-23]. basically, plasmonics represents the use of coupled electron oscillations and surfacebound electromagnetic waves called surface plasmons polaritons (spp). this is achieved through utilization of metal-dielectric nanocomposites that can be designed to obtain almost any desired optical properties and thus almost complete control over electromagnetic propagation in and around such structures [24]. even the values of optical parameters not ordinarily met in nature can be obtained, like near-zero or even negative values of refractive index [25, 26]. such ability to engineer optical parameters at will brought to almost complete control over the propagation of electromagnetic waves and resulted in the appearance of transformation optics [27-29], where one optical space is transformed into another. one of the obvious application of plasmonics has been to “squeeze” the optical space to a much smaller volume than that of the free space. in this way high localizations of the electromagnetic field became possible, i.e. local densities of electromagnetic states much larger than those in the free space. in this paper we consider the use of plasmonics in light trapping in (ultra)thin photodetectors including solar cells and night vision detectors. after considering the fundamental limits to photon management in detectors from the point of view of subwavelength structures, we investigate the basic schemes for light trapping using plasmonics. we analyze the applicability of plasmonic nanoparticles both for field scattering and localization within the detector, the use of subwavelength plasmonic crystals and the possibility to redshift the device response utilizing the designer plasmons. we consider the utilization of dedicated optical antennas (nanoantennas) for detector enhancement. at the end we show how some of the schemes utilized for visible and near infrared radiation can be applied for night vision detectors through the application of different redshifting strategies. 2. fundamental limits to light management in detectors we consider a general case of a photodetector as a device that converts optical energy into another form of energy. most often this energy is electrical signal, although other forms may be used like thermal [30], motion (e.g. cantilever-based detectors) [31], optical signal at another frequency (upor down-converted) [32, 33] etc. basically, different light management approaches are intended to improve absorption of light in the detector and ensure a higher degree of this conversion. obviously, the efficiency of any conversion is limited by basic physical laws. a question is posed what are the fundamental limits of photodetector enhancement through light management. 186 z. jakšić, m. obradov, s. vuković, m. belić fig. 2 the structure of the active region of a detector with corrugated surface and ideal backside reflector a detector system is presented in fig. 2. a background optical flux is incident to the active area of a photodetector with a thickness d. both in the case of solar cells and night vision photodetectors the optical flux is blackbody radiation, described by the planck’s law. in a general case the detector material may incorporate nanostructuring that could localize optical field and create hotspots with high density of states. a perfect mirror is placed at the rear side of the device – i.e. it is assumed that the incident light is unidirectional, while the internal radiation is bidirectional. the detector surface is corrugated in order to increase the optical path through the detector. the corrugation may be random or ordered, but in both cases its basic purpose is to change the direction of light incident upon the active surface and to make use of total internal reflection to ensure repeated passing of the beams through the active region. light can escape if the direction of the internal beam falls within the escape cone, for which according to snell’s law sin cr = 1/n (cr is the critical angle of total reflection, n is the refractive index of the active region). we first consider the case limited by geometrical optics, which has been established by yablonovitch [34-37]. in literature it is variably denoted as the conventional limit, the ergodic light trapping limit, the ray-optics limit and the lambertian limit. it is assumed that the detector active material can be described by an effective absorption coefficient  isotropic throughout the device and that the detector thickness is much larger than the operating wavelength in free space (d >> /2n), so that one considers a bulk process. the absorbance within the photodetector for a single pass across the structure (absorption without enhancement) is ( ) 1 exp( ( ) ) )a d w d        , (1) i.e. the absorbance is equal to the optical thickness of a photodetector, which is defined as the d product. since a bulk case is considered, it is further assumed that interference/diffraction effects can be neglected and that the intensity of light within the detector medium is in equilibrium with external blackbody radiation. the density of states within the medium is proportional to n 2 . the next assumptions are that the equipartition theorem is valid (the plasmonic enhancement of light trapping in photodetectors 187 internal occupation of states is equal to the external one, the internal states are ergodic) and that the surface corrugation performs a full randomization of the incident signal over space. this is not always satisfied, but the assumption holds in a vast majority of cases. a sufficient condition for randomization of light by multiply scattering corrugated surfaces is that these surfaces upon averaging behave as lambertian. the internal distribution of the light within the medium is then isotropic. according to the statistical ray optics approach [34] the relation between internal and external intensity of light is )(),(2),( 2 int  extixnxi  . (2) the same result is also obtained according to the principle of detailed balancing of the light [38] applied between the light incident to a small surface element of the detector active area and escaping from that same element through the loss cone and by applying the brightness or radiance theorem (e. g. [39]) stating that the spectral radiance of light cannot be increased by passive optical devices (based on the principle of reversibility). to determine the enhancement of absorption, one has to consider the loss of light due to various mechanisms. according to yablonovitch [34, 35] there are three such mechanisms: the escape of light through the light cone, the losses due to imperfect reflection at the surfaces and the absorption in bulk. the absorbance of a photon is the ratio of the rate at which absorption occurs and the sum of the absorption and the photon loss through the escape cone. for the volume absorption in the limiting case when d << 1 and taking account the angle of the loss cone , this expression is dn a 2 2 4 sin )( )( )(       , (3) so that the absorption enhancement limit in the bulk case with internal randomization becomes 4n 2 . for = /2 this assumes the more often used simple form dn a 2 4 1 )( )( )(      . (4) the next case we consider are the devices with plasmonic localization for the enhancement of absorption. in this case many of the above assumptions introduced for ergodic limit are not valid. the crucial points are that the light distribution now is not isotropic (and actually the volumes with a strongly enhanced density of electromagnetic states may be deeply subwavelength) and the thickness of the detector is usually subwavelength. a number of treatises is dedicated to the situations in which the ray optics limit is exceeded and optical modes are confined at subwavelength scale [40-42]. however, until now no generally valid solution has been given for the extension of the ray optics limit [43]. 188 z. jakšić, m. obradov, s. vuković, m. belić 3. plasmonics for light trapping surface plasmons polaritons (spp) are oscillations of free electrons in conductive material near an interface with dielectric coherently coupled with electromagnetic radiation at the interface. the conductive material can be characterized by negative value of dielectric permittivity, while that in dielectric is positive. typically the conductive material is metal (most often used being gold and silver, although other metals are used like chromium, copper, various alloys, alkali metals, etc.), however other materials are used too, for instance transparent conductive oxides like indium tin oxide, zinc oxide, tin oxide, etc. (in near infrared), different semiconductors like silicon carbide, gallium arsenide (mid infrared), intermetallics, graphene and some other materials, all being denoted as plasmonic materials [44-46]. the spp is related with electromagnetic waves that are confined to the interface between positive and negative permittivity materials and are evanescent in perpendicular direction, i.e. they exponentially decay away from the interface. spps can be propagating along the interface, or they can be nonpropagating, i.e. spatially confined to e.g. a metal nanoparticle (localized surface plasmons polaritons). generally, the rapidly expanding field of research and application of spp-based phenomena is denoted as plasmonics [24, 47-49]. the field of plasmonics is dedicated to the use of spps in a similar way electrons are used in electronics. this is achieved via engineering of nano-composites that combine materials with positive and with negative values of dielectric permittivity in a certain frequency range. plasmonic nanocomposites can be one-dimensional (1d) like planar metal-dielectric superlattices, two-dimensional (2d) like cylindrical metallic nanowires, or three-dimensional (3d) like spherical metallic nanoparticles embedded in dielectrics. these structures can be periodic, quasiperiodic [50], aperiodic [51] or fully random [52]. the building blocks of these functions themselves may have different shapes, from simple to complex and from regular to irregular [53]. even in their simplest version, spps at the plane boundary between two semi-infinite media with opposite signs of dielectric permittivity are inhomogeneous electromagnetic waves (i.e. not plane waves) that propagate along the interface, and whose energy is concentrated in the narrow region near the boundary plane. this is possible only in a frequency range where the absolute value of the negative dielectric permittivity on one side is greater than the positive value on the other side of the interface. spps are strongly tm (transverse-magnetic) polarized, and because of that they are called polaritons. in other words, magnetic field and wavevector of the spp lay in the plane of interface, while electric field of the wave has both perpendicular and parallel to the wavevector components. therefore, spps are neither longitudinal nor transversal waves. it should be noted that te polarized component of electromagnetic field cannot satisfy the maxwell equations with standard boundary conditions, in the form of surface wave. plasmonic nanocomposites with two or more metal-dielectric interfaces within distances less than, or comparable to the plasmonic material skin depth (~25 nm for au or ag) produce strong coupling of neighbouring spps, and highly pronounced nonlocal effects. a plethora of new modes and possible novel effects may appear in such structures [54, 55]. sophisticated theoretical and numerical methods are necessary in order to achieve desired nanocomposite design levels. plasmonic enhancement of light trapping in photodetectors 189 an important disadvantage of spps is their resonant nature, which causes a narrow bandwidth of operation. another one is their large wave damping due to collisions of free carriers in the epsilon-negative material, which leads to shorter spps lifetimes and/or propagation lengths and high absorption of incident radiation. the relative dielectric permittivity of plasmonic materials is negative below plasma frequency, and its dispersion is well-described by electron resonance model of drude [56], also denoted as drude-sommerfeld model )( 2    i p    , (5) where p is the plasma frequency,  denotes damping factor describing losses (i.e. defines the imaginary part of the complex dielectric permittivity), while  is the asymptotic relative dielectric permittivity. the plasma frequency is determined by the properties of free carriers as 2 2 * 0 e p n e m    , (6) where ne is electron concentration, e is the free electron charge (1.6·10 –19 c), 0 is the free space (vacuum) permittivity (8.854·10 –12 f/m), and m * is the electron effective mass. the damping factor can be calculated from the material scattering data as * m e   (7) where  is mobility of free carriers. if interband transitions from the valence to the conduction bands exist, dielectric permittivity is described by the lorentz model [57]    ')( ' 22 0 2    i p , (8) where  is the resonant frequency of electron oscillator, while the apostrophe in plasma frequency ’p and damping factor ’ denotes that these values are related with the concentration of bound electrons taking part in the interband transitions. since one is able to tailor a plasmonic nanostructure, this means that dispersion relations could be designed within it, even enabling the optical behavior that surpasses that of natural materials. the structures thus obtained are known as plasmonic metamaterials [25]. in that case one can obtain modes with superluminal group velocities (“fast light”), near-zero (“slow light”) or even negative (“left-handed light,” propagating in the direction opposite to that of the phase velocity) [58]. the possibility to obtain an arbitrary frequency dispersion gives a possibility to convert propagating far field modes into spatially localized near-field modes, thus obtaining strongly increased density of states. the same energy is compacted into a much smaller space, thus ensuring much higher energy densities. this ensures highly enhanced interaction of optical radiation with photodetector material. this kind of engineering of optical absorption ensures its maximization in the 190 z. jakšić, m. obradov, s. vuković, m. belić active area, leading to vastly increased photodetector response and sensitivity compared to other light trapping schemes. a drawback of the use of plasmonics in photodetection are large absorption losses in metal, which result in a large part of energy being converted to heat instead of the useful signal. this topic is a field of active investigation, and various schemes are used to avoid it [59]. one of the approaches is the use of alternative plasmonic materials, like for instance transparent conductive oxides like tin oxide, indium tin oxide or zinc oxide [60] which are routinely used in solar cells because of their transparency at visible wavelengths. another such material for solar cell enhancement is graphene [45]. the applicability of plasmonics for photodetector enhancement has been recognized very early, in the period 1970-1980-ties, and actually some of the first proposed applications of surface plasmons polaritons were in photodetection [61, 62]. a large body of papers has been published on various methods of plasmonic enhancement in solar cells [19, 59]. surface plasmon polariton-mediated light trapping schemes may be roughly divided into the following groups according to the particular mechanism used (and bearing in mind that a single trapping scheme may include more than one of these):  enhanced mie scattering on plasmonic nanoparticles or nanovoids through plasmonic enlargement of effective cross-section [63].  coupling into guided modes (which may be propagating or spp modes) [19]  field localization and generation of hotspots near the surface of plasmonic material (using embedded nanoparticles, nanoantennas, metamaterials) [20]  use of plasmon-based singular optics (optical vortices, i.e. circular flow of field in a corkscrew fashion around phase singularities in the optical near field around plasmonic nanostructures) [59]  use of metamaterial-based transformation optics to map the optical space into a desired shape and with an increased density of states (optical superconcentrators and superabsorbers, optical black holes) [64]  plasmon-enhanced up-conversion media (reverse of luminescent materials used for down-conversion) [65] the plasmonic structures to be used for one or more of the above purposes include the following:  nanoparticles and nanovoids – used as scatterers and as nanoantennas for field coupling and localization. may be arranged in an ordered fashion (pattern) or disordered)  diffractive structures (gratings, lattices) – used for field coupling into guided modes; may be ordered or disordered.  subwavelength plasmonic crystals (spc) – used for field coupling and localization. may be periodic [66] or quasiperiodic [67] in 1d, 2d or 3d. plasmonic structures may be used as resonant enhancers, in which case they offer a narrow-bandwidth operation, or may be nonresonant, with a wide-bandwidth operation [68]. 4. plasmonic nanoparticles as mie scatterers the scattering cross-section of a plasmonic nanoparticle is greatly enhanced due to plasma resonance compared to non-plasmonic ones. the effective cross-section may be plasmonic enhancement of light trapping in photodetectors 191 an order of magnitude larger than the geometrical cross-section. thus a 10% surface coverage would suffice for practically 100% efficiency of conversion from incident propagating modes into surface plasmons polaritons. plasmonic nanoparticles (field concentrators) active region substrate fig. 3 light trapping utilizing plasmonic nanoparticles stochastically placed on the detector surface substrate buffer active layer ar/dielectric b) a) plasmonic nanoparticless c) d) fig. 4 geometries for plasmonic scatterers for light trapping within photodetector. a) nanoparticles embedded in top dielectric; b) nanoparticles on top of the active region; nanoparticles embedded within the active region; d) nanoparticles on the back side usually the conventional mie theory is utilized for the calculation of effective crosssections for absorption and scattering on nanoparticles [69]. mie theory is valid for noninteracting nanoparticles (i.e. those where the interparticle distance is large enough to prevent their electromagnetic coupling). in nanoparticles interacting through near-field coupling or far-field dipole interactions various additional phenomena appear like splitting of plasmon resonances and their shifting. 192 z. jakšić, m. obradov, s. vuković, m. belić the simplest case is scattering on a spherical plasmonic nanoparticle that can be considered as an electric dipole. its scattering cross-section at a wavelength  can be calculated as [70, 71] 4 2 3 8            scatc , (9) where                   213 d np d np v      . (10) here np is the complex and wavelength-dispersive relative dielectric permittivity of the plasmonic nanoparticles, d is the permittivity of the surrounding dielectric medium and v is the geometrical volume of the nanoparticle. the plasmon resonance and the maximum scattering cross-section are achieved at np = –2 d. the absorption cross-section is determined as 2 im( ) abs c     . (11) elongated ellipse may be taken as a generalization of the case of sphere and corresponds to a single wire nanorod antenna. this structure is actually the basic building block, out of which more complex forms are built. again the mie theory is applicable to this case, in a somewhat modified form. the dipole moment induced by an external field in an elongated ellipsoid is 0 (1 ) e j e j v p p        , (12) and its resonant frequency r rp res 2    , (13) where r is short radius of the ellipsoid, and r/2 its longer radius. in the most general case, the shapes of the nanoparticles widely vary and may assume different complex forms (e.g. various convex and concave polyhedra, including stellated and other forms [72]. this reflects strongly in their plasmonic response [19], since in principle sharper forms will cause larger field localizations. mie theory has been generalized to some of the more complex forms, but in the most general case the response is calculated numerically. 5. diffractive plasmonic couplers an obvious approach to light trapping using plasmonics is to integrate the detector structure with a diffractive plasmonic structure (diffractive optical element, doe) [73] and generally with a corrugated metal layer to act as a coupler with propagating modes. the simplest doe is the conventional diffractive grating. a parameter of a general diffractive optical element (doe) that determines the degree of coupling with propagating modes is its diffraction efficiency. the diffraction efficiency is dependent on geometrical and material parameters of the plasmonic doe, i.e. the complex plasmonic enhancement of light trapping in photodetectors 193 refractive index of the plasmonic material (for instance, transparent conductive oxides will generally have lower losses and longer resonant wavelengths than metals), the dimensions of the doe features (in the case of plasmonic diffractive gratings the parameters of influence will be the lattice constant (the grating element spacing), the shape and height of the grating ridges. thus its value can be tailored and optimized by a proper choice of the quoted parameters. the guided modes into which propagating modes are coupled by a doe can be propagating optical modes (the conventional waveguide modes) and surface plasmon polariton modes. in an ideal case for a photodetector, all propagating modes will be converted to plasmonic ones. figure 5 shows two different geometries for incorporation of doe into thin photodetectors: a) back-side doe, b) top-side doe. the configuration shown in fig. 5a is more common of the two [74]. however, the second one (fig. 5b can perform an additional function as light collector. a) diffractive patern b) ar/dielectric active layer buffer substrate fig. 5 two geometries for incorporation of plasmonic doe into a photodetector. a) bottom doe, b) top doe depending on the structure of the doe coupler, the propagation lengths of the spp modes may be shorter or longer [75]. besides its function as a light trapping structure, a doe can also serve as a light collector by its virtue of functioning as a non-imaging light concentrator [76]. in addition to that, a doe may perform impedance matching between free space and photodetector material, thus behaving basically as a diffractive antireflection structure. for instance, 1d metallic gratings (i.e. metal surface with an array of parallel slits) have been proved to act as such impedance-matching structures [77]. this means that such grating exhibit wideband extraordinary transmission. since this is a non-resonant phenomenon, it ensures a wide bandwidth and a broad range of incident angles. the diffractive structure may have a form of conventional diffractive grating with parallel ridges of metal, or may be more complex (e.g. a lattice/fishnet, etc.) in a most general case it will have a form of a holographic optical element with fully tailorable properties that can be computer generated [78]. plasmonic doe may function in narrow-bandwidth mode near resonance, but also as non-resonant elements with wide bandwidth. a built-in plasmonic doe in photodetector may simultaneously perform its function as a coupler and an electromagnetic field concentrator, but it may be also built to perform as a plasmonic waveguide [79, 80]. 6. subwavelength plasmonic crystals and designer plasmons further generalization of diffractive plasmonic structures is that to subwavelength plasmonic crystals (spc) [81]. a spc may be defined as a 1d, 2d or 3d plasmonic 194 z. jakšić, m. obradov, s. vuković, m. belić structure with its period much smaller than the operating wavelength (a rule of thumb is that the periodicity is at least ten times smaller than the operating wavelength). thus the details of the structure are not “seen” by the incident light and it behaves as an effective medium with its optical parameters dependent on its design, thus ensuring engineering of frequency dispersion of such materials. the number different possible kinds of spc is virtually limitless. plasmonic metamaterials may be regarded a special class of the spc and are defined as the structures possessing electromagnetic properties that are not readily found in nature [25], the most often researched among such properties being the possibility to reach negative values of effective refractive index [82]. the spc structures ensure light localization and can be therefore straightforwardly utilized to enhance optical absorption in photodetectors. in addition to that, owing to a large number of possible modes in such structures [boba], it is possible to utilize them at the same time to match the impedance between the free space and the photodetector, effectively behaving as an antireflective diffraction structure. as an example of spc for the enhancement of solar cells, fan et al [83] fabricated an ordered 2d array or metal cubes (or rather cuboids) on semiconductor surface to improve light trapping. among spc structures within the context of photodetection, one of the more frequently encountered ones are 2d arrays of nanoapertures in opaque metal films. such structures first drew attention for their ability to transmit light in spite of the dimensions of nanoapertues being much smaller than the operating wavelength and were denoted as extraordinary optical transmission (eot) arrays [84]. this behavior is a consequence of resonant excitation of spp at their surface that forces the passage of electromagnetic waves incident to the whole surface through the apertures. since such behavior effectively corresponds to impedance matching between propagating waves and the perforated metal film, the eot arrays thus act as efficient antireflective structures. however, there is another useful application of the eot arrays in photodetection (and generally structured metal-dielectric surfaces) and it is based on the properties of the surface waves that propagate along them. detector active region plasmon enhancement fig. 6 metallodielectric eot structure introducing “designer” plasmons with structurally tunable plasma frequency pendry et al [85] have shown that for a surface wave that propagates along a perforated metal film one is able to introduce an effective permittivity with a form plasmonic enhancement of light trapping in photodetectors 195          holehole hole planein a c a d    22 22 2 22 1 8 (14) where hole is the permittivity of the material within the holes, a is the hole side length (in the case of square holes, as shown in fig. 6) , and k0 is the wavevector in vacuum. the effective plasma frequency of such material is holehole p a c     (15) in other words, the effective dielectric permittivity of an eot array has the form identical to that of plasmonic materials. such surface waves that mimic spp were denoted by pendry the designer plasmons, and are also known as “spoof” plasmons. their main advantage is that one is able to tune the effective plasma frequency by a proper choice of geometry and material parameters and thus to shift it at will. an obvious application of this approach was for infrared detectors and structures tuned to the range of 8-10 m have been reported [86]. a paragidm that appeared in the wake of metamaterials is the transformation optics [2729, 87], the use of conformal mapping to transform one optical space into another, thus ensuring bending of light at will and tailoring of the density of states within a given volume. in a general state this is ensured through the use of gradient index metamaterials [88, 89]. probably the best known example of transformation optics are the so-called cloaking devices [29, 90], but from the point of view of photodetection much more interesting concepts are met in superfocusing and superconcentrators [91], superabsorbers [92, 93] including optical black holes [94], superscatterers [95], etc . in their 2011 paper aubry et al [64] proposed the use of transformation optics to ensure broadband light harvesting. 7. nanoantennas for photodetection enhancement nanoantenna or optical antenna [68, 96-98] is a plasmonic structure redirecting propagating waves into evanescent field (and vice versa), where propagating and spatially localized modes are linked in a highly efficient manner. the amount of localization itself can be tailored by the proper design of the nanoantenna and can be deeply subwavelength. thus interaction with photodetector active region can be vastly enhanced. nanoantennas are isolated structures, i.e. they are not connected to a feeding circuitry like the conventional antennas. with this in mind, a simple spherical nanoparticle may be regarded as the most basic nanoantenna. its scattering properties are shortly presented in section 4 of this paper. various types of nanoantennas were experimentally produced and presented in literature. fig. 7 shows some of the basic geometries, including the most basic type, the nanosphere. if two such spheres are brought together, they form a nanodimer with a coupling gap with a subwavelength width between them (denoted as the feed gap). a field hotspot appears in the feed gap, where localization is deeply subwavelength and field enhancement is very strong. in this manner larger field localizations are obtained than those using single structures. 196 z. jakšić, m. obradov, s. vuković, m. belić another generalization is the introduction of elongated ellipsoid (also described in section 4) that can be within this context described as dipole nanorod antenna, which is one of the most basic nanoantenna geometries. if two nanorods acting as linear dipoles are aligned and brought together to a subwavelength distance, ensuring an end-to-end coupling, they form a two-wire nanoantenna [68]. this is another basic type of optical antenna. it can be further generalized by introducing two additional dipoles perpendicularly to the first ones, all foud having a joint feed gap (the cross-antenna). nanoparticles can be ordered in an array (nanoparticle chain) to form an optical antenna [99] effectively behaving as linear nanorod antenna. another prototypical structure is the bowtie nanoantenna [100], consisting of two triangular shapes aligned along their axes and forming the feed gap with their tips. such geometry ensures a broader bandwidth together with large field localizations in the feed gap. a diabolo-type nanoantenna has been proposed in [101]. an optical yagi-uda nanoantenna can be fabricated by placing a resonant nanorod antenna between a reflector nanorod and a group of director nanorods [102]. similar to such antennas used in radiofrequent domain, a good directivity is obtained. more exotic shapes include spiral nanoantennas [103] and those with fractal geometries [104]. a plethora of other shapes can be used. different geometries include e.g. the use of split rings, various crescent shapes. an important group are nanoantennas making use of the babinet principle (a metal shape surrounded by dielectric and a dielectric-filled hole in metal with identical shape and size have identical diffraction patterns). thus bow-tie holes in metal substrates are used, two holes as a babinet equivalent of a nano-dimer, arrays of nanoholes, crossed arrays of nanoholes, etc. [105]. fig. 7 some different types of experimental plasmonic nanoantennas plasmonic enhancement of light trapping in photodetectors 197 the obvious way to use nanoantennas in photodetection is for coupling between propagating and localized modes and for field localization, especially through the use of hotspots within the feed gaps. a large number of works has been dedicated to the use of optical nanoantennas for photodetector enhancement [59, 68, 97, 106] the applicability of optical antennas for photodetection has been recognized very early [61]. today it is still one of the foci of interest in the application of optical antennas [59, 106]. one of the alternative approaches is to use a schottky metal-semiconductor junction where the optical antenna forms the metal mart of the metal-dielectric contact at the semiconductor detector surface [107]. photoexcitation generates hot electron-hole pairs by plasmon decay and the electrons are injected over the schottky barrier, thus directly generating photocurrent. a problem with this approach is a low efficiency when using hot electrons. 9. redshifting methods for nanoparticle-based plasmon-assisted infrared detection most of the approaches described in this paper are applicable in different part of the spectrum (subwavelength plasmon crystals/designer plasmon structures and optical antennas). however, the use of metal nanoparticles as mie scatterers is limited to frequencies near the surface plasmon resonance, which is for usual plasmonic materials (good metals) in ultraviolet or visible part of the spectrum. this makes them unsuitable for night vision devices and infrared detection. in this section we consider possible strategies to ensure the usability of plasmonic particles in the ir range [108]. the main point is that one needs to shift their resonance frequency toward longer wavelength, i.e. to perform a redshift of the characteristics. one obvious approach is to use materials with lower plasma frequency. it is known that plasma frequency of transparent conductive oxides is redshifted compared to metals and can be further shifted through proper doping and fabrication techniques [109-111]. another pathway toward redshifting is the immersion of plasmonic nanoparticles into high refractive index material [70], either by incorporating it into a dielectric film at the detector surface or utilizing core-shell particles with external dielectric layer. finally, one of the possible methods is the adjustment of interparticle spacing. fig. 8 shows the calculated scattering cross section for a spherical dipole indium tin oxide nanoparticle with a radius of 60 nm. the assumed doping concentration was 1.2·10 21 cm –3 which together with an effective mass of m* = 0.4 m0 furnishes a plasma frequency of 4.8·10 14 hz. the nanoparticle is placed at the top of the active surface of the detector and is embedded in dielectric, a layout similar to that shown in fig 4b. finite element method was utilized for simulation; no approximations were used. the plasmon resonance redshift described by maxima in scattering cross-section dispersion relations shown in figure 8 is caused by the increase in the embedding dielectric permittivity. figure 9 shows the radial distribution of the scattered electric field, presenting forward and back scattering. spreading of the forward scattering region with the increase of the permittivity of the dielectric layer is readily seen figure 9.a as well for larger operating wavelengths figure 9.b. finally, fig. 10 shows the electric field x-axis component (parallel to the incident light polarization) around the spherical nanoparticle at the surface of the detector for a permittivity of the embedding layer of 8. 198 z. jakšić, m. obradov, s. vuković, m. belić 2.0 2.5 3.0 3.5 4.0 10 15 20 25 30 35 40 s c a tt e ri n g c ro s s -s e c ti o n , x 1 0 – 1 4 m 2 wavelength, m diel = 8 diel = 10 diel = 12 spherical nanoparticle r = 60 nm substr=10 fig. 8 spectral dependence of scattering sscat cross-section for an embedded ito particle, r=60 nm, p=625 nm =12 =8 =10 2 m 1 m 2 m 2 m 3 m 4 m 4 m a) b) fig. 9 radial distribution of electric field around an ito nanoparticle obtained by finite element modeling; r=60 nm, p=625 nm. light is incident from top. a) scattering curves obtained for dielectric permittivity values 8, 10 and 12 for an operating wavelength of 3 m. b) scattering curves for operating wavelengths of 2, 3 and 4 m. permittivity of the dielectric layer is 12 plasmonic enhancement of light trapping in photodetectors 199 fig. 10 field enhancement around ito nanoparticle for infrared detector enhancement, calculated by fem simulation. light is incident from right. r=60 nm, p=625 nm,  =2.6 m, permittivity of the dielectric layer is 8 10. conclusion a broad overview is given of the currently available possibilities to use plasmonics for the enhancement of different classes of photodetectors, stressing solar cells and night vision devices. the consideration is based on the point of view of non-imaging photodetection devices (single detector elements) intended for detection of a broadband spectrum that can be represented as blackbody radiation. a classification of the approaches proposed until now is given, including some original results by the authors. the list of the available methods and approaches must be far from finished, since both plasmonics and solar cells fields of research are rapidly expanding, and new ideas and approaches appear almost every day. acknowledgement: the paper is a part of the research funded by the serbian ministry of education and science within the projects tr32008 and iii45016 and by the qatar national research fund within the project nprp 09-462-1-074. references [1] a. shah, p. torres, r. tscharner, n. wyrsch, and h. keppner, “photovoltaic technology: the case for thin-film solar cells,” science, vol. 285, no. 5428, pp. 692-698, 1999. [2] a. mcevoy, t. markvart, and l. castañer, solar cells, elsevier, amsterdam, 2013. [3] g. li, r. zhu, and y. yang, “polymer solar cells,” nat. photonics, vol. 6, no. 3, pp. 153-161, 2012. 200 z. jakšić, m. obradov, s. vuković, m. belić [4] s. j. fonash, solar cell device physics, elsevier amsterdam, 2010. [5] a. rogalski, infrared detectors, crc press, bocca raton, 2010. [6] r. b. wehrspohn, and j. ůpping, “3d photonic crystals for photon management in solar cells,” journal of optics, vol. 14, no. 2, 2012. [7] z. jakšić, and z. djurić, “cavity enhancement of auger-suppressed detectors: a way to backgroundlimited room-temperature operation in 3-14 μm range,” ieee j. sel. top. quant. electr., vol. 10, no. 4, pp. 771-776, 2004. [8] j. h. atwater, p. spinelli, e. kosten, j. parsons, c. van lare, j. van de groep, j. garcia de abajo, a. polman, and h. a. atwater, “microphotonic parabolic light directors fabricated by two-photon lithography,” appl. phys. lett., vol. 99, no. 15, 2011. [9] i. m. bassett, w. t. welford, and r. winston, "nonimaging optics for flux concentration," progress in optics 27, e. wolf, ed., pp. 161-226: elsevier, 1989. [10] w. t. welford, and r. winston, high collection nonimaging optics, academic press, 1989. [11] r. winston, j. c. minano, and p. g. benitez, nonimaging optics, academic press, 2005. [12] d. h. raguin, and g. m. morris, “antireflection structured surfaces for the infrared spectral region,” appl. opt., vol. 32, no. 7, pp. 1154-1167, 1993. [13] m. s. ünlü, and s. strite, “resonant cavity enhanced photonic devices,” j. appl. phys., vol. 78, no. 2, pp. 607-639, 1995. [14] b. temelkuran, e. ozbay, j. p. kavanaugh, g. tuttle, and k. m. ho, “resonant cavity enhanced detectors embedded in photonic crystals,” appl. phys. lett., vol. 72, no. 19, pp. 2376-2378, 1998. [15] z. djurić, z. jakšić, d. randjelović, t. danković, w. ehrfeld, and a. schmidt, “enhancement of radiative lifetime in semiconductors using photonic crystals,” infrared phys. technol., vol. 40, no. 1, pp. 25-32, 1999. [16] l. cao, p. fan, a. p. vasudev, j. s. white, z. yu, w. cai, j. a. schuller, s. fan, and m. l. brongersma, “semiconductor nanowire optical antenna solar absorbers,” nano lett., vol. 10, no. 2, pp. 439-445, 2010. [17] m. m. adachi, a. j. labelle, s. m. thon, x. lan, s. hoogland, and e. h. sargent, “broadband solar absorption enhancement via periodic nanostructuring of electrodes,” scientific reports, vol. 3, 2013. [18] w. l. barnes, a. dereux, and t. w. ebbesen, “surface plasmon subwavelength optics,” nature, vol. 424, no. 6950, pp. 824-830, 2003. [19] h. a. atwater, and a. polman, “plasmonics for improved photovoltaic devices,” nat. mater., vol. 9, no. 3, pp. 205-213, 2010. [20] j. a. schuller, e. s. barnard, w. cai, y. c. jun, j. s. white, and m. l. brongersma, “plasmonics for extreme light concentration and manipulation,” nat. mater., vol. 9, no. 3, pp. 193-204, 2010. [21] s. pillai, k. r. catchpole, t. trupke, and m. a. green, “surface plasmon enhanced silicon solar cells,” j. appl. phys., vol. 101, no. 9, 2007. [22] v. e. ferry, m. a. verschuuren, h. b. t. li, e. verhagen, r. j. walters, r. e. i. schropp, h. a. atwater, and a. polman, “light trapping in ultrathin plasmonic solar cells,” opt. express, vol. 18, no. 13, pp. a237-a245, 2010. [23] k. r. catchpole, and a. polman, “plasmonic solar cells,” opt. express, vol. 16, no. 26, pp. 2179321800, 2008. [24] s. a. maier, plasmonics: fundamentals and applications, springer science+business media, new york, ny, 2007. [25] w. cai, and v. shalaev, optical metamaterials: fundamentals and applications, springer, dordrecht , germany, 2009. [26] s. a. ramakrishna, and t. m. grzegorczyk, physics and applications of negative refractive index materials, spie press bellingham, wa & crc press, taylor & francis group, boca raton fl, 2009. [27] u. leonhardt, “optical conformal mapping,” science, vol. 312, no. 5781, pp. 1777-1780, 2006. [28] u. leonhardt, and t. g. philbin, "transformation optics and the geometry of light," progress in optics, e. wolf, ed., pp. 69-152, amsterdam, the netherlands: elsevier science & technology 2009. [29] j. b. pendry, d. schurig, and d. r. smith, “controlling electromagnetic fields,” science, vol. 312, no. 5781, pp. 1780-1782, 2006. [30] e. h. putley, "thermal detectors," optical and infrared detectors, r. j. keyes, ed., berlin: springerverlag, 1983. [31] p. g. datskos, n. v. lavrik, and s. rajic, “performance of uncooled microcantilever thermal detectors,” review of scientific instruments, vol. 75, no. 4, pp. 1134-1148, 2004. plasmonic enhancement of light trapping in photodetectors 201 [32] t. trupke, m. a. green, and p. würfel, “improving solar cell efficiencies by up-conversion of sub-bandgap light,” j. appl. phys., vol. 92, no. 7, pp. 4117-4122, 2002. [33] t. trupke, m. a. green, and p. würfel, “improving solar cell efficiencies by down-conversion of highenergy photons,” j. appl. phys., vol. 92, no. 3, pp. 1668-1674, 2002. [34] e. yablonovitch, “statistical ray optics,” j. opt. soc. am., vol. 72, pp. 899-907, 1982. [35] e. yablonovitch, and g. d. cody, “intensity enhancement in textured optical sheets for solar cells,” ieee transactions on electron devices, vol. ed-29, no. 2, pp. 300-305, 1982. [36] t. tiedje, e. yablonovitch, g. d. cody, and b. g. brooks, “limiting efficiency of silicon solar cells,” ieee transactions on electron devices, vol. ed-31, no. 5, pp. 711-716, 1984. [37] p. campbell, and m. a. green, “limiting efficiency of silicon solar cells under concentrated sunlight,” ieee transactions on electron devices, vol. ed-33, no. 2, pp. 234-239, 1986. [38] w. shockley, and h. j. queisser, “detailed balance limit of efficiency of p-n junction solar cells,” j. appl. phys., vol. 32, no. 3, pp. 510-519, 1961. [39] m. born, and e. wolf, principles of optics, 7th ed., cambridge university press, cambridge 1999. [40] z. yu, a. raman, and s. fan, “fundamental limit of nanophotonic light trapping in solar cells,” proc. nat. acad. sci. u.s.a., vol. 107, no. 41, pp. 17491-17496, 2010. [41] z. yu, a. raman, and s. fan, “thermodynamic upper bound on broadband light coupling with photonic structures,” phys. rev. lett., vol. 109, no. 17, 2012. [42] d. m. callahan, j. n. munday, and h. a. atwater, “solar cell light trapping beyond the ray optic limit,” nano lett., vol. 12, no. 1, pp. 214-218, 2012. [43] v. ganapati, o. d. miller, and e. yablonovitch, “light trapping textures designed by electromagnetic optimization for subwavelength thick solar cells,” ieee journal of photovoltaics, 2013. [44] a. boltasseva, and h. a. atwater, “low-loss plasmonic metamaterials,” science, vol. 331, no. 6015, pp. 290-291, 2011. [45] p. avouris, and m. freitag, “graphene photonics, plasmonics, and optoelectronics,” ieee j. sel. top. quant. electr., vol. 20, no. 1, 2014. [46] z. jakšić, s. m. vuković, j. buha, and j. matovic, “nanomembrane-based plasmonics,” j. nanophotonics, vol. 5, pp. 051818.1-20, 2011. [47] s. a. maier, and h. a. atwater, “plasmonics: localization and guiding of electromagnetic energy in metal/dielectric structures,” j. appl. phys., vol. 98, no. 1, pp. 1-10, 2005. [48] e. ozbay, “plasmonics: merging photonics and electronics at nanoscale dimensions,” science, vol. 311, no. 5758, pp. 189-193, 2006. [49] r. b. m. schasfoort, and a. j. tudos, eds., “handbook of surface plasmon resonance,” cambridge, uk: royal society of chemistry 2008. [50] c. bauer, g. kobiela, and h. giessen, “2d quasiperiodic plasmonic crystals,” scientific reports, vol. 2, pp. 0681.1-6, 2012. [51] m. maksimović, and z. jakšić, “emittance and absorptance tailoring by negative refractive index metamaterial-based cantor multilayers,” j. opt. a-pure appl. opt., vol. 8, no. 3, pp. 355-362, 2006. [52] k. vynck, m. burresi, f. riboli, and d. s. wiersma, “photon management in two-dimensional disordered media,” nat. mater., vol. 11, no. 12, pp. 1017-1022, 2012. [53] z. jakšić, "optical metamaterials as the platform for a novel generation of ultrasensitive chemical or biological sensors," metamaterials: classes, properties and applications, e. j. tremblay, ed., pp. 1-42, hauppauge, new york: nova science publishers, 2010. [54] s. m. vuković, z. jakšić, and j. matovic, “plasmon modes on laminated nanomembrane-based waveguides,” j. nanophotonics, vol. 4, pp. 041770, 2010. [55] s. m. vuković, z. jakšić, i. v. shadrivov, and y. s. kivshar, “plasmonic crystal waveguides ” appl. phys. a, vol. 103, no. 3, pp. 615-617, 2011. [56] p. drude, the theory of optics, dover publications, mineola, new york, 2005. [57] h. a. lorentz, the theory of electrons, dover publications, mineola, new york, 1952. [58] p. w. milonni, fast light, slow light and left-handed light, taylor & francis, abingdon, oxford, 2004. [59] s. v. boriskina, h. ghasemi, and g. chen, “plasmonic materials for energy: from physics to applications,” materials today, vol. 16, no. 10, pp. 375-386, 2013. [60] s. franzen, “surface plasmon polaritons and screened plasma absorption in indium tin oxide compared to silver and gold,” j. phys. chem. c, vol. 112, no. 15, pp. 6027-6032, 2008. [61] b. l. twu, and s. e. schwarz, “properties of infrared cat-whisker antennas near 10.6 μ,” appl. phys. lett., vol. 26, no. 12, pp. 672-675, 1975. 202 z. jakšić, m. obradov, s. vuković, m. belić [62] s. r. j. brueck, v. diadiuk, t. jones, and w. lenth, “enhanced quantum efficiency internal photoemission detectors by grating coupling to surface plasma waves,” appl. phys. lett., vol. 46, no. 10, pp. 915-917, 1985. [63] d. derkacs, s. lim, p. matheu, w. mar, and e. yu, “improved performance of amorphous silicon solar cells via scattering from surface plasmon polaritons in nearby metallic nanoparticles,” appl. phys. lett., vol. 89, no. 9, pp. 093103, 2006. [64] a. aubry, d. y. lei, a. i. fernández-domínguez, y. sonnefraud, s. a. maier, and j. b. pendry, “plasmonic light-harvesting devices over the whole visible spectrum,” nano lett., vol. 10, no. 7, pp. 2574-2579, 2010. [65] t. trupke, m. green, and p. würfel, “improving solar cell efficiencies by up-conversion of sub-bandgap light,” j. appl. phys., vol. 92, no. 7, pp. 4117-4122, 2002. [66] g. shvets, and y. a. urzhumov, “electric and magnetic properties of sub-wavelength plasmonic crystals,” j. opt. a-pure appl. opt., vol. 7, no. 2, pp. s23-s31, 2005. [67] c. bauer, and h. giessen, “light harvesting enhancement in solar cells with quasicrystalline plasmonic structures,” opt. express, vol. 21, no. 103, pp. a363-a371, 2013. [68] p. biagioni, j.-s. huang, and b. hecht, “nanoantennas for visible and infrared radiation,” reports on progress in physics, vol. 75, no. 2, pp. 024402, 2012. [69] m. quinten, optical properties of nanoparticle systems: mie and beyond, wiley-vch, weinheim, germany, 2011. [70] m. schmid, r. klenk, m. c. lux-steiner, m. topič, and j. krč, “modeling plasmonic scattering combined with thin-film optics,” nanotechnology, vol. 22, no. 2, 2010. [71] v. e. ferry, j. n. munday, and h. a. atwater, “design considerations for plasmonic photovoltaics,” adv. mat., vol. 22, no. 43, pp. 4794-4808, 2010. [72] t. k. sau, and a. l. rogach, eds., “complex-shaped metal nanoparticles: bottom-up syntheses and applications,” weinheim, germany: wiley-vch, 2012. [73] d. c. o'shea, t. j. suleski, a. d. kathman, and d. w. prather, diffractive optics: design, fabrication, and test, spie publications, bellingham, washington, 2003. [74] p. spinelli, e. ferry, j. van de groep, m. van lare, a. verschuuren, i. schropp, a. atwater, a. polman, v. e. ferry, m. a. verschuuren, r. e. i. schropp, and h. a. atwater, “plasmonic light trapping in thinfilm si solar cells,” journal of optics, vol. 14, no. 2, 2012. [75] p. berini, “long-range surface plasmon polaritons,” adv. opt. photon., vol. 1, no. 3, pp. 484-588, 2009. [76] r. d. r. bhat, n. c. panoiu, s. r. j. brueck, and r. m. osgood jr, “enhancing the signal-to-noise ratio of an infrared photodetector with a circular metal grating,” opt. express, vol. 16, no. 7, pp. 4588-4596, 2008. [77] a. alù, g. d'aguanno, n. mattiucci, and m. j. bloemer, “plasmonic brewster angle: broadband extraordinary transmission through optical gratings,” phys. rev. lett., vol. 106, no. 12, 2011. [78] p. genevet, j. lin, m. a. kats, and f. capasso, “holographic detection of the orbital angular momentum of light with plasmonic photodiodes,” nature communications, vol. 3, 2012. [79] p. berini, “plasmon-polariton waves guided by thin lossy metal films of finite width: bound modes of symmetric structures,” phys. rev. b, vol. 61, no. 15, pp. 10484-10503, 2000. [80] p. berini, “plasmon-polariton waves guided by thin lossy metal films of finite width: bound modes of asymmetric structures,” phys. rev. b, vol. 63, no. 12, pp. 1254171-12541715, 2001. [81] i. i. smolyaninov, w. atia, and c. c. davis, “near-field optical microscopy of two-dimensional photonic and plasmonic crystals,” phys. rev. b, vol. 59, no. 3, pp. 2454-2460, 1999. [82] j. b. pendry, a. j. holden, d. j. robbins, and w. j. stewart, “magnetism from conductors and enhanced nonlinear phenomena,” ieee t. microw. theory, vol. 47, no. 11, pp. 2075-2084, 1999. [83] r. h. fan, l. h. zhu, r. w. peng, x. r. huang, d. x. qi, x. p. ren, q. hu, and m. wang, “broadband antireflection and light-trapping enhancement of plasmonic solar cells,” phys. rev. b, vol. 87, no. 19, 2013. [84] t. w. ebbesen, h. j. lezec, h. f. ghaemi, t. thio, and p. a. wolff, “extraordinary optical transmission through sub-wavelength hole arrays,” nature, vol. 391, no. 6668, pp. 667-669, 1998. [85] j. b. pendry, l. martín-moreno, and f. j. garcia-vidal, “mimicking surface plasmons with structured surfaces,” science, vol. 305, no. 5685, pp. 847-848, 2004. [86] j. rosenberg, r. v. shenoi, t. e. vandervelde, s. krishna, and o. painter, “a multispectral and polarizationselective surface-plasmon resonant midinfrared detector,” appl. phys. lett., vol. 95, no. 16, 2009. [87] h. chen, c. t. chan, and p. sheng, “transformation optics and metamaterials,” nat. mater., vol. 9, no. 5, pp. 387-396, 2010. [88] d. r. smith, j. j. mock, a. f. starr, and d. schurig, “gradient index metamaterials,” phys. rev. e, vol. 71, no. 3, pp. 036609, 2005. plasmonic enhancement of light trapping in photodetectors 203 [89] m. dalarsson, m. norgren, n. dončov, and z. jakšić, “lossy gradient index transmission optics with arbitrary periodic permittivity and permeability and constant impedance throughout the structure,” journal of optics (united kingdom), vol. 14, no. 6, pp. 065102, 2012. [90] a. alù, and n. engheta, “achieving transparency with plasmonic and metamaterial coatings,” phys. rev. e, vol. 72, no. 1, pp. 016623, 2005. [91] a. i. fernández-domínguez, s. a. maier, and j. b. pendry, “collection and concentration of light by touching spheres: a transformation optics approach,” phys. rev. lett., vol. 105, no. 26, 2010. [92] j. ng, h. chen, and c. t. chan, “metamaterial frequency-selective superabsorber,” opt. lett., vol. 34, no. 5, pp. 644-646, 2009. [93] n. i. landy, s. sajuyigbe, j. j. mock, d. r. smith, and w. j. padilla, “perfect metamaterial absorber,” phys. rev. lett., vol. 100, no. 20, 2008. [94] e. e. narimanov, and a. v. kildishev, “optical black hole: broadband omnidirectional light absorber,” appl. phys. lett., vol. 95, no. 4, 2009. [95] t. yang, h. chen, x. luo, and h. ma, “superscatterer: enhancement of scattering with complementary media,” opt. express, vol. 16, no. 22, pp. 18545-18550, 2008. [96] l. novotny, and n. van hulst, “antennas for light,” nat. photonics, vol. 5, no. 2, pp. 83-90, 2011. [97] a. alu, and n. engheta, “theory, modeling and features of optical nanoantennas,” ieee t. antenn. propag., vol. 61, no. 4, pp. 1508-1517, 2013. [98] p. bharadwaj, b. deutsch, and l. novotny, “optical antennas,” adv. opt. phot., vol. 1, no. 3, pp. 438483, 2009. [99] a. f. koenderink, “plasmon nanoparticle array waveguides for single photon and single plasmon sources,” nano lett., vol. 9, no. 12, pp. 4228-4233, 2009. [100] p. j. schuck, d. p. fromm, a. sundaramurthy, g. s. kino, and w. e. moerner, “improving the mismatch between light and nanoscale objects with gold bowtie nanoantennas,” phys. rev. lett., vol. 94, no. 1, 2005. [101] t. grosjean, m. mivelle, f. i. baida, g. w. burr, and u. c. fischer, “diabolo nanoantenna for enhancing and confining the magnetic optical field,” nano lett., vol. 11, no. 3, pp. 1009-1013, 2011. [102] j. li, a. salandrino, and n. engheta, “shaping light beams in the nanometer scale: a yagi-uda nanoantenna in the optical domain,” phys. rev. b, vol. 76, no. 24, 2007. [103] e. n. grossman, j. e. sauvageau, and d. g. mcdonald, “lithographic spiral antennas at short wavelengths,” appl. phys. lett., vol. 59, no. 25, pp. 3225-3227, 1991. [104] g. volpe, g. volpe, and r. quidant, “fractal plasmonics: subdiffraction focusing and broadband spectral response by a sierpinski nanocarpet,” opt. express, vol. 19, no. 4, pp. 3612-3618, 2011. [105] y. alaverdyan, b. seplveda, l. eurenius, e. olsson, and m. käll, “optical antennas based on coupled nanoholes in thin metal films,” nature physics, vol. 3, no. 12, pp. 884-889, 2007. [106] c. simovski, d. morits, p. voroshilov, m. guzhva, p. belov, and y. kivshar, “enhanced efficiency of light-trapping nanoantenna arrays for thin-film solar cells,” opt. express, vol. 21, no. 13, pp. a714a725, 2013. [107] m. w. knight, h. sobhani, p. nordlander, and n. j. halas, “photodetection with active optical antennas,” science, vol. 332, no. 6030, pp. 702-704, 2011. [108] z. jakšić, m. milinović, and d. randjelović, “nanotechnological enhancement of infrared detectors by plasmon resonance in transparent conductive oxide nanoparticles,” strojniski vestnik/journal of mechanical engineering, vol. 58, no. 6, pp. 367-375, 2012. [109] l. dominici, f. michelotti, t. m. brown, a. reale, and a. di carlo, “plasmon polaritons in the near infrared on fluorine doped tin oxide films,” opt. express, vol. 17, no. 12, pp. 10155-10167, 2009. [110] s. franzen, c. rhodes, m. cerruti, r. w. gerber, m. losego, j. p. maria, and d. e. aspnes, “plasmonic phenomena in indium tin oxide and ito-au hybrid films,” opt. lett., vol. 34, no. 18, pp. 2867-2869, 2009. [111] c. rhodes, m. cerruti, a. efremenko, m. losego, d. e. aspnes, j. p. maria, and s. franzen, “dependence of plasmon polaritons on the thickness of indium tin oxide thin films,” j. appl. phys., vol. 103, no. 9, 2008. instruction facta universitatis series: electronics and energetics vol. 27, n o 2, june 2014, pp. 221 234 doi: 10.2298/fuee1402221d causal models of electrically large and lossy dielectric bodies  antonije djordjević 1 , dragan olćan 1 , mirjana stojilović 2 , miloš pavlović 3 , branko kolundžija 1 , dejan tošić 1 1 university of belgrade – school of electrical engineering, belgrade, serbia 2 university of applied sciences western switzerland, yverdon-les-bains, switzerland 3 wipl-d d.o.o., belgrade, serbia abstract. this paper presents a novel formula for the complex permittivity of lossy dielectrics, which is valid in a broad frequency range and is ensuring a causal impulse response in the time domain. the application of this formula is demonstrated through the analysis of wet soil, where the coefficients of the formula are tuned to match the measured data from the literature. additionally, an analytical expression for the impulse response of the relative permittivity is derived. the influence of the frequency dependence of the complex permittivity on the causality of responses is illustrated through the analysis of 1-d, 2-d, and 3-d electromagnetic systems. being the most complex, the 3-d system is also used as a test bed for comparing the computational limitations of two commercially available solvers, cst and wipl-d. key words: causal response, complex permittivity, wet soil, impulse response, cst, wipl-d. 1. introduction contemporary software tools for electromagnetic (em) simulation can be efficiently used for modeling and analysis of various complex systems. yet, when a system comprises electrically large but highly-detailed objects filled with lossy dielectrics, as is usually the case for simulations at microwave and millimeter-wave frequencies, software limitations can easily be reached. an example of a large and complex system is a human body, which is often modeled when considering body-area networks. in order to find the response of such a system in the time domain, there are two general simulation strategies: to apply a time-domain solver, or to apply a frequency-domain solver and then use the inverse discrete fourier transform.  received january 23, 2014 corresponding author: antonije djordjević university of belgrade – school of electrical engineering, bulevar kralja aleksandra 73, p.o. box 35-54, 11120 belgrade, serbia (e-mail: edjordja@etf.bg.ac.rs) mailto:edjordja@etf.bg.ac.rs 222 a. djordjević, d. olćan, m. stojilović, m. pavlović, b. kolundţija, d. tošić if a frequency-domain solver is used, the system needs to be analyzed at a large number of equispaced frequencies. at the lower end of the frequency range, the electrical dimensions of the body are small compared with the wavelength, so that integral-equation solvers are the preferred choice [1]. at the upper end of the frequency range, various asymptotic techniques may be used, provided that the shape of the body is simple (e.g., a sphere). if, however, the shape of the body is highly-detailed, then an integral-equation solver (is) is preferred again. however, the computational resources required for is (the memory and processor requirements) quickly increase with increasing the simulation frequency. that is why the contemporary commercial integral-equation solvers often fail to analyze objects whose overall dimensions exceed several tens or hundreds of wavelengths. in this paper, we will challenge the limits of two available commercial solvers implemented in wipl-d [2] and cst [3] electromagnetic simulation tools. the time-domain response of any real (physical) system is causal: the response cannot start before the excitation. all time-domain solvers implement a time-stepping procedure and naturally incorporate the causality feature, as in [3]. however, this is not the case with frequency-domain solvers. to ensure causal response, one must not forget to properly model the dielectric relative permittivity. if the dielectric is lossless, the relative permittivity can be independent of frequency. yet, if the dielectric is lossy, the variations with frequency must be described by an appropriate function of frequency, as discussed in section 2. otherwise, a noncausal, and thus unrealistic, response will be obtained. in order to illustrate the causality issues, we use wet soil as an example. the parameters of the soil are evaluated based on measured data available in the literature, presented in section 3.1. these data are fitted in a very broad frequency range, as described in section 3.2. the fitting function involves the broadband term from [4]. in the time-domain analysis, a dispersive relative permittivity is described by the corresponding impulse response. hence, for the broadband term from [4], we reveal in section 3.3 the corresponding impulse response, which is not available in the literature. in section 4 we present three examples to demonstrate differences between a causal and a noncausal model of the soil. the examples are sorted by computational complexity. the first example is a one-dimensional (1-d) electromagnetic problem (plane-wave propagation). the second example is a two-dimensional (2-d) problem (a cylindrical dielectric scatterer). finally, the third example is the most complex one: a threedimensional (3-d) problem consisting of a large dielectric cube and two dipole antennas. 2. causality issues with frequency-dependent permittivity parameters of lossy media are frequency-dependent. thus, a linear nonmagnetic medium is characterized by the complex relative permittivity )(j)()j( rrr  , where f 2 is the angular frequency and f is the frequency. here, the negative of the imaginary part, )( r  , takes into account both conductive and dielectric losses. a physical system is causal: the response cannot occur before the excitation. consequently, )( r  and r ( )  are not mutually independent. under certain conditions [5], they are related by the hilbert transform, or, equivalently, the kramerkronig relations (dispersion relations). causal models of electrically large and lossy dielectric bodies 223 time-domain solvers that are capable of dealing with dispersive parameters inherently use causal models of media. usually, they fit the relative permittivity in the frequency domain using simple terms, most often debye terms [6], so that the impulse response is obtained analytically using the fourier transform. the impulse response is afterwards used in convolution integrals in the time domain. frequency-domain solvers can deal with any kind of frequency dependence of )j( r  , because they are not bound by causality issues. however, if we use the results of the frequency-domain analysis to compute the time-domain response, )j( r  should be such as to provide a causal response. otherwise, the response in the time domain would have a non-physical behavior. for example, it could start before the excitation, thus violating strict causality [7], or the speed of propagation of electromagnetic fields could exceed the speed of light in a vacuum, violating einstein’s causality. although it is often stated in the literature that )( r  can be evaluated from )( r  [8], and vice versa, the required numerical integration is not easy, and sometimes even not doable. the reasons for that lie in singular, highly oscillatory, or even diverging integrands, and in infinite integration limits in the hilbert transform. even analytically, the integrals cannot be evaluated in many important practical cases because the integrals are divergent or undefined [5]. as a simple example, let us consider a leaky dielectric characterized by const)( r  (equal to the electrostatic relative permittivity) and by a constant conductivity const)(  , which is independent from r  . the equivalent (complex) permittivity of the material is r r 0 ( j ) j( / )       , so that r 0 ( ) /     . clearly, if only r  is known, it is impossible to find )( r  without an additional piece of information, and vice versa. in consequence, the data for )j( r  that satisfy causality conditions can be most reliably and easily supplied in terms of an analytic function of the complex frequency s (  js on the imaginary axis). this function, )( r s , cannot have poles in the right half-plane. it can have only simple poles on the imaginary axis, where it must possess conjugate symmetry: )j(*)j( rr  . hence, )(**)( rr ss  . the function )( r s can be supplied directly by the user, in an analytic form. alternatively, the user tabulates the frequency-dependent data for )( r  and )( r  , and the solver evaluates an appropriate interpolation formula as in [3]. for direct analysis in the time-domain, the impulse response of r  is needed. it is convolved with the vector )( 0 te to obtain )(td . this convolution is the time-domain counterpart of the relation ed 0r  in the frequency domain. 3. complex relative permittivity of wet soil in order to clearly demonstrate the difference between causal and noncausal models, it is preferable to have a medium with relatively high losses. in that case, the causality issues can be noted even after an em wave propagates along a short distance. we have selected wet soil as an example of a dispersive medium throughout the remainder of this paper. we have characterized the soil based on experimental data presented in subsection 3.1. the analytic approximation for )( r s is given in subsection 3.2. 224 a. djordjević, d. olćan, m. stojilović, m. pavlović, b. kolundţija, d. tošić 3.1. experimental data on relative permittivity of water and soil in this paper we use two sets of measured relative permittivity values: those of water and those of soil. we combine them to estimate the relative permittivity of wet soil. measurement results and analytical model for the relative permittivity of water are given in [9]. the measured complex permittivity is fitted by a constant ( r ), two debye (relaxation) terms, and a frequency-independent conductivity (  ), as 0 2w r2 lw r1 rr 11 )(             sss s . (1) in this model, conductive losses are attributed to  and polarization losses to the two debye terms. the measured relative permittivity of soil is given in [10], and comprise data for r  , r  , and  . although it is not sufficiently clear from the report, r and  are not independent; they are related by r 0 /( )    , which can be verified from the numerical results presented in the paper. in other words, two distinct descriptions of losses are used in reference [10]. in the first description, all losses (conductive and polarization) are attributed to r  . in the second description, all losses are attributed to  . in this paper we use results from the middle row of fig. 36 in [10]. 3.2. broadband approximation of relative permittivity of wet soil in [4], an approximation for frequency-dependent complex relative permittivity of lossy dielectrics is proposed, covering a very wide frequency range. it uses a logarithmic function, which provides practically constant )( r  , while )( r  slowly decays with frequency. this approximation yields a causal response in the time domain. the complete expression for the relative permittivity is given by o 1 2 12 r j 10ln j j ln ' ')j(          mm , (2) where [rad/s]1101 log m and [rad/s]2102 log m . the first term is the relative permittivity at very high frequencies, the second term is the broadband logarithmic term, and the third term comes from the conductivity, which is assumed to be independent of frequency. for the frequency range where 21  , the real part of the logarithmic term is 10ln ln ' 10ln j j ln ' 10ln j j ln ' re 2 12 1 2 12 1 2 12                             mmmmmm , (3) and it linearly decays with the logarithm of the frequency, for 2 1 ' ( )m m  per decade. in the same frequency band, the imaginary part of the integral, causal models of electrically large and lossy dielectric bodies 225 10ln 2' 10ln j j arg ' 10ln j j ln ' im 12 1 2 12 1 2 12                                   mmmmmm , (4) is practically constant. for angular frequencies below 1  or above 2  , the imaginary part of the logarithmic term tends to zero, while the real part tends to be constant. this logarithmic function can replace several debye terms in a wide frequency range. the formula (2) is often quoted as “djordjevic-sarkar” model, and it has been built into ansoft [11], agilent [12], simberian [13], and other software. we make an approximation of the parameters of wet soil by combining the soil parameters from [10], the logarithmic term from [4], and the approximation for pure water at 25ºc, based on data from [9] and [14]. the measured data for the soil are in the frequency range from 0.1 ghz to 3 ghz, whereas the approximation is valid outside of this frequency range as well. our approximation for the permittivity of wet soil reads:                          s s mm ps 1 2 12 r r ln 10ln 11)(   0 2w rw w lw rw w 11 1                        ss p s pp , (5) where  11.0p is the relative contribution of water,  11 2 f , where mhz 1 1 f is the lower cutoff frequency of the broadband term,  22 2 f , where thz 100 2 f is the upper cutoff frequency of the broadband term,  [rad/s]1101 log m ,  [rad/s]2102 log m ,  r rd 2 1 ( )m m    is the total variation of the real part of the broadband term, where 61.1 rd  is the slope per decade,  s/m025.0 is the constant conductivity,  1w1w 2 f , where ghz 25 1w f is the location of the first debye term for water,  2w2w 2 f , where ghz 200 2w f is the location of the second debye term for water,  5.76 rw  is the total variation of the real part of the permittivity of water, and  065.0 w p is the relative contribution of the second debye term. 226 a. djordjević, d. olćan, m. stojilović, m. pavlović, b. kolundţija, d. tošić fig. 1 compares the measured data from [10] with the results obtained from our approximation formula in the frequency range from 0.1 ghz to 3 ghz. the measured data exhibit stochastic errors, but the achieved agreement can be considered to be good. (a) (b) fig. 1 comparison of measured and analytically calculated parameters of wet soil: (a) relative permittivity and (b) conductivity. the results show good agreement. 3.3. impulse response for time-domain solvers, we need the impulse response of )( r s from equation (5). this response can be evaluated by summing the responses for all terms. the first term is a constant (unity), the second term is a broadband term, two debye terms follow, and the last term has the form s/1 . the impulse responses for all terms, except the broadband term, are elementary and can be found in standard tables of the inverse laplace transform. however, the response for the broadband term is not available in the literature. hence, we have evaluated it analytically using the inverse laplace transform, so that the impulse response corresponding to (5) reads: l 2 r r 2 1 e e ( ) ( ) (1 ) h( ) ( ) ln10 t t t t p t m m t                    lw 2w w rw lw w rw 2w 0 ((1 ) e e ) h( ) h( ) t t p p p t t             , (6) where )(t is the dirac (delta) function and )(h t is the heaviside (step) function. 4. examples of (non)causal response in this section, we present three examples to demonstrate differences in the timedomain response when using a causal and when using a noncausal model of wet soil. the examples are ordered according to the complexity and dimensionality of the analyzed electromagnetic problems, from the simplest to the most complex ones. causal models of electrically large and lossy dielectric bodies 227 4.1. plane wave we consider a uniform plane wave that propagates through a homogeneous nonmagnetic medium. this is a one-dimensional (1-d) electromagnetic problem. the excited wave is described by a delta-function. the distance of wave propagation is m 5.0d . we analyze the propagation in the frequency domain at 4096 frequency points, starting from 0, with a step of 1 mhz. thereafter, we use the inverse discrete fourier transform to obtain the response in the time domain. we consider two models. first, when the complex relative permittivity is given by (5). second, when the complex relative permittivity is independent of frequency and equal to 36402j725116 r . .  (which is an estimated mean value of the permittivity of wet soil in the first model). the results are shown in fig. 2. for the first model, a causal response is obtained. it has a crisp start at 6 ns. for the second model, a noncausal response is obtained. it is characterized by a premature and “lazy” leading edge of the pulse. fig. 2 time-domain response when a plane wave is propagating through a causal and a noncausal medium. the causal response has a crisp start, while the noncausal response has an early start and slow leading edge. 4.2. dielectric cylinder we consider an infinitely long cylinder of a square cross-section, whose side length is 0.5 m. the cross-section of the cylinder is shown as an inset in fig. 3. the axis of the cylinder coincides with the z-axis of the cartesian coordinate system. hence, we deal here with a two-dimensional (2-d) em system. 228 a. djordjević, d. olćan, m. stojilović, m. pavlović, b. kolundţija, d. tošić fig. 3 electric field z-component at the axis of a dielectric cylinder for the cases when the permittivity is frequency-constant (noncausal) and frequency-dependent (causal). a uniform plane electromagnetic wave illuminates the cylinder. the wave propagates along the x-axis, in the opposite direction of the x-axis. the electric-field vector of the wave is a gaussian pulse in the time domain, defined as 2 0 2 ( ) 2 0 ( ) e t t z t e   e i , (7) where, v/m 1 0 e , ns 3 0 t , ns 1.0 , and z i is the unit vector in the z-direction. these data are for the transversal plane at m 1x . since the vector of the electric field is parallel to the cylinder axis, the electromagnetic field in this system is a 2-d transversal magnetic-field, usually termed as tm mode. the numerical analysis is performed using the method of moments (mom) with the surface integral-equation formulation, piecewise constant approximation for electric and magnetic surface currents, pmchwt formulation, and point-matching testing procedure [1], [15]. as the response, we calculate the z-component of the electric field at the cylinder axis (i.e., at point o at the inset of fig. 3). the frequency-domain analysis is done from 9.99512 mhz to 10.235 ghz over 1024 equidistant frequency samples. the total number of unknowns increases with the increase of the analysis frequency, but it does not exceed 1600. the total analysis time is 815 s on a desktop computer with intel i7 cpu and 32 gb of ddr3 ram. the time-domain response is calculated for the time interval from 0 to 100 ns over 2048 equidistant time samples. the results for constant permittivity in the whole frequency range, 3j16 r  , which is a noncausal model, and for r ( )s given by (5), which is a causal model, are shown in causal models of electrically large and lossy dielectric bodies 229 fig. 3 for the first 15 ns. as in fig. 2, the response obtained by the first model shows a premature beginning of the leading edge. in contrast, the causal model has a clear start, which allows for much more precise timing when evaluating the beginning of the impulse response. 4.3. dielectric cube as the most resource-demanding problem, we consider the three-dimensional (3-d) system shown in fig. 4. it consists of a cube, made of a lossy dielectric, and two symmetrical dipoles. the side of the cube is 2c, where we take two values for c: mm 100c for a smaller cube and mm 250c for a larger cube. inside the cube, at its center (which coincides with the coordinate origin o), one symmetrical dipole (dipole #1) is located. the length of one arm of the dipole is 5 mm (the overall dipole length is 10 mm). the wire radius is 0.1 mm (the diameter is 0.2 mm). another dipole (dipole #2) is located outside the cube. the arm length of this dipole is 20 mm (40 mm overall) and the wire radius is 0.5 mm (the diameter is 1 mm). the dipoles are mutually parallel, and parallel to the height of the cube. the distance between the dipole centers (feeding points, ports) is 2c. fig. 4 two dipole antennas, one of which is inside a lossy dielectric cube. not drawn to scale. both cubes are analyzed at 1000 frequencies: 10 mhz, 20 mhz, ..., 10000 mhz (10 ghz) using the program wipl-d [2]. for each frequency, the impedance and scattering parameters are computed. the nominal impedance for the scattering parameters is 50 ω. two models of the dielectric are used: a noncausal model, for which the complex relative permittivity is frequency independent, 3j16 r  (i.e., 16 r  and 3 r  ), and a causal model, for which the complex relative permittivity is evaluated from equation (5). for both cubes and for both dielectric models, the impulse responses for s21 and z21 are evaluated. the results are shown in figs. 5–8. 230 a. djordjević, d. olćan, m. stojilović, m. pavlović, b. kolundţija, d. tošić wipl-d simulations are done using wipl-d pro version 11, on a desktop computer with intel cpu core i7 3820 @3.60 ghz, 64 gb of (ddr3) ram, nvidia geforce gtx590, and with microsoft windows 7 pro 64-bit operating system. the em system is modeled using three symmetry planes in order to maximally reduce the computer resources needed for the analysis. although such symmetry introduces a parasitic image of the second (larger) dipole, the effect of the parasitic dipole is negligible as it is located far away. first, we analyze the smaller cube ( mm 100c ). in the case of the constant relative permittivity, 3j16r  , the analysis takes 26,931 s (i.e., approximately 7.5 hrs), and the total number of unknowns increases with frequency from 1,202 at the lowest frequency (10 mhz) up to 4,943 at the highest frequency (10 ghz). in the case when the relative permittivity is given by equation (5), the simulation takes 26,172 s (i.e., approximately 7.3 hrs). the total number of unknowns increases with frequency from 1,202 at the lowest frequency up to 4,803 at the highest frequency. thereafter, we analyze the larger cube ( mm 250c ). the analysis for the constant permittivity lasts 113,753 s (i.e., approximately 31.6 hrs). the number of unknowns is 1,202 at the lowest frequency and rises with frequency up to 15,233 for the highest frequency. the analysis for the relative permittivity given by equation (5) lasts 105,484 s (i.e., 29.3 hrs), while the number of unknowns is in the range from 1,202 to 13,621. note that in both cases, the analysis of the cube with frequency dependent permittivity lasts slightly less than the analysis with constant permittivity. this is due to the fact that wipl-d allocates resources by taking into the account the electrical size of the structure at the operating frequency. at higher frequencies, the modulus of the frequencydependent permittivity is smaller than the modulus of the constant permittivity, thus demanding fewer unknowns. the results for mm 100c obtained by wipl-d are compared with the results obtained by program cst [3], which uses a time-domain solver (fig. 5). cst simulations were performed using microwave studio software from the cst studio 2013 package, on a windows 7 64-bit server equipped with two intel(r) xeon(r) cpus @2ghz and 192 gb ram. unfortunately, cst cannot analyze the case for mm 250c for the given hardware configuration. the cst time-domain solver uses the causal model of the relative permittivity. the hexahedral mesh size was about 7 million cells. the cube interior was filled with a frequency dispersive material, defined using the appropriate permittivity values given by equation (5) for each frequency point. the background was filled with a vacuum. the model boundaries were set to “open”. two symmetry planes (one magnetic-field and one electric-field symmetry plane) were defined to reduce the total computational load. the solver accuracy was set to 80 db. the excitation signal was of a 10 ghz cst-default gaussian type. the simulation was run so to ultimately yield 1001 equidistant frequency points in the 0 to 10 ghz frequency range. the total simulation time is approximately 67.5 hours. the time-domain solver in cst evaluates only the scattering parameters. thereby, only one port is excited, so that in one run of the program the parameters s11 and s21 are evaluated. in order to calculate the impedance parameter z21, the parameter s22 is needed as well. however, the computation of s22 requires another full-time run of cst, which was not performed to avoid the long run of the program. consequently, only the impulse response for s21 is shown (fig. 5). causal models of electrically large and lossy dielectric bodies 231 the simulation for the noncausal response in cst can be performed using a frequency-domain solver only. the simulation parameters are as follows: integral solver (is), a mesh with 1313 surfaces, a vacuum background, open boundaries, two symmetry planes, solver accuracy 1e3, s-parameters normalized to 50 ω, 3rd order solver, and the solver type is mom. the results are also shown in fig. 5. the total simulation time is approximately 52 hours. the agreement between the results evaluated by wipl-d and by cst is very good, both for the causal model and the noncausal model. fig. 5 impulse response for mm 100c , for s21, and zoom-in (inset), as computed by cst time-domain solver for the causal model, by cst frequency-domain solver for the noncausal model, and by wipl-d for both the causal and noncausal models. fig. 6 shows the impulse response for z21 for the smaller cube. figs. 7 and 8 show the impulse response for the larger cube, for s21 and z21, respectively. these results were computed only by wipl-d. the noncausal model of the dielectric yields a premature start of the response, which is more visible for z21 than for s21. the explanation is in the shape of the spectrum of these two parameters. the spectrum of the parameter z21 is wider than the spectrum of s21. hence, the inadequate variations of the permittivity of the noncausal model have influence in a wider frequency range for z21 than for s21. 232 a. djordjević, d. olćan, m. stojilović, m. pavlović, b. kolundţija, d. tošić fig. 6 impulse response for mm 100c , for z21, and zoom-in (inset), as computed by wipl-d for the noncausal and causal models. fig. 7 impulse response for mm 250c , for s21, and zoom-in (inset), as computed by wipl-d for the noncausal and causal models. causal models of electrically large and lossy dielectric bodies 233 fig. 8 impulse response for mm 250c , for z21, and zoom-in (inset), as computed by wipl-d for the noncausal and causal models. 7. conclusion this paper presents an analytical expression of complex permittivity of wet soil, valid in a broad frequency range, which assures a causal response in the time domain. the parameters of the formula are tuned to fit the measured data for soil and water in a broad range of frequencies. the impulse response, needed for direct analysis in the time domain, is derived, too. the discrepancies between the causal and noncausal responses, and their relations with the complex permittivity of the material, are illustrated through several examples of different dimensionality and complexity. it is shown that in all cases the causal response has a crisp start, while the noncausal response has an early and slow leading edge. additionally, a model of a 3-d em system, being the most complex example, is used to test the present-day limits of some commercial em solvers. acknowledgement: the paper is a part of the research done within the project tr32005 of the serbian ministry of education, science, and technological development. references [1] b. m. kolundţija and a. r. djordjević, electromagnetic modeling of composite metallic and dielectric structures, boston: artech house, 2002. [2] wipl-d pro 3-d. available: http://www.wipl-d.com/ [3] cst, 3d electromagnetic simulation software. available: https://www.cst.com/ 234 a. djordjević, d. olćan, m. stojilović, m. pavlović, b. kolundţija, d. tošić [4] a. r. djordjević, r. m. biljić, v. d. likar-smiljanić, and t. k. sarkar, “wideband frequency-domain characterization of fr-4 and time-domain causality”, ieee trans. electromagn. compat., vol. 43, no. 4, pp. 662–667, november 2001. [5] a. r. đorđević and d. v. tošić, “causality of circuit and electromagnetic-field models”, in proc. of 5th european conference on circuits and systems for communications (eccsc'10), belgrade, serbia, 2010, pp. 12–21. [6] p. j. w. debye, polar molecules, new york: the chemical catalog company, 1929. [7] c. f. bohren, “what did kramers and kronig do and how did they do it?”, eur. j. phys., vol. 31, pp. 573–577, 2010. [8] f. m. tesche, “on the use of the hilbert transform for processing measured cw data”, ieee trans. electromagn. compat., vol. 34, no. 3, august 1992, pp. 259–266. [9] t. meissner and f. j. wentz, “the complex dielectric constant of pure and sea water from microwave satellite observations”, ieee trans. geoscience remote sens., vol. 42, no. 9, pp. 1836–1849, september 2004. [10] g. d. smith and b. j. stanton, soil parameters from fort a.p. hill soil permittivity and conductivity measurements for the wide area airborne minefield detection program, army research laboratory, adelphi, md, arl-tr-3049, sept. 2003. available: http://www.arl.army.mil/arlreports/2003/arl-tr3049.pdf. [11] ansys. (2012, may). automating the si design flow for hfss. [online]. available: http://www.ansys.com/ staticassets/ansys/conference/confidence/minneapolis/downloads/automating-si-design-flow-for-ansyshfss-1.pdf [12] agilent technologies. (2009). about dielectric loss models. [online]. available: http://edocs.soco.agilent. com/display/ads2009/about+dielectric+loss+models [13] simberian inc. (2008, sept.). modeling frequency-dependent dielectric loss and dispersion for multigigabit data channels. [online]. available: [14] http://www.simberian.com/appnotes/modelingdielectrics_2008_06.pdf [15] j. barthel, k. bachhuber, r. buchner, h. hetzenauer, and m. kleebauer, “a computer-controlled system of transmission lines for the determination of the complex permittivity of lossy liquids between 8.5 and 90 ghz”, ber. bunsenges. phys. chem., vol. 95, no. 8, pp. 853–859, 1991. [16] wipl-d 2-d solver. available: http://www.wipl-d.com/ 10215 facta universitatis series: electronics and energetics vol. 35, no 2, june 2022, pp. 283-300 https://doi.org/10.2298/fuee2202283s © 2022 by university of niš, serbia | creative commons license: cc by-nc-nd original scientific paper optimization of the 3p keys kernel parameters by minimizing the ripple of the spectral characteristic nataša savić, zoran milivojević, zoran veličković academy of applied technical and preschool studies, niš, serbia abstract. the ideal interpolation kernel is described by the sinc function, and its spectral characteristic is the box function. due to the infinite length of the ideal kernel, it is not achievable. therefore, convolutional interpolation kernels of finite length, which should better approximate the ideal kernel in a specified interval, are formed. the approximation function should have a small numerical complexity, so as to reduce the interpolation execution time. in the scientific literature, great attention is paid to the polynomial kernel of the third order. however, the time and spectral characteristic of the third-order polynomial kernels differs significantly from the shape of the ideal kernel. therefore, the accuracy of cubic interpolation is lower. by optimizing the kernel parameters, it is possible to better approximate the ideal kernel. this will increase the accuracy of the interpolation. the first part of the paper describes a three-parameter (3p) keys interpolation kernel, r. after that, the algorithm for optimizing the parameters of the 3p keys kernel, is shown. first, the kernel is disassembled into components, and then, over each kernel component, fourier transform is applied. in this way the spectral characteristic of the 3p keys kernel, h, was determined. then the spectral characteristic was developed in the taylor series, ht. with the condition for the elimination of the members of the taylor series, which greatly affect the ripple of the spectral characteristic, the optimal kernel parameters (αopt, βopt, opt) were determined. the second part of the paper describes an experiment, in which the interpolation accuracy of the 3p keys kernel, was tested. parametric cubic convolution (pcc) interpolation, with the 3p kernel, was performed over the images from the test database. the test database is created with standard test images, which are intensively used in digital image processing. by analyzing the interpolation error, which is represented by the mean square error, mse, the accuracy of the interpolation was determined. the results (αopt, βopt, opt, msemin) are presented on tables and graphs. detailed comparative analysis showed higher interpolation accuracy with the proposed 3p keys interpolation kernel, compared to the interpolation accuracy with, 1p keys and 2p keys interpolation kernels. finally, the numerical values of the optimal kernel parameters, which are determined by the optimization algorithm proposed in this paper, were experimentally verified. key words: convolution, interpolation, interpolation kernel, pcc interpolation, keys kernel received november 23, 2021; received in revised form april 5, 2022 corresponding author: nataša savić academy of applied technical and preschool studies, generala milojka lešjanina 39, 18000 niš, serbia e-mail: natasa.savic@akademijanis.edu.rs 284 n. savić, z. milivojević, z. veličković 1. introduction interpolation is the process of estimating intermediate values between discrete samples of a continuous signal. among other things, interpolation can be realized by applying a convolution between a discrete signal and a continuous interpolation kernel. the interpolation kernel significantly affects the accuracy and time execution of interpolation [1]. for interpolation of band-limited signals, the ideal interpolation kernel is of the form sin(x)/x (in the notation sinc) where -∞ ≤ x ≤ +∞ [1, 2]. the spectral characteristic of the sinc interpolation kernel is a rectangular function, hsinc. the sinc kernel cannot be practically realized because it has infinite limits. for this reason, there is a need to truncate the sinc interpolation kernel to a finite length. as a consequence of the truncated sinc kernel, its spectral characteristic deviates from the ideal, rectangular, characteristic, which leads to: a) ripple in the passband and stopband, and b) finite slope in the transition band. the idea is to approximate the truncated sinc interpolation kernel with a low-degree polynomial function. in this way, the interpolation kernel has a small numerical complexity, and thus, allows a higher interpolation speed. these features of kernel are especially important when implemented in real-time systems. signal interpolation using finite length interpolation kernels is realized by applying convolution. a polynomial zeroth-degree kernel allows interpolation by rounding to the nearest-neighbor [3, 4]. nearest-neighbor interpolation is the most efficient in terms of computational speed, but in doing so, the largest interpolation error is generated. a linear, first-degree interpolation kernel is described in [5]. a quadratic, second-degree interpolation kernel is described in [3, 6]. a cubic, third-degree interpolation kernel, intended for parametric cubic convolution, pcc, is described in [1, 5]. using numerical examples, it has been shown that cubic convolution is more precise than nearest-neighbor and linear interpolation [7 9]. the parameterization of the cubic interpolation kernel, by introducing the kernel parameter α, is shown in [1]. the paper [1] is one of the basic papers in the field of interpolation in digital image processing. later, in the scientific literature, the parametric interpolation kernel from [1] was named, in honor of the author, the 1p keys interpolation kernel. by changing the value of the kernel parameter α, the characteristics of the kernel can be changed and, in this way, adjusted to the corresponding signal that is interpolated. the process of changing the kernel parameter for customization is called parameter optimization. in [1], the optimization of the parameter α was performed by minimizing the interpolation error by developing the error function into a taylor series in f = 0 (maclaurin series). in this way, it is shown that the optimal value of the parameter αopt = -0.5. the ripple of the spectral characteristic is reduced by eliminating the members of the taylor series that predominantly influence on the ripple. in [10], the ripple of the spectral characteristic was reduced by eliminating the members of the taylor series that affect on the concavity of the spectral characteristic. in [11], the reduction of ripple of the spectral characteristic was achieved with α = -0.5. the construction of a two-parameter interpolation kernel is shown in [12, 13]. this kernel is based on the extended parameterization of the 1p keys kernel [1]. in the scientific literature, this kernel is called the 2p keys kernel. optimal values of kernel parameters (αopt = 0.1, βopt = 0.2975) in estimating the fundamental frequency of the speech signal determined in [14]. further expansion of parameterization, in order to improve the characteristics of the kernel, led to the construction of 3p keys kernel [15]. the optimal values of kernel parameters in the estimates of the fundamental frequency of the speech signal are αopt = 1.7, βopt = -4.7, γopt = -3.8. a detailed analysis of the error estimate, presented using mse, optimization of the 3p keys kernel parameters ... 285 shows a higher accuracy of estimation using 3p kernels compared to the use of 1p keys and 2p keys kernels [15]. in the paper [16] the results of precision of the interpolation of audio signals, which was realized using the 3p keys kernel [15], are presented. audio test signals were acquired by recording g tones (g1 g7) on a steinway b concert piano. a detailed comparative analysis showed that the interpolation error, when the 3p keys kernel was used, was compared to the following: a) 1p keys kernel, 7.374 times smaller, and b) 2p keys kernels, 2.4166 times smaller. encouraged by the results of the papers, which unequivocally indicate the fact that increasing the number of interpolation kernel parameters reduces the interpolation error, the authors of this paper performed optimization of 3p keys kernel parameters, in order to increase similarity with the ideal kernel, sinc. thus created, optimized kernel, will further reduce interpolation error. in this paper, the process of optimization of parameters of the 3p keys kernel [15] in the spectral domain, is presented. optimization of kernel parameters was performed by minimizing the ripple of the spectral characteristic. the first part of the paper describes the algorithm for optimizing kernel parameters. first, by applying the fourier transform on the 3p kernel, r, the analytical form of the spectral characteristic, h, was determined. after that, the spectral characteristics were approximated using the taylor series, ht. the ripple reduction was achieved by eliminating the members of the taylor series, ht, which have a dominant effect on the ripple increase. then, the degree of similarity of the spectral characteristics of the ideal sinc kernel, hsinc, and the optimized kernel, hopt, was determined by comparative analysis. mse were used as a measure of similarity [11]. finally, the optimal parameters, (αopt, βopt, opt), were determined based on the minimum of the mse. the second part of the paper presents the results of an experiment in which the optimal parameters for 1p keys, 2p keys and 3p keys kernels were determined. an algorithm for interpolation test images, error interpolation estimation, and determination of experimental optimal parameters, is described. for the purposes of the experiment, the image test base was formed. image test base consists of: a) standard test images for digital signal processing (lena, barbara, cameraman, peppers, boats, tulips, and watch), and b) images from the bsds500 image base [17]. test images from the bsds500 base have numeric labels, so they will be named in the same way later in this paper. by applying the algorithm for each image, the optimal parameters and the corresponding estimate errors were determined. the results are presented in tables and graphs. finally, a comparative analysis of the experimental results with the results obtained by optimizing the spectral characteristic, was performed. comparative analysis will determine: a) the accuracy of interpolation using mse and b) the accuracy of estimating kernel parameters using absolute error. finally, in the last part of the paper, an analysis of the execution time of all analyzed kernels was performed. testing of the execution time was performed on a computer desktop s2ac43p, processor: intel (r) pentium (r), cpu: g3220 3 ghz, ram: 8 gb and a windows 10 operating system. the matlab r2017b program was applied (to determine the execution time, the tic and toc functions are used). it should be emphasized that the realized experiment, within which the algorithm for pcc interpolation is described, is intended, exclusively, for the comparative analysis of the interpolation accuracy of the 1p keys, 2p keys and 3p keys kernels. it was implemented using the matlab. therefore, the time of interpolation execution, in this case, is not of primary importance, because the condition for real-time is not set. the paper is organized as follows: section 2 describes 3p keys kernel. section 3 describes the 3p keys kernel parameterization algorithm. experimental results and comparative analysis are presented in section 4. section 5 is the conclusion. 286 n. savić, z. milivojević, z. veličković 2. keys parametric interpolation kernels in paper [1], for the field of convolutional interpolation fundamental paper, the author defined a parametric interpolation kernel. the kernel was intended to image interpolation. later, in the scientific literature, the interpolation kernel from [1] was called the 1p keys kernel. 2.1. 1p keys kernel the proposed 1p keys kernel [1] is defined as: 3 2 3 2 ( 2) | | ( 3) | | 1, | | 1, ( ) | | 5 | | 8 | | 4 , 1 | | 2 0, | | 2 x x x r x x x x x x        + − + +   = − + −     , (1) where α is parameter of the 1p keys kernel. the length of this kernel is l = 4. 2.2. 2p keys kernel a modification of the 1p key kernel, with the introduction of the second kernel parameter, with length l = 6, is shown in [13]. the analytical form of the 2p keys kernel is: ( ) 3 2 3 2 3 2 ( 2) | | ( 3) | | 1, | | 1 | | (5 ) | | (8 3 ) | | (4 2 ), 1 | | 2 | | 8 | | 21 | | 18 , 2 | | 3 0, | | 3 x x x x x x x r x x x x x x                 − + − − + +   − − + − − −   =  − + −     , (2) where α and β are the parameters of the kernel. for β = 0 is obtained 1p keys kernel. in [12 14], it was shown that the precision of the pcc interpolation with the 2p keys kernel was increased compared to the interpolation of the pcc interpolation with the 1p keys kernel. 2.3. 3p keys kernel the results in [12 14] show that the precision of the pcc interpolation with 2p keys kernel, compared to interpolation with 1p keys kernel, is increased. with the idea of further increasing the interpolation accuracy, the parameterization of the 1p kernel, using three parameters, was performed [15]. the three-parameter kernel is called the 3p keys kernel. the analytical form of the keys 3p kernel is: 3 2 3 2 3 2 3 2 ( 2) | | ( 3) | | 1, | | 1 | | ( 5 ) | | (8 3 3 ) | | ( 4 2 2 ), 1 | | 2 ( ) | | ( 8 ) | | (21 5 ) | | ( 18 6 ), 2 | | 3 | | 11 | | 40 | | 48 , 3 | | 4 0, | | 4 x x x x x x x r x x x x x x x x x x                             − + + + − + − − +   + − − − + − + + − + −    = + − + + − + − +    − + −      ,(3) where α, β and  are the parameters of the 3p keys kernel. as an example, fig. 1.a shows the time characteristics of the ideal interpolation kernel, rsinc, and the 3p keys kernel, rαβ, for kernel parameters α = -1.2, β = -0.1 and  = -0.1. optimization of the 3p keys kernel parameters ... 287 a) b) fig. 1 characteristics of the ideal sinc and 3p keys kernels (α = -1.2, β = -0.1,  = -0.1): a) time characteristics (rsinc, rαβγ) and b) spectral characteristics (hsinc, hαβ) 3. optimization of the keys 3p kernel parameters the spectral characteristic, h, of the 3p keys kernel (eq. 3) is different from the spectral characteristic, hsinc of the ideal interpolation kernel rsinc (fig. 1.b). the deviation of the spectral characteristic h from hsinc is described as the ripple of the spectral characteristic. the optimization process minimizes the difference between the spectral characteristics of h and hsinc. optimization involves selecting the kernel parameters α, β, and , so as to minimize the mean square error between h and hsinc. in this way, the optimal parameters of the 3p keys kernel αopt, βopt and opt are obtained. 3.1. algorithm for minimizing of the ripple of the spectral characteristic this part of the paper conducts the optimization of keys 3p kernel parameters by minimizing the ripple of the spectral characteristic. the algorithm for parameters optimization consists of the following steps: input: r 3p keys kernel output: αopt, βopt and opt kernel parameters. step 1: decomposition 3p keys kernel r to its components r0, r1, r2 and r3. step 2: determining the spectral characteristic h(f) by applying the fourier transform over the kernel components r0, r1, r2 and r3. step 3: the expansion of the spectral characteristic h( f ) into taylor series ht( f ). step 4: eliminating coefficients of the members of the spectral characteristic ht( f ) which dominantly affect on the ripple of the spectral characteristic. determining the optimal kernel parameters αopt, βopt and opt. a more detailed explanation of the algorithm steps (step 1 step 4) is shown below. 3.2. kernel components (step 1) the 3p keys kernel r (eq. (3)) can be represented as the sum of the kernel components: 288 n. savić, z. milivojević, z. veličković 0 1 2 3 ( ) ( ) ( ) ( ) ( )r x r x r x r x r x  = + + + , (4) where 3 2 0 2 | | 3 | | 1. | | 1 ( ) 0, | | 1 x x x r x x  − +  =   , (5) 3 2 3 2 1 | | | | , | | 1 ( ) | | 5 | | 8 | | 4, 1 | | 2 0, | | 2 x x x r x x x x x x  −   = − + −     , (6) 3 2 2 2 3 2 | | | | , | | 1 | | 3 | | 2, 1 | | 2 ( ) | | 8 | | 21 | | 18, 2 | | 3 0, | | 3 x x x x x x r x x x x x x  − +   − +   =  − + −     , (7) and 3 2 2 2 3 3 2 | | | | , | | 1 | | 3 | | 2, 1 | | 2 ( ) | | 5 | | 6, 2 | | 3 | | 11 | | 40 | | 48, 3 | | 4 0, | | 4 x x x x x x r x x x x x x x x x  −   − + −    = − +    − + −      , (8) are components of the 3p keys kernel. fig. 2 shows the components of 3p keys kernel r0, r1, r2 and r3. fig. 2 3p keys kernel components: r0, r1, r2 and r3 3.3. spectral characteristic of the 3p keys kernel (step 2) in order to optimize the parameters α, β, and , of the 3p keys kernel r in the spectral domain, by using the fourier transform (ft) the spectral characteristic of the kernel h was obtained: 0 1 2 3 0 1 2 3 ( ) ( ( )) ( ( ) ( ) ( ) ( )) ( ) ( ) ( ) ( ) h f ft r x ft r x r x r x r x h f h f h f h f       = = + + + = + + + (9) where h0, h1, h2 and h3 are spectral components of the 3p keys kernel: optimization of the 3p keys kernel parameters ... 289 2 0 ( ) ( ) xfi o h f r x e dx   − − =  , (10) 2 1 1 ( ) ( ) xfi h f r x e dx   − − =  , (11) 2 2 2 ( ) ( ) xfi h f r x e dx   − − =  , (12) and 2 3 3 ( ) ( ) xfi h f r x e dx   − − =  . (13) by substituting eq. (5) in eq. (10) is obtained: 0 1 3 2 2 3 2 2 0 1 0 ( ) ( 2 3 1) (2 3 1) xfi xfi h f x x e dx x x e dx  − − − = − − + + − +  , (14) by substituting eq. (6) in eq. (11) is obtained: -1 0 3 2 -2 3 2 -2 1 -2 -1 1 2 3 2 -2 3 2 -2 0 1 ( ) ( 5 8 4) ( ) ( ) ( 5 8 4) xfi xfi xfi xfi h f x x x e dx x x e dx x x e dx x x x e dx     = − − − − + − − + − + − + −     , (15) by substituting eq. (7) in eq. (12) is obtained: 2 1 3 2 2 2 2 2 3 2 0 1 2 3 2 2 3 2 2 2 2 1 0 1 3 3 2 2 2 ( ) ( 8 21 18) ( 3 2) ( ) ( ) ( 3 2) ( 8 21 18) xfi xfi xfi xfi xfi xfi h f x x x e dx x x e dx x x e dx x x e dx x x e dx x x x e dx       − − − − − − − − − − − = − − − − + + + + + + − + + − + + − + −       , (16) by substituting eq. (8) in eq. (13) is obtained: 3 2 3 2 2 2 2 3 4 3 1 0 1 2 2 3 2 2 3 2 2 2 1 0 2 3 2 2 2 2 1 2 4 3 2 3 ( ) ( 11 40 48) ( 5 6) ( 3 2) ( ) ( ) ( 3 2) ( 5 6) ( 11 40 48) xfi xfi xfi xfi xfi xfi xfi h f x x x e dx x x e dx x x e dx x x e dx x x e dx x x e dx x x e dx x x x        − − − − − − − − − − − − − − = − − − − + + + + − − − + − − + − + − + − + − + + − + −        2 xfi e dx −  , (17) after applying euler's formula and partial integration, the spectral components of the kernel can be written in the following form: 290 n. savić, z. milivojević, z. veličković 2 0 4 4 6 sin ( ) 3 sin(2 ) ( ) 2 f f f h f f     − = , (18) 2 1 4 4 3sin (2 ) 4 sin(2 ) sin(4 ) ( ) 2 f f f f f h f f       − − = , (19) 2 2 2 2 4 4 4 4 3sin ( ) 3sin (2 ) 3sin (3 ) ( ) 2 3 sin(2 ) 3 sin(4 ) sin(6 ) 2 f f f h f f f f f f f f f            − − + = + − − , (20) and 2 2 2 3 4 4 4 4 3(sin ( ) sin (3 ) sin (4 )) ( ) 2 (3sin(2 ) 2 sin(4 ) 3sin(6 ) sin(8 )) 2 f f f h f f f f f f f f           − + = − − + − − . (21) spectral components h0 (eq. (18)), h1 (eq. (19)), h2 (eq. (20)) and h3 (eq.(21)), are shown in fig. 3. fig. 3 spectral components h0, h1, h2 and h3. of the 3p keys kernel 3.4. optimal kernel parameters (step 3, step 4) in order to determine the optimal parameters of the 3p keys kernel r in the spectral domain, the taylor expansion ht of spectral characteristic h (eq. (9)) was determined. (step 3) by expansion into taylor series in the neighborhood f = 0 (maclaurin series), spectral components of the kernel were obtained: optimization of the 3p keys kernel parameters ... 291 2 4 6 8 0 4 1 8 2 ( ) 1 ( ) ( ) ( ) ( ) ... 15 35 4725 31185 t h f f f f f   = − + − + + , (22) 2 4 6 8 1 8 16 232 4112 ( ) ( ) ( ) ( ) ( ) ... 15 35 1575 155925 t h f f f f f   = − + − + + , (23) 2 4 6 8 2 8 272 4232 205808 ( ) ( ) ( ) ( ) ( ) ... 15 105 1575 155925 t h f f f f f   = − + − + + , (24) and 2 4 6 8 3 16 256 25904 2640832 ( ) ( ) ( ) ( ) ( ) ... 15 35 1575 155925 t h f f f f f   = − + − + + . (25) by substituting eq. (22)-(25) in eq. (9) is obtained: 0 1 2 3 2 4 6 8 ( ) ( ) ( ) ( ) ( ) 4 1 1 (1 2 2 4 )( ) (3 48 272 768 )( ) 15 105 8 (1 87 1587 9714 )( ) ( ) 4725 t t t t t h f h f h f h f h f f f f o f                 = + + + = − + + + + + + + − + + + + . (26) (step 4) the minimization of the spectral characteristic (eq. (26)) ripple is carried out by eliminating the dominant members of the spectral characteristic: 1 2 2 4 0 3 48 272 768 0 1 87 1587 9714 0          + + + =  + + + =  + + + = . (27) after calculating the system of equations eq. (27) is obtained: 4945 0.6132 8064 409 0.1522 2688 157 0.0195 8064 opt opt opt    = −  − =  = −  − . (28) by substituting the optimal parameter αopt = -0.5 [11], the optimal interpolation 1p keys kernel, ropt_1p, was obtained: 3 2 3 2 _1 1.5 | | 2.5 | | 1, | | 1, ( ) 0.5 | | 2.5 | | 4 | | 2, 1 | | 2 0, | | 2 opt p x x x r x x x x x x  − +   = − − +     . (29) the spectral characteristic of the 1p keys kernel, hopt_1p, is shown in fig. 4. by substituting the optimal parameters αopt = -0.5938, βopt = 0.0938 [12] the optimal interpolation 2p keys kernel, ropt_2p, was obtained: 292 n. savić, z. milivojević, z. veličković 3 2 3 2 _ 2 3 2 1.3124 | | 2.3124 | | 1, | | 1 0.5938 | | 3.0628 | | 5.0318 | | 2.5628, 1 | | 2 ( ) 0.0938 | | 0.7504 | | 1.9698 | | 1.6884, 2 | | 3 0, | | 3 opt p x x x x x x x r x x x x x x  − +   − + − +   =  − + −     . (30) the spectral characteristic of the 2p keys kernel, hopt_2p, is shown in fig. 4. by substituting the optimal parameters αopt = -0.6132, βopt = 0.1522, opt = -0.0195 (eq. (28)) in eq. (3), the optimal interpolation 3p keys kernel, ropt_3p, was obtained: 3 2 3 2 3 2 _ 3 3 2 1.2151 | | 2.2151 | | 1, | | 1 0.6132 | | 3.2377 | | 5.4207 | | 2.7962, 1 | | 2 ( ) 0.1522 | | 1.2371 | | 3.2937 | | 2.8566, 2 | | 3 0.0195 | | 0.2145 | | 0.78 | | 0.936, 3 | | 4 0, | | 4 opt p x x x x x x x r x x x x x x x x x x  − +   − + − +    = − + −    − − − −      . (31) the spectral characteristic of the 3p keys kernel, is shown in fig. 4. moreover, fig. 4 shows the spectral characteristics of the ideal rsinc kernel (hsinc). paper [11] presents the total mean square error, mset, i.e. the difference between the spectral characteristic h and the ideal box characteristic hsinc: 1 2 sinc 0 1 ( ) ( ) k t k k k mse h f h f k − = = − . (32) fig. 4 spectral characteristic of the ideal interpolation kernel hsinc and optimal spectral characteristics h of: a) 1p (αopt = -0.5), b) 2p (αopt = -0.5938, βopt = 0.0938) and c) 3p (αopt = -0.6132, βopt = 0.1522, opt = -0.0195) keys kernel optimization of the 3p keys kernel parameters ... 293 4. experimental results and analysis 4.1. experiment an experiment, with the aim of determining: a) interpolation accuracy with the 3p keys kernel, and b) interpolation execution time with the 3p keys kernel, te, in relation to interpolations with the 1p and 2p keys kernels, was realized. interpolations were performed on test images, ti, from the image base. the image base is created from some: a) standard test images used in digital image processing, and b) test images from the bsds500 base. some test images from the image base are in color (rgb) and some are in black-white (y). in this experiment, interpolations were performed on black-white images. therefore, color images were transformed into black-white images in accordance with the colorimetric equation y = 0.3r + 0.59g + 0.11b. the experiment was performed as follows. first, the experimental optimal values of the kernel parameters for: a) 1p keys (αopt), b) 2p keys (αopt, βopt) and c) 3p keys (αopt, βopt, opt), using the algorithm described below, were determined. after that, a comparative analysis of the error estimation of the optimal parameters of the kernels obtained: a) by optimizing the ripple of the spectral characteristic and b) obtained by experiments. finally, a comparative analysis of interpolation accuracy between the proposed 3p keys versus 1p keys and 2p keys was performed. for these reasons, the test image , ti, which, for analysis purposes, is presented as a two-dimensional matrix, with dimensions (l x k), was transformed into a one-dimensional matrix. the transformation was performed by connecting the rows of the test image matrix one after the other, and, in this way, a onedimensional matrix, x, with dimensions n = l x k, was obtained (these activities are realized by the algorithm described in section 4.3). the interpolation is organized as follows. the interpolation of the intensity of the pixel i, x(i), was performed by convolution between the interpolation kernel and intensity of the pixels x(ik), x(i k + 1), ..., x(i + k), where k is the length of the interpolation kernel. the interpolated value of pixel i is ˆix . on the other hand, intensity of the pixel i is known (x(i)), and, in the experiment, it is considered to be the true value of the pixel intensity. further analysis involved defining interpolation error. the interpolation error was defined by mse (eq. (32)), which was calculated between true, x(i), and the interpolated intensity, ˆix , of the pixel i. mse was used in a comparative analysis of interpolation accuracy, between interpolation results with applied 1p, 2p and 3p keys kernels. the interpolation results (msemin) are presented using graphs and tables. by comparative analysis of msemin, the precision of interpolation with the 3p keys kernel, in relation to the precision of interpolation with the 1p and 2p keys kernels, was determined. in addition, the executions time of the pcc interpolation, te, was determined. testing of the of the execution time was performed on a computer desktop s2ac43p, processor: intel (r) pentium (r), cpu: g3220 3 ghz, ram: 8 gb and a windows 10 operating system. the matlab r2017b program was applied (to determine the execution time, te, the tic and toc functions are used). execution time was measured for: a) complete convolution with kernels (eq. (1), eq. (2) and eq. (3)), where, based on the kernel parameters α, β and , the coefficients of third order polynomials are calculated, and then the value of the polynomials were calculated, b) convolution with the optimized kernel parameters (eq. (29), eq. (30) and eq. (31)), where the coefficients of the polynomial were previously calculated, and, after that, the value of the polynomial is were calculated, and c) convolutional kernel execution time, without interpolation. all interpolation execution times, as the arithmetic mean of the value of the results for 100000 interpolations, were determined. 294 n. savić, z. milivojević, z. veličković 4.2. image base for the purpose of realizing the experiment, in which the accuracy of pcc interpolation with image interpolation, is tested, an image base was created. image base consists of: a) standard test images for digital signal processing, and b) images from the bsds500 image base [17]. standard test images are: lena (512 x 512, rgb) (fig. 5.a), barbara (225 x 675, rgb) (fig. 5.b), cameraman (225 x 675, y) (fig. 5). c), peppers (225 x 675, rgb) (fig. 5.d), boats (225 x 675, rgb) (fig. 5.e), tulips (512 x 768, rgb) (fig. 5.f) , and watch (768 x 1024, rgb) (fig. 5.d). test images from the bsds500 base have numeric labels: 3096 (321 x 481, rgb) (fig. 5.h), 14037 (321 x 481, rgb) (fig. 5.i), 295087 (321 x 481, rgb) ( fig. 5.j), 126007 (321 x 481, rgb) (fig. 5.k), 260058 (321 x 481, rgb) (fig. 5.l), 160068 (321 x 481, rgb) (fig. 5.m), 241004 (321 x 481, rgb) (fig. 5.n), 197017 (321 x 481, rgb) (fig. 5.o), 143090 (321 x 481, rgb) (fig. 5.p). a) b) c) d) e) f) g) h) i) j) k) l) m) n) o) p) fig. 5 test image for digital image processing: a) lena, b) barbara, c) cameraman, d) pappers, e) boats, f) tulips, d) watch. test images from bsds500 database, with numeric labels: h) 3096, i) 14037, j) 295087, k) 12607, l) 260058, m) 160068, n) 241004, o) 197017, p) 143090 optimization of the 3p keys kernel parameters ... 295 4.3. algorithm for interpolation error determining the following algorithm performs interpolation of the test images, determines the interpolation error and determines the mse depending on the parameters α, β and γ. optimal parameters were determined by minimizing mse. algorithm is realized in the following steps: input: (r0, r1, r2, r3) – 3p keys kernel parameters, (αmin, δα, αmax, βmin, δβ, βmax, γmin, δγ, γmax) parameter boundaries and iteration steps, l – kernel length, ti (l x k) test image. output: αopt, βopt, γopt. optimal parameters. mseα, mseαβ, mseαβγ. step 1: converting a color image to a black-white image. if test image == color image 0.3 0.59 0.11 i t r g b=  +  +  end step 2: transformation of the image ti (l x k) into a one-dimensional matrix x: for = 1 : l for k = 1 : k (( 1) ) ( , ) i x k k t k−  + = end k end the dimensions of the one-dimensional matrix x are (1, n), where n = l x k. for γ = γmin : δγ :γmax. for β = βmin : δβ : βmax for α = αmin : δα : αmax step 3: construction of the kernel: 0 1 2 3 r r r r r  = + + + , step 4: the length of interpolation frame is: 2 1m l=  − for i = 1: n-m+1, step 5: selecting the i-th frame: xi = x (1: i+m-1) step 6: estimation of ˆ i x by applying pcc: ˆ [1: 2 : ] i i x x m r=  , where the symbol  stands for convolution. step 7: estimation error is: ˆ( ) ( ) i i e i x l x= − end i step 8: mean square error of estimation of 1p kernel: 1 2 1 ( ) 1 ( 1) | ( ) | n m k mse n m e k  − + = = − +  , end α step 9: mean square error of estimation of 2p kernel: ( )mse mse  = , end β step 10: mean square error of estimation of 3p kernel: ( )mse mse  = , end γ step 11:. optimal values of 3p kernel parameters: , , ( , , ) arg min( ) opt opt opt mse       = . 296 n. savić, z. milivojević, z. veličković the described algorithm had the purpose of testing the interpolation error with the 3p keys kernel in relation to the interpolation error with the 1p and 2p keys kernels. the algorithm was implemented in matlab, and, except for testing, is not intended for realtime systems. therefore, the execution time of the algorithm is not of dominant importance. however, in the experiment, using the matlab function tic and toc, for the case of applying 1p, 2p and 3p keys kernels, the interpolation execution time, te, is determined. based on the execution time, a comparative analysis was performed. 4.4. experimental results using the test algorithm described in section 4.3, interpolation of the test images was performed. interpolation for some values of α, β and γ parameters from the specified range has been performed. in addition, interpolation with the 1p, 2p and 3p keys kernels with all parameters from the range was performed. for each interpolation, the interpolation error, mse, is determined. based on the minimum interpolation error, msemin, the optimal interpolation kernel parameter was determined. figure 5.a shows the dependence of the mseα on the parameter α, for the 1p keys kernel (test image boats). the optimal parameter, αopt, was determined as ( ) arg min( )opt mse   = . figure 5.b shows the dependence of mseαβ on the parameters α and β for the 2p keys kernel (test image boats). the optimal parameters αopt and βopt, were determined as , ( , ) arg min( ) opt opt mse      = . the minimum interpolation errors, msemin, and the corresponding optimal kernel parameters, when interpolating all test images from the image base, are shown in: a) table 1 (1p keys, αopt, mse1p), b) table 2 (2p keys, αopt, βopt, mse2p) and c) table 3 (3p keys, αopt, βopt, γopt, mse3p). table 4 shows the execution time of pcc convolution for: a) complete convolution with kernels (label in the table: int1), (eq. (1), eq. (2) and eq. (3)), b) convolution with the optimized kernel parameters (label in the table: int2) (eq. (29), eq. (30) and eq. (31)) and c) convolutional kernel execution time, without interpolation (label in the table: kert). all interpolation execution times, as the arithmetic mean of the value of the results for 100000 interpolations, were determined. a) b) fig. 5 dependence of mse on kernel parameters for the test image boats: a) 1p keys kernel and b) 2p keys kernel optimization of the 3p keys kernel parameters ... 297 table 1 optimal parameter α and minimum mse for 1p keys kernel. image base image αopt mse1p d s p t e st b a se lena -0.3000 11.3234 barbara -0.1000 247.0271 cameraman -0.5000 0.3133 pappers -0.6200 75.7521 boats -0.3000 263.2390 tulips -0.7000 14.5797 watch -0.4000 49.9283 b s d s 5 0 0 b a z e 3096 0.200 0.7933 14037 -0.6000 10.0185 295087 0.1000 2.3780 126007 -0.4000 19.1678 260058 -0.300 4.8327 160068 0.6000 0.5835 241004 -0.300 6.4673 197017 0.3000 6.4499 143090 -0.01 18.2042 _1opt p 1pmse -0.1706 45.6911 table 2 optimal parameters α and β, and minimum mse for 2p keys kernel image base image αopt βopt mse2p d s p t e st b a se lena -0.3000 -0.1000 11.3137 barbara -0.1000 0 247.0271 cameraman -0.3000 -0.2000 0.3114 pappers -0.5400 0.1000 75.2829 boats -0.4000 0.1000 262.7854 tulips -0.6000 0.2000 14.1536 watch 0 0.3000 49.2893 b s d s 5 0 0 b a z e 3096 0.2100 0.0030 0.6346 14037 -0.300 0.200 7.9427 295087 0.0400 -0.0100 1.9018 126007. -0.4000 0.0100 15.3342 260058 -0.300 -0.010 3.8658 160068 0.7000 0.1000 0.4663 241004 -0.3000 -0.0300 5.1729 197017 0.4000 0.0900 5.1557 143090 -0.0100 0.0040 14.5632 _ 2opt p _ 2opt p 2 pmse -0.1375 0.0473 44.7000 298 n. savić, z. milivojević, z. veličković table 3 optimal parameters α, β and γ , and minimum mse for 3p keys kernel imag e base image αopt βopt γopt mse3p d s p t e st b a se lena -0.3000 -0.1000 -0.0500 11.3130 barbara -0.1000 -0.3000 -0.3000 242.1622 cameraman 0.3000 -0.1000 0.1000 0.3113 pappers -0.5200 0.1000 -0.0200 75.2664 boats 0.5000 0.2000 0.0500 262.7747 tulips -0.6000 0.2000 -0.0500 14.1407 watch -0.1000 0.1000 -0.1500 49.2107 b s d s 5 0 0 b a z e 3096 0.2000 -0.007 -0.005 0.4760 14037 -0.300 0.2000 -0.010 5.9566 295087 0 0 0.0400 1.4259 126007. -0.400 0.1000 0.0800 11.4961 260058 -0.400 -0.060 0.0001 2.8970 160068 0.7000 0 -0.110 0.3491 241004 -0.300 0 0.0400 3.8780 197017 0.400 0.0800 -0.014 3.8666 143090 -0.020 0.0900 0.1000 10.9091 _ 3opt p _ 3opt p _ 3opt p 3pmse -0.1587 0.0314 -0.0187 43.5271 table 4 execution time for pcc interpolation. execution time te (s) te_1p_keys te_2p_keys te_3p_keys int1 1.430510-6 2.487610-6 2.522510-6 int2 1.190310-6 2.070410-6 2.099010-6 kert 5.649910-7 5.649210-7 5.648910-7 4.5. comparative analysis according to the results presented in table 1, table 2 and table 3, it is obvious that: a) mse when applying 2p keys kernel compared to 1p keys kernel: 1p mse / 2 pmse = 45.6911 / 44.700 = 1.0222 times smaller, b) mse when applying 3p keys kernel compared to 1p keys kernel: 1p mse / 3p mse = 45.6911 / 43.5271 = 1.0497 times smaller, and c) mse when applying 3p keys kernel compared to 2p keys kernel: 2 p mse / 3p mse = 44.700 / 43.5970 = 1.0269 times smaller. the optimal values of the kernel parameters, determined by minimizing the ripple of the characteristic of the 3p keys kernel (eq. (28)), are: αopt = -0.6132, βopt = 0.1522 i γopt = 0.0195. using the experimental results (table 3), it was shown that the mean values of the optimal kernel parameters, determined for all test images, are: _ 3opt p  = -0.1587, _ 3opt p  = 0.0314 and _ 3opt p = -0.0187. the absolute error of the kernel parameters, determined by algorithm for minimizing of the ripple of the spectral characteristic, in relation to the experimentally determined kernel parameters, are: a) α3p = _ 3 | | opt opt p  − = | 0.6132 ( 0.1587) |− − − = 0.4545, b) β3p = _ 3| |opt opt p − = | 0.1522 0.0314 |− = 0.1208, optimization of the 3p keys kernel parameters ... 299 c) 3p = | | opt opt  − = | 0.0195 ( 0.0187) |− − − = 0.0008. the total absolute error of kernel parameter estimation for all test images is et = 2 2 2 3 3 3p p p    + + = 0.4703. in accordance with the results presented in table 4, for a complete convolution with nonoptimized kernels, (eq. (1), eq. (2) and eq. (3)), (label in the table 1: int1), it is concluded that time execution, te, when applying: a) 2p keys kernel compared to 1p keys kernel is te_2p_keys / te_1p_keys = 2.487610 -6 / 1.430510-6 = 1.7389 times bigger, b) 3p keys kernel compared to 1p keys kernel is te_3p_keys / te_1p_keys = 2.522510 -6 / 1.430510-6 = 1.7633 times larger, and c) 3p keys kernel compared to 2p keys kernel is te_3p_keys / te_2p_keys = 2.522510 6 / 2.487610-6 = 1.014 times larger in accordance with the results presented in table 4, for a complete convolution with optimized kernels, (eq. (29), eq. (30) and eq. (31)), (label in the table 1: int1), it is concluded that time execution, te, when applying: a) 2p keys kernel compared to 1p keys kernel is te_2p_keys / te_1p_keys = 2.070410 -6 / 1.190310-6 = 1.7393 times bigger, b) 3p keys kernel compared to 1p keys kernel is te_3p_keys / te_1p_keys = 2.099010 -6 / 1.190310-6 = 1.7634 times larger, and c) 3p keys kernel compared to 2p keys kernel is te_3p_keys / te_2p_keys = 2.099010 -6 / 2.070410-6 = 1.013 times larger. the convolutional kernel execution time, te, without interpolation (label in the table: kert) is approximately 5.64910-7 for all keys kernels. the reason is that all kernels, after optimization, have the same numerical complexity: a third-order polynomial with constant coefficients. when a 3p keys interpolation kernel with optimized parameters is applied, the convolution execution time, compared to non-optimized kernels, is te 3p keys / te 3p keys opt = 2.522510 -6 / 2.099010-6 = 1.2017 times less. the results from the described experiment and the conducted detailed comparative analysis of interpolation error, which were expressed through mse, indicated the fact that the accuracy of interpolation when the 3p keys kernel was applied, in relation to 1p and 2p kernels, increased. the testing algorithm is implemented in the matlab programming language. the interpolation execution times were calculated using the matlab function tic and toc. the experiment only showed precision interpolation with 3p keys kernels compared to precision with 1p keys and 2p keys kernels. in addition, the relative ratio of the interpolation execution times is determined. however, for real-time interpolation, the convolution algorithm must be written in a programming language (for example, programming language c) where, in the compilation process, optimizations can be made to reduce program execution time (eq. 31). in this way, image processing can be provided in real-time mode and, among other things, on personal computers. 5. conclusion the paper presents an algorithm for optimizing the parameters of the 3p keys interpolation kernel. parameter optimization was performed in the spectral domain by minimizing the ripple of the spectral characteristic. first, the spectral characteristic was developed in the taylor series, and, after that, the members of the taylor series that have a great effect on increasing the riple of the spectral characteristic, were eliminated. from the conditions of elimination of the dominant members of the taylor series, the optimal values of the parameters 3p keys kernel (αopt = -0.6132, βopt = 0.1522, γopt = -0.0195) were determined. verification of the accuracy of the 3p keys kernel when interpolating images 300 n. savić, z. milivojević, z. veličković was performed experimentally. the interpolation accuracy is expressed through the mse interpolation error. detailed comparative analysis showed that the 3p keys kernel, with experimentally determined optimal parameters, has a higher interpolation accuracy compared to the 1p keys kernel 1.0497 times, and compared to the 2p keys kernel 1.0269 times. based on the presented results, it is concluded that the 3p keys kernel is superior to the 1p keys and 2p kernels and that the interpolation error is very small. experimental results show that the 3p keys kernel, with the optimal parameters, which are determined by the optimization algorithm presented in this paper, performed the interpolation of the test images with great precision. the 3p keys kernel with optimal parameters, compared to the ideal sinc kernel, has a small numerical complexity, and therefore, it is suitable for implementation in convolutional interpolations for operation in real-time systems. references [1] r. g. keys, "cubic convolution interpolation for digital image processing" ieee trans. acout. speech, & signal processing, vol. assp-29, pp. 1153–1160, dec. 1981. [2] e. meijering, m. unser, "a note on cubic convolution interpolation", ieee transactions on image processing, vol. 12, no. 4, pp. 447–479, april 2003. [3] n. dodgson, "quadratic interpolation for image resampling", ieee transactions on image processing, vol. 6, no. 9, pp. 1322–1326, sept. 1997. [4] o. rukundo, b. maharaj, "optimization of image interpolation based on nearest neighbor algorithm", in proceedings of the international conference on computer vision theory and applications (visapp), 2014, vol. 1, pp. 641–647. [5] s. s. rifman, "digital rectification of erts multispectral imagery", in proceedings of the symp. significant results obtained from the earth resources technology satellite-1, 1973, vol 1, sec. b, pp. 1131–1142. [6] t. b. deng, "frequency-domain weighted-least-squares design of signal-dependent quadratic interpolators", iet signal process., vol. 4, no. 1, pp. 102–111, feb. 2010. [7] n. gajalakshmi, s. karunanithi, "cubic convolution and osculatory interpolation for image analysis", international journal of creative research thoughts (ijcrt), vol. 9, issue 12, pp. 836–841, december 2021. [8] y. li, f. qi, y. wan, "improvements on bicubic image interpolation", in proceedings of the ieee 4th advanced information technology, electronic and automation control conference (iaeac), 2019, pp. 1316– 1320. [9] s.-h. hong, l. wang, t.-k. truong, "an improved approach to the cubic-spline interpolation", in proceedings of the 25th ieee international conference on image processing (icip) 2018, pp. 1468– 1472. [10] k. s. park, r. a. schowengerdt, "image reconstruction by parametric cubic convolution", computer vision, graphics & image processing, vol. 23, pp. 258–272, 1982. [11] e. meijering, k. zuiderveld, m. viegever, "image reconstruction by convolution with symmetrical piecewise nth-order polynomial kernels", ieee transactions on image processing, vol. 8, no. 2, pp. 192–201, feb. 1999. [12] z. milivojević, n. savić, d. brodić, p. rajković, "optimization parameters of two parameter keys kernel in the spectral domain", in proceedings of the xv international scientific-professional symposium infoteh-jahorina, bosnia, 2016, pp. 392–397. [13] r. hanssen, r. bamler, "evaluation of interpolation kernels for sar interferometry", ieee transactions on geoscience and remote sensing, vol. 37, no. 1, pp. 318–321, jan. 1999. [14] z. milivojević, d. brodić, "estimation of the fundamental frequency of the real speech signal compressed by mp3 algorithm", archives of acoustics, vol. 38. no. 3, pp. 363–373, 2013. [15] z. milivojević, n. savić, d. brodić, "three-parametric cubic convolution kernel for estimating the fundamental frequency of the speech signal", computing and informatics, vol. 36, pp. 449–469, 2017. [16] n. savić, z. milivojević, "optimization of the 3p keys kernel parameters for interpolacion of audio signals", in proceedings of the international scientific conference unitech'20, gabrovo, bulgaria, 2020, pp. 200–205. [17] https://www2.eecs.berkeley.edu/research/projects/cs/vision/bsds/ instruction facta universitatis series: electronics and energetics vol. 27, n o 2, june 2014, pp. 153 182 doi: 10.2298/fuee1402153s fiber optics engineering: physical design for reliability ephraim suhir bell laboratories, murray hill, nj, portland state university, portland, or, usa, technical university, vienna, austria, bordeaux university, bordeaux, france, ariel university, ariel, israel, ers co., los altos, ca, usa abstract. the review part of the paper addresses analytical modeling in fiber optics engineering. attributes and significance of predictive modeling are indicated and discussed. the review is based mostly on the author’s research conducted at bell laboratories, physical sciences and engineering research division, murray hill, nj, usa, during his tenure with bell labs for about twenty years, and, to a lesser extent, on his recent work in the field. the addressed topics include, but are not limited to, the following major fields: bare fibers; jacketed and dual-coated fibers; coated fibers experiencing thermal and/or mechanical loading; fibers soldered into ferrules or adhesively bonded into capillaries; roles of geometric and material non-linearity; dynamic response to shocks and vibrations; as well as possible applications of nanomaterials in new generations of coating and cladding systems. the extension part is concerned with a new, fruitful and challenging direction in optical engineeringprobabilistic design for reliability (pdfr) of opto-electronic and photonic systems, including fiber optics engineering. the rationale behind the pdfr concept is that the difference between a highly reliable optical fiber system and an insufficiently reliable one is “merely” in the level of the never-zero probability of failure. it is the author’s belief that when the operational reliability of an optical fiber system and product is imperative, the ability to predict, quantify, assure and, if possible and appropriate, even specify this reliability is highly desirable. key words: fiber optics engineering, optical fibers, design-for-reliability, predictive modeling, probabilistic assessments  received january 6, 2014 corresponding author: ephraim suhir ers co., los altos, ca, usa 727 alvina ct., los altos, ca 94024, 650-969-1530, cell. 408-410-0886 (e-mail: suhire@aol.com) 154 e. suhir 1. physical design-for-reliability in fiber optics engineering 1.1. fiber optics engineering (foe) three major objectives are pursued in fiber optics engineering (foe), as far as its short and long-term reliability is concerned: 1) failure-free functional (optical) performance; 2) high physical (structural, mechanical) reliability; and 3) satisfactory environmental durability. the physical design for reliability (dfr) effort deals primarily with the second objective, but, to an extent, also with the other two as well. the dfr effort employs methods and approaches of reliability physics and structural analysis and is aimed at evaluating stresses, strains and displacements in fiber optics structures, carry out physical design of these structures, and assess and assure their shortand long-term reliability. physical dfr effort treats fiber optics products as structures: the materials interaction, the size and configuration of the structural elements in the product, physical nature and magnitude of the applied loads, and the ability to quantify reliability are as important in this effort as the optical properties and characteristics of the employed materials. the application of methods and approaches of dfr in foe systems enables one to design, fabricate and operate a viable and reliable product [1]-[11]. like traditional and much better developed branches of dfr, such as civil, aircraft, space, maritime, automotive, etc., dfr in foe considers the specifics, associated with the properties of the materials used, typical structures employed, and the nature, magnitude and variability of the applied loads. typical foe structures are bare or composite (coated) rods and beams of various lengths and flexural rigidities. these structural elements could be soldered into ferrules, adhesively bonded into capillaries, or embedded into various materials and media. typical materials are silica glasses; polymers (coatings, adhesives, and even polymer light-guides); semiconductors, including compound semiconductors; metals, and, first of all, solders, both ―hard‖ (e.g., gold-tin) and ―soft‖ (e.g., silver-tin) ones. typical loads include internal (thermal) loads caused by dissimilar materials and/or by temperature gradients, and/or highor low-temperature environments (temperature extremes); external (mechanical) loads due to the inevitable or imposed, but always critical, deformations; or possible dynamic loads caused by shocks, vibrations, acoustic noise or impact, etc. high voltage, electric current, ionizing radiation and/or extensive light output from a powerful laser source are also considered as loads (stressors, stimuli). dfr in foe pursues, but might not be limited to, the following major objectives: 1) determine and idealize, for the sake of predictive modeling, the most likely loading conditions; 2) evaluate the stresses, strains, displacements, and, when methods and approaches of fracture mechanics are applicable, also fracture characteristics of the fiber optics materials and structures; 3) assure, typically on the probabilistic basis, that the acceptable strength and reliability criteria will remain, during the lifetime of the product, within the limits allowable from the standpoint of the product‘s structural integrity, elastic stability, dependability, availability and normal operation. while an optical engineer is and should be concerned, first of all, with the functional (optical) performance of the foe product, an adequate performance of this product cannot assured, if its ability to withstand elevated stresses (physical reliability) and exhibit adequate environmental durability (ability to withstand degradation and aging at high fiber optics engineering: physical design for reliability 155 temperature and/or humidity environments) is not taken care of. accordingly, we consider the following stress-strain analysis problems encountered in foe: 1) role and attributes of, and challenges in, predictive modeling in foe problems: the emphasis is on the analytical (mathematical) modeling; 2) thermal stress in fiber optics structures: it is this stress and strains (displacements) that are the most typical and most detrimental in these structures; 3) bending of bare fibers caused by the ends off-set; 4) bare fibers under the combined action of bending and tension; 5) role of the structural and materials nonlinearity; 6) coated fibers and stresses that occur in the glass material during the design, fabrication, operation and proof-testing of such fibers; 7) micro-bending of dual-coated fibers intended for long haul communication; 8) solder materials and joints, and fibers soldered into ferrules; 9) dynamic response of electronic and photonic systems, including optical fibers, to shocks and vibrations; 10) new nano-material and its applications in fiber optics, photonics and beyond; 11) some special foe problems: strain-free planar optical waveguides; apparatus and method for thermostatic compensation of temperature sensitive optical devices; stresses and strains in fused bi-conical taper couplers; ―curling‖ phenomenon during drawing of optical fibers; effect of voids. the extension part deals with a novel direction in ―high-tech‖ engineering -probabilistic design for reliability (pdfr) of electronic and photonic systems, including optical fibers and interconnects. the objective of this direction is to provide quantitative probabilistic assessments of the likelihood of operational failures of oe materials, devices and systems. the pdfr direction is based on the rationale that when reliability is imperative, the ability to predict, quantify, assure, and, if possible and appropriate, even specify it, is highly desirable, or even a must. 1.2. predictive modeling (pm) in fiber optics engineering (foe): role, attributes, challenges modeling is the major approach of any science, whether pure or applied. research and engineering models can be experimental or theoretical. experimental models are typically of the same physical nature as the actual phenomenon or the object. they reproduce a notion or an object of interest in a simplified way and often on a different scale. theoretical models represent real phenomena using abstract notions. the goal of a theoretical model is to reveal non-obvious, often even paradoxical, relationships hidden in the available intuitively obvious and/or experimentally proven input information [12]-[17]. a theoretical model can be either analytical or numerical (computational). analytical models often employ more or less sophisticated mathematical methods of analysis. the today‘s numerical models are computeraided. the most widespread model in the stress-strain evaluations and physical design for reliability in foe is finite-element analysis (fea). experimental and theoretical models have their merits and drawbacks, their areas of application, and should be viewed as equally important and equally indispensable for the design of a viable, reliable, and cost-effective foe product. one should always try to avoid to be blamed that because his/her only tool is a hammer, all the problems look like nails to him/her. although the role of theoretical modeling, mostly computer-simulations 156 e. suhir based, has dramatically increased in foe during the last two decades, the situation is still essentially different from the traditional areas of applied science. the majority of studies dealing with the physical design and performance of foe materials and products are experimental, and there are several reasons for that. first, experiments could be carried out with ―full autonomy‖, i.e. without necessarily requiring theoretical support. unlike theory, testing can be, and is, in effect, used for final proof of the viability and reliability of a foe product. that is why testing procedures are essential requirements of military and commercial specifications for such products. second, experiments in the foe field, expensive as they are, are considerably less costly than, e.g., hulls in naval architecture, or fuselages in the aerospace field, or objects in civil engineering, where "specimens" might cost millions of dollars. third, foe experimentations are much easier to design, organize, and conduct than in the macro-engineering world. fourth, materials whose properties are unknown are often and successfully employed in various foe applications. lack of information about the properties of such materials is often viewed as an obstacle for implementing theoretical modeling. finally, many of the leading specialists in foe (experimental physicists, materials scientists, chemists, chemical engineers) traditionally use experimental methods as their major research tool. some of them simply do not feel that adding theoretical modeling will make an appreciable difference in the state-of-the-art of what they do. it is not surprising that eleven out of twelve bell labs nobel laureates were experimentalists. on the other hand, the application of experimental modeling, unlike theoretical modeling, requires, as a rule, considerable time and is often associated with significant expense. what is even more important though is that experimental data inevitably reflect the effect of the combined action of a variety of factors affecting the phenomenon or the product of interest. this makes experimentation often insufficient to understand the behavior and the performance of an foe material or a device. such a lack of insight inevitably leads to tedious, time-consuming and costly experimental procedures. as a rule, the experimental data cannot be simply extended to new situations or new designs that are appreciably different from those tested. it is always easy to recognize purely empirical relationships obtained by formal processing of experimental data and not based on rational theoretical considerations reflecting the physical nature of the phenomenon of interest. purely experimental relationships contain, as a rule, fractional exponents and coefficients, odd units, etc. although such relationships may have a certain practical value, the very fact of their existence should be attributed to the lack of knowledge in the given area of applied science. typical examples are a power law (e.g., the one used in proof-testing of optical fibers, when their delayed fracture, aka as ―static fatigue‖, is evaluated) or an inverse power law (e.g., numerous relationships of coffin-manson type used to evaluate the lifetime of solder joint interconnections). in view of the above, here is what could be gained by using theoretical modeling: 1) unlike experimentation, predictive modeling is able to shed light on the role of each particular parameter that affects the behavior and performance of the material, structure or a system of interest; 2) although testing can reveal insufficiently robust elements, it is incapable to detect superfluously reliable ones; ―over-engineered‖ (superfluously robust) objects may have excessive weight and be more costly than necessary; in mass production of expensive products, superfluous reliability may entail substantial and unnecessary additional costs; fiber optics engineering: physical design for reliability 157 predictive modeling might be able to reveal the ―over-engineered‖ and, hence, costineffective elements of a foe design; 3) theoretical modeling can often predict the result of an experiment in less time and at a lower expense than it would take to perform the actual experiment; 4) in many cases, theory serves to discourage wasting time on useless experiments; numerous attempts to build impossible heat engines have been prevented by a study of the theoretical laws of thermodynamics; while this is, of course, a classical and an outstanding example of the triumph of a theory, there are also numerous, though less famous, examples, when plenty of time and expense were saved because of prior theoretical modeling of a problem of interest; 5) in the majority of research and engineering projects, a preliminary theoretical analysis enables one to obtain valuable information about a phenomenon or an object to be investigated, and gives an experimentalist an opportunity to decide, what and how should be tested or measured, and in what direction success might be expected. 6) by shedding light on ―what affects what‖, theoretical modeling often serves to suggest new experiments: theoretical analyses of thermal stresses in bi-material assemblies (e. suhir, asme j. appl. mech., vol. 53, no. 3, sept. 1986) and in semiconductor thin films (s. luryi and e. suhir, applied physics letters, vol. 49, no. 3, july 1986) triggered numerous experimental investigations aimed at the rational physical design of semiconductor crystal grown assemblies; 7) theory can be used to interpret empirical results and to bridge the gap between different experiments and can be used to extend the existing experience on new materials and products; 8) one cannot do without a good theory when developing rational (optimal) designs; the idea of optimization of structures, materials, functions and costs, although new in foe, has penetrated many areas of modern engineering; no progress in this direction could be achieved, of course, without application of theoretical methods of optimization. 1.3. analytical vs. numerical modeling analytical modeling [18]-[23] occupies a special place in the predictive modeling effort: it is able not only to come up with relationships that clearly indicate ―what affects what‖, but, more importantly, can often explain the physics of phenomena and especially paradoxical situations better than the fea modeling, or even experiments, can. although the basics of fea modeling were known since mid-thirties or so, it is since mid-1950s, when high-speed and powerful computers have become available, fea modeling has become the major research tool for theoretical evaluations in many areas of engineering. since mid-1970s, fea has become the major modeling tool in electronics and photonics as well. this can be attributed, first of all, to the developments of computer science and engineering and the availability of numerous powerful and flexible computer programs. these programs enable one to obtain, within a reasonable time, a solution to almost any stress-strain related problem. broad application of computers, however, has, by no means, made analytical solutions unnecessary or even less important, whether exact, approximate, or asymptotic. simple and easy-to-use analytical relationships have invaluable advantages, because of the clarity and compactness of the obtained information and explicit indication of the role of various factors affecting the given phenomenon or the behavior of the given material or the device. these advantages are especially significant when the parameter under investigation 158 e. suhir depends on more than one variable. as to the asymptotic techniques, they can be successful in many cases, when there are difficulties in the application of computational methods, e.g., in various problems containing singularities. such problems are often encountered in foe, because of wide employment of assemblies comprised of dissimilar materials. but, even when application of fea encounters no difficulties, it is always advisable to investigate the problem analytically before carrying out fea analyses. such a preliminary investigation helps to reduce computer time and expense, develop the most feasible and effective preprocessing model and, in many cases, avoid fundamental errors. let us indicated several attributes of the analytical modeling effort in comparison with the fea: 1) fea has been originally developed for structures with complicated geometry and/or with complicated boundary conditions (such as, e.g., avionics structures), when it might be difficult to apply analytical approaches. as a consequence, fea has been especially widely used in those areas of engineering, in which structures of complex configuration are typical (aerospace, maritime and offshore structures, some civil engineering structures, etc.). in contrast, foe structures are usually characterized by relatively simple geometries and can be easily idealized as cylindrical beams, flexible rods, rectangular or circular plates, various composite structures of relatively simple geometry, etc. there is an obvious incentive therefore for a broad application of analytical modeling in foe. 2) the adjacent structural elements in foe often have dimensions (thicknesses) that differ by orders of magnitude. typical examples are dual-coated fibers, thin-film systems fabricated on thick substrates, and adhesively bonded assemblies, in which the bonding layer (or the primary coating) is, as a rule, significantly thinner than the bonded components (secondary coating). since the mesh elements in a fea model must be compatible, fea of such structures often becomes a problem of itself, especially in regions of high stress concentration. such a situation does not occur, however, when an analytical approach is used. 3) there is often an illusion of simplicity in applying fea procedures. some users of fea programs believe that they are not even supposed to have any prior knowledge of structural analysis and materials physics, and that the ‖black box" they deal with will automatically provide the right answer, as long as they push the right keys on the computer. at times, a hasty, thoughtless, and incompetent application of computers can result in more harm than good by creating an impression that a solution has been obtained when, actually, this "solution" is simply wrong. it is well known to those with hands-in experience with fea that although it might be easy to obtain a fea solution, it might be quite difficult to obtain the right solution. and how would one know that he/she obtained the right solution, if there is nothing to compare it with? in effect, one has to have good background in reliability and materials physics to develop an adequate, feasible, and economic preprocessing model and to correctly interpret the obtained information, and preliminary analytical modeling can be of significant help in that. clearly, if the fea data are in good agreement with the results of an analytical modeling (which is usually based on quite different assumptions), then there is a reason to believe that the obtained solution is accurate enough. a crucial requirement for an effective analytical model is its simplicity and clear physical meaning. a good analytical model, which can be of real help in ―high-tech‖ engineering, should produce simple, easy-to-use and physically meaningful relationships that clearly indicate the role of the major factors affecting a phenomenon or an object of interest. one authority in applied physics remarked, perhaps only partly in jest, that the degree of fiber optics engineering: physical design for reliability 159 understanding of a phenomenon is inversely proportional to the number of variables used for its description. although an experimental approach, unsupported by theory, is "blind," theory, not validated by an experiment, is "dead." it is the experiment that forms a basis for a theoretical model, provides the input data for theoretical modeling, and determines the viability, accuracy, and limits of application of a theoretical model. limitations of a theoretical model are different in different problems and, in the majority of cases, are not known beforehand. it is the experimental modeling therefore, which is the ―supreme and ultimate judge‖ of a theoretical model. a physical experiment can often be rationally included into a theoretical solution to an applied problem. even when some relationships and structural characteristics lend themselves, in principle, to theoretical evaluation, it is sometimes simpler and more accurate to determine these relationships empirically. a good example is the spring constant of an elastic foundation provided by the primary coating in dual coated optical fibers. 1.4. bending of bare fibers bending of bare fibers, idealized as a single span beams clamped at the ends and subjected to lateral and/or angular misalignment(s) was examined, based on the engineering beam theory, in application to the stress-strain evaluations in optical fiber interconnects [24]-[34]. angular misalignments and lateral ends-offsets might be due to the inability of the given technology to ensure good alignment of the interconnect ends and/or end cross-sections, but might be also essential, and quite often even desirable, features of a particular design. elevated optical fiber curvatures, caused by misalignments, affect both functional (optical) performance and mechanical (structural) reliability of the fiber interconnects. these curvatures and the resulting bending stresses can be predicted and, if necessary, minimized for lower curvatures, thereby minimizing also the added transmission losses in, and structural reliability of, an interconnect. sometime it might be particularly easy and effective to minimize the maximum curvatures by simply rotating the end cross-section of the interconnect. the directions and the angles of rotation could be predicted depending on the measured lateral misalignments [24], [25], [31]-[34]. an important factor in the assessment of the level of the reactive axial tensile forces in an interconnect experiencing ends off-set is the magnitude of the off-set for its given interconnect length (span). reactive forces arise because the supports of the actual interconnect cannot move closer when the interconnect is subjected to the end‘s off-set. in such a situation the interconnect experiences, in addition to bending, also reactive tension. this tension might be neglected nevertheless, if the misalignment is small compared to the interconnect length [26]. how small is ―small‖ could be determined based on a more general, but still simple, predictive model that takes into consideration the possible occurrence of the appreciable tensile reactive forces [27]. if the reactive stresses are not negligible and have to be accounted for, this still could be done on the basis of the linear theory of bending of beams [27], although the level of the tensile forces is not proportional anymore to the level of the ends-offset. the situation is different if the interconnect experiences significant ends offset [28]. if this is the case, the nonlinear euler‘s ―elastic‖ theory can be employed to accurately predict the configuration of the misaligned fiber and the level of the tensile forces for the given (measured) ends off-set. for very large end off-sets the nonlinear stress-strain behavior of the silica material has to be accounted for (see section 1.6 below). 160 e. suhir the effect of the ends-offset, as far as the bending and the reactive tensile stresses are concerned, depends on the flexural rigidity of the interconnect. it is different therefore for bare and coated fibers. the models suggested in refs. [24], [25]-[34] can be employed also for coated interconnects, by just evaluating and using their increased flexural rigidity and then considering the distribution of the induced tensile force between the silica fiber and its coating. partially coated interconnects provide a particular challenge [26], as far as the ability to determine the induced stresses is concerned. if the highest stresses are expected to occur at the clamped ends of the interconnect, it might make a difference whether the interconnect is soldered or adhesively bonded into the support structures, and whether its coating becomes part of these structures: the lateral and/or the axial compliance of the clamped fiber at its support cross-sections might provide appreciable stress relief for the misaligned fiber and should be considered. in a conservative analysis, one could get away, however, assuming ideally rigid supports. such an assumption will result in an overestimation of the actual stresses in bending and/or in (reactive) tension. it is noteworthy that the occurrence of the tensile stress in an optical fiber of finite length subjected to a deliberately applied lateral off-set of its ends can be used in a unique and effective test vehicle for the evaluation of the tensile strength of the fiber, including its ―static fatigue‖ (delayed fracture) [29]. indeed, the developed models for the prediction of the tensile force in a fiber subjected to the given (imposed) ends offset enable one to develop a simple and an effective experimental setup. in fibers with significant ends off-set the bending stress is significantly lower than the tensile one, and could be neglected, especially if the ends of the fiber are allowed to rotate. such a setup mimics well therefore the pull-test conditions. while tensile loading on an optical fiber interconnect has always a negative effect on the state of stress in it, i.e., always leads to elevated stresses, especially in the presence of the ends off-set, moderate compression can have, strange as it may sound, a positive effect on the induced stresses [30], [31]. even if the compressive force exceeds the critical (buckling) force, it can still be tolerated, as long as the distance between the interconnect ends is controlled and cannot be smaller than the distance determined by the thermal contraction mismatch of the supports (and this distance is determined by the thermal contraction mismatch between the optical fiber and its enclosure). the desired compression can be evaluated beforehand and then implemented into the actual design by choosing the most suitable material of the enclosure: when the structure is fabricated at an elevated temperature and is subsequently cooled down to a low (room) temperature, the thermal contraction mismatch between the materials of the fiber and the enclosure will lead to the desired (required) level of the compressive stress and the displacement in the fiber. 1.5. pigtail configuration pigtails in laser package designs provide particular challenge, as far as their bending and optimized configuration is concerned. various situations encountered when a pigtail is employed to connect a laser package to the ―outside world‖ were addressed and analyzed [35]-[38]. it has been shown particularly [35] that by rotating the package inside the enclosure, one could reduce dramatically the induced curvatures. this should be done, however, with caution, since ideally straight pigtails cannot be recommended. this is because while the initial bending stress in them is indeed zero, the situation could be worsened dramatically if the structure with a high expansion enclosure is heated up, thereby leading to undesirable and significant tensile stresses in the pigtail. it is usually fiber optics engineering: physical design for reliability 161 preferred that the pigtail is kept ―loose‖ and, owing to that, is able to accommodate appreciable axial deformations in tension or compression without being stressed. if a pigtail experiences two-dimensional bending on a plane [38], appreciable bending stress relief can be obtained by simply forcing the pigtail to be configured as a quarter of a circumference, so that all its points have the same curvature and, hence, experience the same bending stress. this stress can be appreciably lower than the maximum bending stress at the clamped end of a clamped-free pigtail. more practical and more complicated situations take place when a pigtail is bent on a cylindrical surface [36], [37]. such a design was considered for lasers intended for at&t undersea long haul communication technologies. achieving an optimized geometry of such a pigtail was certainly a challenge. 1.6. consideration of structural and material nonlinearity consideration of the structural (geometric) and materials (physical) nonlinearity might be necessary, if the fiber experiences significant bending and/or axial deformations [39]-[48]. the effect of the structural nonlinearity, which is due to the significant bending deformations of optical fibers, takes place when the induced displacements are not proportional anymore to the applied forces. the stress-strain relationship might be still linear, however, i.e., hooke‘s law is still fulfilled. this is the case, e.g., of fiber interconnects with moderate end-offsets. it has been established, however [39]-[43], that silica glasses exhibit highly non-linear, although still elastic, stress-strain relationships when the applied strains are not low enough. young‘s modulus in these materials becomes strain dependent. experiments have indicated that it increases with an increase in the tensile stress and decreases with an increase in the compressive stress, even well below the stress that leads to buckling. when a silica fiber specimen is subjected to significant bending deformations, its neutral axis shifts at the given cross-section of the specimen, because of the non-linear stress-strain behavior of the material, in the direction of the layer subjected to tension. this phenomenon takes place, particularly, when the specimen is subjected to two-point bending tests [39]-[47]. both the geometric nonlinearity caused by large bending deformations (the shape of the bent fiber) and material‘s nonlinearity have to be considered, so that the maximum bending stress in the bent fiber is predicted in the most accurate fashion. if there is a need to establish the actual shape of the bent fiber, the euler ―elastica‖ approach might be necessary, and this shape is expressed in elliptic functions. attributes associated with the role of fiber coating, if any, can be easily incorporated, if necessary, into the analytical stress model [48]. 1.7. thermal stress in coated fibers coated fibers, whether polymer coated or metalized or otherwise protected, are widely employed for better shortand long-term reliability of the silica material, which is both brittle and moisture-sensitive. the addressed problems encountered during design, manufacturing, testing, and reliability assessments for coated fibers include: evaluation of the effect of coating on the bending stresses; understanding the possible delamination modes and mechanisms; improving strippability of coated fibers; prediction of the magnitude and distribution of stresses occurring during proof (pull-out) testing, and others. thermal loading is responsible for many failures in photonics engineering, including optical fiber systems [49]-[61]. such loading could be caused by the thermal expansion 162 e. suhir (contraction) mismatch of the dissimilar materials in the structure (and particularly of the fiber and its coating) and/or by the non-uniform distribution of temperature (temperature gradients) in the system. steady-state or variable thermal loading takes place during the normal operation of optical assemblies and systems, as well as during their fabrication, testing, transportation and storage. thermal stresses, strains and displacements are the major contributor to the functional, structural and environmental failures of the optical equipment. this is true even for optical fiber systems, although optical fibers, unlike copper wires, do not dissipate heat. creep and stress relaxation phenomena might lead to excessive and undesirable displacements in foe systems. complete loss in optical coupling efficiency can occur, because of the excessive displacements due to the lateral (often less than 0.2 micrometers) or angular (often less than a split of one percent of a degree) misalignment in the gap between two light-guides or between a light source and a light-guide. this could be caused particularly by the thermal stress related deformations and/or, e.g., by stress relaxation in the laser weld. as is known, tiny temperature-induced changes in the distances between bragg gratings written on an optical fiber can be detrimental to its functional performance. for this reason thermal control of the ambient temperature is sometime needed to ensure sufficient protection provided to an optical device sensitive to the change in temperature. the requirements for the structural (physical) behavior of the materials and structures in optoelectronics and photonics are often based therefore on the functional (optical) requirements and specifications, while the requirements for the structural reliability or for the environmental durability might be significantly less stringent. the importance of addressing thermal stresses in, and particularly of modeling of the physical behavior and performance of, coated optical fibers was addressed in refs. 49-61, where a number of practically important fiber optics structures were considered and analyzed. a simple analytical stress model has been recently developed for the prediction of thermal stresses in a cylindrical tri-material body [60], with application to silicon photonics technologies, when a metalized optical fiber is soldered into a silicon chip. the developed model is applicable also to situations when a fiber is soldered into a ferrule, or is adhesively bonded into a capillary. it is concluded particularly that the adequate bonding material (e.g., a ―soft‖ tin-lead or a ―hard‖ gold-tin solder) should be selected and its thickness should be established, for low enough thermally induced stresses in it, based on the developed model, so that the shortand long-term reliability of the materials, and, first of all, the solder material, is not compromised. being analytical, rather than fea based, this model is quite general and can be used in various other technologies and structures, even well beyond the field of photonics, when cylindrical tri-material bodies comprised of dissimilar materials and experiencing temperature excursions are employed. in bi-material soldered or adhesively bonded assemblies, the bonding layer is much thinner than the bonded components and/or its young‘s modulus is considerably lower than young‘s moduli of the materials of the bonded components. owing to that the cte of the bonding material does not have to be accounted for, and the engineering predictive model can be developed for a bi-material assembly and made therefore relatively simple. however, when the intermediate (bonding) material is not thin and/or its young‘s modulus is not small, the material becomes ―an equal partner‖ with the materials of the bonded components. then a more complicated model has to be developed to account for the roles of all the materials in such a tri-material assembly. the development of such a model is particularly challenging for a cylindrical body, such as a silicon photonics assembly [60]. fiber optics engineering: physical design for reliability 163 1.8. coated fibers with low modulus coating at the ends interfacial thermally induced shearing and peeling stresses that are due to the interaction of the dissimilar materials in coated fibers are often the major cause of an insufficient shortand long-term reliability of the fibers. since both categories of the interfacial stresses concentrate at the fiber ends and decrease with an increase in the compliance of the coating system [62]-[64], there is an obvious incentive for employing low modulus coating materials at the ends of optical fiber interconnects. particularly, the maximum thermally induced interfacial shearing stresses at the ends of a jacketed fiber can be minimized, if the lengths of the end portions of the coating are established, for the given young‘s modulus of the coating material and the given thickness of the coating layer, in such a way that the shearing stress at the fiber ends becomes equal to the shearing stress at the boundary between the mid-portion and the peripheral portions of the bonding layer. the maximum shearing stress in such an inhomogeneously coated fiber takes place at two locations: at the fiber ends and at the boundaries between the mid-portion and the peripheral portions of the coated fiber. this stress could be significantly lower than in a fiber with a homogeneous coating. moreover, the maximum stresses in an inhomogeneously coated fiber will be even lower than in a fiber coated by a homogeneous layer whose young‘s modulus is the same as the young‘s modulus of the low modulus material at the peripheral portions of a fiber with an inhomogeneous coating. such a paradoxical situation [65] is due to the fact that stiff mid-portions of bonded joints bring down the relative longitudinal interfacial displacements of the bonded materials not only in the fiber mid-portion, where the interfacial thermal stresses are low anyway, but also at the fiber ends, where the maximum interfacial stresses occur. these stresses decrease with a decrease in the peripheral displacements. 1.9. micro-bending phenomenon in dual-coated fibers dual-coated optical fibers are fabricated at elevated temperatures and operated at low temperature conditions. it is imperative for coated fibers intended for long-haul communications remain stable at low temperatures, i.e., do not buckle (do not ―microbend‖) within the primary coating as a result of the thermal contraction mismatch of the high expansion (contraction) secondary coating and the low expansion (contraction) fiber. low-temperature micro-bending, while most likely harmless from the standpoint of the level of bending thermal stresses, can result in substantial added transmission losses [66][77]. the low-temperature micro-bending phenomenon is a good illustration of a situation, when it is the need for a failure-free functional (optical) performance, rather than the physical reliability, that determines the requirements for the adequate structural (physical) design of an optical fiber system. the simplest analytical models [67]-[74] suggest that the fiber prone to low temperature micro-bending is treated as an infinitely long beam lying on a continuous elastic foundation. this foundation is provided by the coating system, and, first of all, by the low-modulus primary coating. as long as such a beam-on-elastic-foundation predictive model is considered, particular attention should be paid to how the spring constant of the elastic foundation is determined. in the early publications preceding the pioneering vangheluwe‘s work [67] it was simply assumed that this constant was equal to the young‘s modulus of the primary coating materials. vangheluwe, using the plain strain theory-of-elasticity approximation, obtained, assuming ideally rigid secondary coating, a simple and physically meaningful formula for the spring constant. vangheluwe‘s formula indicates that the spring constant of interest 164 e. suhir depends on both the elastic constants of the primary coating material (young‘s modulus and poisson‘s ratio) and its thickness. vangheluwe‘s formula could result, however, in a considerable overestimation of the spring constant and, hence, in an overestimation of the critical (buckling) force, for some actual, not very stiff, secondary coating materials [69]. the more general formula [69] accounts for the finite rigidity of both the primary and the secondary coating. in the case of thick and not very high-modulus secondary coatings, the compliance of both coating layers should be considered. another significant finding, as far as the low-temperature micro-bending phenomenon is concerned, has to do with the role of the initial local curvatures [68]. while the initial curvatures do not change the magnitude of the critical force, they affect the pre-buckling behavior of the compressed fiber. when the compressive force increases, an initially straight fiber remains straight up to the very moment of buckling, while the localized curvatures in a fiber with such curvatures gradually increase with an increase in the compressive force. this could cause appreciable additional deflections of the glass fiber and, as the consequence of that, considerable added transmission losses even at moderately low temperatures, well below the buckling temperatures. it has been shown particularly that, from the standpoint of the pre-buckling behavior of a fiber, certain curvature lengths are less favorable than the others: a dual-coated fiber supported by an elastic foundation provided by the low-modulus coating behaves, with respect to the distributed localized initial curvatures, like a narrowband filter that enhances the curvatures, which are close to the post-buckling configuration of the fiber (regardless of whether buckling occurs or not), and suppresses all the other, ‗non-resonant‖, curvatures. the developed analytical models are simple, easy-to-use, and clearly indicate the role of various factors affecting the pre-buckling behavior of the fiber. the obtained solutions indicate what could possibly be done to bring down, if necessary, the induced curvatures and the resulting added transmission losses in the fiber. the numerical examples are carried out for silicone/nylon coated systems extensively studied experimentally by japanese engineers [66]. the theoretical predictions agree well with the experimental observations. it is noteworthy in this connection that it has been observed [76] that external (mechanical) periodic loading with a period of about 100nm can also cause appreciable micro-bending losses in dual-coated fibers, and therefore should be avoided in actual designs. this period is rather close to the predicted critical ―periods‖ of initial curvatures in the low-temperature micro-bending situation. 1.10. proof-testing of coated fibers the stress-strain related problems that arise during proof-testing of coated optical fibers were addressed, based on the analytical predictive modeling, in refs. 78-83. the considered problems include: the role of the lengths of test specimens in pull-testing [82]; the buffering effect of the coating on the acceptable length of the test specimens in pull (proof) [83] and in bending [80] tests; the magnitude and the distribution of the interfacial stresses during pull-out testing [81], as well as stresses in coated fibers stretched on a capstan during the manufacturing process [79]. in the brief discussion that follows we elaborate on some more or less important aspects of the physical phenomena associated with proof-testing of coated optical fibers. it is well-known in materials science that if one intends to experimentally determine the young‘s modulus of a material and/or its flexural strength through threeor four-point bending, the specimen should be long enough (say, its length should be at least 12-15 fiber optics engineering: physical design for reliability 165 times larger than its height), so that lateral shearing deformations do not occur in the specimen and do not affect the test data [78]. a problem encountered during pull testing of a glass fiber whose one end is soldered or adhesively bonded (and is therefore rigidly or elastically clamped), and its other end is subjected to a pulling (tensile) force [82], although is somewhat different, of course, but has also to do with the intent to obtain clear information about testing. this could be done if the specimen is long enough, so that the tensile stresses prevail considerably over the bending stresses. considering that the pulling force will always form a certain angle with the fiber axis, the question is what could be done to minimize the effect of the associated bending stresses? to answer this question, a simple analytical model has been developed for the evaluation of the bending stress caused by the misalignment of the ends of a glass fiber specimen soldered into a ferrule and subjected to tension during pull testing. it is shown that the bending stress can be reduced considerably by using sufficiently long specimens and how long such specimens should be, so that only the tensile stress could be accounted for. it is also shown how the uncertainty in the prediction of the inevitable misalignment of the fiber ends can be considered when establishing the appropriate specimen length. the tensile force experienced by a dual-coated optical fiber specimen during its reliability (proof) testing is applied to the fiber‘s secondary coating and is transmitted to the glass fiber at a certain distance from the specimen‘s ends. although it is true that, in accordance with the saint-venant‘s principle, the glass fiber will be subjected, at a certain distance from the specimen ends, to the same stress that it would experience if the external force were applied to both the fiber and the coating, it is also true that, because of the buffering effect of the coating, the effective length of the fiber under testing, when the testing force is applied to the coating only, might be reduced appreciably in comparison with the fiber‘s actual length. a simple analytical stress model for the evaluation of this effect was developed [83] and was used to establish the appropriate minimum length of a dual-coated test specimen, so that the experimental data would be consistent and physically meaningful. it has been found that it is the axial compliance of the secondary coating, which experiences the direct action of the external loading, and the interfacial compliance of the coating system that determine the buffering effect of this system. it was concluded that for any finite compliance of the coating, even a very low one, one could always employ a long enough specimen, in which the major mid-portion of the glass fiber would be loaded to practically the same level as in an infinitely long specimen, when the external force is distributed between the glass fiber and its coating proportionally to the axial rigidities of these structural elements. the developed model can be used for selecting the appropriate length of coated optical fiber specimens in reliability (proof) testing. it can be used also beyond the fiber optics technologies area, when composite structures of the type in question are employed and tested. 1.11. elastic stability of optical fiber interconnects analytical models for the evaluation of the elastic stability of optical fiber interconnects have been developed 1) to understand the role of the nonlinear stress-strain relationship [84], 2) to assess the role of the hydrostatic pressure, if any [85], 3) to evaluate the role of the ends off-set [86], 166 e. suhir 4) to find out of there is sufficient incentive for using thicker coatings for higher elastic stability [87], 5) to investigate the role of the finite length of the interconnect [88], [91] on the critical stress, including the situation, when the interconnect is partially stripped off of its coating [89], and 6) to analyze the effect of the lateral compliance of the interconnect on the level of the buckling forces [90]. in the brief discussion that follows we indicate some important physical aspects of some of the phenomena associated with the above efforts. the analysis of the effect of the nonlinear stress-strain relationship on elastic stability of optical glass fibers [84] has been carried out under an assumption that this relationship, obtained for the case of uniaxial tension, is also valid in the case of compression: just the sign in front of the nonlinear term in the formula for the strain-dependent young‘s modulus should be changed. it is clear that since the critical force is proportional, in accordance with the well-known euler formula, to the young‘s modulus of the material, and this modulus reduces with an increase in the compressive force, an approach that ignores such a reduction will overestimate the magnitude of the critical force, and, hence, will not be conservative. in the studies addressing low-temperature micro-bending of infinitely long dual-coated fibers and elastic stability of short bare fibers the role of the nonlinear stress-strain relationship has been evaluated for strains not exceeding 5%, and therefore it has been indicated that future experimental research should include evaluation of the nonlinear stress-strain relationship, both in tension and compression, for higher strains and for high-strength fibers, such as, e.g., fibers protected by metallic coatings. the author of this review is not aware of whether such research has been conducted. the analysis of the effect of the hydrostatic pressure in dual‐coated optical fibers on the induced stresses in the fiber [85] has indicated that all the normal stresses in the fiber (radial, tangential, axial) are proportional to this pressure. it has been found also that hydrostatic pressure results in lower micro-bending losses. calculations of the elastic stability of coated fiber specimens subjected to compression were carried out using analytical modeling [86], [87] for 2mm and 5mm long interconnects for the cases of bare (uncoated) fibers, as well as for coated fibers with 62.5 μm and 187.5 μm thick coatings. the compressive, bending and the total stresses in the glass fiber at the pre-buckling, buckling and post-buckling conditions were computed with consideration of the non-linear stress-strain relationship in the silica material. it has been found that the stresses in the fiber are strongly dependent on its length and the coating thickness. the nonlinear stress-strain relationship plays, however, a minor role, unless the specimen is shorter than only 2mm. the incentive for the evaluation of the effect of the length of a coated fiber, idealized as a beam lying on a continuous elastic foundation (provided by the coating system), on the critical stress in it [88], [90] is due to the fact that the critical (buckling) force for a beam, in the absence of an elastic foundation, is highly dependent on its length: in accordance with the euler formulas, this force is inversely proportional to the beam‘s length squared and is proportional to the beam‘s flexural rigidity. on the other hand, the critical force for a long enough beam lying on a continuous elastic foundation is beam‘s length independent and, as is known from the theory of such beams, is proportional to the doubled square root of the product of the spring constant of the foundation and the beam‘s flexural rigidity. the following natural questions arise in this connection: fiber optics engineering: physical design for reliability 167 1) for what lengths both the beam‘s length and the spring constant of the foundation play a role and should be accounted for? in other words, if the beam on an elastic foundation is not long enough, how does its finite length affect, if at all, the critical force? 2) what role, if any, the arrangements of the beam‘s supports at its ends play, as far as the critical force is concerned, and is this role dependent of the beam‘s length? in other words, is the above mentioned well known formula for the critical force for a long enough beam calculated as the doubled square root of the product of the spring constant of the elastic foundation and beam‘s flexural rigidity, valid for any long enough beam lying on an elastic foundation, regardless of the arrangements of its end supports, or it is not always the case? the developed analytical model enabled one to obtain answers to these questions and, as a by-product, to provide practical guidance for designers of coated fiber interconnects. an easy to use and physically meaningful diagram [90] based on the developed analytical models has been suggested to determine stability/instability zones for the given compressive force, the spring constant of the foundation, the length of the beam (fiber) and its flexural rigidity. both the mechanical and thermally induced compressive forces were considered. it has been shown also that the critical force for a long enough beam with a free (unsupported) end is half of the magnitude of the force in a beam with both ends supported. the obtained solution has been extended for a fiber with a stripped-off coating at its end portion, when the stripped off end of the fiber interconnect (connector) is subjected to compression [89]. a situation when the critical force for the coated portion of the fiber is equal to the critical force for its stripped off portion was particularly addressed and the recommendations for the corresponding length of the elastically stable stripped off portion have been suggested. the model developed for a cantilever beam lying on a continuous elastic foundation and subjected to the combined action of the concentrated compressive and lateral forces at the free end of the beam (coated fiber) [90] was used to explain the effect of the lateral compliance of such a beam (i.e., its propensity to deflect under the action of the given lateral force) on its elastic stability. it is clear that the flexural rigidity of the beam and the presence of a compressive or a tensile force are equally important when assessing the role of the lateral compliance. indeed, while the tensile axial force results in an increased effective flexural rigidity of the beam, the compressive force results in its lower flexural rigidity. in an extreme situation, when the compressive force is significant and becomes equal to its critical value, the beam buckles, i.e., its effective flexural rigidity becomes zero. in another extreme case, when the tensile force is large, the beam‘s effective flexural rigidity increases, and a significant lateral force is needed to bend the beam. these phenomena can be used in fiber optics to increase, if necessary, the elastic stability (the critical force) by applying a tensile force to the fiber. this could be done, e.g., by placing the fiber into an enclosure whose cte is even lower than that of the fiber, say, in an enclosure built of carbon nano-tubes (cnts). as is known, at low and room temperatures, the cte for single wall cnts in axial direction could be even negative. on the other hand, if one intends to increase the lateral compliance of the fiber, a high expansion enclosure could be used. such an enclosure will apply compression to the silica fiber. the modeling technique could be similar to the one used in [30] where a fiber with an initial ends off-set was considered. 1.12. solder materials and joints, and fibers soldered into ferrules solder materials and joints are as important in photonics and, particularly, in foe, as they are in microelectronics [92], [93]. there are, however, specific requirements for the 168 e. suhir solder materials and joints used in photonics. these requirements are associated with the ability to achieve high alignment, high yield stress, propensity to low creep, etc. it has been shown [92] that low expansion enclosures with good thermal expansion (contraction) match with silica is not always the right choice (solution) from the standpoint of the thermally induced stresses in metalized fibers soldered into ferrules, and in the solder material itself. indeed, the low expansion enclosures result in tensile radial stresses in the solder ring, and could lead to the delamination of the metallization from the fiber and/or to the excessive tensile radial deformations in the solder. on the other hand, high expansion (contraction) enclosures might result in high compressive stresses in the solder material, and in unfavorable low cycle fatigue conditions during temperature cycling of the joint. the most feasible material of the enclosure and/or the thickness of the solder ring, and/or the physical properties of the solder material could and should be found based on the developed model. 1.13. dynamics response of optoelectronic structures to shocks and vibrations numerous problems associated with the dynamic response of electronic and photonic structures to shocks and vibrations were addressed in refs. [94]-[106]. the major findings, conclusions and recommendations could be summarized as follows: 1) the maximum acceleration is typically used in electronics and photonics engineering as the major reliability criterion. it is suggested that this criterion can be indeed used in this capacity, when functional (electrical, optical, thermal) performance of the product is evaluated. it could be misleading, however, when structural (physical) reliability is critical [95]. it is the dynamic stress, and not the maximum accelerations (decelerations) that should be used as a suitable and an adequate criterion of the dynamic strength of the material or a device. this stress may or may not be proportional to the maximum acceleration. 2) drop tests are often replaced in electronics and photonics engineering by shock tests, which are simpler to design and conduct, and whose results is easier to interpret. it has been found that such a replacement can be justified, if the dynamic response of the device under test is as close to an instantaneous impact, as possible [97], [98], [103]; 3) electronic and opto-electronic systems are often tested ―on the board level‖. the model [99] contains is an exact solution to a highly nonlinear equation for the principal coordinate for the dynamic response of a board to an impact (shock) loading. the model can be used to evaluate the dynamic response characteristics of the board (with surface mounted devices on it) that experiences highly nonlinear vibrations as a result of the shock impact applied to the board‘s support contour in drop or shock tests. the model has been developed under an assumption that the size of the surface-mounted devices in the xy plane is small, so that the surface mounted devices do not change the flexural rigidity of the board, but contribute significantly to its mass and, hence, to the inertial forces. 4) electronic and photonic systems often experience periodic impacts that could be idealized and modeled as a train of instantaneous impulses [101]. the developed model enables one to evaluate the dynamic response of such systems to a train of periodic impacts, including the situation, when such shocks generate quasi-chaotic vibrations in the system. smoluchowski‘s (fokker-planck) equation is used to describe and to characterize the quasi-random vibrations caused in such a nonlinear system by periodic impulses. fiber optics engineering: physical design for reliability 169 1.14. new nano-particle material (npm) and its applications in fiber-optics an advanced technology for making nano-particle material (npm) based optical silica fiber coatings has been developed under grants from darpa/navy [107]-[116]. the developed technology enables one to create ultra-thin, highly cost-effective, highly mechanically reliable, and highly environmentally durable coatings for silica light-guides. the obtained results have demonstrated the performance superiority of the developed technology over polymer-coated and metallized fibers, as well as a potential that the npm has for various commercial and military applications in microand opto-electronics and related areas. it can have many attractive applications also well beyond the ―high-tech‖ field. this npm-based coating has all the merits of polymer and metal coatings, but is free of the majority of their shortcomings. the developed material is an unconventional inhomogeneous ―smart‖ composite material, which is equivalent to a homogeneous material with the following major properties: 1) low young‘s modulus, 2) immunity to corrosion, 3) good-to-excellent adhesion to adjacent material(s), 4) non-volatile, 5) stable properties at temperature extremes (from -220 0 c to +350 0 c), 6) very long (practically infinite) lifetime, 7) ―active‖ hydrophobicity — the material provides a moisture barrier (to both water and water vapor), and, if necessary, can even ―wick‖ moisture away from the contact surface; 8) ability for ―self-healing‖ and ―healing‖: the npm is able to restore its own dimensions, when damaged, and is able to fill existing or developed defects (cracks and other ―imperfections‖) in contacted surfaces; very low (near unity) effective refractive index (if needed). npm can be designed, depending on the application, to enhance those properties that are most important for the pursued application. the npm properties have been confirmed through testing. the tests have demonstrated the outstanding mechanical reliability, extraordinary environmental durability and, in particular applications, improved optical performance of the lightguide. it is always desirable to provide application-specific modifications of the npm to master/optimize its properties and performance. because it is a nano-material, its surface chemistry and its performance depend a lot upon the contact materials and surfaces. the following npm applications are viewed as the most attractive ones. 1) npm is able to hermetically seal packages, components and devices, such as laser packages, mems, displays and plastic leds; 2) npm can be used as an effective protective coating for various metal and non-metal surfaces, well beyond the area of microand opto-electronics: in cars, aerospace structures, offshore and ocean structures, marine vehicles, civil engineering structures (bridges, towers, etc.), tubes, pipes and pipe-lines, etc. these applications benefit because the material is actively hydrophobic, does not induce additional stresses (owing to its low modulus), is inexpensive, is easy-to-apply, has practically infinite lifetime, and is self-healing. application of this material can result in a significant resistance of a metal surface to corrosion, and, in addition, in substantial increase in the fracture toughness of the material, both initially and during the system‘s operation (use); 3) the npm can be added in the formulation of various coatings such as paints, thereby providing protective benefits without changing the application techniques; 170 e. suhir 4) because of a low refractive index, the npm can be used, if necessary, as an effective cladding of optical silica fibers. the use of the npm cladding eliminates the need to dope silica for obtaining light-guide cores. the new preform will consist of a single (undoped and, hence, less expensive) silica material; 5) a derivative application is flexible light-guides. multicore flexible fiber cables employing npm are able to provide high spatial image resolution. as such, they might find important applications, when there is a need to provide direct high-resolution image transmission from secluded areas. possible applications can be found in bio-medicine, nondestructive evaluations, oil and other geological explorations, in ocean engineering, or in other situations, when an image needs to be obtained and transmitted from relatively inaccessible locations. in such applications, the plane (―butt‖) end of the fiber bundle (cable) will play the role of a small size pixel array. the transmitted image can be concurrently or subsequently enlarged to a desirable size, as needed; 6) another derivative application is a multicore fiber cable. ultra-small diameter glass fibers with an npm-based cladding/coating can be placed in large quantities within a npm medium (―multiple cores in a single cladding‖). in addition, owing to a much better inner-outer refractive index ratio in the npm-based fibers, such cables will be characterized by very low signal attenuation; 7) yet another derivative application is sensor systems. the npm-based fibers can be used in optical sensor systems that employ optical fibers embedded in a laminar or a cast material. such systems are used, e.g., in composite airframes. with the npm used as a cladding or, at least, as a coating of the silica optical fiber, the optical performance and the structural reliability of the light-guide will be improved dramatically compared with the conventional systems; 8) ultra-thin planar light-guides are yet another derivative application of the npm. in the new generation of the planar light-guides, npm can be used as the top cladding material. it will replace silicon or polymer claddings, which are considered in today‘s planar light-guides. all the advantages of the npm cladding material discussed above for optical fibers are equally applicable to planar light-guides. these are thought to have a ―bright‖ future in the next generation of computers and other photonic devices. a modification of the npm has been developed and tested as an attractive substitute for the existing hermetic and non-hermetic optical fiber coatings. the following major activities were undertaken and the following results were obtained: 1) the drawing (manufacturing) process and the drawing tower were adequately retrofitted to adjust them to the characteristics of the developed npm and to the npm layer application procedure; 2) the conducted mechanical tests have demonstrated remarkable strength (up to 7.5gpa=765kgf/mm.sq.=1088kpsi) and attractive quality (low strength variability) of the manufactured npm-based fibers. such high strength characteristics have been never achieved before, even in the lab conditions; 3) the environmental tests have shown that even at the humidity level of 100% (samples were immersed into water for 24 hours) the mechanical strength of these fibers is on the order of the strength of the best quality fibers at the ―dry‖ conditions in the previous tests; 4) there is reason to believe that the achieved performance is still not a limit of the npm-based technology and that the higher fibers strengths and better environmental stability are feasible by further ―fine tuning‖ and further optimization of the npm and the drawing procedure; fiber optics engineering: physical design for reliability 171 5) the optical performance of the npm-based fibers (in terms of the attenuation level) is almost two-fold better than the optical performance of the reference (existing) samples. the estimated lower limit of the npm based optical fibers with silica glass core and stepwise refractive index change, can potentially get a record values for the tested type of multi-mode fibers (getting even below 1 db/km in a specific spectral ―window‖.). the obtained results clearly demonstrated the performance superiority of the developed technology and a great potential (scientific, technological and commercial) of the future products, which makes the project attractive for the commercialization. 1.15. some special foe problems 1. application of the mechanical approach to the evaluation of low-temperature added transmission losses in single-coated (jacketed) optical fibers [117] enables one, based on the developed analytical stress model, to evaluate the threshold of such losses from purely structural (mechanical) calculations, without resorting to optical evaluations or measurements. the model has been confirmed, however, by optical measurements. the model is based on the experimentally obtained evidence that the temperature threshold of the elevated added transmission losses coincides with the threshold of the elevated thermally induced (―hoop‖) stresses applied by the polymer jacket to the silica fiber. the suggested model enables one to predict the threshold of interest by stress calculations, instead of resorting to much more complicated optical calculations or measurements. the model sheds light on the physics of the losses in question. the model can be used also to assess the incentive for employing a dual coated system, in which the thermally induced pressure on the glass fiber will be reduced. 2. analytical models [118]-[120] were used to predict the thermal stresses in fused biconical taper (fbt) light-wave couplers. the stresses are caused by the thermal contraction mismatch of the high-expansion coupler and its low-expansion substrate. the challenge in the modeling is due to the non-prismaticity of the fbt structure and the non-linear stressstrain relationship of the fbt material. 3. elevated lateral gradients of the cte‘s and young‘s moduli (in direction of the fiber diameter) can be possibly responsible for the fiber ―curling‖ during drawing of optical silica fibers [121]. the analysis was carried out on the basis of both analytical and fea modeling, and an excellent agreement of the analytical modeling and fea data has been observed. 4. apparatus and method for thermostatic compensation of temperature change sensitive opto-electronic devices [122] was also based on analytical modeling. in accordance with the invention, temperature-sensitive devices are mounted within a thermostatic structure that provides temperature compensation by applying compressive or tensile forces to stabilize the performance of the device across a significant operating temperature range. in a preferred embodiment, an optical fiber refractive index grating is thermostatically compensated to minimize changes in the reflection wavelength of the grating. various methods and devices are known in the art to compensate for temperature induced thermal expansion. the patent [122] provides the simplest and most effective solution to the thermal compensation problem, when regular and readily available materials can be used to solve the problem. 172 e. suhir 2. probabalistic design for reliability in fiber optics engineering 2.1. qualification testing (qt) the short-term goal of a particular opto-electronic device manufacturer is to conduct and pass the established qt, without questioning if they are adequate. the ultimate longterm goal of opto-electronic industries, whether aerospace, military, or commercial, regardless of a particular manufacturer or a product, is to make their deliverables reliable in the actual operations. it is well known, however, that today‘s electronic devices that passed the existing qt often fail in the field (in operation conditions). are the existing opto-electrionic qt specifications adequate? do opto-electronic industries need new approaches to qualify their devices into products? could the existing qt specifications and practices be improved to an extent that if the device passed the qt, there is a quantifiable way to assure that its performance will be satisfactory? at the same time, there is a perception, perhaps, a substantiated one, that some electronic products ―never fail‖. it is likely that such a perception exists because these products are superfluously durable, are more robust than is needed for a particular application and, as the consequence of that, are more costly than necessary. to prove that it is indeed the case, one has to find a consistent way to quantify the level of the opto-electronic product robustness in the field. then one could establish if a possible and controlled reduction in the reliability level could be translated into a significant cost reduction. 2.2. probabilistic design for reliability (pdfr) the probabilistic design for reliability (pdfr) concept enables one to provide affirmative answers to the above questions. the concept suggest that one 1) conducts a highly focused and highly cost-effective failure-oriented accelerated testing (foat), 2) carries out simple and physically meaningful predictive modeling (pm) to understand the physics of failure; 3) predicts, using the results of the carried out foat and pm, the probability of failure (pof) in the field; 4) carries out sensitivity analyses (sa) to establish the acceptable pof; 5) revisits, reviews and revises the existing qt practices, procedures, and specifications; and 6) develops and widely implements the pdfr concept, methodologies and algorithms, considering that ―nobody and nothing is perfect‖, that the probability of failure is never zero, but could be predicted and, if necessary, minimized, controlled, specified and even maintained (assured) at an acceptable level. in effect, the only difference between a highly reliable and an insufficiently reliable product is ―merely‖ in the level of the operational pof. very popular today prognostication and health monitoring (phm) approaches and techniques could be very helpful at all the stages of the design, manufacturing and operation of the product. the reliability evaluations and assurances cannot be delayed, however, until the device is made (although it is often the case in many current practices). reliability should be ―conceived‖ at the early stages of the device design; implemented during manufacturing; qualified and evaluated by (electrical, optical, environmental and mechanical) testing at the design, product development and the manufacturing stages checked (screened) during production (by implementing an adequate burn-in process) and, if necessary and appropriate; fiber optics engineering: physical design for reliability 173 monitored and maintained in the field during the product‘s operation, especially at the early stages of the product‘s use by employing, e.g., technical diagnostics, prognostication and health monitoring (phm) methods and instrumentation. three classes of engineering products, including opto-electronic and particularly fiber optics products, should be distinguished from the reliability point of view: 1) class i includes some military or aerospace objects, such as warfare, military aircraft, battle-ships, space-craft. cost is important, but is not a dominating factor; 2) class ii includes objects like long-haul communication systems, civil engineering structures (bridges, tunnels, towers), passenger elevators, ocean-going vessels, offshore structures, commercial aircraft, railroad carriages, cars, some medical equipment. the product has to be made as reliable as possible, but only for a certain specified level of demand (stress, loading); 3) class iii includes consumer products, commercial electronics, agricultural equipment. the typical market is the consumer market. 2.3. reliability, cost effectiveness and time to market reliability, cost effectiveness and time-to-market considerations play an important role in the design, materials selection and manufacturing decisions in commercial electronics, and are the key issues in competing in the global market-place, at least for class iii products. a company cannot be successful, if its products are not cost effective, or do not have a worthwhile lifetime and service reliability to match the expectations of the customer. too low a reliability can lead to a total loss of business. product failures have an immediate, and often dramatic, effect on the profitability and even the very existence of a company. profits decrease as the failure rate increases. this is due not only to the increase in the cost of replacing or repairing parts, but, more importantly, to the losses due to the interruption in service, not to mention the losses due to reduced customer confidence and acceptance. these make obvious dents in the company‘s reputation and, as the consequence of that, affect its sales. each business, whether small or large, should try to optimize its overall approach to reliability. ―reliability costs money‖, and therefore a business must understand the cost of reliability, both ―direct‖ cost (the cost of its own operations), and the ―indirect‖ cost (the cost to its customers and their willingness to make future purchases and to pay more for more reliable products). 2.4. failure oriented accelerated testing (foat) it is impractical and uneconomical to wait for failures, when the mean-time-to-failure for a typical today‘s electronic device (equipment) is on the order of hundreds of thousands of hours. accelerated testing (at) enables one to gain greater control over the reliability of a product. at has become a powerful means in improving reliability [3], [4]. this is true regardless of whether (irreversible or reversible) failures will or will not actually occur during the foat (―testing to fail‖) or the qt (―testing to pass‖). in order to accelerate the material‘s (device‘s) degradation and/or failure, one has to deliberately ―distort‖ (―skew‖) one or more parameters (temperature, humidity, load, current, voltage, etc.) affecting the device functional or mechanical performance and/or its environmental durability. at uses elevated stress level and/or higher stress-cycle frequency as effective stimuli to precipitate failures over a short time frame. the ―stress‖ in re does not necessarily have to be mechanical or a thermo-mechanical: it could be electrical current or voltage, high (or low) temperature, high humidity, high 174 e. suhir frequency, high pressure or vacuum, cycling rate, or any other factor (stimulus) responsible for the reliability of the device or the equipment. at must be specifically designed for the product under test. the experimental design of at should consider the anticipated failure modes and mechanisms, typical use conditions, and the required or available test resources, approaches and techniques. some of the most common at conditions (stimuli) are: high temperature (steadystate) soaking/storage/ baking/aging/ dwell; low temperature storage; temperature (thermal) cycling; power cycling; power input and output; thermal shock; thermal gradients; fatigue (crack propagation) tests; mechanical shock; drop shock (tests); random vibration tests; sinusoidal vibration tests (with the given or variable frequency); creep/stress-relaxation tests; electrical current extremes; voltage extremes; high humidity; radiation (uv, cosmic, x-rays, alpha particles); space vacuum. 2.5. qualification testing (qt) and failure oriented accelerated testing (foat) qt is a must. industry cannot do without qt. its objective is to prove that the reliability of the product-under-test is above a specified level. qt enables one to ―reduce to a common denominator‖ different products, as well as similar products, but produced by different manufacturers. qt reflects the state-of-the-art in a particular field of engineering, and the typical requirements for the product performance. however, if a product passes the today‘s qt for opto-electronic products, it is not always clear why it was good, and if it fails the tests, it is usually equally unclear what could be done to improve its reliability. since qt is not failure oriented, it is unable to provide the most important ultimate information about the reliability of the product – the reliability physics behind the failure and the pof after the given time in service under the given operation conditions. foat on the other hand, is aimed, first of all, at revealing and understanding the physics of the expected or occurred failures. that is why it could be referred to as knowledge oriented testing. unlike qts, foat is able to detect the possible failure modes and mechanisms. foat end points are cycles or durations that are scaled to the use environments. another possible objective of the foat is, time permitting, to accumulate failure statistics. thus, foat deals with the two major aspects of the re– physics and statistics of failure. adequately planned, carefully conducted, and properly interpreted foat provides a consistent basis for the prediction of the pof after the given time in service. welldesigned and thoroughly implemented foat can facilitate dramatically the solutions to many engineering and business-related problems, associated with the cost effectiveness and time-to-market. this information can be helpful in understanding what should be changed to design a viable and reliable product. this is because any structural, materials and/or technological improvement can be ―translated‖, using the foat data, into the pof for the given duration of operation under the given service (environmental) conditions. foat should be conducted in addition to the qt. there might be also situations, when foat can be used as an effective substitution for the qt, especially for new products, when acceptable qualification standards do not yet exist. while it is the qt that makes a device into a product, it is the foat that enables one to understand the reliability physics behind the product and, based on the appropriate pm, to create a reliable product with the predicted or even specified pof. fiber optics engineering: physical design for reliability 175 2.6. burn-in testing (bit) as a special type of failure oriented accelerated testing (foat) burn-in (―screening‖) testing (bit) is widely implemented to detect and eliminate infant mortality failures. bit could be viewed as a special type of manufacturing foat. bit is needed to stabilize the performance of the device in use. bit is supposed to stimulate failures in defective devices by accelerating the stresses that will cause these devices to fail without damaging good items. the bathtub curve of a device that undergone bit is supposed to consist of a steady state and wear-out portions only. the rationale behind the bit is based on a concept that mass production of electronic devices generates two categories of products that passed qt: 1) robust (―strong‖) components that are not expected to fail in the field and 2) relatively unreliable (―week‖) components (―freaks‖) that, if shipped to the customer, will most likely fail in the field. 2.7. failure oriented accelerated testing (foat): predictive modeling (pm) foat cannot do without simple and meaningful predictive models. it is on the basis of such models that one decides which parameter should be accelerated, how to process the experimental data and, most importantly, how to bridge the gap between what one ―sees‖ as a result of the accelerated testing and what he/she will possibly ―get‖ in the actual operation conditions. by considering the fundamental physics that might constrain the final design, pm can result in significant savings of time and expense and shed additional light on the physics of failure. pm can be very helpful to predict reliability at conditions other than the foat and can provide important information about the device performance. modeling can be helpful in optimizing the performance and lifetime of the device, as well as to come up with the best compromise between reliability, cost effectiveness and time-to-market. a good foat pm does not need to reflect all the possible situations, but should be simple, should clearly indicate what affects what in the given phenomenon or structure, be suitable/flexible for new applications, with new environmental conditions and technology developments, as well as for the accumulation, on its basis, the reliability statistics. the scope of the model depends on the type and the amount of information available. a foat pm does not have to be comprehensive, but has to be sufficiently generic, and should include all the major variables affecting the phenomenon (failure mode) of interest. it should contain all the most important parameters that are needed to describe and to characterize the phenomenon of interest, while parameters of the second order of importance should not be included into the model. the most widespread foat pm are: power law (used when the physics of failure is unclear); boltzmann-arrhenius‘ equation (used when there is a belief that the elevated temperature is the major cause of failure) and its numerous extensions; coffin-manson‘s and related equations; crack growth equations (used to assess the fracture toughness of brittle materials); miner-palmgren‘s rule (used to consider the role of fatigue when the yield stress is not exceeded); creep rate equations; weakest link model (used to evaluate the mttf in extremely brittle materials with defects); stress-strength interference model, which is, perhaps, the most flexible and well substantiated model. 176 e. suhir 2.8. safety factor (sf) direct use of the probability of non-failure is often inconvenient, since, for highly reliable items, this probability is expressed by a number which is very close to one, and, for this reason, even significant than in the item‘s (system‘s) design, which have an appreciable impact on the item‘s reliability, may have a minor effect on the probability of non-failure. in those cases when both the mean value, <ψ>, and the standard deviation, ŝ, of the margin of safety (or any other suitable characteristic of the item‘s reliability, such as stress, time-to-failure, temperature, displacement, affected area, etc.), are available, the safety factor (safety index, reliability index) sf can be used as a suitable reliability criterion. if the probability distribution density f (ψ) of the random safety margin ψ for the ttf is anticipated or established, then the mean value  ψ  and the standard deviation sψ of this margin can be determined as  ψ  =  0  f (ψ)ψdψ, and sψ =  0  f (ψ)(ψ   ψ ) 2 dψ, and the corresponding sf can be evaluated as sf =  ψ  / sψ. the sf establishes both the upper limit of the reliability characteristic of interest (through the mean value of the corresponding margin of safety) and the accuracy with which this characteristic is defined (through the corresponding standard deviation). the structure of the sf indicates that it is acceptable that a system characterized by a high mean value of the safety margin (i.e., a system whose bearing capacity with respect to a certain stress/reliability-characteristic, is significantly higher than the level of loading) has a less accurately defined deviation from this mean value than a system characterized by a low mean value of the safety margin (i.e., a system whose bearing capacity is much closer to the possible level of loading). in other words, the uncertainty in the evaluation of the safety margin should be smaller for a more vulnerable design. 2.9. do opto-electronic (oe) industries need new approaches to qualify their devices into products? it should be widely recognized that the probability of a failure is never zero, but could be predicted and, if necessary, controlled and maintained at an acceptable low level. one effective way to achieve this is to implement the existing methods and approaches of prm techniques and to develop adequate pdfr methodologies. these methodologies should be based mostly on foat and on a widely employed predictive modeling effort. foat should be carried out in a relatively narrow but highly focused and timeeffective fashion for the most vulnerable elements of the design of interest. if the qt has a solid basis in foat, pm and pdfr, then there is reason to believe that the product of interest will be sufficiently robust in the field. the qt could be viewed as ―quasi-foat,‖ as a sort-of the ―initial stage of foat‖ that more or less adequately replicates the initial nondestructive, yet full-scale, stage of foat. we expect that the suggested approach to the dfr and qt will be accepted by the engineering and manufacturing communities, implemented into the engineering practice and be adequately reflected in the future editions of the qt specifications and methodologies. the pdfr-based qt will still be non-destructive. such qts could be designed, therefore, as a sort of mini-foat that, unlike the actual, ―full-scale‖ foat, is non-destructive and conducted on a limited scale. the duration and conditions of such ―mini-foat‖ qt should fiber optics engineering: physical design for reliability 177 be established based on the observed and recorded results of the actual foat, and should be limited to the stage when no failures in the actual full-scale foat were observed. prognostics and health management (phm) technologies (such as ―canaries‖) should be concurrently tested to make sure that the safe limit is not exceeded. it is important to understand the reliability physics that underlies the mechanisms and modes of failure in electronics and photonics components and devices. no statistics is able to replace understanding of reliability physics underlying a particular design and modes of failure. statistical assessments could and should be conducted when there is a good reason to believe that an adequately reliable product is on the way. as to the foat, it should be thoroughly implemented, so that the qt is based on the foat information and data. pdfr concept should be widely employed. since foat cannot do without predictive modeling, the role of such modeling, both computer-aided and analytical, in making the suggested new approach to product qualification practical and successful. 3. conclusion the application of the methods and approaches of methods and approaches of materials physics and structural analysis can be very helpful in creating a viable and reliable fiber optics products and networks. the probabilistic design for reliability (pdfr) concept enables one to design and fabricate a viable and reliable optoelectronic product. references [1] e. suhir, ―structural analysis in microelectronics and fiber optics‖, van-nostrand, new york, 1991. [2] e. suhir, r.c. cammarata, d.d.l. chung, m. jono, ―mechanical behavior of materials and structures in microelectronics‖, mrs symposia proceedings, vol.226, 1991. [3] e. suhir, ―structural analysis in fiber optics‖, in j. menon, ed., ―trends in lightwave technology‖, council of scientific information, india, 1995. [4] e. suhir, m. fukuda, and c.r. kurkjian, eds., ―reliability of photonic materials and structures‖, mrs symposia proceedings, vol. 531, 1998. [5] e. suhir, ―the future of microelectronics and photonics and the role of mechanics and materials‖, asme j. electr. pack., march 1998. [6] e. suhir, ―fiber optics structural mechanics-brief review‖, editor‘s note, asme j. electr. pack., sept. 1998. [7] e. suhir, ―microelectronics and photonics – the future‖, microelectronics journal, vol.31, no.11-12, 2000. [8] driessen, r. g. baets, j. g. mcinerney, and e. suhir, ―laser diodes, optoelectronic devices, and heterogeneous integration‖, spie press, 2003. [9] e. suhir, ―microelectronic and photonic systems: role of structural analysis‖, interpack’2005, san francisco, july 2005. [10] e. suhir, c.p. wong, y.c. lee, eds. ―microand opto-electronic materials and structures: physics, mechanics, design, packaging, reliability‖, 2 volumes, springer, 2008. [11] e. suhir, ―optical fiber interconnects: design for reliability‖, society of optical engineers (spie), proc. of spie, vol. 7607 760717-8, 2010. [12] b. welker, m. uschitsky, e. suhir, s. kher, g. bubel, ―finite element analysis of the optical fiber structures‖, in e. suhir, ed., ―structural analysis in microelectronics and fiber optics‖, symp. proc., asme press, 1996. [13] e. suhir, ―modeling of the mechanical behavior of microelectronic and photonic systems: attributes, merits, shortcomings, and interaction with experiment‖, proc. 9-th int. congr. on experim. mech., orlando, fl., june 5-8, 2000. [14] e. suhir, ―thermo-mechanical stress modeling in microelectronics and photonics‖, electronic cooling, vol.7, no.4, 2001. 178 e. suhir [15] e. suhir, ―modeling of thermal stress in microelectronic and photonic structures: role, attributes, challenges and brief review‖, special issue, asme journal of electronic packaging, vol.125, no.2, june 2003. [16] e. suhir, ―predictive modeling is a powerful means to prevent thermal stress failures in electronics and photonics‖, chipscale reviews, vol.15, no.4, july-august 2011. [17] e. suhir, ―stress modeling in polymer coated optical glass fibers‖, session honoring prof. a. chudnovsky, 2014 antec, las vegas, nv, april 28-may 3, 2014. [18] e. suhir, ―mechanical behavior of materials in microelectronic and fiber optic systems: application of analytical modeling-review‖, mrs symp. proc., vol. 226, 1991. [19] e. suhir, ―analytical stress-strain modeling in photonics engineering: its role, attributes and interaction with the finite-element method‖, laser focus world, may 2002. [20] e. suhir, ―modeling of thermal stress in microelectronic and photonic structures: role, attributes, challenges and brief review‖, special issue, asme j. electr. packaging (jep), vol.125, no.2, june 2003. [21] e. suhir, ―analytical thermal stress modeling in physical design for reliability of microand opto-electronic systems: role, attributes, challenges, results‖, in e. suhir, cp wong, yc lee, eds. ―microand optoelectronic materials and structures: physics, mechanics, design, packaging, reliability‖, springer, 2007. [22] e. suhir, ―analytical thermal stress modeling in electronic and photonic systems‖, asme app. mech. reviews, invited paper, vol.62, no.4, 2009. [23] e. suhir, ―thermal stress failures: predictive modeling explains the reliability physics behind them‖, imaps advanced microelectronics, vol.38, no.4, july/august 2011. [24] e. suhir, ―bending performance of clamped optical fibers: stresses due to the end off-set‖, applied optics, vol. 28, no. 3, february 1989. [25] e. suhir, ―predicted curvature and stresses in an optical fiber interconnect subjected to bending‖, ieee/osa journal of light-wave technology, vol.14, no.2, 1996. [26] e. suhir, ―bending of a partially coated optical fiber subjected to the ends off-set‖, ieee/osa journal of lightwave technology, vol. 12, no.2, 1997. [27] e. suhir, ―optical fiber interconnect subjected to a not-very-small ends off-set: effect of the reactive tension‖, mrs symposia proceedings, vol. 531, 1998. [28] e. suhir, ―bending stress in an optical fiber interconnect experiencing significant ends off-set‖, mrs symposia proceedings, vol. 531, 1998. [29] e. suhir, ―method and apparatus for proof-testing optical fibers‖, us patent #6,119,527, 1998. [30] e. suhir, ―optical fiber interconnect with the ends offset and axial loading: what could be done to reduce the tensile stress in the fiber?‖, j. appl. phys., vol.88, no.7, 2000. [31] e. suhir, ―method for determining and optimizing the curvature of a glass fiber for reducing fiber stress‖, us patent #6,016,377, 2000. [32] e. suhir, ―method of improving the performance of optical fiber, which is interconnected between two misaligned supports‖, u.s. patent #6,314,218, 2001. [33] e. suhir, ―interconnected optical devices having enhanced reliability‖, u.s. patent #6,327,411, 2001. [34] e. suhir, ―optical fiber interconnects having offset ends with reduced tensile strength and fabrication method‖, us patent #6,606,434, 2003. [35] suhir e., ―analysis and optimization of the input/output fiber configuration in a laser package design‖, asme journal of electronic packaging, vol.117, no.4, 1995. [36] e. suhir, ―‘optical glass fiber bent on a cylindrical surface‖, mrs symposia proceedings, vol.531, 1998. [37] e. suhir,―optimized configuration of an optical fiber ―pigtail‖ bent on a cylindrical surface‖, in t. winkler and a, schubert, eds., ―materials mechanics, fracture mechanics, micromechanics‖ anniversary volume in honor of b. michel’s 50 th birthday, fraunhofer izm, berlin, 1999. [38] e. suhir, ―method for determining and optimizing the curvature of a glass fiber for reduced fiber stress‖, us patent #6,016,377, 2000. [39] j.b. murgatroyd, "the strength of glass fibres. part ii. the effect of heat treatment on strength", j. soc. glass tech.,28, 1944. [40] d. sinclair, ―a bending method for measurement of the tensile strength and young‘s modulus of glass fiber‖, journal of applied physics, vol.21, 1950. [41] krause, j.t., l.r. testardi, and r.n. thurston, ―deviations from linearity in the dependence of elongation upon force for fibers of simple glass formers and of glass optical light-guides‖, physics and chemistry of glasses, vol.20, 1979. [42] p.w. france, paradine, m.j., reeve, m.h., and newns, g.r., ―liquid nitrogen strength of coated optical glass fibers‖, journal of materials science, vol.15, 1980. fiber optics engineering: physical design for reliability 179 [43] s.f. cowap, and s.d. brown, ―static fatigue testing of a hermetically sealed optical fiber‖, american ceramic society bulletin, vol.63, no.3, 1984. [44] m.j. matthewson, c. r. kurkjian and s. t. gulati, "strength measurement of optical fibers in bending", j. am. ceram. soc. vol.69, no.1, 1986. [45] j.n. mcmullin, and j.e. freeman, ―on the shape of a bent fiber‖, ieee/osa j. light-wave techn., vol.8, no.7, 1990. [46] e. suhir, ―effect of the nonlinear stress-strain relationship on the maximum stress in silica fibers subjected to two-point bending‖, applied optics, vol. 32, no. 9, 1993. [47] m. muraoka, ―the maximum stress in optical glass fibers under two-point bending‖, asme j. electr. pack., vol.123, march 2000. [48] e. suhir, v. ogenko, d. ingman, ―two-point bending of coated optical fibers‖, proceedings of the phomat’2003 conference, san-francisco, ca, august 2003. [49] e. suhir, ―stresses in dual-coated optical fibers‖, asme journal of applied mechanics, vol.55, no.10, 1988. [50] o.s. gebizioglu, i.m. plitz, ―self-stripping of optical fiber coatings in hydrocarbon liquids and cable filling compounds‖, optical engineering, vol.30, no.6, 1991. [51] e. devadoss, ―polymers for optical fiber communication systems‖, journal of scientific and industrial research, vol.51, no.4, 1992. [52] s.t. shiue, ―thermal stresses in tightly jacketed double-coated optical fibers at low temperature‖, journal of applied physics, vol.76, no.12, 1994. [53] e. suhir, ―approximate evaluation of the interfacial shearing stress in circular double lap shear joints, with application to dual-coated optical fibers‖, int. j. solids and structures, vol.31, no.23, 1994. [54] p. ostojic, ―stress enhanced environmental corrosion and lifetime prediction modeling in silica optical fibers‖, journal of materials science, vol.30, no.12, 1995. [55] w.w. king, and c.j. aloisio, ―thermomechanical mechanism for delamination of polymer coatings from optical fibers‖, asme journal of electronic packaging, vol.119, no.2, 1997. [56] e. suhir, ―thermal stress failures in microelectronics and photonics: prediction and prevention‖, future circuits international, issue #5, 1999. [57] e. suhir, ―thermomechanical stress modeling in microelectronics and photonics‖, electronic cooling, vol.7, no.4, 2001. [58] e. suhir, ―polymer coated optical glass fibers: review and extension‖, proceedings of the polytronik’2003, montreaux, october 21-24, 2003. [59] e. suhir, ―mechanics of coated optical fibers: review and extension‖, ectc’2005, orlando, florida, 2005. [60] e. suhir, j. nicolics, c. gu, a. bensoussan, l. bechou, ―analytical stress model for the evaluation of thermal stresses in a cylindrical tri-material body with application to optical fibers‖, j. electrical and control engineering, vol.3 no.5, december 2013. [61] e. suhir, ―thermal stress failures in electronics and photonics: physics, modeling. prevention‖, j. thermal stresses, june 3, 2013. [62] e. suhir, ―predicted thermal mismatch stresses in a cylindrical bi-material assembly adhesively bonded at the ends‖, asme j. appl. mech., vol.64, no. 1, 1997. [63] e. suhir, ―thermal stress in a polymer coated optical glass fiber with a low modulus coating at the ends‖, j. mat. res., vol. 16, no. 10, 2001. [64] e. suhir, ―coated optical glass fiber‖, us patent #6,647,195, 2003. [65] e. suhir, ―on a paradoxical situation related to bonded joints: could stiffer mid-portions of a compliant attachment result in lower thermal stress?‖, jsme j. solid mech. and materials engineering (jsmme), vol.3, no.7, 2009. [66] katsuyama, y. mitsunaga, y. isida, and k. ishihara, "transmission loss of coated optical fiber at low temperature," appl. opt., no. 22, 1983. [67] d.c.l. vangheluwe, "exact calculation of the spring constant in the buckling of optical fibers," appl. opt., 23, 1984. [68] e. suhir, ―effect if the initial curvature on the low temperature microbending in optical fibers‖, ieee/osa journal of lightwave technology, vol.6, no.8, 1988. [69] e. suhir, ―spring constant in the buckling of dual-coated optical fibers‖, ieee/osa journal of lightwave technology, vol.6, no.7, 1988. [70] s.t. shiue, ―design of double-coated optical fibers to minimize hydrostatic-pressure-induced microbending losses‖, ieee photonics technology letters, vol.4, no.7, 1992. 180 e. suhir [71] s.t. shiue, and s.b. lee, ―thermal stresses in double-coated optical fibers at low temperature‖, journal of applied physics, vol.72, no.1, 1992. [72] s.t. shiue, ―axial strain-induced microbending losses in double-coated optical fibers‖, journal of applied physics, vol.73, no.2, 1993. [73] f. cocchini, ―double-coated optical fibers undergoing temperature variations-the influence of the mechanical behavior on the added transmission losses‖, polymer engineering and science, vol.34, no.5, 1994. [74] s.t. shiue, ―the axial strain-induced stresses in double-coated optical fibers‖, journal of the chinese institute of engineers, vol.17, no.1, 1994. [75] s.t. shiue, ―thermally induced microbending losses in double-coated optical fibers at low temperature‖, materials chemistry and physics, vol.38, no.2, 1994. [76] e. suhir, v. mishkevich, j. anderson, ―how large should a periodic external load be to cause appreciable microbending losses in a dual-coated optical fiber?‖, in e. suhir, ed., ―structural analysis in microelectronics and fiber optics‖, asme press, 1995. [77] s.t. shiue, ―the spring constant in the buckling of tightly jacketed double-coated optical fibers‖, j. appl. phys., vol.81, no.8, 1997. [78] e. suhir, ―how long should a beam specimen be in bending tests?‖, asme journal of electronic packaging, vol.112, no.1, 1990. [79] e. suhir, ―stresses in a coated glass fiber stretched on a capstan‖, applied optics, vol.29, no.18, 1990. [80] e. suhir, ―can the curvature of an optical glass fiber be different from the curvature of its coating?‖, international journal of solids and structures, vol.30, no.17, 1993. [81] e. suhir, ―analytical modeling of the interfacial shearing stress during pull-out testing of dualcoated lightguide specimens‖, applied optics, vol.32, no.7, 1993. [82] e. suhir, ―pull testing of a glass fiber soldered into a ferrule: how long should the test specimen be?‖, applied optics, vol.33, no.19, 1994. [83] e. suhir, l. bechou, ―saint-venant‘s principle and the minimum length of a dual-coated optical fiber specimen in reliability (proof) testing‖, esref, arcachon, france, 2013. [84] e. suhir, ―elastic stability, free vibrations, and bending of optical glass fibers: the effect of the nonlinear stress-strain relationship‖, applied optics, vol.31, vol.24, 1992. [85] s.t. shiue, ―the hydrostatic pressure induced stresses in double-coated optical fibers‖, journal of the chinese institute of engineers‖, vol.17, no.4, 1994. [86] e. suhir, ―coated optical fiber interconnect subjected to the ends offset and axial loading‖, int. workshop on reliability of polymeric materials and plastic packages of ic devices, paris, nov. 29dec.2, 1998, asme press, 1998. [87] e. suhir, ―critical strain and postbuckling stress in polymer coated optical fiber interconnect: what could be gained by using thicker coating?‖, int. workshop on reliability of polymeric materials and plastic packages of ic devices, paris, nov. 29-dec.2, 1998, asme press, 1998. [88] e. suhir, ―elastic stability of a dual-coated optical fiber of finite length‖, j. appl. physics, vol.102, no.5, 2007. [89] e. suhir, ―elastic stability of a dual-coated optical fiber with a stripped off coating at its end‖, j. appl. physics, vol. 102, no.4, 2007. [90] e. suhir, ―lateral compliance of a compressed cantilever beam, with application to micro-electronic and fiber-optic structures‖, j. appl. physics d, vol.41,no.1, 2008. [91] e. suhir, ―elastic stability of a dual-coated fiber‖, spie paper #8621-37, photonics west, february 2011. [92] e. suhir, ―thermally induced stresses in an optical glass fiber soldered into a ferrule‖, ieee/osa journal of lightwave technology, vol.12, no.10, 1994. [93] e. suhir, ―solder materials and joints in fiber-optics: reliability requirements and predicted stresses‖, proceedings of the international symposium ―design and reliability of solder joints and solder interconnections‖, orlando, fl., 1997. [94] e. suhir, ―elastic stability, free vibrations, and bending of optical glass fibers: the effect of the nonlinear stress-strain relationship‖, applied optics, vol.31, no.24, 1992. [95] e. suhir, ―is the maximum acceleration an adequate criterion of the dynamic strength of a structural element in an electronic product?‖, ieee transactions on components, packaging and manufacturing technology, vol.20, no.4, 1997. [96] e. suhir, ―dynamic response of microelectronics and photonics systems to shocks and vibrations‖, interpack’1997 proc., hawaii, june 15-19, 1997. fiber optics engineering: physical design for reliability 181 [97] e. suhir, ―could shock tests adequately mimic drop test conditions?‖, ieee ectc conference proceedings, san-diego, ca, may 28-31, 2002. [98] c.y. zhou, t.x. yu, e. suhir, ―design of shock table tests to mimic real-life drop conditions‖, ieee cpmt transactions, vol.32, no.4, 2009. [99] e. suhir, m. vujosevic, and t. reinikainen, ―nonlinear dynamic response of a ―flexible-and-heavy‖ printed circuit board (pcb) to an impact load applied to its support contour‖, j. appl. physics, d, 42, no.4, 2009. [100] e. suhir,―linear response to shocks and vibrations‖, in e. suhir, d.steinberg and t.yu, ―structural dynamics of electronic and photonic systems‖, john wiley, hoboken, nj., 2011. [101] e. suhir, ―linear and nonlinear vibrations caused by periodic impulses‖. in e.suhir, d.steinberg and t.yu, ―structural dynamics of electronic and photonic systems‖, john wiley, hoboken, nj., 2011. [102] e. suhir, ―random vibrations of structural elements in electronic and photonic systems‖, in e.suhir, d.steinberg and t.yu, ―structural dynamics of electronic and photonic systems‖, john wiley, hoboken, nj., 2011. [103] c.y. zhou, t.x. yu, s.w. ricky lee and e. suhir, ―shock test methods and test standards for portable electronic devices‖, in e. suhir, d. steinberg and t. yu, ―structural dynamics of electronic and photonic systems‖, john wiley, hoboken, nj., 2011. [104] e. suhir, ―linear response of a single-degree-of-freedom system to an impact load: could shock tests adequately mimic drop test conditions?‖, in e. suhir, d. steinberg and t. yu, ―structural dynamics of electronic and photonic systems‖, john wiley, hoboken, nj., 2011. [105] e. suhir, ―predictive modeling of the dynamic response of electronic systems to shocks and vibrations‖, asme appl. mech. reviews, vol. 63, no.5, march, 2011. [106] e. suhir, ―structural dynamics of electronics systems‖, modern physics letters b (mplb), vol. 27, no. 7, march 2013. [107] e. suhir, and d. ingman, ―new hermetic coating for optical fiber dramatically improves strength: new nano-particle material (npm) and npm-based new generation of optical fiber claddings and coating‖, us navy workshop, st. louis, mo, 2003: could nano-technology make a difference?‖, polytronic‘04, portland, or, september 13-15, 2001. [108] e. suhir, ―polymer coated optical glass fiber reliability: could nano-technology make a difference?‖, polytronic‘04, portland, or, september 13-15, 2004. [109] e. suhir, ―new nano-particle material (npm) for microand opto-electronic packaging applications‖, ieee workshop on advanced packaging materials, irvine, march 2005. [110] d. ingman and e. suhir, ―optical fiber with nano-particle overclad‖, us patent, #7,162,138 b2, 2007. [111] d. ingman and e. suhir, ―optical fiber with nano-particle cladding‖, us patent, #7,162,137 b2, 2007. [112] e. suhir, ―fiber-optics structural mechanics and nano-technology based new generation of fiber coatings: review and extension‖, in e. suhir, cp wong, yc lee, eds. ―microand opto-electronic materials and structures: physics, mechanics, design, packaging, reliability‖, springer, 2007. [113] e. suhir, d. ingman, ―highly compliant bonding material and structure for microand optoelectronic applications‖, in e. suhir, cp wong, yc lee, eds. ―microand opto-electronic materials and structures: physics, mechanics, design, packaging, reliability‖, springer, 2007. [114] t. mirer, ,d. ingman, e. suhir, ―reliability improvement through nano-particle-material-based fiber structures‖, optical fiber technology, v. 13, 2007. [115] e. suhir, ―polymer coating of optical silica fibers, and a nanomaterial-based coating system‖, keynote presentation, polytronic‘2007, proceedings of the international conference on polymeric materials for microand opto-electronics applications, tokyo, japan, january 14-16, 2007. [116] d. ingman, v. ogenko, e. suhir, a. glista, ―moisture resistant nano-particle material and its applications‖, us patent #7,321,714b2, 2008. [117] e. suhir, ―mechanical approach to the evaluation of the low temperature threshold of added transmission losses in single-coated optical fibers‖, ieee/osa journal of light-wave technology, vol.8, no.6, 1990. [118] e. suhir, ―free vibrations of a fused bi-conical taper lightwave coupler‖, int. j. solids and structures, vol. 29, no. 24, 1992. [119] e. suhir, ―vibration frequency of a fused bi-conical taper (fbt) lightwave coupler‖, ieee/osa journal of lightwave technology, vol. 10, no. 7, 1992. [120] e. suhir, ―predicted stresses and strains in fused bi-conical taper couplers subjected to tension‖, applied optics, vol. 32, no. 18, 1993. 182 e. suhir [121] e. suhir, and j.j. vuillamin, jr., "effects of the cte and young's modulus lateral gradients on the bowing of an optical fiber: analytical and finite element modeling", optical engineering, vol. 39, no. 12, 2000. [122] e. suhir, ―apparatus and method for thermostatic compensation of temperature sensitive devices‖, us patent #6,337,932, 2002. [123] e. suhir, r. mahajan, ―are current qualification practices adequate?―, circuit assembly, april 2011 [124] e. suhir,‖accelerated life testing (alt) in microelectronics and photonics: its role, attributes, challenges, pitfalls, and interaction with qualification tests‖, asme j. electr. packaging (jep), vol. 124, no. 3, 2002. [125] e. suhir, ―failure-oriented-accelerated-testing (foat) and its role in making a viable ic package into a reliable product‖, circuits assembly, july 2013. [126] e. suhir, a. bensoussan, j. nicolics, l. bechou, ―highly accelerated life testing (halt), failure oriented accelerated testing (foat), and their role in making a viable device into a reliable product‖, 2014 ieee aerospace conference, big sky, montana, march 2014. [127] e. suhir, ―failure-oriented-accelerated-testing (foat) and its role in making a viable package into a reliable product‖, semi-term 2014, san jose, ca, march 9-13, 2014. [128] e. suhir, ―how to make a photonic device into a product: role of accelerated life testing‖, keynote address at the international conference of business aspects of microelectronic industry, hong-kong, january 2003. [129] e. suhir,―reliability and accelerated life testing‖, semiconductor international, february 1, 2005. [130] e. suhir,―when reliability is imperative, ability to quantify it is a must‖, imaps advanced microelectronics, august 2012. [131] e. suhir, "applied probability for engineering and scientists", mcgraw hill, new york, 1997. [132] e. suhir, ―thermal stress modeling in microelectronics and photonics packaging, and the application of the probabilistic approach: review and extension‖, imaps int. j. of microcircuits and electronic packaging, vol.23, no.2, 2000 (invited). [133] e. suhir, ―probabilistic design for reliability‖, chipscale reviews, vol.14, no.6, 2010. [134] e. suhir, ―remaining useful lifetime (rul): probabilistic predictive model‖, int. j. of phm, vol 2(2), 2011. [135] e. suhir, r. mahajan, a. lucero, l. bechou, ―probabilistic design for reliability (pdfr) and a novel approach to qualification testing (qt)‖, 2012 ieee/aiaa aerospace conf., big sky, montana, 2012 [136] e. suhir, ―how long could/should be the repair time for high availability?‖, modern physics letters b (mplb), vol.27, aug.30, 2013. [137] e. suhir, ―could electronics reliability be predicted, quantified and assured?‖ microelectronics reliability, no. 53, april 15, 2013. [138] e. suhir, ―boltzmann-arrhenius-zhurkov (baz) model in physics-of-materials problems‖, modern physics letters b (mplb), vol.27, april 2013. [139] e. suhir, l. bechou, ―availability index and minimized reliability cost‖, circuit assemblies, february 2013. [140] a. bensoussan, and e. suhir, ―design-for-reliability (dfr) of aerospace electronics: attributes and challenges", 2013 ieee aerospace conference, big sky, montana, march 2013. [141] e. suhir, ―assuring aerospace electronics and photonics reliability: what could and should be done differently‖, 2013 ieee aerospace conference, big sky, montana, march 2013. [142] e. suhir, ―predicted reliability of aerospace electronics: application of two advanced probabilistic techniques‖, 2013 ieee aerospace conference, big sky, montana, march 2013. [143] e. suhir, a. bensoussan, ―application of multi-parametric baz model in aerospace optoelectronics‖, 2014 ieee aerospace conference, big sky, montana, march 2014. [144] e. suhir, ―combined statisticsand physics-of-failurebased approach in the probabilistic design for reliability of opto-electronics products‖, optical engineering, 2014. instruction facta universitatis series: electronics and energetics vol. 30, n o 1, march 2017, pp. i i editorial since my appointment as a new editor-in-chief of facta universitatis: series electronics and energetics, in october 2013, we have published the series of three special anniversary issues dedicated to the journal’s majestic age of a quarter of century, one special issue devoted to the internet of things an emerging paradigm and a cutting edge technology, as well as 9 regular issues. over the past three years, we were receiving submissions and publishing papers from a very broad geographical area, making facta universitatis: series electronics and energetics a truly international journal. the published papers in both special and regular issues not only met the goals consistent with our focused aims, but have surpassed our expectation in quality and practical value. as a consequence, facta universitatis: series electronics and energetics has recently been selected for coverage in thomson reuters’ products and services, and beginning with all content published in 2016, the journal will be indexed and abstracted in recently launched emerging sources citation index (esci). note that journals in esci have passed an initial editorial evaluation and can continue to be considered for inclusion in scie that has rigorous evaluation procedure and selection criteria. therefore, our job is not finished yet and the journal will have to be developed and improved further. we will continue to insist that all published papers are of high quality and practical value, thus leading to their worldwide citation, i.e. to the journal’s inclusion in scie. this is the fun part of this job, often it is a journey that is more enjouable than the destination itself. as the editor-in-chief, i, along with our editorial team, promise to continue to develop and improve facta universitatis: series electronics and energetics in order to keep it at the forefront of science and technology. ninoslav stojadinović editor-in-chief instruction facta universitatis series: electronics and energetics vol. 29, n o 4, december 2016, pp. 647 651 doi: 10.2298/fuee1604647l transient voltage suppressor based on diode-triggered low-voltage silicon controlled rectifier  xiang li, shurong dong, zhihui yu, jie zeng, weihuai wang institute of photonics and microelectronics department of information sciences and electronic engineering, zhejiang university hangzhou, china abstract. transient voltage suppressor (tvs) has been widely used for electronic system esd protection. a good tvs is usually costive as it needs some special processes and extra masking layers for fabrication. a novel tvs design based on the standard cmos process will be much more attractive. this work proposes a new tvs device using a cmos compatible diode-triggered silicon controlled rectifier (dlvtscr) as the core device. due to the availability of multiple trigger mechanisms and the dual current paths for bypassing the esd current, the newly proposed device is able to sink an esd current of over 10 a. in addition, the holding voltage is promoted up to 6.83 v and the trigger voltage is lowered down to 10.8 v which is well suited for most portable device applications. key words: tvs, esd, lvtscr 1. introduction the integrated circuits (ics) used in modern mobile electronic devices are faster, more powerful, less power consumptive, and are much smaller than ever before. however, they are more vulnerable to reliability issues, not only due to the small device size and the use of ultrathin gate oxide, but also due to their applications which make the devices more frequently exposed to electrostatic discharge events produced during the frequent human interfacing, and often plugging and disconnecting the usb devices and hdmi port. on-chip protection is now of vital importance for system reliability. however, conventional protection scheme is not only costive and bulky, but also leads to the system performance degradation [1]-[5]. transient voltage suppressor (tvs) diodes have long been used to provide a high robustness system level esd protection [6-8]. under normal operating conditions, the tvs diode maintains in a high impedance state. during a transient discharge event, the tvs breaks down electrically and yields a low impedance shunt path to bypass the transient current. a received june 30, 2015; received in revised form march 12, 2016 corresponding author: shurong dong institute of photonics and microelectronics, department of information sciences and electronic engineering, zhejiang university, hangzhou, china (email: dongshurong@zju.edu.cn) 648 x. li, s. dong, z. yu, j. zeng, w. wang good tvs protection circuit must be able to divert the transient current and to clamp transient voltage below the threshold value before the failure of the protected ic. a tvs structure includes a core device and some steer diodes. the clamping voltage usually depends on the core device, and the steer diodes can divert the esd current to the core device and can reduce the overall capacitance of the structure. however, to obtain a good tvs diode, some special processes, such as deep trench isolation or additional processing masking layers, are required. this work attempts to develop a cmos compatible tvs device. fig. 1 shows the conventional tvs based on a zener diode (a) and the newly proposed tvs structure. fig. 1 conventional tvs structure based on zener diode and the proposed tvs structure based on the standard cmos process. for protecting the interface for data line communications, a good tvs device must possess some special features. first, a low working voltage is crucial for safeguarding the submicron integrated circuits. the maximum reverse working voltage, vrwm, which is the largest allowable dc voltage that can maintains the tvs in non-conducting state, is the key parameter for specifying the working voltage. when the transient voltage exceeds vrwm, the tvs turns on quickly and a low impedance path will be established to divert the transient current. hence, a low working voltage is essential for clamping the transient voltage to a level well below the threshold value. second, the equivalent capacitance of tvs should be low enough in order to preserve signal integrity at the high-speed interface. if the capacitance of the tvs diodes is too high, it will cause excessive load to the circuit and then signal distortion and data errors will result. 2. structure and performance the schematic equivalent circuit and the cross-sectional view of the diode-triggered lvtscr structure are shown in fig. 2. lvtscr, using the gated p-well structure, has been widely used as esd protection devices because of its suitable values of holding voltage and the low trigger voltage. however, the gate structure also plays an important role in the reliability of the device. by adding an extra diode connecting the anode and cathode of the conventional lvtscr, the structure can be triggered by an esd event more easily. when an esd event occurs, the drain pn junction and the substrate of the ggnmos will be first driven into an avalanche breakdown and the voltage drop across transient voltage suppressor based on diode-triggered low-voltage silicon controlled rectifier 649 the diode increases as the avalanche current increases. meanwhile, the electrons in the n+ (the one between the n-well and p substrate) will diffuse into the n-well. when the voltage drop across the diode rises above 0.7 v, the bipolar transistor (q1) will be turned on. and that makes the scr to be turned on later owing to the positive feedback in the transistors q1 and q2. this device has been taped out in 0.18um cmos process. to study the characteristic of this new structure, transmission line pulse (tlp) measurements using pulses with a rise time of 10 ns and a pulse width of 100 ns were conducted. fig. 3 shows the comparison of tlp characteristics for conventional and diode triggered lvtscr. as compared with the conventional lvtscr, the new diode-triggered lvtscr structure exhibits a low parasitic resistance (calculated by dv/di), because the current conduction path in the newly proposed structure is now formed by the ggnmos together with the scr. as shown in f, the trigger voltage decreases from 8.94 v to 7.82 v; whereas the holding voltage increases from 2.01 v to 3.21 v. in addition, the failure current, it2, also increases from 3.17 a to 4.05 a because of the availability of two current conduction paths. (a) (b) fig. 2 cross-sectional view (a) and the schematic equivalent circuit (b) of diode-triggered lvtscr. fig. 3 tlp results of conventional lvtscr and diode-triggered lvtscr. 650 x. li, s. dong, z. yu, j. zeng, w. wang as shown in fig. 4, when the value of d (the distance between the drain side of gate to contact of gate, see fig. 2(a)) increases from 0.85 μm to 2.35 μm, it2 increases from 2.05 a to 3.1 a. when the drain contact is close to the poly gate (when d is small), the heat produced at the drain junction spreads isotropically to the contact metal and results in a lower failure current level [8]. hence, a larger separation between the contact and the poly gate will help to increase the failure current level. on the other hand, this device behaves liking a diode when adding reverse voltage on it. after investigating the standalone dlvtscr, a tvs using the dlvtscr as the core device was realized and the tlp test result is shown in fig. 5. taking i/o1 as an example, when adding esd strike on i/o1 to gnd, the esd current will be released by the steer diode d1, through the dlvtscr and then going to gnd. as shown in fig.5, the tvs structure presents a higher holding voltage of about 6.83 v and an acceptable trigger voltage of about 10.8v. these values should be acceptable for esd protection applications for high-speed digital interfaces such as usb2.0, hdmi, avi ports etc, in portable equipments. fig. 4 tlp characteristics of diode-triggered lvtscr as a function of device spacing (d) between the gate and drain contact of ggmos of the lvtscr. fig. 5 comparison of tlp characteristics of a standalone dlvtscr and a tvs device embedded with a dlvtscr. transient voltage suppressor based on diode-triggered low-voltage silicon controlled rectifier 651 3. conclusion this paper attempts to incorporate a diode-triggered low-voltage silicon controlled rectifier into a tvs. the results show that the larger distance between the gate edge and the drain contact, the higher esd current (it2) can be obtained. the tvs structure was further verified with the standard cmos process and good robustness was obtained. this structure can be used for system level esd protections for high speed digital interfaces such as usb2.0, hdmi, avi ports, and so on. acknowledgement: this work was supported by the national natural science foundation of china (no. 61171038, 61204124). the authors thank the innovation platform for micro/nano device and system integration and cyrus tang centre for sensor materials and applications at zhejiang university. references [1] c. ito, k. banerjee, r.w. dutton, “analysis and design of distributed esd protection circuits for highspeed mixed-signal and rf ics”, ieee trans. electron devices, vol.49, pp.1444-1454, 2002. [2] j.-h. ko, k.-y. kim, j.-s. jeon, c.-h. jeon, c.-s. kim, k.-t.lee, h.-g. kim, "system-level esd onchip protection for mobile display driver ic", in proc. of the sympo. of electrical overstress/ electrostatic discharge, 2011, pp.1-8. [3] a. jahanzeb, l. lou, c. duvvury, c. torres, s. morrison, "tlp characterization for testing system level esd performance", in proc. of the sympo. electrical overstress/electrostatic discharge, 2010, pp.1-8. [4] k. shrier, t. truong, and j. felps, "transmission line pulse test methods, test techniques and characterization of low capacitance voltage suppression device for system level electrostatic discharge compliance", in proc. of the sympo. electrical overstress/electrostatic discharge, 2004. pp.1-10. [5] h. gossner, w. simbürger, m. stecher, "system esd robustness by co-design of on-chip and on-board protection measures", microelectronics reliability, vol. 50, no. 9-11, pp.1359–1366, 2010. [6] m. hove, t. o. sanya, a. j. snyders, i. r jandrell and h .c. ferreira, "the effect of type of transient voltage suppressor on the signal response of a coupling circuit for power line communications", africon, 2011 [7] s. s. choi, d. h. cho, k. h. shim, "development of transient voltage suppressor device with abrupt junctions embedded by epitaxial growth technology", electron. mater. lett., vol. 5, pp. 59-62, jun 2009. [8] r. n. rountree and c. l. hutchins, "nmos protection circuitry," ieee trans. electron devices, vol. 32, pp. 910-917, 1985. instruction facta universitatis series: electronics and energetics vol. 29, n o 4, december 2016, pp. 721 732 doi: 10.2298/fuee1604721k contribution to calculating the impedance of grounding electrodes using circuit equivalents  andrijana kuhar, leonid grcev ss. cyril and methodius university, feit, skopje, macedonia abstract. in this paper the dynamic behavior of grounding electrodes is investigated by calculating their impedance to ground. the calculations are performed by implementing popular modelling approaches including circuit, transmission line, and electromagnetic field (emf) model. the attention of this paper is given to the circuit based (cbm) method. the results from the rigorous emf model are used as reference in the process of validity range determination of the other models. numerically obtained curves for the frequencydependent impedance to ground are presented in several figures for various electrode lengths and soil characteristics. key words: equivalent circuits, emf model, grounding electrode, impedance 1. introduction equivalent circuits are frequently used in the analysis of grounding systems due to the simplicity of modelling they offer [1]-[5]. the impedance to ground is one of the most important characteristics of any grounding system. in this paper the circuit based method (cbm) from [5] is used to determine the grounding impedance of a perfect conductor placed in imperfect ground, using the thin wire approximation. the obtained results for the harmonic impedance to ground of the conductor are compared to results obtained by implementation of a lumped r-l-c circuit, a transmission line (tl) model, and the referent emf model from [6]. a thorough verification process of the cbm method implemented in this paper has been previously performed by the authors [7], [10]. received january 23, 2016; received in revised form march 9, 2016 corresponding author: andrijana kuhar ss. cyril and methodius university, feit, skopje, macedonia (email: kuhar@feit.ukim.edu.mk) * an earlier version of this paper was presented at the 12 th international conference on applied electromagnetics (пес 2015), august 31 september 2, 2015, in niš, serbia [1]. 722 a. kuhar, l. grcev 2. methods for determination of the grounding impedance 2.1. cbm model the cbm method was first implemented by otero, cidras and alemo in 1999 [5]. this method operates in the frequency domain and creates an equivalent circuit that takes into account all the inductive, capacitive, and conductive couplings between the conductor segments. propagation effects on the em fields are not considered in [5]. a newer expanded version of the cbm method was proposed by visacro and silveira in 2004 [8], in which the authors introduced time propagation by multiplying the classical cbm equations with the term e -γr , where γ is the propagation constant of the medium and r is the distance between the point of interest and the source point. the enhanced cbm method from [8] was called hybrid electromagnetic model (hem). the system analyzed in this paper is consisted of a perfect conductor placed in imperfect ground. the length of the conductor is noted as l, and the radius is noted as a. the conductor is segmented for the needs of the cbm method and one of the segments is presented in fig. 1. fig. 1 geometry of the conductor in fig.1 σ, ε and μ are the conductivity, permittivity and permeability of the soil, while the according characteristics of the air are σ0, ε0 and μ0. indexes j and m mark the number of the corresponding node, while index k marks the corresponding segment. the thin wire approximation requires the length of the conductor to be significantly larger than its radius (l>>a). the solution for the potential distribution of the electrode is based on conventional nodal analysis which is represented in the following matrix equations 1 [ ] [ ][ ] [ ] ([ ] [ ][ ] [ ] [ ] [ ]) s t t i y v y q g q a z a     (1) where [y] is the admittance matrix, [z] is the impedance matrix, [g] is the conductance matrix, [q] is a matrix containing relations between nodal and segment potentials and [is] is the current source vector. equation (1) results with the potential distribution [v] (of every node) along the conductor. the conductor is excited by a harmonic current source at the first node and the impedance to ground is calculated as a ratio of the conductor’s first node voltage and current. the presence of air-earth interface is taken into account by implementing the quasi-dynamic image theory [9]. for the purpose of this research the calculating the impedance of grounding electrodes using circuit equivalents 723 cbm method is implemented in the classical form [5], and in the hybrid form [8]. more details of the extraction of impedances, conductances from maxwell’s equations and the approximated relation between nodal and segment potentials can be found in [7]. the validity of the cbm method applied on complex grounding networks has been thoroughly investigated by the authors [7], [10]. one such network is the e. balaidos ii substation grounding (located close to the city of vigo, spain). the analyzed network has dimensions 80 x 60 m and is constructed with 107 horizontal copper bars of 1.28 cm diameter buried at 80 cm depth and a set of 67 vertical copper clad steel rods of 1.4 cm diameter and 2.5 m length. 3d view of the grounding grid and the profile on the surface of the ground used for calculations is presented in fig. 2. small portion of the verification results will be presented in fig. 3 fig. 2 geometry of the balaidos grounding network and the position of the calculation profile. the verification process included results for the potential along the profile when injected current frequency is low (50 hz) and higher (100 khz) [10]. the current was injected in a corner of the grounding grid (point o1 from fig. 2). the resistivity of the soil was ρ=50 ωm. the obtained results are compared graphically in fig. 3. fig. 3 potential along the profile. 2.2. lumped circuit (r-l-c) approach the second method that was compared in this paper is the lumped circuit (r-l-c) approach. the main purpose of this method is to equivalent the grounding conductor with its 724 a. kuhar, l. grcev input impedance or impedance to remote neutral ground. at low frequencies, the input impedance is represented by a single resistor and at high frequencies by a lumped r-l-c circuit, fig. 4. fig. 4 lowand high-frequency equivalent lumped circuit of the grounding conductor. the expressions used for the circuit parameters of a vertical grounding rod are taken from reference [11]. 0 1 2 log ( ) 2 2 2 / log ( ) 2 log ( ) 2 l r l a l c l f a l l l h a          (2) 2.3. tl model the third method being compared in this paper is the tl or distributed circuit parameters method, fig. 5. this method assumes transverse-electromagnetic (tem) propagation on a perfect infinite conductor in a homogeneous medium and neglects the effects of the earth-air interface. fig. 5 discrete approximation of the distributed circuit representation of the grounding conductor. the distributed circuit parameters (per length) are obtained from the lumped circuit parameters (2) in the following manner calculating the impedance of grounding electrodes using circuit equivalents 725 1 ' ( ) ' ' ( / ) ' ( / )       r r l m g c c f m l l l h m l (3) fig. 3 presents a discrete approximation of the distributed-parameter circuit, where each segment of the grounding conductor is represented by a r-l-c section. identical parameters are used for each section. the impedance to ground of the grounding electrode is in fact, the input impedance of the transmission line open at the lower end [12]. 0 0 coth ' ( ' ') '( ' ') z z l j l z g j c j l g j c             (4) 2.4. emf approach the emf method is used to investigate the validity range of the methods presented in the previous subsections. the referent results are obtained by rigorous method of moments calculations of the mixed potential integral equation [8] implemented on the identical system. for the perfect conductor from fig. 1, the above mentioned equation in the frequency domain expresses the z component (tangential) of the electric field as 1 ( ( ) ) z a v l l d di z e j g i z dz g dz j dz dz            (5) the longitudinal current i(z’) is then expanded as a linear combination 1 ( ) n n n n i z f i     (6) where in are the unknown current values on every segment and fn are triangular basis functions. the following matrix equation yields the current distribution [5], [13]. [ ][ ] [ ] s s z i z i  (7) where the array [i] represents the currents to be determined, [z] is the impedance matrix of mutual impedances between each of the current elements, [–zsis] represents the energization array, and is is the injected (source) current. the elements of the matrix [z] are calculated between the observation segment m and the source segment n, as 1 m nm nm nz m n n l v z e dl i i     (8) more details of the mpie solution can be found in [13]. 726 a. kuhar, l. grcev 3. results in the first part of this section, the comparison between the curves of the impedance to ground obtained by the different methods is presented. a) b) fig. 6 impedance to ground of a 3m long vertical grounding rod fig. 6 a) and b) present the impedance to ground for 3m long vertical grounding conductor, calculated by implementation of cbm method, compared to curves obtained by the other methods from literature [6]. the radius of the analyzed conductor is 1.25 cm. the value of soil resistivity (ρ=1/σ) is a) 30 ωm and b) 300 ωm. the length of the conductor is increased 10 times in fig. 7. as shown in the figures, the cbm method is implemented in two ways – first without time propagation, and secondly by including time propagation effects in the equivalent circuit parameters. it is visible from figs. 6 and 7 that there is an calculating the impedance of grounding electrodes using circuit equivalents 727 excellent agreement between the results obtained by the cbm method including time propagation and the referent emf method. the differences between all the other impedance curves and the above mentioned method are clearly significantly larger, especially for high frequencies. the cbm method takes into account not only the self characteristics of each segment but also the coupling among different segments. that is the main reason for the high precision of cbm compared with other less accurate circuit methods, some of which are tested in this paper. the second part of the section provides a parametric analysis of the grounding impedance of a horizontal conductor buried in imperfect ground, in terms of soil resistivity and conductor length. classical and enhanced cbm is compared in the latter case. a) b) fig. 7 impedance to ground of a 30m long vertical grounding rod 728 a. kuhar, l. grcev a) b) fig. 8 impedance to ground of a 3m long horizontal grounding conductor the dependency of the grounding impedance on the specific soil resistivity is shown in the following figures. fig. 8 a) and b) presents the magnitude and phase of the grounding impedance for a 3m long horizontal grounding conductor and fig. 9 for a 30m long horizontal conductor, respectively. the radius of the analyzed conductor is 1.25 cm, the depth of burial is 80 cm and the relative permittivity of soil is set to εr=10. calculating the impedance of grounding electrodes using circuit equivalents 729 a) b) fig. 9 impedance to ground of a 30m long horizontal grounding conductor. it may be observed from figs. 8 and 9 that as could be expected, the impedance is significantly higher for grounding conductors placed in highly resistive soil. the third part of this section provides a parametric analysis of the grounding impedance of a horizontal conductor buried in imperfect ground, in terms of soil resistivity and relative permittivity. classical and enhanced cbm is implemented for horizontal and vertical grounding conductors. fig. 10 a) and b) presents the magnitude of the grounding impedance for a horizontal grounding conductor with 30 and 300 m length. 730 a. kuhar, l. grcev a) l=30 m b) l=300 m fig. 10 impedance to ground of a horizontal grounding conductor fig. 11 a) and b) presents the magnitude of the grounding impedance for a 30 and 300 m long vertical grounding conductor calculating the impedance of grounding electrodes using circuit equivalents 731 a) l=30 m b) l = 300 m fig. 11 impedance to ground of a vertical grounding conductor 732 a. kuhar, l. grcev 4. conclusions it may be concluded from the presented research that the cbm method is a solid choice for determining the grounding impedance of buried single conductors, in terms of accuracy. even the classical form of cbm that doesn’t include time propagation of em fields provides much more accurate results than the other investigated methods, compared to the referent results. it may be observed in the presented figures that a very high agreement exists between the referent results and those obtained by implementation of enhanced cbm method (hem). the higher accuracy of this method compared to other circuit methods is mainly due to the coupling of segments that cbm (and hem) take into account. the high precision of the cbm method in addition with the relative simplicity and high speed of its application shows that it may be one of the best choices in the analysis of grounding systems. references [1] a. kuhar and l. grcev, ”calculating the impedance of grounding electrodes using circuit equivalents”, in proc. of the 12 th international conference on applied electromagnetics пес 2015, niš, serbia, 2015. [2] r. velazquez and d. mukhedkar, ”analytical modelling of grounding electrodes transient behavior”, ieee trans. power apparatus and systems, vol. 103, pp. 1314-1322, 1984. [3] f. napolitano, m. paolone, a. borghetti, c. a. nucci, f. rachidi, v. a. rakov, j. schoene and m. a. uman, “interaction between grounding systems and nearby lightning for the calculation of overvoltages in overhead distribution lines”, ieee trondheim powertech, 2011. [4] p. yutthagowith, a. ametani: “application of a hybrid electromagnetic circuit method to lightning surge analysis”, ieee trondheim powertech, 2011. [5] a. f otero, j. cidras and j. l. alemo, ”frequency-dependent grounding system calculation by means of a conventional nodal analysis technique” ieee transactions on power delivery, vol. 14, no. 3, pp. 873878, 1999. [6] l. grcev and m. popov, ”on high-frequency circuit equivalents of a vertical ground rod”, ieee transactions on power delivery, vol. 20, no. 2, pp. 1598-1603, 2005. [7] r. jankoski, a. kuhar and l. grcev, “application of the electric circuit approach in the analysis of grounding conductors”, in proc. of the 5th international symposium on applied electromagnetics – saem'2014, skopje, macedonia, 2014. [8] s. visacro, f.h. silveira, “evaluation of current distribution along the lightning discharge channel by a hybrid electromagnetic model”, journal of electrostatics, vol. 60, pp. 111-120, 2004. [9] v. arnautovski-toseva, l. grcev, “electromagnetic analysis of horizontal wire in two-layered soil”, journal of computational and applied mathematics, vol. 168, no. 1-2, pp. 21-29, 2004. [10] a. kuhar, l. ololoska-gagoska and l. grcev, “numerical analysis of complex grounding systems using circuit based method”, in proc. of the xii international conference – etai, ohrid, macedonia, 2015. [11] r. rudenberg, electrical shock waves in power systems, cambridge, ma: harvard univ. press, 1968. [12] s. bourg, b. sacepe and t. debu, “deep earth electrodes in highly resistive ground: frequency behavior”, in proc. of the ieee int. symp. electromagnetic compatibility, 1995, pp. 584-589. [13] l. grcev, b. markovski, v. arnautovski-toseva, a. kuhar, k. el khamlichi drissi, k. kerroum: ”modeling of horizontal grounding electrodes for lightning studies”, in proc. of the european electromagnetics, euroem 2012, toulouse, france, july 2-6, 2012. 11106 facta universitatis series: electronics and energetics vol. 36, no 2, june 2023, pp. 227-238 https://doi.org/10.2298/fuee2302227a © 2023 by university of niš, serbia | creative commons license: cc by-nc-nd original scientific paper savonius micro wind turbine: a theoretical analysis * elson avallone1, paulo henrique palota1, paulo césar mioralli1, pablo sampaio gomes natividade1, jonas rafael antonio1, josé ferreira da costa1, sílvio aparecido verdério junior2 1federal institute of education, science and technology of sao paulo, catanduva-sp, brazil 2federal institute of education, science and technology of sao paulo, araraquara-sp, brazil abstract. currently, the energy sector is the main responsible for the emission of carbon dioxide into the atmosphere. therefore, to reverse this scenario, it is necessary to expand the use of renewable energy sources, such as wind energy. with that, the search for improving efficiency in wind turbines that work with low speed winds, make the savonius turbine an advantageous option for presenting characteristics of low construction cost. this study aims to theoretically analyze a single model of vertical axis wind micro turbine using artificial wind. the wind power for 2 stages in this project was 0.063 w, as the power variation in relation to rotation is not linear. another important factor to consider is that the overlap ratio of 30% collaborates a power reduction. using the mathematical models, some results were theoretically analyzed through the savonius turbine with central axis. the literature indicates that the most efficient turbine is a two-stage turbine with helical blades and without a central axis. key words: wind energy, mini wind turbines, savonius 1. introduction this work is an extended version presented in [1], where the operating principles of a low-cost anemometer were presented. the world energy sector is the main responsible for the increase of carbon dioxide in the earth's atmosphere, and in 2007, 25% of the total greenhouse gases were emitted, due to the burning of coal, natural gas and oil. thus, in order to favor economic growth, it is necessary to invest in alternative energy sources aimed at sustainable development, as renewable energy [2]. renewable energies are those that are replenished naturally, that is, they are inexhaustible sources such as solar, tidal, geothermal and wind energy. according to [3] the production in the received september 16, 2022; revised october 25, 2022 and november 09, 2022; accepted november 25, 2022 corresponding author: elson avallone federal institute of education, science and technology of sao paulo, catanduva-sp, brazil e-mail: elson.avallone@ifsp.edu.br * an earlier version of this paper was presented at the 7th virtual international conference on science, technology and management in energy, energetics 2021, december 16-17, belgrade, serbia [1]. 228 e. avallone, p. h. palota, p. c. mioralli, et al. european union grew by 5.1% per year between 2007 and 2017, with wind being the second most produced energy source. wind energy is the transformation of air movement into useful energy, transforming mechanical energy into electricity, especially vertical and horizontal axis wind turbines [4]. this energy has been used for many centuries, such as ocean navigation, grain milling and water pumping [4]. persian windmills used in the transformation of wind energy into mechanical energy and their assembly was basically on vertical axes [5]. the first applications in europe took place in the netherlands with the same function of grinding grain, later spreading to the rest of the european continent in countries such as france, germany, belgium and denmark. however, it was in holland that they had the function of pumping water, with the change to the horizontal axis [5]. in comparison with europe, the exploration of brazilian wind energy began in the 1990s. this development took place through a mapping of the country's wind potential through sensors and special computers, mapping the first locations such as the states of ceará and pernambuco in northeastern brazil [6], [7] and [8]. the heating up of the brazilian market only occurred in 2004 with the creation of the incentive program for alternative sources of electric energy (proinfa), with the incentive of wind farm projects. even with the aforementioned incentive, the real growth occurred between 2009 and 2011, with the reduction of wind turbine prices and greater ease of connections to the electricity grid [9]. brazil is among the 10 countries that most exploit wind energy, ranking sixth with 3% of the world's installed capacity [10]. fig. 1 presents the growth of the installed capacity of wind energy in brazil and its participation in the national energy generation until the year 2020. fig. 1 wind energy growth in brazil [9] 1.1. types of wind turbines wind turbines are divided into horizontal axis wind turbines (hawts) and vertical axis wind turbines (vawts) [11], [12] and [13]. savonius micro wind turbine: a theoretical analysis 229 the hawts are more used nowadays because they are more efficient when compared to vertical axis turbines. however, the vawts have proven to be viable options due to their low production cost, independence from wind direction and wide applicability. rotary axis independence makes vawts work independently of wind direction [14]. the two main models of vertical axis turbines are savonius and darrieus [15]. in the 1930s, the darrieus turbine was developed, operating on the principle of lift and drag from a wing. its efficiency is similar to horizontal turbines, due to the presence of airfoils. when the moving air hits the blades, fixed to the ends of the deflector plates, a low pressure zone is generated. as the blade is fixed, the force of the wind causes the rotational movement of the set [15]. a variation of this type of turbine is the type h blades with straight and helical blades however this variation of the darrieus turbine has a torque deficiency, requiring a starter motor. [15-16]. mechanical systems produce constant and artificial exhaust winds, thus producing a constant rotation in the vawts, which provides a uniform generation of electrical energy [18]. one of the reasons vawts are not that expensive to build is that they do not need a yaw mechanism. this makes them ideal for small-scale applications in remote areas with electricity shortages. their shells do not require a mechanism to change their angle, as they work with any wind direction. vawts are less noisy than hawts which facilitates application in urban environments, in addition, with their reduced size provides greater safety for wildlife in rural areas [19]. the savonius turbine was developed and patented in 1929 in the united states of america and finland by the finnish sigurd j. savonius. latter it became one of the most widespread and well-known radial drag turbines in the world. [18-19]. the savonius turbine works by the aerodynamic principle of drag, having no airfoils, being formed by two opposite half channels, supported by a vertical axis [15]. their movement is based on the difference in the drag force acting on the concave and convex parts of their shells [19]. for this equipment to have better efficiency, it is necessary to determine the aspect ratio value (α1), which is the ratio between the height and diameter of the blades, where the most recommended value is around 4.0 [22]. according to menet [12], the savonius turbine, compared to other wind turbines, has greater resistance to fatigue and mechanical stress. aerodynamically, the savonius is simpler to design and build, which greatly reduces its cost compared to the airfoil blade designs of other vawts and hawts [19]. experimental studies show that the savonius performs well at low wind speeds and that the two-blade performs better than the three-blade, due to the fact that more drag is wasted on the three-blade versions [23]. 1.2. betz’s law betz's law (1926) states that “the maximum energy utilization of a wind turbine is 59.3%, and that even if a mechanical system is ideal, it is still possible to extract at most about 40% of the kinetic energy of the winds” [13] and [24]. as the maximum power coefficient, cp, is about 59% in wind turbines, according to betz's law, despite the efforts of hawts and vawts turbine designers to improve them, it was difficult to reach the betz limit [24] on page 27 of that publication. 230 e. avallone, p. h. palota, p. c. mioralli, et al. 1.3. objectives analyzing the characteristics of low production cost of the savonius turbine, low noise and construction in small sizes, it makes it a suitable application in urban environments, for having low noise, and also in rural environments, for having small size [19]. therefore, seeking to expand this approach, the aim of this work is theoretically analyze the two stages with central axis of savonius micro turbine for future application in small equipment, in addition to producing a literature review of micro turbines. 2. savonius the sizing of the wind micro turbines was based on a prototype developed at the federal institute of science and technology of são paulo campus catanduva, brazil, which was reduced to a 1:10 scale compared with an existing project of the institution. prototypes printed on a 3d printer have two stages, each 90 mm high, as the increase in the number of these stages increases inertia and reduces dependence on the wind direction for the start of rotation. on the other hand, the excess of stages would decrease the aspect ratio and static torques [22-23]. three “end plates” were used, or deflector plates with a diameter of 55 mm and 2 mm thick. wind tunnel tests using 5 models of savonius turbines found that the "end plates", which form an angle with the shells, provide improvements in efficiency [27]. thus, the greater the number of deflector plates, the greater the number of fins, which detain the air, increasing the total drag force by up to 36% [28]. all 2-blade turbines were used, which increases the rotational speed, but also generates a reduction in efficiency [15]. two models use conventional (straight) blades, which have an efficiency of approximately 10% if installed in a single stage. however, when the stages are overlapped the efficiency increases to 13%. two other models used have helical-shaped blades that have an efficiency close to 18% in a single stage [29], which was no object of this work. savonius wind turbines are designed with the central axis completely blocking the air passage through the cavity of thickness "e" (fig. 2). this generates better efficiency, while the turbine without the central shaft produces greater stability [30]. in the theoretical analysis of the savonius rotor (fig. 2), the equations of aerodynamic power (pa) and mechanical torque (m) are used [12]. the smallest value of df is 10% greater than the value of d [31]. fig. 2 𝐷𝑓 = 0,055 𝑚 𝐷 = 0,04889 𝑚 𝑑 = 0,03 𝑚 𝐻 = 0.186014 𝑚 𝑒 = 0,009 𝑚 savonius micro wind turbine: a theoretical analysis 231 a. aerodynamic power (pa) 2.1. aerodynamic power (pa) the aerodynamic power (pa) is determined using equation (1), derived from the bernoulli equation, where cp corresponds to the aerodynamic power coefficient, ρ the air density, ap the projected area of the rotor and v the air speed [12]. pa = cp ∙ 1 2 ∙ ρ ∙ ap ∙ v 3 (1) the projected area (ap) refers to the product of the diameter (d) by the height of the rotor (h), that is, ap = d ∙ h. the effect of the number of blades affects the aerodynamic performance of the wind turbine, in terms of 𝜆 and cp, as well as weight, cost, fatigue life and structural dynamics [24], [32] and [33]. the speed coefficient 𝜆 determines how fast the wind turbine will rotate. the referred coefficient depends on the specific wind turbine design with regard to the drag coefficient [34] (page 510) of the rotor and the number of shells. a high 𝜆 value can generate mechanical stress, noise and low energy absorption. therefore, it is important that wind turbines are designed to operate in a range of 𝜆, which considers the relationship between angular velocity and wind speed, in order to extract as much energy as possible from the airflow [35]. in fig. 3 it can be seen that the savonius turbine works best with low𝜆. the savonius has a field of application similar to that of dutch mills and multi-blade axial turbines, but the advantage of savonius is that it has less structural material. fig. 3 characteristic curves of cp as a function of 𝜆 for wind turbines [20] and [36] it can be observed in fig. 4 that the savonius turbine has a higher torque coefficient than the other turbines, with the exception of the dutch mil [20]. 232 e. avallone, p. h. palota, p. c. mioralli, et al. fig. 4 ct characteristic curves as a function of 𝜆 for wind turbines [20] and [36] the values of the aerodynamic power coefficient 𝐶𝑝 and the torque coefficient (𝐶𝑚 ) are obtained graphically through fig. 5, also studied by [37]. the value of λ is obtained through the relation λ = vtang v⁄ , where vtang = ω ∙ d 2⁄ . the power coefficient has its maximum value when λ ≈ 1. fig. 5 value of 𝐶𝑝 and 𝐶𝑚, as a function of 𝜆 [12-13] 2.2. torsion torque (m) torsion torque is defined by the equation (2) [12]. m = cm ∙ 1 4 ∙ ρ ∙ d ∙ ap ∙ v 2 (2) aspect ratio is an important characteristic of rotor efficiency and defined as α1 = h d⁄ . the best rotor power coefficient has a value of α1 ≈ 4,0 [39]. savonius micro wind turbine: a theoretical analysis 233 the overlap ratio1 is calculated by β = e d⁄ , where the best efficiency is between 20 and 30% [16], [30], [31] and [37]. the value of β used to calculate the rotor was 30%, as recommended by the authors tahani, kothe and fujisawa [16], [30], [31] and [37]. the aerodynamic power coefficient relates the aerodynamic power with the power available in the wind, expressed by the expression of cp = pa pv ⁄ [41]. the generator's theoretical electric current (𝐼𝑔 ), for a voltage of 12v defined by equation (3). where 𝜔 corresponds to angular speed [31]. ig = −0,0024 ∙ ω 2 + 0,4138 ∙ ω + 7,6 (3) 3. results the turbines were designed in autodesk inventor software and printed on a 400 x 400 x 400 mm 3d printer (fig. 7 and fig 8) with polylatic plastic (pla) filament witch was chosen in order to a better printing and to have a less materials residues and burs. the stages were built and fitted separately with fittings to enable coupling. the figs. 6, 7, 8 and 9 were based on the characteristics of the fig. 2 with dimensions, 𝐷𝑓 = 0,055 𝑚, 𝐷 = 0,04889 𝑚, 𝑑 = 0,03 𝑚, 𝐻 = 0.186014 𝑚 and 𝑒 = 0,009 𝑚. the calculations were performed in equations (1) and (2). the theoretical results for the two-stage savonius rotors applied to this work, obtained through equation (1), (2) and (3) are presented in table 1. fig. 6 assembly of the structure of the savonius turbine with its two stages, straight blades and shaft, developed by authors fig. 7 and 8 show the savonius turbine with the central shaft dismantled and assembled, printed on 3d printer. 1 the overlap ratio 𝛽 in savonius turbines is the ratio between the overlap of the blades "e" and their chord length (𝑑). thus, this is beneficial for the wind turbine due to the increase in pressure caused in the concave region of the return blade. however, it also generates pressure reduction in the concave region of the advance blade [25]. 234 e. avallone, p. h. palota, p. c. mioralli, et al. fig. 7 savonius turbine printed in 3d, developed by authors fig. 8 savonius turbine printed in 3d and assembled, developed by authors fig. 9 assembly of the structure of the savonius turbine with its two stages, straight blades and no shaft, developed by authors savonius micro wind turbine: a theoretical analysis 235 to generate the results of table 1, the input data were considered as specific mass 𝜌𝑎𝑖𝑟 = 1.23 𝑘𝑔 𝑚 3⁄ , dynamic air viscosity 𝜇𝑎𝑖𝑟 = 0.000015 𝑘𝑔 𝑚. 𝑠⁄ , projected area 𝐴𝑝 = 0.009447 𝑚 2, aspect ratio α1 ≈ 4,0 and peripheral velocity 𝑉𝑝 = 5 𝑚 𝑠⁄ . table 1 theoretical results for the savonius rotor name value 𝑅𝑒 20788.22 𝑉𝑎𝑖𝑟 [m/s] 5 𝑉𝑝𝑒𝑟𝑖𝑝ℎ [m/s] 5 𝜔 [rad/s] 23.47 𝜆 0.12 𝐶𝑝 0.043 𝐶𝑚 0.378 𝑃𝐸 [w] 0.031 𝑃𝐸 (x2) [w] 0.063 𝑀 [n.m] 0.001 𝑛 [rpm] 224.11 𝛽 [%] 30 𝐼𝑔 [a] 0.79 table 1 presents the theoretical results obtained from the equation presented by menet [12]. the wind power for 2 stages in this project was 0.063 w, as the power variation in relation to rotation is not linear. another important factor to consider is that the overlap ratio is very large, thus causing a reduction in power, which has already been presented as a footnote in section 2.2. the other results obtained in table 1 are derived from the dimensions of the designed savonius rotor. 4. conclusion the present work sought to expand the application of vertical rotor wind turbines with the use of artificial winds from the development and analysis of computer-assisted design of micro aero generator turbine savonius two blades with axial shaft. using the literature review and theoretical results, the mechanical torque and aerodynamic power were obtained and analyzed in the proposed savonius turbine configuration. the analysis carried out in this work is fundamentally theoretical, and to determine the viability of these types of turbines, other analyzes are needed, such as the recovery wind energy obtained from experimental models. several research fields need further deeply study. a proposal to install a savonius micro turbine printed on a 3d printer coupled to a micro electric generator may have satisfactory results through future studies. in this way, it will be possible to compare theoretical results with experimental data through electronic measurements using a datalogger system. another alternative for future work would be to apply the turbine in bus routes to large centers in order to capture wind energy from the passage of buses and transform it into electrical energy. it is also possible to monitor the capture of this energy through an 236 e. avallone, p. h. palota, p. c. mioralli, et al. intelligent mobile system, as an example, the one developed for monitoring microclimatic parameters [42]. the literature review is an important tool in the study of this work. thus, we concluded that the savonius model with greater efficiency is the one with two stages without a central axis. through the experimental further analysis, it will be possible too to certify the more efficiency in two blades with no axis when compared with no shaft as searched in the current literature review. acknowledgmenent: to the federal institute of education, science and technology of são paulo for the constant encouragement. nomenclature symbol name units 𝐴𝑃 rotor projected area [𝑚 2] 𝐶𝑃 aerodynamic power coefficient [𝑎𝑑𝑖𝑚𝑒𝑛𝑠𝑖𝑜𝑛𝑎𝑙] 𝐶𝑚 torque coefficient [𝑎𝑑𝑖𝑚𝑒𝑛𝑠𝑖𝑜𝑛𝑎𝑙] 𝐷𝑓 end plate diameter [𝑚] 𝐼𝑔 generator current [𝐴] 𝑃𝐴 aerodynamic power [𝑊] 𝑃𝑉 wind power [𝑊] 𝑉𝑎𝑖𝑟 air speed [𝑚 𝑠⁄ ] 𝑉𝑝𝑒𝑟𝑖𝑓 peripheral speed [𝑚 𝑠⁄ ] 𝑉𝑡𝑎𝑛𝑔 tangential speed [𝑚/𝑠] 𝛼1 aspect ratio [𝑎𝑑𝑖𝑚𝑒𝑛𝑠𝑖𝑜𝑛𝑎𝑙] 𝐷 rotor diameter [𝑚] 𝐻 rotor height [𝑚] 𝑀 torsion moment [𝑁. 𝑚] 𝑅𝑒 reynolds number [𝑎𝑑𝑖𝑚𝑒𝑛𝑠𝑖𝑜𝑛𝑎𝑙] 𝑉 air speed [𝑚/𝑠] 𝑑 diameter of rotor half cylinder [𝑚] 𝑒 spacing between the two half cylinders [𝑚] 𝑛 rotation [𝑟𝑝𝑚] 𝛽 overlap ratio [𝑎𝑑𝑖𝑚𝑒𝑛𝑠𝑖𝑜𝑛𝑎𝑙] 𝜆 speed coefficient [𝑎𝑑𝑖𝑚𝑒𝑛𝑠𝑖𝑜𝑛𝑎𝑙] 𝜌 air density [𝑘𝑔 𝑚3⁄ ] 𝝎 angular speed [𝑟𝑎𝑑 𝑠⁄ ] references [1] e. b. da s cabral et al., “theoretical analysis of vertical micro wind turbines”, in proceedings of the 7th virtual international conference on science technology and management in energy, niš, serbia, 2022, pp. 407-411. [2] h. b. o. arruda, "mapeamento das emissões de gases de efeito estufa em uma empresa do setor energético", conexoes, vol. 12, no. 3, p. 108, 2018. [3] eurostat, "renewable energy on the rise", 2019. https://ec.europa.eu/eurostat/web/products-eurostat-news//ddn-20220126-1 (accesed 31 of august 2022). [4] u. lisboa, energia eólica: kit científico de energias renováveis, vol. 1, 1 vols. lisboa-portugal: museu de ciências da universidade de lisboa, 2009. savonius micro wind turbine: a theoretical analysis 237 [5] r. f. m. santos, "detecção de mudança da característica de produção de parques eólicos", master thesis, porto, porto portugal, 2008. [online]. available at: https://repositorio-aberto.up.pt/bitstream/10216/ 58768/2/texto%20integral.pdf [6] g. de a. nunes and a. a. magalhães, "energia eólica no brasil: uma alternativa inteligente frente às demandas elétricas atuais", bolsista de valor: revista de divulgação do projeto universidade petrobras e if fluminense, vol. 1, p. 163-167, 2010. [7] b. marcele medeiros monteiro and v. s. de q. varella, "fontes de energia renováveis", rio de janeiro. [online]. available at: http://www.solar.coppe.ufrj.br/eolica/eol_txt.htm [8] o. c. amarante, j. zack and m. brower, "atlas do potencial eólico brasileiro”, brazil, 2021. [online]. available at: http://www.cresesb.cepel.br/publicacoes/download/atlas_eolico/atlas%20do%20potencial%20eolico%20brasileiro. pdf [9] m. a. b. mroz, atlas eólico do estado de são paulo, vol. 1, 1 vols. são paulo: governo do estado de são paulosecretaria da energia, 2012. [online]. available at: https://dadosenergeticos.energia.sp.gov.br/portalcev2/intranet/ bibliovirtual/renovaveis/atlas_eolico.pdf [10] gwec, "global wind report”, global wind energy council, brussels, belgium, 2019. [online]. available at: https://gwec.net/wp-content/uploads/2020/08/annual-wind-report_2019_digital_final_2r.pdf [11] c. d. ôlo, "projecto de uma turbina savonius com utilização de componentes em fim-de-vida”, master thesis, faculdade de ciência e tecnologia universidade nova de lisboa, lisboa-portual, 2012. [online]. available at: https://run.unl.pt/bitstream/10362/8876/1/olo_2012.pdf [12] j.-l. menet, "a double-step savonius rotor for local production of electricity: a design study", renew. energy, vol. 29, no. 11, pp. 1843-1862, sept. 2004. [13] a. m. biadgo, a. simonovic, d. komarov and s. stupar, "numerical and analytical investigation of vertical axis wind turbine", fme, vol. 41, no. 1, pp. 49-58, 2013. [14] d. m. prabowoputra, a. r. prabowo, a. bahatmaka and s. hadi, "analytical review of material criteria as supporting factors in horizontal axis wind turbines: effect to structural responses", procedia struct. integ., vol. 27, pp. 155-162, 2020. [15] m. t. tolmasquim, energia renovável: hidráulica, biomassa, eólica, solar, oceânic, 1o ed, vol. 1, 1 vols. rio de janeiro: empresa de pesquisa energética (epe), 2016. [online]. available at: https://www.epe.gov.br/sites-pt/publicacoes-dados-abertos/publicacoes/publicacoesarquivos/publicacao172/energia%20renov%c3%a1vel%20-%20online%2016maio2016.pdf [16] m. tahani, a. rabbani, a. kasaeian, m. mehrpooya and m. mirhosseini, "design and numerical investigation of savonius wind turbine with discharge flow directing capability", energy, vol. 130, pp. 327-338, july 2017. [17] s. b. garcia, g. c. da s. simioni and j. a. v. alé, "aspectos de desenvolvimento da turbina eólica de eixo vertical", in proceedings of the conem 2016 iv congresso nacional de engenharia mecânica, recife pe brazil, 2006, vol. 1. [online]. [18] a. fazlizan, w. chong, s. yip, w. hew and s. poh, "design and experimental analysis of an exhaust air energy recovery wind turbine generator", energies, vol. 8, no. 7, pp. 6566-6584, june 2015. [19] e. aymane, "savonius vertical wind turbine: design, simulation and phisical testing", specialization monograph, al akhawayn university, marroc, 2017. [online]. available at: http://www.aui.ma/ssecapstone-repository/pdf/spring-2017/savonius%20vertical%20wind%20turbine%20%20design%20simulation%20and%20physical%20testing.pdf [20] j. v. akwa, "análise aerodinâmica de turbinas eólicas savonius empregando dinâmica dos fluidos computacional", master thesis, universidade federal do rio grande do sul, porto alegre, rs, 2010. [online]. available at: https://lume.ufrgs.br/bitstream/handle/10183/26532/000756688.pdf?sequence=1&isallowed=y [21] s. j. savonius, "wind rotor us1766765a", us1766765a [online]. available at: https://patentimages.storage. googleapis.com/b4/a5/b0/c9503e83cfe1c0/us1766765.pdf [22] u. k. saha, s. thotla and d. maity, "optimum design configuration of savonius rotor through wind tunnel experiments", j. wind eng. ind. aerodyn., vol. 96, no. 8-9, pp. 1359-1375, aug. 2008. [23] m. h. ali, "experimental comparison study for savonius wind turbine of two & three blades at low wind speed", int. j. modern eng. res., vol. 3, no. 5, pp. 2978-2986, oct. 2013. [24] h. kana, aerodynamics of wind turbines, university of london, london, uk, 2011. [25] l. s. bianchin, d. beck and d. j. seidel, "influência do número de estágios no torque estático da turbina eólica savonius", revista thema, vol. 17, no. 2, pp. 309-317, june 2020. [26] f. wenehenubun, a. saputra and h. sutanto, "an experimental study on the performance of savonius wind turbines related with the number of blades", energy procedia, vol. 68, pp. 297-304, apr. 2015. [27] t. ogawa and h. yoshida, "the effects of a deflecting plate and rotor end plates on performances of savonius-type wind turbine", bulletin of jsme, vol. 29, no. 253, pp. 2115-2121, 1986. 238 e. avallone, p. h. palota, p. c. mioralli, et al. [28] i. s. utomo, d. d. d. p. tjahjana and s. hadi, "experimental studies of savonius wind turbines with variations sizes and fin numbers towards performance", in proceedings of the 1st international conference and exhibition on powder technology indonesia (icepti), jatinangor, indonesia, 2018, p. 030041. [29] e. s. caser and g. da m. paiva, "projeto aerodinâmico de uma turbina eólica de eivo vertical (teev) para ambientes urbanos", degree project, universidade federal do espírito santo, vitória-es, 2016. [online]. available at: https://mecanica.ufes.br/sites/engenhariamecanica.ufes.br/files/field/anexo/2._pg_final__eduardo_caser_giuseppe_paiva.pdf [30] l. b. kothe, "estudo comparativo experimental e numérico sobre o desempenho de turbinas savonius helicoidal e de duplo-estágio", master thesis, universidade federal do rio grande do sul, porto alegre, rs, 2016. [online]. available at: https://lume.ufrgs.br/bitstream/handle/10183/141901/000993090.pdf?sequence=1&isallowed=y [31] n. fujisawa, "on the torque mechanism of savonius rotors", j. wind eng. ind. aerodyn., vol. 40, no. 3, pp. 277-292, 1992. [32] m. zemamou, m. aggour and a. toumi, "review of savonius wind turbine design and performance", energy procedia, vol. 141, pp. 383-388, dec. 2017. [33] k. r. abdelaziz, m. a. a. nawar, a. ramadan, y. a. attai and m. h. mohamed, "performance improvement of a savonius turbine by using auxiliary blades", energy, vol. 244, pp. 122575, apr. 2022. [34] b. munson, d. f. young, t. h. okiishi and w. w. huebsch, fundamental of fluid mechanics. estados unidos da américa: john wiley & sons, 2009. [35] k. horikiri, "aerodynamics of wind turbines”, ph.d. thesis, queen mary, london, uk, 2011. [online]. available at: http://qmro.qmul.ac.uk/jspui/handle/123456789/1881 [36] j. v. akwa, h. a. vielmo and a. p. petry, "a review on the performance of savonius wind turbines”, renew. sust. energy rev., vol. 16, no. 5, pp. 3054-3064, june 2012. [37] l. b. kothe, s. v. möller and a. p. petry, "numerical and experimental study of a helical savonius wind turbine and a comparison with a two-stage savonius turbine", renew. energy, vol. 148, pp. 627-638, apr. 2020. [38] j.-l. menet and a. leiper, "prévision des performances aérodynamiques d’un nouveau type d’éolienne à axe vertical dérivée du rotor savonius", in proceedings of the 17ème congrès français de mécanique, france, sept. 2005, vol. 1, pp. 1-6. [39] i. ushiyama and h. nagai, "optimum design configurations and performance of savonius rotors", wind eng., vol. 12, no. 1, pp. 59-75, 1988. [40] b. g. newman, "measurement on a savonius rotor with variable gap", in wind energy, achievements and potential, symposium sherbrook, canada, 1974, pp. 115-136. [41] j. a. schetz and a. e. fuhs, handbook of fluid dynamics and fluid machinery: fundamentals of fluid dynamics. hoboken, nj, usa: john wiley & sons, inc., 1996. [42] d. danković and m. djordjević, "a review of real time smart systems developed at university of nis", fu: elec. energ., vol. 33, no. 4, pp. 669-686, 2020. 10902 facta universitatis series: electronics and energetics vol. 36, no 1, march 2023, pp. 103-119 https://doi.org/10.2298/fuee2301103b © 2023 by university of niš, serbia | creative commons license: cc by-nc-nd original scientific paper design and implementation of digital controller in delta domain for buck converter arka biswas1, arindam mondal2, prasanta sarkar3 1department of aerospace engineering, iit kharagpur, west bengal, india 2department of electrical engineering, dr bc roy engineering college, durgapur, west bengal, india 3department of electrical engineering, nitttr kolkata, west bengal, india abstract. this paper presents the design and implementation of a discrete-time controller for a dc-dc buck converter in the complex delta domain. whenever any continuous-time system is sampled to get a corresponding discrete-time system with a very high sampling rate, the shift operator parameterized discrete-time system fails to provide meaningful information. there is another discrete-time operator called delta operator. in the delta operator parameterized discretetime system, the discrete-time results and continuous-time results can be obtained hand to hand, rather than in two special cases at a very high sampling rate. the superior property of the delta operator is capitalized in this paper to design the proposed controller in the discrete domain. the proportional plus integral (pi) controller designed in the delta domain is used to maintain the output voltage of the buck converter at the load end for varying load and varying supply voltage conditions. the controller is designed and implemented using the ds1202 dspace board. the output voltage of the buck converter is scaled to feed to the onboard analogue to digital converter of ds1202. under the different disturbances, the error between the desired output voltage and the actual output voltage is measured and the delta pi controller is used to manipulate the duty cycle of the converter. the duty cycle of this pulse width modulation (pwm) signal is generated using a ds1202 board and is applied to the gate of the metal oxide semiconductor field-effect transistor (mosfet) via a suitable driver such that the output voltage of the buck converter remains at its desired value. key words: buck converter, delta domain, digital controller, dspace board, pi controller 1. introduction the sources of conventional energy are decreasing day by day and the supply-demand gap is therefore increasing. this leads to a growing demand for non-conventional sources of energy. the output of most of the renewable sources is dc voltage and also, they are not stabilized. for the stabilization and conversion from one dc voltage level to another dc voltage level, one of the most important power-electronic circuits called the dc-dc converter, is used [1]. the maximum power point tracking is a very important area for received july 08, 2022; revised august 21, 2022; accepted september 01, 2022 corresponding author: arindam mondal department of electrical engineering, dr bc roy engineering college, durgapur, west bengal, india e-mail: arininstru@gmail.com 104 a. biswas, a. mondal, p. sarkar maximization of solar power and is done through an electronic circuit consisting of a dcdc converter [2]. there are two types of dc-dc converters available, one is the buck converter, and another is the boost converter. for the reduction of voltage level buck converter is used. the buck converter is widely used for the dc motor drive control [3], renewable systems [4], [5], [6] as it is one of the most interesting power electronics circuits which converts the uncontrollable dc input into controllable dc output. whenever the supply voltage varies or the load is changed, there is a possibility of changing the output voltage of the buck converter, thereby calling for a proper choice of controller [7]. in [8], robust adaptive control (rac) approach using system identification methodologies has been illustrated for controlling of buck converter by pwm (pulse width modulation) in the presence of input voltage as well as load variations. pid controllers are used for controlling the output voltage of the buck converter [9] for low-power applications such as powering led. as the buck converter itself is a nonlinear system, the control effects of the system on voltage can be improved through the use of fractional-order pid controllers [10]. in [11], rct digital robust control is used to overcome the instability issues caused by the negative resistance effect of constant power load. nonlinear least squares optimization methodbased digital controllers can be used for controlling the high-frequency buck converter [12]. through this approach, performance of the controller is optimized through the polezero-cancellation (pzc) technique and the adverse effects of the undesired poles on the buck converter power stage are drastically reduced. a derivative-free nelder–mead (n– m) simplex method for designing a digital controller for buck converter operating in high frequency is depicted in [13] for the improvement of rise-time and settling time. the proportional-integral (pi) controller gives zero steady-state error and the simplest of all the controllers is generally used in different dc-dc converters [14]. digital controllers are always better than analog controllers, therefore used for the controlling of dc-dc buck converter. by using the digital control strategy, the algorithm or program can be easily altered. the digital pid controller can improve the performance of the buck converter by varying loop gain, cross-over frequency and phase margin [15]. the control algorithm developed through the shift operator parameterization finds defects for highfrequency applications for buck converter [16]. digital controller design using the delta domain is better than the controller designed using the shift operator, particularly when the sampling rate is very high. the advantages and application of the delta operator in control theory are elaborated in [17], [18], and [19]. the delta operator has the diversified nature of giving results in the digital domain which is again equivalent to the continuous ‘s’ domain, basically at high frequency. the discrete ‘z’ transfer function approximation turns out to be very sensitive even if there is a slight change in the values of the coefficient but the transfer function in the digital delta domain, progresses significantly the robustness of the estimate to parameter changes [20]. in [21], the delta operator is used to reduce the order of the model of a system which helped to save some extra bits in a digital system. the superior property of the delta () operator is used in the case of fault detection and network control [22], for kalman filter-based controller design used in cyber-physical systems [23]. to check the packet losses in the sensor to controller link or controller to actuator link, the delta operator is successfully applied for lyapunov-krasovskii functional design in the field of limited communication [24]. a delta domain-based pi controller is designed [25] for indirect field-oriented control (ifoc) for controlling an induction motor and the superiority of delta parameterised discrete-time system is proved. at a very high sampling frequency, the continuous-time results are the obvious outcome from discrete-time design and implementation of digital controller in delta domain for buck converter 105 measurements. the selection of sampling frequency is very much important during discretization. the sampling frequency must be 10 times the maximum frequency of the system to suitably reproduce the signal. for the design of pi controller in discrete shift operator parameterization sampling rate cannot be made high as it becomes numerically ill at very high sampling limit, therefore, for high frequency digitally controlled switching converters, delta domain pi controllers are most suitable [26]. the pi controller instead of the pid controller is used in the case of certain types of work where the voltage has a smaller amount of ripple during load change from lower load to higher load. this will cause the drop of the output voltage to develop smaller than the essential size and the same goes for the opposite, therefore, only the pi controller is sufficient for regulatory the process to be stable. the regulatory process using the pi controller is satisfactory as well as it has wide use in industries since it got a simple structure and is cost-efficient as compared to pid controller [27], [28]. for more precise results, a fractional-order controller can also be used instead of the traditional integer-order controller using the discrete delta operator [29]. for finding the parameters of fractional order controller in delta domain alpha guided grey wolf optimization technique can be used [30]. the ds 1202 dspace board is one kind of surrounded system where the controller can be designed and simulated using the simulink and dspace block sets. the dspace has been successfully used for designing pid controllers for buck and boost converters [31], [32]. as the ds 1202 dspace board operates on the discrete-time platform, this can be used as a real-time controller for controlling the buck converter for getting the output at desired level irrespective of the load and supply voltage variation. the hardware implementation of the buck converter along with the controller formulated in the delta domain using ds 1202 dspace board has been presented in this paper. the realtime analyses as well as simulation results are obtained using matlab/simulink. the significant contributions made in this paper are as given below: in the earlier work, the digital controllers for buck converter have been designed using shift operator parameterization. the discrete-time systems so far designed are done using shift operator parameterization but shift operator parameterization fails to provide meaningful information at a high sampling rate. the real-time implementation of the controller in the digital domain needs a very high sampling rate to get a better result. this is the motivation to work on the implementation of a digital controller for buck converter using the delta operator parameterization. the most crucial part is that at a fast-sampling limit, the discrete domain results resemble that of the continuous-time results in the delta operator parameterized system. moreover, the discrete-time pi controller for buck converter, designed in the delta domain is implemented using the ds1202 dspace board which acts as the real-time controller with built-in adc having a much higher resolution than any other microcontrollers. by using the realized controller, the output of the dc-dc buck converter provides a stable desired output voltage. therefore, digital design and implementation of pi controller for buck converter using delta operator parameterization is a newer concept and a new direction for further research. this paper is organized in the following way. the basics of the buck converter are discussed in section 2. in section 3, the control algorithm based on delta-operator for dc-dc buck converter is described. the simulation and practical result analysis are illustrated in section 4. finally, section 5 is devoted to the conclusion. 106 a. biswas, a. mondal, p. sarkar 2. buck converter 2.1. topology the buck converter topology is used to step down the input voltage to a lower level. it consists of a power mosfet switch, a filter inductor l, a filter capacitor c, a freewheeling diode d and a resistive load rl. it operates either in continuous conduction mode or discontinuous conduction mode. figure 1(a) represents the present topology of the buck converter under consideration. 2.2. operation 2.2.1. mode 1 when the gate pulse is applied to the mosfet, current flows through l, c, and rl thus storing energy in the inductor. in this mode diode remains to reverse biased, the inductor current increases linearly and the load consumes energy from the source. fig. 1(b) shows the equivalent circuit for the model when the switch is on. the voltage and current equations during this mode are as follows: l di e l dt = (1) where el, is the inductor voltage. let the inductor current increases from i1 to i2, the kirchhoff’s voltage equation is written as 2 1 0dc on i i e e l t  − − =     (1) where edc is the supply voltage, e0 is the output voltage and ton is the on-time of the switch. peak to peak ripple current through the inductor l is defined as: 2 1 i i i = − (2) equation (2) can be rewritten as 0dc on i e e l t  − = (3) 2.2.1. mode 2 when the switch is off, the energy stored previously in the inductor acts as a source and current flows through c, rl and d. in this mode, the diode is in forward biased and conducts. fig. 1(c) shows the equivalent circuit of mode 2 when the switch is off. during toff, the inductor current falls linearly from i2 to i1 and therefore the output voltage is expressed as 0 off i e l t  − = − (4) where toff, is the off-time of the switch. comparing i from (3) and (4) and rearranging the variables, (5) is obtained. design and implementation of digital controller in delta domain for buck converter 107 0 ( ) off on dc on e t t e t+ = (5) the time period is defined as on off t t t= + . therefore equation (5) can be rewritten as 0 dc on e t e t= (6) defining the duty ratio / on t t as . the output equation can be expressed as 0 dc e e= (7) fig. 1 (a) an ideal buck converter, (b) mode 1: switch on, (c) mode 2: switch off the buck converter design parameters and the values of the components are detailed in table 1 table 1 buck converter design parameter and values parameter with symbol value units input voltage (edc) 8 volt load resistance (rl) 100 kω load inductance (ll) 100 h series inductor (l) 100 h esr of inductor (rl) 10 mω output capacitor (c) 1000 f esr of capacitor (rc) 30 mω forward drop across diode (vd) 0.7 volt esr of diode when conducting (rd) 0.01 ω drain-source resistance of mosfet (rt) 8 mω operating frequency 5 khz 2.3. choice of sampling rate though the nyquist sampling theory recommends considering the sampling frequency as twice the maximum frequency contained in the signal. the thumb rule is that the minimum sampling frequency has to be 10 times the maximum frequency of the system. therefore, the sample time will be 1/10th of the time constant. the transfer function of a buck converter with vo(s) as the output voltage and d(s) being the duty cycle is given below, lcrc s s lc v sd sv sg in o buck 1)( )( )( 2 ++       == (8) 108 a. biswas, a. mondal, p. sarkar the sampling time is related to the time constant, therefore before the sampling time is decided; the time constant has been calculated first. considering the value of r and c are 100k and 1000uf respectively the time constant is coming out as 0.005 sec. according to the nyquist theorem, the sampling time can be taken as 0.0005 sec or less. for the controller design in the delta domain, the sampling rate is considered as 0.00001 sec to study the behavior of the controller at a high sampling rate as well as to establish the philosophy of the proposed controller design in the delta domain. 3. the control algorithm based on delta-operator for dc-dc buck converter 3.1. delta operator the d / dt operator in the continuous domain is well known for modelling any dynamic system. it is defined as ( ) ( ) 0 lim t h t h x xd dt h + → − = (9) the urge for an operator which resembles this d/dt operator structurally as well as functionally in the discrete domain led to the development of the delta-operator () which is defined as ( ) nn x x  + − =  (10) where  is the sampling time. it is an incremental difference operator that works as a signal differentiator unlike signal shifting as the case with the shift operator. this is a shifted and scaled version of the shift operator. it can be shown easily that the response of the delta-operator converges with the d/dt operator of continuous-time as the sampling time tends to zero (0). this property can be understood by comparing the stable zone of continuous, shift and deltaoperator in the frequency domain. in the frequency domain, the d/dt operator is expressed by the laplace operator s and the stable zone of this operator is widely known which is the entire left half side of the splane. in the frequency domain, the shift operator is denoted by z and related to the laplace operator s as s z e  = (11) examining the positions of the poles, it is seen that the stable zone for the shift operator lies within a circle of radius 1 and the centre at the origin. in the frequency domain, the delta operator is defined as 1z  − =  (12) since it is a shifted and scaled version of the shift operator, the stable zone for the delta-operator is also get shifted and scaled. the stable zone of the delta operator lies in a circle of radius 1/ having the centre at (−1/, 0). fig.2 shows the stable zones of three domains. it can be observed that as the sampling time reduces, the stable zone of the delta-operator tends to converge with the stable zone of the continuous domain. thus, the design and implementation of digital controller in delta domain for buck converter 109 use of the delta-operator provides a unified approach to model, design, analyse, and implement the digital control scheme. fig. 2 (a) stability zone: s domain, (b) stability zone: z domain, (c) stability zone: -domain 3.2. the digital controller design based on delta-operator 3.2.1. pi controller design to control the dc-dc buck converter, a proportional and integral (pi) controller and a pwm generator are used. the pwm signal is required for the on/off operation of the mosfet of the buck converter. the pwm control technique is one of the popular control methods for any switching devices. in this experiment, the pwm signal is generated digitally to trigger the mosfet of the circuit. the duty cycle of the pwm signal is controlled using the pi controller. the proposed pi controller is designed in the discrete delta domain and is simulated using matlab/simulink before being implemented through the dspace. the mathematical equations for a pi controller in continuous, shift and delta domain are as follows: ( ) ( )ip k u s k e s s   = +    (13) ( ) ( ) 1 1 i p k u z k e z z −   = +  −  (14) since no γ-1 operator is available in matlab, z-1 module is used to represent the γ-1. the relation can be derived from equation (12) as: 1 1 1 1 − − − −  = z z  , therefore, ( ) ( ) 1 1 . . 1 i p k z u k e z   − −   = +  −  (15) where kp and ki are proportional gain and integral gain respectively. e is the error and u denotes the control signal. the representation of the γ-1in simulink is shown in fig. 3. 110 a. biswas, a. mondal, p. sarkar fig. 3 representation of γ-1 in matlab simulink the transfer function of the pi controller in  domain can be obtained from equation (14) by using equation (12). the simulations of the pi controller in the three stated domains are given in fig. 4. 3.2.2. ziegler-nichols approach for tuning of pi controller the ziegler-nichols approach for tuning industrial controllers is most well-known [33] and mostly favored by process control engineers in practice [34]. in this work, the ziegler-nichols approach is used to find out the pi controller parameters for buck converter in the continuous time domain. the integral gain (ki) of the controller is set to zero and proportional gain is slowly increased till a sustained oscillation is observed. the value of proportional gain (kp) for which sustained oscillation received is called critical gain and denoted by kc. the frequency of oscillations is measured and is called as critical frequency (fc). the values of kp and ki are tabulated as per the guidelines of ziegler & nicholos and given in table 2. table 2 setting of pi controller parameters using ziegler-nichols rule controller kp kp pi 0.45 kc 1.2 fc the value of kp and ki are optimised through the guidelines of ziegler-nichols’ chart. the optimised values of kp and ki obtained are 0.22 and 0.01 respectively. the continuous time transfer function of pi controller is given by (16). s sgpi 01.0 22.0)( += (16) corresponding  -domain transfer functions of the pi controller is expressed by (17) and the controller structure as given in (17) is realized using matlab/simulink and dspace board for the implementation of pi controller in the delta domain. 1 01.0005.022.0)( − ++= pig (17) fig. 4 (a) pi controller in the continuous domain, (b) pi controller in the discrete z domain, (c) pi controller in the discrete  domain design and implementation of digital controller in delta domain for buck converter 111 3.2.3. mechanism for design of digital controller in delta domain the complete mechanism for the design of the proposed pi controller for buck converter in discrete delta domain is illustrated with a flowchart as shown in fig. 5. fig. 5 flowchart describing the complete mechanism of controller design in the delta domain 4. simulation and practical results fig. 6 shows the schematic diagram of the proposed work. fig. 6 schematic diagram of the proposed method 112 a. biswas, a. mondal, p. sarkar 4.1. simulation the experiment has been simulated first using matlab/simulink in sim electronics module. the simulation of closed-loop control of dc-dc buckconverter using continuous-time pi controller and pi controller in delta domain along with the dc-dc buck converter for r load has been depicted in fig. 7. fig. 7 simulink model for closed-loop control of dc-dc buck converter using (a) continuous time pi controller with r load, (b) delta domain pi controller with r load 4.2. hardware implementation in this work, the controller used for the control action is built with the dspace microlab board. in the year 2000, at bradley university, the dspace ds1102 was first used after developing the user’s manual and a workstation based on this board. after that, a newer dspace ds1103 board has been developed. in this experiment, the latest version of dspace ds1202 has been used. the design and simulation of the controller are done using the matlab simulink and the dspace block sets, the matlab-to-dsp interface libraries, real-time interface to simulink, and real-time workshop on a pc. the output from the ds1202 includes the pwm signal to trigger the gate of mosfet of the dc-dc buck converter. in this work, the dspace ds1202 system is used for the implementation of the control system; it is a mixed fpga/dsp digital controller consisting of a powerful processor for the computation of floating-point. the pci slot of the host computer is plugged with the key of ds1202. the control system is automatically processed and run in the ds1202 after being developed using matlab/simulink. a graphical user interface (gui) has been built using dspace. it allows the realtime evaluation of the control system. the “control desk” is used for multiple services. it has the provision for interfacing using which, the controller model that has been designed in simulink can be downloaded onto the dsp. various measurements viz., the regulated output (voltage and current), the duty cycle of the pwm signal and error to the controller can be displayed at the instrument panel feature of the control desk. the primary objective of using “ds1202” is as an interface between the external hardware portion of the overall system and the simulation. the ds1202 contains connectors for thirtytwo (32) analog-to-digital inputs and sixteen (16) digital-to-analog outputs; there are forty design and implementation of digital controller in delta domain for buck converter 113 eight (48) other connectors that can be used for digital i/o, slave/dsp i/o, incremental encoder interfaces, can interface and serial interfaces. the adc that is used for feedback the output voltage is of 16 bits, i.e., it represents the values between 0 to 65535. therefore, the resolution of adc is (8/65535) v =0.122 mv. now, the converter regulates the voltage to 4v which needs to be represented by 32765.5. but decimals cannot be represented due to the finite word length effect, so the reference voltage is represented by 32765. the adc resolution error using dspace is much less compared to the adc of 8 bit which is normally included in the microcontroller. the inherent error is 0.122/2 = 0.061 mv in the reference voltage. the adc resolution error can be small as 0.122 mv. the pwm generator used here, is an in-built pwm generator that takes the control input as the duty ratio and generates a pwm signal accordingly at a given frequency. the amplitude of the pwm signal can be varied over the range of 2.5 v, 3.5 v, and 5 v. fig. 8 shows the circuit implementation of buck converter and fig. 9 shows the simulated delta domain pi controller with dspace i/o blocks. fig. 8 the circuit implementation of the dc-dc buck converter fig. 9 discrete pi controller in the delta-domain with dspace rti blocks 114 a. biswas, a. mondal, p. sarkar 4.3. result analysis 4.3.1. simulation result fig. 10 shows the response of the continuous-time pi controller and the response of the pi controller using the delta domain with low and high sample rates. the resemblance of the response of the pi controller designed in the delta domain at a high sampling rate is also shown here. fig. 10 simulation result with 100 kω resistance (a) continuous domain pi controller response, (b) delta domain pi controller response with a sample rate of 0.5 sec, (c) delta domain pi controller response with a sample rate of 0.00001 sec fig. 11 shows the simulation result of the complete closed loop system under different load conditions. at first, the load is taken to be purely resistive and varied over the range of 100 ω to 100 kω. subsequently, a 100 h inductor is added to test the behaviour of the system under inductive load. in each case, the controller output remains at a steady desired output voltage of 4 v. design and implementation of digital controller in delta domain for buck converter 115 fig. 11 simulation of the system under different load conditions. (a) with 100 ω resistance, (b) with 100 kω resistance, (c) with r-l load consisting of 100 kω resistance and100 h inductor in each case, the current variation is shown with the variation of the load. the output voltage regulation of the buck converter with the load variation is thus depicted in terms of current variation in fig. 11. 116 a. biswas, a. mondal, p. sarkar 4.4. real-time experimental results fig. 12 shows the complete hardware setup with microlab dspace board, designed buck converter, cpu and dso. fig. 12 complete hardware setup with 4 volts as reference the load is varied over the ranges from 100 ω and 100 kω. with the variation of load resistances, the output of the buck converter is set at almost 4 v at its output which is the desired set point. the variation of the output with the changes in load resistances is depicted in figure 13. the output of the buck converter at a load resistance of 100 ω is shown in figure 13(a) whereas, figure 13(b) is used to illustrate the changes in output voltage with a load resistance of 100 kω. therefore, it is evident that the controller is successfully working for the output voltage regulation of the buck converter for load variation. (a) (b) fig. 13 regulation of output voltage for load resistance of (a) 100 ω (b) for 100 kω the system is also tested with the variation of source voltages keeping the set point fixed at 4 volt. in fig. 14, output voltage regulation for a variety of source voltages like 6v, 8 v and 20 v is depicted. it is observed that with the different input voltage levels, the output of the buck converter is stable at the desired set point which is 4.0 v in this experiment. thus, the controller is working efficiently to maintain its output level fixed even if there are any changes in the source voltages. design and implementation of digital controller in delta domain for buck converter 117 (a) (b) (c) fig. 14 regulation of output voltage with the variation of the source voltages of (a) 6 volt, (b) 8 volt, (c) 20 volt pwm signal with different duty cycles are generated through the dspace board and are shown in fig. 15. the pwm signal is varied with the changes of the reference voltages. (a) (b) fig. 15 pwm signal fed to the buck converter at (a) 75% duty cycle, (b) 50% duty cycle 5. conclusion the work in this paper deals with the development of the digital pi controller in the delta domain and its implementation in ds1202 dspace board for the dc-dc buck converter. the mathematical analysis, simulations and experiments are conducted using a pi-compensated buck converter. it is found that the performance of the chosen dc-dc buck converter is satisfactory under variation of supply voltage as well as load. the output voltage changes are only 0.25% when the load and supply voltage are varied as can be shown in fig. 11, fig. 13 and fig. 14 respectively. by varying the duty cycle of pwm using the designed controller in dspace, the output voltage is adjusted proportionately as shown in fig. 15. from the simulation result as given in fig. 10, it is found that the sampling time (δ) is reduced up to 0.00001sec and desired result is obtained which is again almost same as that of the output obtained by using the continuous-time controller. from this result, it is proved that at a fast sampling limit the delta parameterised system provides meaningful information. the mathematical derivations, simulations, and experiments performed in this paper conclude that the delta operator parameterized discrete-time controller’s exhibit certain numerical advantages and at a high sampling rate the results of the continuoustime controllers are almost same as the results obtained by using the controller designed in delta domain. this leads to the development of a unified approach for digital controller design for buck converter using the delta operator. 118 a. biswas, a. mondal, p. sarkar references [1] n. mohan, t. m. undeland, & w. p. robbins, power electronics: converters, applications, and design, john willey & sons, new york, 2002. [2] g. dileep & s. n. singh, "selection of non-isolated dc-dc converters for solar photo voltaic system", renewable sustainable energy review, vol. 76, pp. 1230-1247, 2017. [3] a. bhaumik, y. kumar, s. srivastava, & m. islam, 2016, "performance studies of a separately excited dc motor speed control fed by a buck converter using optimized piλdμ controller", in proceedings of the int. conf. circuit, power comput. technol. (iccpct), nagercoil, india, 2016. [4] m. hassanalieragh, t. soyata, a. nadeau, & g. sharma, "ur-solarcap: an open-source intelligent autowakeup solar energy harvesting system for supercapacitor-based energy buffering", ieee access, vol. 4, pp. 542-557, 2016. [5] q. xu, c. zhang, c. wen, & p. wang, "a novel composite nonlinear controller for stabilization of constant power load in dc microgrid, ieee trans. smart grid", vol.10, no. 1, pp. 752-761, 2010. [6] d. kumar, f. zare, & a. ghosh, "dc microgrid technology: system architectures, ac grid interfaces, grounding schemes, power quality, communication networks, applications, and standardizations aspects", ieee access, vol. 5, pp. 12230-12256, 2017. [7] liu qingpeng, research on buck converter based on linear feedback control [d], northeast petroleum university, 2012. [8] m. ghamari, h. mollaee, f. khavari, "robust self-tuning regressive adaptive controller design for a dc–dc buck converter", measurement, vol. 174, 109071, 2021. [9] c. deekshitha & k. latha shenoy, "design and simulation of synchronous buck converter for led application", in proceedings of the 2nd ieee international conference on recent trends in electronics information & communication technology, bangalore, india, 2017, pp. 142-146. [10] z. yichen, x. hejin & l. deming, "feedback control of fractional piλdμ for dc/dc buck converters", in proceedings of the international conference on industrial informatics -computing technology, intelligent technology, industrial information integration, wuhan, china, 2017, pp. 219-222. [11] a. m. abdurraqueeb, a. a. al-shamma’a, a. alkhuhyali, a. m. noman, k.e. addoweesh, "rst digital robust control for dc/dc buck converter feeding constant power load", mathematics, vol. 10, id. 1782, p. 15, 2022. [12] g. abbas, j. gu, u. farooq, m. irfan abid, a. raza, m. asad, v. e. balas and m. e. balas, "optimized digital controllers for switching-mode dc-dc step-down converter", electronics, vol. 7, no. 12, id. 412, p. 25, 2018. [13] g. abbas, m. nazeer, v. balas, t.-c. lin, m. balas, m. asad, a. raza, m. shehzad, u. farooq and j. gu, "derivative-free direct search optimization method for enhancing performance of analytical design approach-based digital controller for switching regulator", energies, vol. 12, no. 11, id. 2183, p. 18, 2019. [14] k. r. kumar & s. jeevananthan, "design of sliding mode control for negative output elementary super lift luo converter operated in continuous conduction mode", in proceedings of the communication control and computing technologies (icccct), ramanthapuram, india, 2010. pp. 138-148. [15] k. sharma & d. k. palwalia, 2017, "design of digital pid controller for voltage modecontrol of dc-dc converters", in proceedings of the international conference on microelectronic devices, circuits and systems (icmdcs), vellore, india, 2017. [16] l. h. guang, w. bo & l. g. you, 2005, "delta operator control and its robust control theory basis (m), national defence industry press, beijing. [17] r. h. middleton & g. c. goodwin, 1990, digital control and estimation-a unified approach, prentice-hall, englewood cliffs. new jersey. [18] r. h. middleton & g. c. goodwin, 1986, "improved finite wordlength characteristics in digital control using delta operators", ieee transactions on automatic control, vol. 31, no. 11, pp. 1015-1021. [19] j. cortes-romero, a. luviana-juarez & h. sira-ramirez, 2013, "a delta operator approach for the discretetime active disturbance rejection control on induction motors", mathematical problems in engineering, vol. 2013, id. 572026, p. 9, 2013. [20] g. maione, "high-speed digital realisation of fractional operators in delta domain", ieee transactions on automatic control, vol. 56, no. 3, pp. 697-702, 2011. [21] s. ganguli, g. kaur & p. sarkar, "a hybrid intelligent technique for model order reduction in the delta domain: a unified approach", springer nature, soft computing, vol. 23, pp. 4801-4814, 2018. [22] y. zhao & d. zhang, "h∞ fault detection for uncertain delta operator systems with packet dropout and limited communication", in proceedings of the american control conference, seattle, wa, usa, 2017, pp. 4772-4777. [23] j. gao, s. chai, m. shuai, b. zhang & l. cui, "detecting false data injection attack on cyber-physical system based on delta operator", in proceedings of the 37th chinese control conference, wuhan china, 2018. https://ieeexplore.ieee.org/xpl/conhome/8169966/proceeding https://ieeexplore.ieee.org/xpl/conhome/8169966/proceeding design and implementation of digital controller in delta domain for buck converter 119 [24] j. zhou, d. zhang, ieee access, multidisciplinary, vol. 7, id. 94448, 2019. [25] a. mondal, p. sarkar, a. hazra, "a unified approach for pi controller design in delta domain for indirect fieldoriented control of induction motor derive", journal of engineering research, vol. 8, no. 3, pp. 118-134, 2020. [26] b. l. eidson, 2010, an experimental evaluation of delta operator in digital control, auburn, alabama. [27] i. laoprom, s. tunyasrirut, "design of pi controller for voltage controller of four-phase interleaved boost converter using particle swarm optimization", journal of control science and engineering, id. 9515160, p. 13, 2020. [28] k. s. rao, r. mishra, "comparative study of p, pi and pid controller for speed control of vsi-fed induction motor", international journal of engineering development and research, vol. 2, no. 2, pp. 2740-2744, 2014. [29] l. a. quezada-téllez, l. franco-pérez, "guillermo fernandez-anaya, 2020, controlling chaos for a fractional-order discrete system", ieee open journal of circuit and systems, vol. 1, pp. 263-269, 2020. [30] p. hu, s. chen, h. huang, g. zhang, l. liu, "improved alpha-guided grey wolf optimizer", ieee access, vol. 7, pp. 5421-5437, 2018. [31] t. s. anandhi, k. muthukumar & s.p. natarajan, "dspace based implementation of pid controller for buck converter", in proceedings of the dspace user conference, 2012. [32] v. r, r. g, k. k. b & a. k. g, "dspace based 12/24v closed loop boost converter for low power applications", in proceedings of the international conference on computation of power, energy, information and communication, chennai, india, 2014, pp. 213-217. [33] ogata. k, modern control system, 1987, university of minnesota, prentice hall. [34] n pillai, p.a. govender, particle swarm optimization approach for model independent tuning of pid control loop, ieee africon, ieee catalog: 04ch37590c, 2007. https://ieeexplore.ieee.org/author/37088578546 https://ieeexplore.ieee.org/author/37088577841 https://ieeexplore.ieee.org/author/38273888200 instruction facta universitatis series: electronics and energetics vol. 30, n o 3, september 2017, pp. 383 390 doi: 10.2298/fuee1703383v exploration towards electrostatic integrity for sige on insulator (sg-oi) on junctionless channel transistor (jlct)  b vandana, jitendra kumar das, b shivaval patro, sushanta kumar mohapatra school of electronics engineering, kiit university, bhubaneswar, odisha, india abstract. in view of reduced electric field and avoiding source drain engineering, the work exploresstrain effect in junctionless channel transistor. to achieve scaled ioff and maintain ion, here the device sg-oi jlct is proposed. the study discusses higher switching action with mole fraction x = 0.25. the dependency of ϕm and the nd is responsible for maintaining constant current for overall analysis. key words: sg-oi jlct, soi jlt, drift diffusion carrier mobility, on-off currents. 1. introduction the interpretation of the si based semiconductor industries started in 1959 and is still continuing in following moore’s law. scaling technology contributed different leakage currents in conventional metal oxide semiconductor (mos) field effect transistor (fet) which internally affects the device performance. this gives a challenging notation to device engineers. a brief description of various leakage currents is given in [1] describing the issues at scaled channel length. in order to overcome these issues various challenges are addressed such as high-κ gate oxide engineering, spacers engineering and new materials and structural design etc. are reported [2], [3], and therefore process technology device structures have been invented. the new design architectures such as silicon on insulator (soi) [4], double gate mosfet (dgmosfet) [5], [6], tri gate mosfet (tmosfet) [7], gate all around (gaa-mosfet) [8], fin-fet [9], [10] etc., are briefly described. apart from this, a new device structure has been identified such as lilienfeld’s first transistor architecture [11], and followed with various structural design approach as a trigate architecture with no doping gradients is given in [12]. accordingly, a vertical gate stack soi and bulk planar junctionless transistor are reported in [13]–[15]. received september 29, 2016; received in revised form january 23, 2017 corresponding author: b vandana school of electronics engineering, kiit university, bhubaneswar, odisha, india (e-mail: vandana.rao20@gmail.com) 384 b. vandana, j. k. das, b. s. patro, s. k. mohapatra junctionless nanowire (jn) transistor are uniform with heavy doping profile of 10 19 to 10 20 cm -3 with in a si device layer, the jn is usually a on resistor which do not require any metallurgical junctions across the channel edges. depending on the type of transistor n + or p + is doped along s/d channel regions. this approach is well simplified and fabricated with standard cmos technology [16], [17]. the physics behind the architecture is given for lg < 20-nm. for n-type mosfet, due to n+ doping concentration, a high electric field is generated along the vertical direction, which makes the channel fully depleted below vth with vgs = 0 v, and above vth field drops to zero. therefore due to its specific merits jlt along with different structures are preferable for scaling short channel effects. the paper proposes a sige on insulator (sg-oi) [18]–[21] using junction-less channel transistor. nd is taken as sige to evaluate the performance of the device with respect to ioff. the parameters listed in table 1 are used to investigate electrostatic integrity of the device. the obtained results are verified with soi-jlt and conventional mosfet. fig 2 shows that our simulation model is in well agreement with [13]. with the inherent features of the jlt, a si1-0.25ge0.25 mole fraction (x) material is taken along s/d and channel regions with uniformly high nd. the conduction mechanism of jlt shows the difference in ϕm ϕs (jlt conducts above 5 ev work function) leads to the positive shift in vth and bands becomes flat at vfb, which then takes a path for the conduction at positive vgs. the channel depletes completely at zero vgs. at high nd mobility degrades perpendicularly to channel and with low electric field enhances mobility. along with introduction, section ii discusses the device structure and physics behind the device that carried out simulations, and activated models for the simulations. section iii describes the study of electrostatic integrity of sg-oi jlct. section iv deals with conclusion and remarks of the proposed device. 2. silicon germanium on insulator junction-less channel transistor (sg-oi jlct) the schematic diagram of sg-oi jlct is shown in fig.1. the architecture is carried out with no metallurgical junction in lateral direction, hence named jlct. according to the features and specifications listed in table 1 the device has been designed, and the parameter specifications are taken from [4], [13], [14]. fig. 1 cross sectional view of sg-oi jlct exploration towards electrostatic integrity for sige on insulator (sg-oi) on junctionless ... 385 junctionless transistors are the devices with no doping gradients across source channel and drain edges. usually, the device layer is doped with high doping density. the gate metal is taken at high work function ϕm of 5.1 ev. for isolation purpose a sio2 is considered. the spacers are provided with high-κ hfo2 [22], [23]. a fully depleted sige layer is grown epitaxial on an insulator (fd sg-oi jlct) forming a conducting path across the device layer. the strain induce effect occur when a sige layer is grown epitaxial on a thin silicon substrate. [24], [25] a simple identification of the device performance is represented by considering a relaxed sige layer, with the change in the molefraction value. the model used for the simulation is default drift-diffusion carrier transport mobility model. the mobility model is then dependent on the doping concentration with high field saturation carrier densities and transverse field dependence. as sige is compound material, a mole fraction dependent effective intrinsic density band gap narrowing model for sige is used for the device. the structure assumes to be abrupt and taken at room temperature. in order to solve this, a self-consistent drift-diffusion equation is used. due to its high nd across lateral direction and oldslotboom band gap narrowing model and schottky-read-hall mechanism is accounted. the model calculates the intrinsic carriers for silicon material. it then improves the carrier mobility under high field saturation. the overall simulations are carried out using sentaurus tcad 2d simulator [26], [27]. fig. 2 comparison of id,lin with respect to vgs plot for sg-oi jlct at vds = 1 v and [13] with simulation lg = 20-nm, ϕm = 5.1 ev, nd = 1.5e19 cm -3 table 1 parameter required for simulation [28][], [13]. parameters sg-oi jlct conventional mosfet sige layer (tsi) for sg-oi jlct 5-nm 5-nm donor doping (nd) 1.5x10 19 cm -3 10 18 cm -3 eot of gate dielectric (tox) 1-nm 1-nm gate work function (ϕm) 5.1 ev 4.6 ev well doping (na) 5x10 18 cm -3 10 15 cm -3 drain supply voltage (vdd) 0.05 v, 1 v 0.05 v, 1 v channel length (lg) 20-nm 20-nm 386 b. vandana, j. k. das, b. s. patro, s. k. mohapatra 3. results and discussions the section deals with the results and discussions that carryout for the simulations with the parameter values of vds = 0.05 v for id,lin and vds = 1 v for id,sat with vgs = 1 v. basically the paper deals with the electrostatic integrity (ei) parameter which usually has short channel effects and dibl given in equation 2 and 3. this induces a qualitative control on the channel through the gate. in short channel devices the channel is predominated by gate with affecting electric field lines from source to drain. as the approach is fully depleted, sg-oi ei is shown in equation 1, most of the electric field lines propagate through box to channel which can reduce sce. further this has an inconvenience of increasing junction capacitance and body effect [30]. firstly from fig. 4 our model is well suitable for reducing ioff at 10 -13 (a) and ion is maintained 10 -06 (a) which is then compared with [13]. a comparative analysis is shown for conventional mosfet, soi jlt and a sg-oi jlct. the main intention behind the analysis is to scale ioff, the challenges for scaling ioff is [28], (1) having a thin channel region, (2) considering high κ spacers, which improves the ioff and (3) temperature doping dependent channel is considered. 2 2 1 si ox si box el elel t t t t ei l ll         (1) 0.80 si ds ox dibl eiv    (2) 0.64 si bi ox sce eiv    (3) fig. 3 comparative analysis of id,lin with respect to vgs is shown for soi jlct and sg-oi jlct. with ϕm = 5.1 ev, nd = 1.5e19 cm -3 and lg = 20-nm fig. 3 comparison of id,lin with respect to vgs is shown for soi jlct and sg-oi jlct. ion is improved in case of sg-oi jlct and ioff shows better improvement in soi jlct. this shows that at x = 0.25 the values are similar to those given in [13]. as the sige is a compound material, there is possibility of a varying band gap from 0.6 to 1.1 ev. this variation of bandgap is obtained due to tuning the molefraction value (x = 0.25, x = 0.5, and x= 0.75). the composition of si is high in content; therefore the device exploration towards electrostatic integrity for sige on insulator (sg-oi) on junctionless ... 387 acquires the si material characteristics though the channel is maintained to be sige material. however, the band gap value of si is 1.1 ev, but for sige at x = 0.25 the bandgap value is almost near to 1.1 ev. if the molefraction x = 0.75 the bandgap value is near to 0.6 ev, hence the channel acts according to the ge material properties shown in fig. 4. [31] provide ge mosfet advancement in electrical performance which represents the switching activity and the mobility enhancement to that of an si mosfet [32]. fig. 5 shows the dibl as a function of ion is represented for both sg-oi jlct and soi jlct. at x = 0.25 ion is improved in case of sg-oi jlct but dibl remain equal for both the devices and at x = 0.75 ion found to be less and dibl is very high which is not considerable. in order to improve ion and dibl for x = 0.75 a proper tuning of nd and work-function is suggested. fig. 4 energy with respect to distance “x” along the channel for sg-oi jlct is shown. for si1-xgex channel (x = 0.25, 0.5, 0.75), vdsat = 0.7 v and tsi = 5 nm is given. fig. 5 dibl with respect to ion for both sg-oi jlct and soi jlct is shown. vd,lin = 0.05 v, vd,satm = 1 v and x = 0.25, 5, 0.75 fig 6 investigates the impact of ϕm on ion and ioff, as jlt works ϕm > 5 ev the performance of the device is shown accordingly. it is clear that at ϕm > 5.2 ev ion and ioff start degrading. though the device takes the si material properties, the concentration of the ge at sige channel will affect the electric field at low vth. hence results in ioff improvement. [14] jlt as si channel has ϕm = 5.5 ev. in the proposed work, fig. 3 compares id,lin function of vgs plotted with x = 0.25 for sg-oi jlct. as the value of x increases a 388 b. vandana, j. k. das, b. s. patro, s. k. mohapatra drastic degradation of device performance takes place, as shown in fig. 7 with respect to ion/ioff ratio. fig. 6 impact of metal work function ϕm on ion and ioff of the sg-oi jlct with x = 0.25, nd = 1.5e19 cm -3 , eot = 1-nm with lg = 20-nm fig. 7 ion/ioff of the sg-oi jlct with different x composition (x = 0.25, 0.5, 0.75), ϕm = 5.1 ev, nd = 1.5e19 cm -3 , eot = 1-nm with lg = 20-nm 4. conclusion the paper investigates an improvement in ioff current and maintaining ion at 10 -6 amp's. a conduction mechanism of sg-oi jlct with the concept of relaxed sige on insulator is explained. the id,lin and id,sat values at x = 0.25, ϕm = 5.1ev is considered to estimate the ioff. therefore from the above results sg-oi jlct performs better at x = 0.25 by activating the drift-diffusion carrier mobility and srh mechanism for high field saturation mobility model using sentaurus tcad 2d simulator. and study towards electrostatic integrity can then be evaluated. exploration towards electrostatic integrity for sige on insulator (sg-oi) on junctionless ... 389 references [1] k. roy, s. mukhopadhyay, and h. mahmoodi-meimand, “leakage current mechanisms and leakage reduction techniques in deep-submicrometer cmos circuits,” proc. ieee, vol. 91, no. 2, pp. 305–327, 2003. [2] m. t. bohr, r. s. chau, t. ghani, and k. mistry, “the high-k solution: microprocessors entering production this year are the result of the biggest transistor redesign in 40 years,” ieee spectr., vol. 44, no. 10, pp. 23–29, 2007. [3] s. das and s. kundu, “simulation to study the effect of oxide thickness and high-dielectric on draininduced barrier lowering in n-type mosfet,” ieee trans. nanotechnol., vol. 12, no. 6, pp. 945–947, 2013. [4] j.-p. colinge, “soi materials,” in silicon-on-insulator technology: materials to vlsi, springer, 1997, pp. 7–65. [5] f. balestra, s. cristoloveanu, m. benachir, j. brini, and t. elewa, “double-gate silicon-on-insulator transistor with volume inversion: a new device with greatly enhanced performance,” ieee electron device lett., vol. 8, no. 9, pp. 410–412, 1987. [6] s. k. mohapatra, k. p. pradhan, and p. k. sahu, “some device design considerations to enhance the performance of dg-mosfets,” trans. electr. electron. mater., vol. 14, no. 6, pp. 291–294, 2013. [7] m. g. c. de andrade, j. a. martino, m. aoulaiche, n. collaert, e. simoen, and c. claeys, “behavior of triple-gate bulk finfets with and without dtmos operation,” solid. state. electron., vol. 71, pp. 63– 68, 2012. [8] j.-p. colinge, m. h. gao, a. romano-rodriguez, h. maes, and c. claeys, “silicon-on-insulator’gate-allaround device’,” in technical digest. of the international electron devices meeting, 1990. iedm’90., 1990, pp. 595–598. [9] b. ho, x. sun, c. shin, and t.-j. k. liu, “design optimization of multigate bulk mosfets,” ieee trans. electron devices, vol. 60, no. 1, pp. 28–33, 2013. [10] e. a. cartier, b. j. greene, d. guo, g. wang, y. wang, and k. k. h. wong, “finfet structure and method to adjust threshold voltage in a finfet structure.” google patents, 2015. [11] j. e. lilienfeld, “method and apparatus for controlling electric currents,” 1925. [12] j.-p. colinge, i. ferain, a. kranti, c.-w. lee, n. d. akhavan, p. razavi, r. yan, and r. yu, “junctionless nanowire transistor: complementary metal-oxide-semiconductor without junctions,” sci. adv. mater., vol. 3, no. 3, pp. 477–482, 2011. [13] j.-p. colinge, c.-w. lee, a. afzalian, n. d. akhavan, r. yan, i. ferain, p. razavi, b. o’neill, a. blake, m. white, a.-m. kelleher, b. mccarthy, and r. murphy, “nanowire transistors without junctions,” nat. nanotechnol., vol. 5, no. 3, pp. 225–229, 2010. [14] a. kranti, r. yan, c. w. lee, i. ferain, r. yu, n. d. akhavan, p. razavi, and j. p. colinge, “junctionless nanowire transistor (jnt): properties and design guidelines,” in proc. of the essderc, 2010, pp. 357–360. [15] s. gundapaneni, s. ganguly, and a. kottantharayil, “bulk planar junctionless transistor (bpjlt): an attractive device alternative for scaling,” ieee electron device lett., vol. 32, no. 3, pp. 261–263, 2011. [16] c.-w. lee, i. ferain, a. afzalian, r. yan, n. d. akhavan, p. razavi, and j.-p. colinge, “performance estimation of junctionless multigate transistors,” solid. state. electron., vol. 54, no. 2, pp. 97–103, 2010. [17] c.-w. lee, a. afzalian, n. d. akhavan, r. yan, i. ferain, and j.-p. colinge, “junctionless multigate field-effect transistor,” appl. phys. lett., vol. 94, no. 5, p. 53511, 2009. [18] t. irisawa, t. numata, e. toyoda, n. hirashita, t. tezuka, n. sugiyama, and s. takagi, “physical understanding of strain-induced modulation of gate oxide reliability in mosfets,” ieee trans. electron devices, vol. 55, no. 11, pp. 3159–3166, 2008. [19] m. a. hopcroft, w. d. nix, and t. w. kenny, “what is the young’s modulus of silicon?,” j. microelectromechanical syst., vol. 19, no. 2, pp. 229–238, 2010. [20] k. p. pradhan, p. k. sahu, d. singh, l. artola, and s. k. mohapatra, “reliability analysis of charge plasma based double material gate oxide (dmgo) sige-on-insulator (sgoi) mosfet,” superlattices microstruct., vol. 85, pp. 149–155, 2015. [21] c. k. maiti and g. a. armstrong, applications of silicon-germanium heterostructure devices. crc press, 2001. [22] w. j. zhu, t. tamagawa, m. gibson, t. furukawa, and t. p. ma, “effect of al inclusion in hfo 2 on the physical and electrical properties of the dielectrics,” ieee electron device lett., vol. 23, no. 11, pp. 649–651, 2002. 390 b. vandana, j. k. das, b. s. patro, s. k. mohapatra [23] m. wu, y. i. alivov, and h. morkoc, “high-$κ$ dielectrics and advanced channel concepts for si mosfet,” j. mater. sci. mater. electron., vol. 19, no. 10, pp. 915–951, 2008. [24] p. goyal, “design and simulation of strained-si/strained-sige dual channel hetero-structure mosfets,” 2007. [25] y. sun, s. e. thompson, and t. nishida, “physics of strain effects in semiconductors and metal-oxidesemiconductor field-effect transistors,” j. appl. phys., vol. 101, no. 10, p. 104503, 2007. [26] http://www.synopsys.com/, “sentaurus tcad user’s manual,” in synopsys sentaurus device, 2012. [27] l-2016.03, “sentaurus tm device user,” september, 2014. [28] s. gundapaneni, m. bajaj, r. k. pandey, k. v. r. m. murali, s. ganguly, and a. kottantharayil, “effect of band-to-band tunneling on junctionless transistors,” ieee trans. electron devices, vol. 59, no. 4, pp. 1023–1029, 2012. [29] “the international technology roadmap for semiconductors,” 2015. [30] j. p. colinge, “the new generation of soi mosfets,” rom. j. inf. sci. technol, vol. 11, no. 1, pp. 3– 15, 2008. [31] d. p. brunco, b. de jaeger, g. eneman, j. mitard, g. hellings, a. satta, v. terzieva, l. souriau, f. e. leys, g. pourtois, and others, “germanium mosfet devices: advances in materials understanding, process development, and electrical performance,” j. electrochem. soc., vol. 155, no. 7, pp. h552-h561, 2008. [32] s. c. martin, l. m. hitt, and j. j. rosenberg, “p-channel germanium mosfets with high channel mobility,” ieee electron device lett., vol. 10, no. 7, pp. 325–326, 1989. preparation of papers in a two-column format for the 21st annual conference of the ieee industrial electronics society facta universitatis series: electronics and energetics vol. 30, n o 1, march 2017, pp. 27 38 doi: 10.2298/fuee1701027z calculation model for the induced voltage in rectangular coils above conductive plates  siquan zhang 1 , nathan ida 2 1 department of electrical and automation, shanghai maritime university, shanghai, 201306, china 2 department of electrical and computer engineering, the university of akron, akron, oh, 44325-3904, usa abstract. electromagnetic ndt methods and in particular eddy currents play an important role in nondestructive testing of conducting materials. in testing conductive structures, rectangular coils are often more useful than circular coils. a particular configuration consists of two rectangular coils located above the conductive plates, one placed parallel to the plates serving as an excitation coil and the other perpendicular to the plates serving as a sensing coil. in this work we derive analytical expressions for the induced voltage variations in the pick-up coil. then the influences of the plate thickness, the exciting frequency and the moving speed of the conductor on the induced voltage variation are analyzed. the analytical calculation results are verified using the finite element method. key words: eddy current testing, conductive plates, rectangular coil, induced voltage, finite element method. 1. introduction eddy current testing (ect) techniques are widely used in testing of conductive structures with advantages of high sensitivity when testing for surface flaws [1-3]. in standard eddy current testing a circular coil carrying current is used to test the conductive specimen. the alternating current in the coil generates an alternating magnetic field, which interacts with the test specimen and generates eddy currents. however, rectangular coils are more useful than circular coils, because the rectangular coil is not axisymmetric, hence it affects the field inside the medium resulting in higher sensitivity to sub-surface flaws [4]. in spite of these advantages, rectangular coils have been seldom discussed in the literature. in this paper, we analyse a model with two rectangular coils, one serving as the exciting coil and the other is the pick-up coil, both located above the conductive plates. the conductive materials’ characteristics or parameters of flaws can be evaluated  received august 17, 2016 corresponding author: nathan ida department of electrical and computer engineering, the university of akron, akron, oh, 44325-3904, usa (e-mail: ida@uakron.edu) 28 s. zhang, n. ida from the induced voltage variation in the pick-up coil. the validity of the theoretical analysis is confirmed by the finite element method (fem). 2. theoretical analysis 2.1. analytical model fig. 1 shows two rectangular single-turn coils located above multi-layer conductive plates. the exciting coil is parallel to the surface of the conductor which coincides with the z = 0 plane. the dimensions of the exciting coil are 2a1, 2b1 and a lift-off z0. an ac harmonic current tjie  flows in the coil. the pick-up coil is parallel to the yz plane and perpendicular to the conductor, it has dimensions of 2a2, 2b2 and a lift-off z0+w2. the thickness, conductivity and permeability of the two layer conductive plate are assumed to be di, σi and μi (i =1, 2) and the conductive media are assumed to be linear, isotropic and homogeneous. 11 ,  22 ,  i 1a 1b 1 d 2 a 2 b 0 x c 20 wz  0 z 0 yo x y z v 2 d fig. 1 filamentary rectangular coils above a multi-layer conductor to simplify the analysis, the solution region is divided into region 0, 1 and 2. in region 0 (z > 0), the incident magnetic flux density bi generated by the exciting current and the reflected magnetic flux density br generated by inducted eddy currents exist simultaneously. the incident magnetic flux density bi can be expressed by the vector potential ai as: jai 0 (1) ii ab  (2) the reflected magnetic flux density br satisfies the following: 0 rb (3) 0 2  r b (4) region 1 )0(  zd is the top conductive plate. the magnetic flux density b1 in this region satisfies the following: 2 1 1 1 1 1 1 1 0 b b v j b y           (5) 1 0b  (6) calculation model for the induced voltage in rectangular coils above conductive plates 29 region 2 )( dz  is the lower conductive plate. the magnetic flux density b2 in this region satisfies: 0222 2 222 2     bj y b vb  (7) 02  b (8) to solve these equations, the double fourier transform and its inverse are introduced:         dxdyezyxbzb yxj )( ),,(),,(   (9)            ddezbzyxb yxj )( 2 ),,( 4 1 ),,( (10) where ξ and η are the integration variables. 2.2. incident magnetic flux density the single filamentary rectangular coil consists of four finite length wires, as shown in fig.1. by solving (1), the vector potential generated at an arbitrary point ),,( zyxp by a source point )',','( zyx in the coil can be written as:  v r dvzyxj zyxa ')',','( 4 ),,( 0   (11) where j is the current density in the coil, v is the coil segment carrying current, r is the distance of ),,( zyxp to the source point )',','( zyx as follow: 222 )'()'()'( zzyyxxr  (12) performing the fourier transform on (11), the expression of the vector potential in the region z < z0 is obtained as: '} 1 {)',','( 4 ),,( )(0 dvdxdye r zyxjza v yxj                  v zzyxj dveezyxj ' 1 )',','( 2 22 0 22 )''(0    (13) similarly, the components of the incident magnetic flux density are obtained by performing the fourier transform on (2): z a ajb y zx     , z x y aj z a b     , xyz ajajb   (14) as shown in fig. 1, the wire parallel to the x axis satisfies izyxj )',','( , 0' yy  and z  z0 < 0. substituting these into (13), the x component of the vector potential becomes:     v zzyxj x dveezyxja ' 1 )',','( 2 22 0 22 )''(0        0 0 0 22 0 ' 2 ' 22 )( 0 x x xjyj zz dxee ei     0 22 0 22 )( 0 2 yj zz e ei          )sin(2 0 x 22 )( 00 22 00)sin(      zzyj eexi (15) 30 s. zhang, n. ida similarly, the wire parallel to the y axis satisfies izyxj )',','( , 0' xx  and z  z0 < 0, substituting into (13), the y components of the vector potential becomes:     v zzyxj y dveezyxja ' 1 )',','( 2 22 0 22 )''(0        0 0 0 22 0 ' 2 ' 22 )( 0 y y yjxj zz dyee ei     0 22 0 22 )( 0 2 xj zz e ei          )sin(2 0 y 0 22 0 22 )( 00 )sin( xj zz e eyi        (16) the x components of the magnetic flux density can be obtained by substituting (15) and (16) into (14) as follows: )},,(),,({ 01120112 1 zbaazbaa zz a b yy y ix        22 0 )(110 )sin()sin(2      zz e baij (17) similarly, the y and z components of the magnetic flux density can be obtained as: 22 0 )(110 )sin()sin(2      zz iy e baij b (18) 22 0 )(11 22 0 )sin()sin(2      zz iz e bai b (19) the general solution for the z component of the incident magnetic flux density in region 0 is: 22    z iziz ecb (20) where the coefficients ciz are: 22 0 )sin()sin(2 11 22 0      z iz e bai c (21) 2.3. reflected magnetic flux density performing the fourier transform on (4), the reflected magnetic flux density in region 0 can be expressed as: 0)( 22 2 2    r r b z b  (22) in similar fashion, performing the fourier transform on (5) and (7), the magnetic flux density in region 1 and 2 can be expressed as: 0)( 11111 22 2 1 2    bjvj z b  (23) 0)( 22222 22 2 2 2    bjvj z b  (24) the normal component of b and the tangential components of h must be continuous on the z = 0 and z = -d planes. calculation model for the induced voltage in rectangular coils above conductive plates 31 applying the continuity of bz, we obtain zrziz bbb 1 (z = 0) (25) zz bb 21  (z = d) (26) applying the continuity of hx, we obtain 1 1 0 )(  xrxix bbb   (z = 0) (27) 2 2 1 1  xx bb  (z = d) (28) applying the continuity of hy, we obtain 1 1 0 )(  yryiy bbb   (z = 0) (29) due to the fact that 0 j , the current density jz does not exist in regions 1 and 2, and we get: xy bb 11   (30) xy bb 22   (31) the following equations are obtained from (3) z b bj ry rz     (32) z b bj rx rz     (33) following similar steps, the following equations are obtained from (6) and (8): 0111     z b bjbj z yx  (34) 0222     z b bjbj z yx  (35) the coefficient of the reflected magnetic flux density is obtained by solving the above equations: izd d rz c penn penn d 1 1 2 2 )1(1 )1()1(      (36) where  cos ,  sin , 22   , m 12 1   , p m m    1 1 , n   1 10 , 1111 22 1  jvj  (37) let      d d penn penn 1 1 2 2 )1(1 )1()1( (38) 32 s. zhang, n. ida the coefficient of the reflected magnetic flux density becomes:    22 0)sin()sin(2 11 22 0    z rz ebai d (39) rzrx d j d    22      22 0)sin()sin(2 110    z ebaij (40) the x component of the reflected magnetic flux density becomes: 22    z rxrx edb    22 0 )( 110 )sin()sin(2    zz ebaij (41) the x component of the reflected magnetic flux density in region 0 is obtained by performing the inverse fourier transform on (41):            )sin()sin( 2 11 2 0 baij b rx   )( 0zze   ( )j x y e d d       (42) fig. 2 shows two multi-turn rectangular coils obtained by extending the two singleturn coils shown in fig. 1 in width and length respectively. the coil parallel to the surface of the conductor is the excitation coil and the coil perpendicular to the conductor is the pick-up coil. the turns of the excitation and pick-up coil are n1 and n2 respectively. the lower surfaces of the two rectangular coils are level with each other. 11 ,  22 ,  1 d c v o 1 z 1 h 1 w 1 a 2 a 2 b 1 b 2 w 2 h z x y 2 d fig. 2 configuration of two multi-turn rectangular coils the reflected magnetic flux density generated by the multi-turn rectangular exciting coil shown in fig. 2 is obtained by integrating (42) with respect to the width and length as follows:     11 1 1 0 0 11 1 hz z w rx total rx dpbdz hw n b 0 1 2 1 1 2 j in w h           }])sin[()])sin[({ 1 0 11  w dppbpa  1 1 0 1 0 { } z h z z e dz      ( )z j x y e e d d                 ddeee k hw inj yxjhzzzz )()()(1 11 2 10 ][ 2 111           (43) where calculation model for the induced voltage in rectangular coils above conductive plates 33 ][ 1 )( 0 111 11 1 0   hzz hz z z eedze     (44) 1 1 1 1 0 sin[( ) )]sin[( ) ] w k a p b p dp    )(2 )sin(])(sin[ 11111      bawba )(2 ])(sin[)sin( 11111      wbaba (45) fig. 3 shows a comparison of the variation of the reflected magnetic flux density’s x component as calculated from (43) and as simulated using maxwell 3d respectively. the results of the simulation are obtained by subtracting the x component of the magnetic flux density without the conductor from the x component of the magnetic flux density with the conductor. the points shown belong to the line between (-16,0,5) and (16,0,5) which is located below the exciting coil and above the conductive plate. it can be seen that the analytical calculation results agree with the simulated results very well. -20 -15 -10 -5 0 5 10 15 20 -40 -30 -20 -10 0 10 20 30 40 position along x axis (mm)  b x ( g a u s s ) fem fourier transform fig. 3 variations of the x component the magnetic flux density calculated from the analytical and fem simulation 3. induced voltage in pickup coil 3.1. magnetic flux penetrating through the pick-up coil to obtain the reflected magnetic flux penetrating through the multi-turn rectangular pickup coil shown in fig. 2, we first derive the reflected magnetic flux penetrating through the single-turn rectangular coil with lengths 2a2, 2b2, and assume it is located at (c, 0, zc), where zc = z1 + w2 + a2. the reflected magnetic flux penetrating through the single-turn coil is obtained by integrating (43) on the area of coil as: 34 s. zhang, n. ida 2 2 2 2 c c z a b total r rx x cz a b dz b dy           1 11 2 10 2 k hw inj        cj e  ][ )( 111  hzz ee   }{ 2 2 dze az az zc c         dddye b b jy  2 2 }{ 22 1 11 2 10     k hw inj        ][ )( 111  hzz ee   cze  cj e   ][ 22 aa ee    2sin( )b d d  (46) then the reflected magnetic flux penetrating through the multi-turn rectangular pickup coil is obtained by integrating (46) with respect to the width and length of pickup coil as follows: 2 2 2 / 2 2 0 / 2 2 2 w c h r c h n dp dc w h        22 1 2211 2 210     k hwhw ninj        ][ )()( 111  hzzzz cc ee   }{ 2/ 2/ 2 2 dce hc hc cj          2 22 0 )()( ][{ w papa ee   dddppb })](sin[ 2  cj e hkk hwhw ninj              ) 2 sin( 2 2 22 21 2211 2 210 1 2 2 1 2 2 1 (2 ) (2 ) [ ] z w a z w a h e e d d              (47) where   2 22 0 2 )()( 2 )](sin[][ w papa dppbeek   22 )()( 22 )()( 22 ])[(cos])[(sin 22222222        awawawaw eewbeewb 22 22 ])[sin(])[cos( 2222        aaaa eebeeb (48) 3.2. induced voltage in the rectangular pickup coil the relationship between the magnetic flux penetrating through the pickup coil and induced voltage is:   j dt d v  (49) therefore, the induced voltage can be derived as: 0 1 2 1 2 2 2 2 2 1 1 2 2 2 sin( ) 2 j cin n k k h v e w h w h               1 2 2 1 2 2 1 (2 ) (2 ) [ ] z w a z w a h e e d d              (50) 4. results the induced voltage variation of the rectangular pick-up coil is now calculated by considering the influencing factors based on the expressions derived in the previous section. the parameters of the coils and the conductive plates are given in tables 1 and 2 respectively. calculation model for the induced voltage in rectangular coils above conductive plates 35 table 1 parameters of the rectangular coil exciting coil pick-up coil a1 (mm) 12 a2 (mm) 3 b1 (mm) 12 b2 (mm) 5 z1 (mm) 1 z1 (mm) 1 w1 (mm) 2 w2 (mm) 5 h1 (mm) 8 h2 (mm) 2 turns 500 c (mm) 6 turns 300 table 2 parameters of the conductive plate top layer σ1 (s/m) 3.8×10 7 μr1 1 lower layer σ2 (s/m) 5.8×10 7 μr2 1 fig. 4 shows the induced voltage due to the conductive plates as a function of the excitation frequency. the thickness of the top-layer conductor is 200 μm and the thickness of the lower-layer is semi-infinite and both conductors are stationary. c is the distance from the center of the pick-up coil to the z axis. it can be seen from fig. 4 that the variation of the induced voltage increases with frequency. at any given exciting frequency, the pick-up coil with larger distance to the z axis has a higher induced voltage. 0 1 2 3 4 5 6 7 8 9 10 0 1 2 3 4 5 6 7 frequency (khz) r e a l p a rt o f in d u c e d v o lt a g e ( v ) c = 3 mm c = 6 mm c = 9 mm fig. 4 induced voltage in the pickup coil as a function of exciting frequency fig. 5 compares the induced voltage calculated from the analytical method and fem simulation. the analytical results are calculated as the square root of the sum of squares of the real and imaginary parts of the induced voltage. the results of the fem are the effective values of the induced voltage obtained in pick-up coil, simulated with a time-dependent formulation. 36 s. zhang, n. ida 0 1 2 3 4 5 6 7 8 9 10 0 1 2 3 4 5 6 7 8 frequency (khz) in d u c e d v o lt a g e i n p ic k -u p c o il ( v ) analytical method fem fig. 5 comparation of the induced voltage variation in rectangular pick-up coil from analytical and fem at different excitation frequency the induced voltages in the coil for different thicknesses of the top-layer conductor are shown in fig. 6. the excitation frequencies are fixed at 0.5, 2, and 5 khz respectively, and the conductor is stationary. the distance from the center of the pick-up coil to the z axis is fixed at 9 mm. the induced voltage variation initially increases with the thickness, then, at a specific thickness, the induced voltage reaches a maximum, followed by a decreases with increasing thickness. as can be seen from fig. 6, the higher excitation frequency produces a higher maximum at a smaller thickness, but the induced voltage decreases faster with increasing excitation frequency. 0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 thickness of top-layer conductor (mm) r e a l p a rt o f in d u c e d v o lt a g e ( v ) 0.5 khz 2 khz 5 khz fig. 6 induced voltage in pickup coil as a function of top-layer conductor thickness the speed characteristics are shown in fig. 7. the induced voltage variations are calculated at speeds from v = 0 to 50 m/s. the excitation frequency is fixed at 2 khz. fig. 7 shows the differences of the coils induced voltage at different speeds of the conductor relative to the calculation model for the induced voltage in rectangular coils above conductive plates 37 coils’ induced voltage when the conductor is stationary. the rectangular coils’ induced voltage variation keeps increasing with the moving speed of conductor, the maximum variation of induced voltage is achieved with the top-layer conductor of thickness 200 μm. 0 5 10 15 20 25 30 35 40 45 50 -35 -30 -25 -20 -15 -10 -5 0 moving speed of conductor (m /s) r e a l p a rt o f  v ( m v ) 50 m 100 m 200 m 1000 m fig. 7 induced voltage of pickup coil at different speed of conductor 5. conclusion a closed-form expression for the induced voltage between a pair of rectangular coils above a multi-layered conductive plate has been derived using a 2d fourier transform method. the excitation coil is parallel to the plates and the pickup coil is perpendicular to the conductor. we discussed the influencing factors on the induced voltage, such as the excitation frequency, the thickness of the top-layer conductor and the speed of the conductor. the calculation model and results can be extended and used in the forward model of quantitative detection for eddy current testing of multi-layer conductive structures. acknowledgment: the authors would like to thank the financial support by shanghai maritime university and the national natural science foundation of china (51175321). references [1] t. theodoulidis, n. poulakis, a. dragogias, "rapid computation of eddy current signals from narrow cracks", ndt&e international, vol. 43, pp. 13-19, 2010. [2] l. guohou, h. pingjie, c. peihua, "quantitative nondestructive estimation of deep defects in conductive structures", international journal of applied electromagnetics and mechanics, vol. 33 (3-4), pp. 12731278, 2010. [3] j.w. luquire, w.e. deeds, c.v. dodd, "alternating current distribution between planar conductors", journal of applied physics, vol.41 (10), pp. 3983-3991, 1970. [4] t.p. theodoulidis, e.e. kriezis, "impedance evaluation of rectangular coils for eddy current testing of planar media", ndt & e international, vol. 35(6), pp. 407-414, 2002. [5] y. lei, x. ma, "calculation of impedance in an eddy-current coil by numerical integration method", transactions of china electrotechnical society, vol. 11 (1), pp. 17-20, 1996. 38 s. zhang, n. ida [6] p. huang, z. wu, j. zheng, "inversion algorithms for multi-layered thickness measurement in eddy current testing", chinese journal of scientific instrument, vol. 26 (4), pp. 428-432, 2005. [7] t. theodoulidis, e. kriezis, "series expansions in eddy current nondestructive evaluation models", journal of materials processing technology, vol. 161 (5), 2005. [8] c.v. dodd, w.e. deeds, "analytical solutions to eddy current probe-coil problems", journal of applied physics, vol. 39 (6), 2829-2838, 1968. [9] y.u. yating, d.u. ping an, l.i. daisheng, "computational methods of coil impedance of eddy current sensor", chinese journal of mechanical engineering, vol. 43 (2), pp. 210-214, 2007. [10] j.-l. ren, h.-b. diao, j.-h. tang, "simulation of the lift-off effect of eddy current testing based on ansys", chinese journal of sensors and actuation, vol. 21 (6), pp. 967-971, 2008. instruction facta universitatis series: electronics and energetics vol. 30, n o 4, december 2017, pp. 571 584 doi: 10.2298/fuee1704571v toward acoustic noise type detection based on qq plot statistics * sanja vujnović, aleksandra marjanović, željko đurović, predrag tadić, goran kvaščev university of belgrade, school of electrical engineering, belgrade, serbia abstract. fault detection and state estimation using acoustic signals is a procedure highly affected by ambient noise. this is particularly pronounced in an industrial environment where noise pollution is especially strong. in this paper a noise detection algorithm is proposed and implemented. this algorithm can identify the times in which the recorded acoustic signal is influenced by different types of noise in the form of unwanted impulse disturbance or speech contamination. the algorithm compares statistical parameters of the recordings by generating a series of qq plots and then using an appropriate stochastic signal analysis tools like hypothesis testing. the main purpose of this algorithm is to eliminate noisy signals and to collect a set of noise free recordings which can then be used for state estimation. the application of these techniques in a real industrial environment is extremely complex because sound contamination usually tends to be intense and nonstationary. the solution described in this paper has been tested on a specific problem of acoustic signal isolation and noise detection of a coal grinding fan mill in thermal power plant in the presence of intense contaminating sound disturbances, mainly impulse disturbance and speech contamination. key words: acoustic signal, qq plot, noise detection, predictive maintenance 1. introduction it is well known that the largest financial loss for modern industrial plants is due to inefficient or untimely maintenance [2]. this is especially true for power plants which are designed to be in function for many decades after their construction. therefore, it is only logical that there is a significant amount of research done in an attempt to prolong the working life of the plant, improve the quality of its operation [3] and reduce unnecessary losses [4]. with this in mind, the fact that predictive maintenance has become a very received november 16, 2016; received in revised form february 21, 2017 corresponding author: sanja vujnović school of electrical engineering, university of belgrade, kralja aleksandra blvd. 73, 11120 belgrade, serbia (e-mail: svujnovic@etf.bg.ac.rs)  an earlier version of this paper received best section paper award at automatics section at 3rd international conference on electrical, electronic and computing engineering icetran 2016, zlatibor, serbia, 13-16 june, 2016 [1] 572 s. vujnović, a. marjanović, ţ. đurović, p. tadić, g. kvašĉev popular area of research is not so surprising. crucial aspects of predictive maintenance are fault detection and state estimation, i.e. the estimate of whether the fault has occurred somewhere within the system or whether certain components are worn and the maintenance needs to be done in order to replace them. the accelerometers are the sensors most commonly used for implementing predictive maintenance algorithms on rotating machinery. the logic behind this is sound: as the fault occurs within the rotating element or as the wear of some components becomes pronounced, the vibration of the machine is sure to change accordingly [5]. the sensors can measure this vibration and algorithms can be constructed which can, based on the change in vibration signal, detect the amount of wear of certain components. these techniques are widely used in the industry with much success [6]; however, an alternative has been presented in the early 90s. this alternative proposes the use of acoustic signals for the same purpose. it has been shown that sound recordings can be as informative as vibration signals when it comes to state estimation of components [7], but acoustic sensors (microphones) are cheaper to obtain and are contactless, which is a very important feature for certain types of processes. one major drawback of using microphones for predictive maintenance is the fact that they are very sensitive to ambient noise. this makes them less than ideal for the use in an industrial environment which is usually very polluted with contaminating noise. for this reason microphones are still rarely used for predictive maintenance in real industrial environments. one way to significantly increase the applicability of acoustic signals for this purpose is developing an algorithm capable of filtering out the acoustic noise caused by the surrounding events. there are many preprocessing algorithms developed in recent years for purpose of fault detection and state estimation. using one of the standard frequency filters is usually not applicable because it is very difficult (if not impossible) to determine the frequencies on which the noise is dominant. even if that can be established, usually the useful part of the signal exists on the same frequencies as well, so filtering out the noise would significantly damage the informative part of the signal. impulse disturbance in time domain, for example, is equally pronounced on all frequencies, so it cannot be filtered using traditional algorithms. taking this into consideration one can easily conclude that standard frequency domain analysis is not reliable enough for noise detection in acoustic signals. therefore advanced procedures should be used for this purpose, such as statistical analysis of the signal. statistical parameters of the recorded signal can be very informative in this case because different statistical behavior is expected when the noise occurs and when the signal is in its nominal form. one of the standard tools used for statistical comparison and analysis are qq plots and they are shown to be quite effective in this case [8]. the purpose of the algorithm proposed in this research is not removal, but rather detection of noise. the entire recording is separated into windowed signals, and each windowed segment is tested for noise. this is done by comparing the statistical distribution of the recorded signal against the statistical distribution of the signal in nominal working condition. the comparison is conducted using qq plots and neyman-pearson hypothesis test. the noisy sequences are discarded and those which are classified as nominal are saved for the purpose of state estimation or some other predictive maintenance procedure. toward acoustic noise type detection based on qq plot statistics 573 the algorithm developed in this research is seen as a part of a larger system of state estimation and fault detection mechanism of rotating elements in thermal power plants based on acoustic signals. it has been tested on real recordings taken in thermal power plant kostolac a1 in serbia, on a specific fan mill which is a part of coal grinding subsystem. it has been shown that state estimation of impact pates within a mill is possible only by using recordings from a microphone placed in the vicinity of the mill [9]. however, it has also been shown that noise can significantly influence the classification results. the purpose of this algorithm is to conduct signal preprocessing, so that the noise-free samples of the acoustic signals can be used for state estimation of impact plates of the mill. this paper is structured as follows. section 2 contains theoretical description of the algorithm used, mainly qq plots and neyman-pearson method of hypothesis testing. section 3 contains the description of the real industrial coal grinding subsystem in thermal power plants on which this algorithm has been tested. in section 4 the detailed results of the algorithm are given. here, the algorithm has been tested on nominal and noisy signals. furthermore, the effect of the change of certain parameters of the algorithm has been examined, as well as upgrade of the algorithm which enables it to be used for classification and not just noise detection. finally, the conclusions are presented in section 5. 2. qq plot as a tool for noise detection in nominal, stationary operation of the system it is assumed that the statistical parameters of the measured signals will remain constant. if, on the other hand, an event occurs that causes a deviation from nominal state (e.g. nonstationary ambient noise), statistical parameters of the recorded signals are expected to change in a certain way. therefore, the probability distribution of the recorded signal in nominal regime is going to be different from the distribution of the signal which is polluted with noise. this change is going to depend on the duration and the type of noise, so the statistical parameters can be used not only for noise detection, but for noise classification as well. 2.1. qq plot a very efficient graphical tool which is used to compare the expected and obtained probability distribution is a qq plot method [10]. this graph is obtained by plotting quantiles of the measured signal against the quantiles of the expected probability distribution. if the two distributions are similar, all the points in qq plot will approximately lie on the line . figure 1 shows a qq plot of an experimentally obtained zero mean unit variance gaussian distribution against its theoretical expectation. the application of this type of data inspection allows not only the comparison of two probability distributions, but also the identification of the distribution of recorded model. for example, if outliers occur at the end of the line, this means that the measured distribution has lager (or smaller) tails than the expected distribution. if all dots lie on the line, but the angle is not 45 o , then the variance of the expected distribution is not the same as in the measured signal. 574 s. vujnović, a. marjanović, ţ. đurović, p. tadić, g. kvašĉev fig. 1 experimentally obtained gaussian samples plotted against the theoretical distribution. fig. 2 contaminated gaussian distribution in time domain (upper left) with the appropriate qq plot (upper right) and laplace distributed sample data in time domain (lower left) with its qq plot (lower right). using these rules one can easily infer the shape of the probability distribution as a function of the expected distribution. for example, a gaussian signal polluted with noise is expected to contain large tails on the qq plot, as in fig. 2 (up). on the other hand, if the distribution of experimentally obtained signal is significantly different in nature than -3 -2 -1 0 1 2 3 -3 -2 -1 0 1 2 3 normal theoretical quantiles n o rm a l d a ta q u a n ti le s qq plot 0 200 400 600 800 1000 -5 0 5 n[sample] x [n ] -4 -2 0 2 4 -5 0 5 normal theoretical quantiles c o n ta m in a te d d a ta q u a n ti le s 0 200 400 600 800 1000 -5 0 5 n[sample] x [n ] -4 -2 0 2 4 -5 0 5 normal theoretical quantiles l a p la c e d a ta q u a n ti le s toward acoustic noise type detection based on qq plot statistics 575 the expected distribution, one will expect the deviation from axis for both lower and higher values of quantiles. this is shown in fig. 2 (down) where laplacian distributed experimental samples are plotted against the gussian distribution. the graph indicates that the obtained samples have higher values than the gaussian distribution will indicate and there is a curve for lower values as well. if the measured samples form a distribution ( ), an ordered nondecreasing sequence can be obtained, where for . here, represents the number of samples taken. by observing the ordered sequence , the formula for conditional probability can be obtained [8] which calculates the probability that measurement will have the rank in the said sequence: ( ) ( ) ( )( ( )) (1) 2.2. hypothesis testing qq plots in this research are used to represent the relationship between the measured signal distribution and the distribution of the signal in nominal working conditions. for this reason hypothesis testing is implemented in order to decide, based on the available data, whether the assumption of nominal working conditions is correct. if not, then the signal is considered polluted by noise and is discarded. the noise detection algorithm developed in this research relies heavily on eq. (1). in order to successfully implement it several initial calculations need to be performed. first the expected probability distribution in nominal regime (when there is no noise) needs to be established. then, after calculating nominal probability density function , the discriminant boundaries should be determined. if all the samples of the qq plot lie within these boundaries, then the recorded signal is in nominal working condition, i.e. there is no noise. if, on the other hand, points on the qq plot find themselves beyond the calculated boundaries, the fault has occurred, and the recorded samples are dismissed. there are two objectives which must be taken into account when establishing valid bounds on the qq plot. the first objective is maximization of the probability that the noise-free recordings will be classified as valid. the second objective is minimization of the probability that faulty recordings will be falsely classified as valid. therefore, a tradeoff needs to be made, and as a solution a variation of neyman-pearson method [11,12] for hypothesis testing has been chosen. this means that the probability for the desired efficiency under nominal conditions has been fixed. in the literature this value is usually adopted in the range between and . in this paper the value has been taken. therefore, lower and upper bounderies ( and ) are calculated so that the following condition is satisfied: ∫ ( ) ( ) ( ) (2) where the probability density function ( ) can be expressed using the bayesian formula: ( ) ( ) ( ) ( ) (3) 576 s. vujnović, a. marjanović, ţ. đurović, p. tadić, g. kvašĉev 3. case study coal fueled thermal power plants play a very important role in energy production worldwide and are the number one energy provider in serbia. for that reason an increase of productivity and work life of an entire plant, as well as its subsystems, is of great economical importance. coal grinding subsystem is one of the key parts of thermal power plant and is responsible for pulverization of coal, so it can be used in a burner system. in thermal power plant kostilac a1 in serbia fan mills used for coal pulverization have ten impact plates which rotate around the center. pulverization occurs as a result of friction between the plates and the chunks of coal within the mill. when the coal is grinded into a fine powder it is transported into a burner system where it is used as a fuel. the particles which are not small enough return back into the mill where they are additionally pulverized. after several weeks the impact plates within the mill get worn due to constant impact with coal chunks and rock and the efficiency of the mill starts to decrease. this is when the maintenance needs to be performed or other more serious problems and malfunctions will occur. the algorithms which can detect the moment the maintenance is needed based on the recorded acoustic signals have already been developed. they, however, are unable to perform their function when the noisy measurements are provided, which often happens with acoustic signals in a real industrial environment. mills in thermal power plants produce high intensity noise and they are located in the vicinity of other mills of the coal grinding subsystem. therefore, the acoustic environment in which the recordings are measured is extremely complex. even with all this in mind, the frequency features of this noise are very informative for state estimation of impact plates within the mills. however, given that the area around the mill consists of a large number of other actuators, valves, pipes, pumps, additional works such as welding, repairs, maintenance and the like, are quite common. at the same time, the sound recording is being enriched by sporadic impact of larger chunks of coal. these occurrences contaminate acoustic recording and, considering that their statistics are not included in the training sets used for impeller state estimation algorithms, they can cause the algorithm to make a wrong decision or, at the very least, cause a large time delay in making a correct decision. for this reason it is of great importance to develop techniques for detection and, if possible, classification of contamination in the acoustic recording. acoustic signals used to demonstrate the results of the proposed algorithm are recorded in different acoustic surroundings of the mill. one part of these recordings is taken in nominal working conditions in which, other than the noise from the mills and other rotating elements, there are no other sources of contamination. the second group of recordings consists of nominal sound sources as well as the sound of people talking in the vicinity of the microphone. the third group of signals contains nominal sound as well as the sound produced during welding and repair of the steam lines near the mill. the noise detection algorithm developed in this research has been tested on real acoustic signals recorded in thermal power plant kostolac a1 in serbia. there are 10 impact plates within the mill for which the noise detection algorithm has been tested and the recorded signal has the sampling frequency of . the length of the obtained recording is approximately 20 minutes. this recording consists of intervals in which the system is in nominal regime, as well as intervals when the artificially created noise has been used to pollute the recording. toward acoustic noise type detection based on qq plot statistics 577 4. results the proposed algorithm is tested in several steps. first the learning part of the algorithm is conducted in which the recordings in nominal regime are analyzed. in this way the nominal probability density function , as well as the discriminant boundary for nominal recordings are obtained. after that, the algorithm is tested on both contaminated and nominal samples in order to determine how prone it is to false classification. the effect of window length on proposed algorithm is analyzed as well. finally, an attempt has been made to classify the obtained noise and to determine whether the impulse disturbance or speech contamination has occurred. 4.1. nominal recordings as it is stated earlier, the initial part of the algorithm is a learning process in which sufficiently long signal in nominal regime is used to approximate the nominal probability density function. after that the hilbert transform of the signal is performed in order to obtain an envelope of the signal. there are several ways to approximate the probability density function (pdf) of the obtained sequence. one is by observing the scaled histogram of the signal, and the other is using the method of kernel functions. the latter method is chosen in order to obtain a smoother version of the estimate without a significant increase in computational complexity. for pdf estimation an epichenkov kernel function is used due to the fact that it is most commonly applied in the literature because it minimizes the mean square error. as expected, the pdf estimate obtained in this way roughly resembles the shape inferred from the histogram. after estimating pdf of a noise-free signal, the next step is to determine the boundaries of a qq plot from eq. (2). seeing how all the samples of a hilbert transform of the signal are positive and the expected behavior of a noisy signal would be a larger variance and a greater mean value (with respect to noise-free parameters) a slight simplification of (2) can be implemented, for the sake of easier numerical calculations: ∫ ( ) (4) the lower classification boundary does not need to be determined because when the noise occurs, the points on the qq plot are expected to drift above the line. therefore, eq. (4) is used for the purpose of noise detection and boundary calculation. the resulting qq plot of the samples in nominal regime and the calculated boundary are shown in fig. 3. fig. 3 qq plot of nominal recorded quantiles with respect to nominal expected quantiles, with boundary . 0 0.2 0.4 0.6 0 0.2 0.4 0.6 f nom -1 n o m in a l d a ta q u a n ti le s 578 s. vujnović, a. marjanović, ţ. đurović, p. tadić, g. kvašĉev 4.2. noisy recordings testing the algorithm as a tool for noise detection is conducted on the part of the signal which is 12 seconds long and whose hilbert transform is shown in fig. 4. this signal contains dominant sections of nominal regime (blue), sections contaminated with speech (green) and samples which contain impulse disturbance (red). in this way all the aspects of noise detection algorithm are tested. the hilbert transform is applied in order to obtain an envelope of the signal. fig. 4 part of the recording on which the algorithm has been preliminary tested. blue represents the nominal regime, green represents the part of the signal contaminated with speech, and red represents the part of the signal contaminated by impulse disturbance. the testing recording has been separated into smaller pieces obtained using window the size of 1sec, with overlap of 50%. each window has been tested for noise, and the noisy recordings have been dismissed. all the windows which include only the nominal behavior without the noise have qq plots which resemble the shape shown in fig. 3. all the points of the plot are below the discriminant boundary and are therefore classified as noise-free samples. the effect of speech contamination on the qq plot depends heavily on the percentage of contaminated signal which is enveloped within the window, as shown in fig. 5. in case when the windowed signal consists exclusively of speech contaminated samples (fig. 5 down), its qq plot has quantiles which lie on an approximately straight line with angle larger than . this indicates that the variance of the recorded signal, as well as its mean value, is larger than expected. also, most of the samples lie above the discriminant line which means that the algorithm has detected the noise. the situation is not so clear when only part of the window which is examined contains speech contaminated samples. in that case the angle of the plot is lower and, depending on the amount of speech included in the window, sometimes all the quantiles lie below the discriminant line. this means that the contamination has not been detected (fig. 5 up). 0 2 4 6 8 10 12 0 0.5 1 1.5 t [sec] x h ilb e rt nominal impulse disturbance speech contamination toward acoustic noise type detection based on qq plot statistics 579 fig. 5 speech contaminated samples in time domain (left) and the appropriate qq plot (right). upper figures show the behavior of the plot when only small part of the speech contamination is encompassed in the window. central figures show the behavior when about 50% of the window contains contamination, while lower figures show what happens when the contamination is present in the entire windowed signal. with impulse disturbance the problem becomes much simpler and the algorithm manages to detect the contamination regardless of the percentage of noisy samples in the window. the nature of impulse disturbance is so abrupt that even a small number of samples encompassed within a window is enough to significantly change the statistical parameters. the appropriate qq plot of this is shown in fig. 6. 0.5 1 1.5 0 0.1 0.2 0.3 0.4 0.5 0.6 t [sec] x h ilb e rt 0 0.2 0.4 0.6 0.8 0 0.2 0.4 0.6 0.8 1 f nom -1 1 1.2 1.4 1.6 1.8 2 0 0.2 0.4 0.6 0.8 t [sec] x h ilb e rt 0 0.2 0.4 0.6 0.8 0 0.2 0.4 0.6 0.8 1 1.2 f nom -1 2 2.2 2.4 2.6 2.8 3 0 0.2 0.4 0.6 0.8 1 1.2 t [sec] x h ilb e rt 0 0.2 0.4 0.6 0.8 0 0.2 0.4 0.6 0.8 1 1.2 f nom -1 580 s. vujnović, a. marjanović, ţ. đurović, p. tadić, g. kvašĉev fig. 6 samples which contain impulse disturbance in time domain (left) and the appropriate qq plot (right). the classification results of the algorithm are presented in table 1. while classifying the nominal samples and samples which contained impulse disturbance the algorithm has achieved accuracy of 100%, while speech contamination has a lesser percentage of detection. this is due to the fact that the statistical parameters of the windowed signal do not vary considerably with respect to the nominal regime when only a small part of the window contains speech contamination. this is precisely what happened in those 2 windowed parts of the signal which were wrongly classified. table 1 results of the noise detection algorithm nominal recordings speech contamination impulse disturbance classified as nominal 13 (100%) 2 (25%) 0 classified as noisy 0 6 (75%) 4 (100%) 4.3. length of the window adjustment the previous analysis suggests that the proposed algorithm easily detects impulse disturbances, but speech contamination can be somewhat more elusive. in the given 8.5 9 9.5 0 0.5 1 1.5 t [sec] x h ilb e rt 0 0.2 0.4 0.6 0.8 0 0.5 1 1.5 f nom -1 9.5 10 10.5 0 0.5 1 1.5 t [sec] x h ilb e rt 0 0.2 0.4 0.6 0.8 0 0.5 1 1.5 f nom -1 toward acoustic noise type detection based on qq plot statistics 581 example, out of 8 windowed signals contaminated with speech, the algorithm cannot correctly classify two of them. the problematic windowed signals are at the beginning and the ending of the speech sequence and incorrect classification is due to the fact that there is a small percentage of contaminated samples inside the window. one way to correct this error is by changing the length of the window. the noise detection results as the length of the window is changed are given in table 2. table 2 changeable length of the window tested on speech contaminated signals window length classified as nominal classified as noisy total number of windowed signals 1 (17%) 5 (83%) 6 2 (25%) 6 (75%) 8 2 (13%) 13 (87%) 15 31 (37%) 52 (63%) 83 one thing which is obvious from the results is the fact that the number of speech contaminated windowed signals increases as the length of the window decreases. this is important for statistical significance of the experiment. however, with smaller number of samples inside the window, the qq plots are not as representative as they are for larger number of samples. the table shows that for window sizes between 1.5s and 0.5s only one or two windowed signals are wrongly classified as nominal, and those correspond to the beginning or the end of the sequence, as discussed previously. therefore smaller length of the widow will yield statistically better results because higher percentage of signals will be correctly classified as noisy. by continuing to decrease the length of the window, however, the algorithm starts to behave inconsistently. for window length of 0.1s the percentage of misclassified signals drastically increases. this is due to several factors. first of all, qq plots have fewer samples and are therefore less accurate. secondly, the dynamics of speech is such that usually the gaps between the words, and sometimes even within a single word, are larger than 0.1s. therefore there are a significant number of windowed signals which do not contain any information about the speech. furthermore, while other window lengths correctly classify all nominal recordings and all impulse disturbance recordings, for misclassification occurres not only for speech contaminated signals, but for nominal signals as well. 4.4. noise detection and classification from fig. 5 and 6 it is clear that two different types of noise present themselves quite differently on the qq plot. with this in mind it might be possible to classify which type of noise has occurred when the algorithm detects the presence of contamination. the way in which this can be done is by determining another classification line, as in eq. (4), but this time with respect to speech contaminated signals, rather than nominal recordings. in this way two classification lines are obtained, one which classifies nominal recording from the contaminated ones, and the other which detects whether contaminated recordings have impulse or speech disturbance, as shown in fig. 7. in the upper graph it can be seen that nominal recordings do not trigger any errors. speech contaminated recordings can be seen in the lower left part of the figure, and they fit ideally between two classification 582 s. vujnović, a. marjanović, ţ. đurović, p. tadić, g. kvašĉev lines. impulse disturbance, on the other hand, has the quantiles above both discrimination lines, as can be seen in the lower right part of the figure. this upgraded algorithm for noise detection and classification has been tested on the recording from fig. 4 and the results are shown in table 3. as can be seen, the impulse disturbance has been impeccably classified as such. nominal recordings have a high percentage of nominal classification as well. speech still has the lowest detection and classification percentage due to the facts discussed earlier. table 3 results of the noise detection algorithm nominal recordings speech contamination impulse disturbance classified as nominal 100% 25% 0 classified as noisy 0% 55% 0 classified as impulse noise 0% 20% 100% fig. 7 qq plot with 2 classification lines. when the samples of a qq plot go above the red line, the noise has been detected. however, if samples are above the black line, this means that impulse disturbance has occurred, and when they are between the red and blue classification lines the speech contamination has occurred. 0 0.2 0.4 0.6 0.8 0 0.2 0.4 0.6 0.8 1 f nom -1 nominal data quantiles   0 0.2 0.4 0.6 0.8 1 1.2 0 0.5 1 1.5 f nom -1 speech data quantiles 0 0.2 0.4 0.6 0.8 1 0 0.5 1 1.5 2 f nom -1 impulse data quantiles toward acoustic noise type detection based on qq plot statistics 583 5. conclusion in this paper an algorithm was presented which is capable of detecting the occurrence of noise in acoustic signals and is able to classify this noise with high percentage of accuracy. the main tool used for this purpose is a qq plot with probability density function estimates and hypothesis testing algorithms. this research has been conducted with a purpose of making acoustic signals more broadly usable in the industry as a tool for predictive maintenance and state estimation of machines. the algorithm has been tested in a real industrial environment in thermal power plant kostolac a1 in serbia, and is shown to be capable of detecting whether the noise has occurred, and to classify whether the impulse disturbance or speech contamination is in question. furthermore, the influence of the length of the window used on the efficiency of the algorithm is tested as well. successful detection and classification is much lower on speech signals than on impulse disturbance due to the fact that the intensity of the speech, as well as words that are spoken directly influence the amount of contamination of the nominal signal. therefore, if someone speaks quietly or makes long pauses while speaking, the chances are that the proposed algorithm will not manage to detect all the polluted parts of the signal. also the percentage of contamination which is included in the analyzed window affects the detectability of the contamination, so the beginning and an ending of a speech contaminated sequence may not always be detectable. this can be improved by increasing the overlap between the windows and decreasing the size of the window, but only up to a point. the algorithm proposed in this paper is an introductory research of a preprocessing tool that should be capable of detecting and isolating acoustic noise in an industrial environment with a purpose of making acoustic recordings more compelling for usage in industrial predictive maintenance algorithms. further research is going to contain robustification of the algorithm and improvement of speech detection possibly by using correlation analysis or some similar tools. also, a pdf estimation of noisy signals based on their qq plots is something that might yield more robust results as well. acknowledgement: this paper is a result of activities within the projects supported by serbian ministry of education and science iii42007 and tr32038. references [1] s. vujnović, a. al-hasaeri, p. tadić and g. kvašĉev, “acoustic noise detection for state estimation”, in proceedings of the 3rd international conference on electrical, electronic and computing engineering (icetran 2016), zlatibor, serbia, june 13 – 16, 2016, aui4.6 1-5. [2] r. k. mobley, an introduction to predictive maintenance, 2nd ed. amsterdam, netherlands: butterworthheinemann, 2002. [3] m. a. stošović, m dimitrijević, s. bojanić, o. nieto-taladriz, v. litovski, “characterization of nonlinear loads in power distribution grid,” facta universitatis, series: electronics and energetics, vol. 29, no. 2, pp. 159-175, 2016. [4] d. stevanović, p. petković, “utility needs smarter power meters in order to reduce economic losses,” facta universitatis, series: electronics and energetics, vol. 28, no. 3, pp. 407-421, 2015. [5] m. j. crocker, handbook of noise and vibration control, hoboken, new jersey: john wiley & sons, 2007. [6] z. su, p. wang, x. yu, z. lv, "experimental investigation of vibration signal of an industrial tubular ball mill: monitoring and diagnosing," miner eng, vol. 21, no. 10, pp. 699-710, 2008. 584 s. vujnović, a. marjanović, ţ. đurović, p. tadić, g. kvašĉev [7] n. baydar, a. ball, "a comparative study of acoustic and vibration signals in detection of gear failures using wigner-ville distribution, "mech syst signal pr, vol. 15, no. 6, pp. 1091-1107, 2001. [8] g. s. kvascev, z. m. djurovic, b. d. kovacevic, "adaptive recursive m-robust system parameter identification using the qq-plot approach," iet control theory & applications, vol. 5, no. 4, pp. 579-593, 2011. [9] s. vujnovic, z. djurovic, g. kvascev, "fan mill state estimation based on acoustic signature analysis," control engineering practice, vol. 57, pp. 29-38, 2016. [10] j. j. filliben, "the probability plot correlation coefficient test for normality," technometrics, vol. 17, no. 1, pp. 111-117, 1975. [11] k. fukunaga, introduction to statistical pattern recognition, 2nd ed. san diego, california: academic press professional, 1990. [12] s. theodoridis, k. koutroumbas, pattern recognition, 3rd ed. orlando, florida: academic press, 2006. 10563 facta universitatis series: electronics and energetics vol. 35, no 3, september 2022, pp. 455-468 https://doi.org/10.2298/fuee2203455n © 2022 by university of niš, serbia | creative commons license: cc by-nc-nd original scientific paper experimental shielding effectiveness study of metal enclosure with electromagnetic absorber inside nataša nešić1, nebojša dončov2, slavko rupčić3, vanja mandrić-radivojević3 1department of information and communication technologies, academy of applied technical and preschool studies, niš, serbia 2faculty of electronic engineering, university of niš, serbia 3faculty of electrical engineering, university of josip juraj strossmayer, osijek, croatia abstract. in this paper, the impact of an electromagnetic absorber inside a protective metal enclosure is analyzed. the absorber is put inside the enclosure in order to improve its shielding effectiveness, especially at the first resonant frequency. different absorber's sheet positions inside the enclosure are analyzed. the absorber sheet dimensions are fitted to correspond the enclosure's walls. the experimental procedure is conducted in a semianechoic room. the numerical tlm simulations of the em filed distribution inside enclosure are conducted in order to consider position of the absorber sheet on different walls. key words: absorber, enclosure, emi absorber sheet, measurements, shielding effectiveness, tlm method. 1. introduction an increasing number of modern electronic devices resulted in the rise of electromagnetic (em) radiation. hence, it is of considerable importance to conduct electromagnetic compatibility (emc) analysis. quantifying the shielding properties of an enclosure can be considered from the viewpoint of shielding effectiveness (se). commonly, a shielding characteristic of an enclosure can be given as a ratio of em fields with and without an enclosure at some probe point, over a wide frequency range. the se of enclosure may be very low or even has negative value at the resonant frequencies in the observed frequency range. the negative values of the enclosure se at the resonant received march 1, 2022; revised march 24, 2022; accepted may 15, 2022 corresponding author: nataša nešić department of information and communication technologies, academy of applied technical and preschool studies, aleksandra medvedeva 20, 18000 niš, serbia e-mail: natasa.nesic@akademijanis.edu.rs *an earlier version of this paper was presented at the 15th international conference on applied electromagnetics пес 2021, august 30th september 1st, 2021, in niš, serbia [1]. mailto:natasa.nesic@akademijanis.edu.rs 456 n. nešić, n. dončov, s. rupčić, v. mandrić-radivojević frequencies can affect or even can compromise the useful frequency range, in which em shielding of a device is provided. a number of different methods such as the analytical, the numerical and the experimental ones can be used, in order to study the shielding characteristic of an enclosure. the analytical methods [2]–[4] are usually based on problem simplification, thus they can be very fast but with some inherent limitations. for an efficient computational modelling of protective enclosures, there are numerous numerical techniques. one among many is the transmission-line matrix (tlm) method [5], which will be employed in this paper. finally, in the experimental methods, an antenna is set inside the enclosure in order to measure its se. furthermore, the physical dimensions of an in-house monopole receiving-antenna, which is often used in experimental set-up for measuring em field level, could also affect the se of enclosure. this was numerically demonstrated in [6] and experimentally confirmed in [7]. several techniques can be applied in order to improve the shielding properties of enclosure over a frequency range. the se of enclosure was increased by using absorbers [8] or conductive foam in [9] and [10]. as damping techniques, the composite materials based on nanotechnology [11] and metamaterial absorber structure [12] can be used. furthermore, a frequency-selective surface [14] and polymer composites filled with carbonaceous particles which are suitable for microwave absorption [13] can be employed. the enclosure can be coated with composite foam material or can be made of that material [15]. in [16], it was shown that placing small antenna elements, dipole or loop antenna structure with loaded resistance on the enclosure wall opposite to the enclosure aperture can improve the enclosure se. the effective length of this small structure was chosen to match the first resonant frequency of enclosure. in papers [17] and [18], the authors proposed to suppress the first resonant frequency in a metal enclosure by putting small antenna elements with loaded resistance. it was shown that the em shielding could be improved by placing a small dipole printed antenna structure on the enclosure wall inside. the improvement was efficiently, especially at the first resonant frequency over observed frequency range. in [8], an influence of an electromagnetic interference (emi) absorber inside the enclosure and its improvement on the enclosure se was experimentally studied. the absorbers were placed on the back wall of enclosure, on two side walls and on the back and two side walls at the same time. in [1] the study from [8] was expanded placing and combining absorbers on other enclosure walls in order to see how these absorbers positions affect the se of enclosure, especially at resonant frequencies. in this paper, the experimental study of absorber sheet position impact on shielding effectiveness of enclosure is systematized and supported by the numerical analysis of the em field distribution inside inner enclosure walls. for numerical simulations, the tlm method is used aiming to estimate the greatest impact of absorber position inside the enclosure. in such a way, an absorber amount and precisely position can be determined in advance. in experimental and numerical studies, position of thin emi absorber sheet on one or more inner enclosure walls is considered focusing on se behavior at the first resonance of enclosure but it is clear that its placement might affect se at higher resonances as well. the paper is organized as follows. section ii refers to analytical calculation of enclosure modes. in section iii, the numerical tlm model of enclosure is described. in section iv, the experimental set-up and measurement procedure are described. section v presents a physical enclosure’s model with the emi absorber material and with a receiving-antenna inside. section vi provides discussion of the experimental results. finally, section vii summarizes the work. experimental shielding effectiveness study of metal enclosure with electromagnetic absorber inside 457 2. analytical calculation of enclosure modes in this section, a rectangular metal enclosure with one aperture on a frontal enclosure wall is described. the enclosure has dimensions of (300 x 300 x 120) mm3. symmetrically around the centre on the frontal enclosure wall, a rectangular slot aperture with dimensions of (100 x 5) mm2 is positioned. the thickness of all enclosure walls is t = 1.5 mm. it is made of copper material. to start with, the metal enclosure can be analysed as a resonator cavity. a waveguide is a type of a transmission line; a resonator can be constructed from closed sections of waveguide [19]. since a waveguide is short-circuited at both ends, a closed metal box or cavity is obtained. inside the cavity, electric and magnetic energy can be stored and power can be dissipated in the metallic walls of the cavity [19]. usually, coupling to the resonator can be obtained by a small aperture(s) and a small probe or a small loop. in this paper, an aperture on the frontal enclosure wall is used for coupling to the enclosure while a small probe such as a monopole antenna is employed for measuring the distribution of the em field inside enclosure. according to the analytical equation [19], the te and tm modes which are occurred in considered enclosure are calculated and are given in table 1. table 1 all the resonant modes for te and tm modes occurred in considered enclosure, in observed frequency range resonant frequency mode, 𝑓𝑚,𝑛,𝑙 ghz 𝑓110 0.707 𝑓101 =𝑓011 1.346 𝑓111 1.436 𝑓201 = 𝑓021 1.601 𝑓120 =𝑓210 1.118 𝑓211 =𝑓121 1.677 𝑓220 1.414 𝑓221 1.887 3. numerical model of enclosure before the experimental procedure is conducted, a numerical model of the considered enclosure is designed by using the tlm method as a numerical modelling technique [5]. it is created to resemble to the experimental procedure. the tlm compact wire model is very suitable for modelling an antenna inside enclosure whose purpose is to measure the em field level and its distribution [6]. this wire model is based on wire segment incorporated into the standard tlm symmetrical condensed node (scn). the impedances of additional wire network link and short-circuit stub lines depend on the space used and time-step discretization, and also on per-unit length wire capacitance and inductance [20] and [21]. in the numerical model of the enclosure entitled by d, a monopole antenna is employed inside. the tlm compact wire model is used to describe the monopole antenna as a wire conductor with a length of l = 60 mm and with a radius of r = 0.1 mm, placed in the middle of the enclosure [6], as shown in fig. 1. the antenna is also connected to the ground via resistor r. a slot aperture on the front wall of enclosure is described by several nodes across each cross458 n. nešić, n. dončov, s. rupčić, v. mandrić-radivojević section dimension. external em field, represented as a vertically polarized incident plane wave, penetrates into enclosure through aperture and a current induces on the wire. further, on a resistor r, which is loaded at wire base, a voltage generates. this allows measuring numerically the level of em field inside the enclosure [6] and [18]. table 2 presents the first three resonant frequencies (the excited tm modes) obtained by using analytical calculation for rectangular resonator (enclosure without aperture/slot) and by using numerical simulations. the first three modes for the enclosure with slot aperture for empty and for enclosure with monopole antenna are obtained by the tlm numerical calculations. the numerical se results obtained for the empty enclosure and the one with a monopole-receiving antenna are presented in fig. 2. it can be observed that the first resonant frequency is shifted toward lower frequencies in presence of the monopole antenna inside. the analysis with different antenna radii and different antenna length is given in [6]. the frequency shift can be explained by the perturbation theory, according to that, when a volume ∆v is put inside the resonator, the total interior volume decreases by ∆v, which affects the position of the resonant frequency in that enclosure [19] and [6]. fig.1 enclosure d with one rectangular aperture on the front wall, excited by normal incident plane wave vertically polarized [6] fig. 2 the first resonant frequency comparative peaks of enclosure d with and without monopole antenna (tlm simulations) [6] experimental shielding effectiveness study of metal enclosure with electromagnetic absorber inside 459 table 2 the first three resonant frequencies in enclosure d tem mode analytical calculation empty enclosure [6] enclosure with monopole r = 0.1 mm [6] tm110 f110 = 707.107 [mhz] fr1 = 703.059 [mhz] fr1 = 688.496 [mhz] tm120 f120 = 1118.03 [mhz] fr2 = 1101 [mhz] fr2 = 1099 [mhz] tm130 f130 = 1581.138 [mhz] fr3 = 1608 [mhz] fr3 = 1444 [mhz] 4. experimental procedure and setup the experimental procedure of the equipment under test (eut) is described in this section. the measurements are conducted in a semi-anechoic room measuring place occupied with rf absorbers in an ordinary laboratory space, in laboratory for hf measurements at ferit faculty in osijek, croatia. in order to determine the se of enclosure, a measuring procedure has to be performed twice, without and with enclosure. the se of considered enclosure is measured by the network analyzer and with the s21 parameters. the transmission parameters of the measurement without and with an enclosure are marked as s21n and s21e, respectively [8]. the se can determine by following: 𝑆𝐸 [𝑑𝐵] = 𝑠21𝑛 − 𝑠21𝑒 . (2) figure 3 illustrates the measuring configuration used in a semi-anechoic room. the dipole broadband-antenna, type vivaldi, was used as a transmitting antenna, in the experimental set-up. the vector network analyzer (vna), the keysight field fox rf analyzer n9914a 6.5 ghz, with a maximum power of 3 dbm and with a resolution of 100 hz was employed as a measuring device. the vivaldi antenna was connected via coax cable to the vna. further, the vna was connected to the receiving-antenna via coax cable. an in-house monopole antenna was employed as a receiving-antenna and it is placed inside tested d enclosure [17] and [18]. in the measurement process, as an excitation source, a vertically polarized vivaldi antenna is used [21], as depicted in fig. 3. the vivaldi antenna has a frequency range of 600 mhz to 6 ghz while a receiving one is a very thin in-house monopole. the monopole antenna is placed in the middle of the enclosure in order to measure the level of em field inside. all measurements are performed in the frequency range from 600 mhz to 2 ghz, in the far-field. enclosure used in experiments is made of copper material with the internal dimensions (300 x 300 x 120) mm3. an in-house monopole antenna is also made of copper with a length of l = 60 mm and with a radius of r = 0.15 mm. figure 4 presents a photography of a measuring configuration used for obtaining the experimental results in a semi-anechoic room [21]. the se results of enclosure with the monopole antenna obtained by the measurements and the numerical simulations are compared and presented in fig. 5. it can be observed an excellent match between the measurements and the simulation curves. 460 n. nešić, n. dončov, s. rupčić, v. mandrić-radivojević fig. 3 the sketch of the measuring set-up: transmitting antenna, vna and eut (enclosure under test d) [21] fig. 4 photography of measuring configuration used in a semi-anechoic room: transmitting antenna, vna and eut (enclosure d with a receiving antenna) experimental shielding effectiveness study of metal enclosure with electromagnetic absorber inside 461 fig. 5 the comparison between measurements and numerical simulation results of the se of enclosure with monopole antenna 5. emi absorber the 3m™ emi absorber ab7050 from ab7000 series [1], [8] and [22] is used in the measurements conducted in this paper. one side of the absorber sheet consists of a flexible polymer resin loaded with soft metal flakes and on the other side is covered by an acrylic pressure-sensitive adhesive allows for easy application [8] and [22]. this absorber is typically used for applications in the wide frequency range, from 50 hz up to 10 ghz. it is a broadband emi absorber designed to work in near-field applications inside and around electronic devices and assemblies [22]. this absorber is thin as a sheet of paper, with the backing thickness of 0.5 mm and adhesive thickness of 0.05 mm, so it does not occupy significant space inside the enclosure [22]. the emi absorber used in the experimental analysis is cut to fit the inner enclosure’s sides. the experimental procedure is conducted for eight cases (configurations). firstly, the enclosure without emi absorber is measured and its se characteristics is obtained. secondly, the emi absorber is employed on the lower wall of enclosure which is entitled by lw. its dimensions correspond to the inner dimensions of the lower enclosure’s wall. the third case refers to the absorbers on two side enclosure’s walls (entitled by 2sw). the dimensions of the absorber are cut to fit the enclosure’s side walls which is (297 x 120) mm2. in the fourth case, the absorber on the wall opposite to the front wall with an aperture, so-called back wall (entitled by bcw), is considered. in the fifth case, the absorbers are employed at the same time on lower wall and on two side walls (entitled by lw+2sw, as in fig. 6). the sixth case, the absorbers are put on lower and back enclosure walls (case lw+bcw). the seventh case, the absorbers are put on lower and upper enclosure’s wall (case lw+up). finally, the eight case is all above-mentioned absorber positions. this case will be called lw+2sw+bcw+up. 462 n. nešić, n. dončov, s. rupčić, v. mandrić-radivojević fig. 6 photography of the physical model of d metal enclosure with the emi absorbers on lower wall and both side walls of enclosure (lw+2sw) 6. discussion of results the experimental results of the shielding characteristics of considered enclosure with the emi absorbers inside are presented in this section. to start with, the se results are obtained based on the measured transmission parameters without and with the enclosure. in fig. 7, and also in further figures, the se results for configuration of the enclosure without absorber (empty enclosure with only receiving antenna inside) are given. the results are compared to the configuration with absorber on the lower enclosure wall and are presented in fig. 7. it can be observed that the presence of the absorber inside the enclosure gave a significant improvement, especially at the resonant frequencies, over the case without it. apart from the resonance frequencies, the both se curves are very similar in terms of the shape and se levels. therefore, it can be seen that in the presence of the absorber, all the peaks at resonant frequencies are damped. table 3 presents the se values at the first enclosure resonance for different absorber configurations. also, it can be seen that the first resonant frequency of empty enclosure occurs at 686 mhz and the se value is equal to -14.95 db. a negative se value might compromise shielding property of the enclosure. by putting the emi absorber on the lower wall of enclosure, the first resonant frequency occurs at 696 mhz and the positive value of 9.87 db for the se is obtained. in comparison between the enclosure with the lw absorber with the one without it, the frequency shift (δfr1) of 10 mhz is obtained, as shown in fig. 7. moreover, at the first resonant frequency the difference between the se levels (δse) of 25.32 db is indicated. therefore, it can be observed the first resonance frequency shift toward higher frequencies in a presence of the emi absorber. experimental shielding effectiveness study of metal enclosure with electromagnetic absorber inside 463 secondly, fig. 8 presents the compared measurement results of enclosure with the emi absorbers placed on two side walls to the empty enclosure case. although the se curves look similar in fig. 8 and do not differ much in terms of shape in the whole frequency range, the se levels are still different at resonant frequencies around 700 mhz, 1100 mhz and 1650 mhz, respectively. it can be seen a very good absorber efficiency at lower frequencies, while it is weaker at higher frequencies in observed range. the additional te and/or tm modes are not established inside the enclosure, since a very thin emi absorbers were employed inside it. also, table 3 presents that the frequency shift of the first resonance, δfr1, is 8 mhz, while the difference between se levels is 23.65 db. for the third case, the absorber is placed on the back wall inside enclosure. the results are depicted in fig. 9. it can be seen that the difference between the se levels (δse) is 20.4 db and the frequency shift (δfr1) related to the first resonance position without and with absorber is 8 mhz, as depicted in fig. 9 and in table 3. fig. 7 the se measurement enclosure results without absorber and with the absorber placed on the lower wall (case lw) fig. 8 the measurement results for the se of the enclosure with the absorber placed on two side walls (case 2sw) 464 n. nešić, n. dončov, s. rupčić, v. mandrić-radivojević fig. 9 the measurement results for the se of the enclosure with the absorber placed on the wall on opposite side from frontal enclosure wall (case bcw) in order to consider the effects of absorbing material, put in different positions inside the enclosure, on the shielding characteristic, the tlm simulations are conducted. the em field distribution is shown on different inner wall surface of enclosure without absorber. figure 10 presents the em field distribution on the lower wall inside the enclosure. further, figs. 11 and 12 present the em field distribution on the left-side wall and on the back wall inside the enclosure, respectively. it can be observed that the em field distribution is not uniform and that placing absorber on the lower wall might have the strongest influence on the se characteristic among these three considered positions. therefore, position of absorber on the lower enclosure wall is included in all further considered cases. fig. 10 the em field distribution on the lower wall inside the enclosure, obtained by the numerical model experimental shielding effectiveness study of metal enclosure with electromagnetic absorber inside 465 fig. 11 the em field distribution on the left-side wall inside the enclosure, obtained by the numerical model fig. 12 the em field distribution on the back wall inside the enclosure, obtained by the numerical model in fig. 13, the compared se measurement results are presented for the first, the fifth, the sixth and the seventh configurations. for the fifth case (lw+2sw), it can be seen that the se curve do not differ much in terms of shape in the whole frequency range in order to the empty enclosure (the first case), but the se values differ at resonant frequencies, especially above 1400 mhz. one can notice that the absorber efficiency is very good at lower frequencies, while it is a bit weaker at higher frequencies in observed range. for this configuration, the difference between se values is 30.54 db, while the frequency shift δfr1 is 10 mhz. the fifth, the sixth 466 n. nešić, n. dončov, s. rupčić, v. mandrić-radivojević and the seventh configurations have the same frequency shift, see table 3. further, for the sixth configuration (lw+bcw) the difference between the se levels (δse) at the first resonance for this case and the case without the absorbers is 29.2 db. at higher frequencies, above 1700 mhz, the presence of absorbers for this case led to the decrease of the se, as depicted in fig. 13. for the seventh case (lw+up), the difference between the se levels (δse) for this case and the case without the absorbers is 28.96 db, at the first resonance. at higher frequencies, above 1700 mhz, the presence of absorbers influenced to the increase of the se, for this configuration depicted in fig. 13. it can be observed that all compared cases have a similar shape of characteristics, but case lw+up has the highest se value at the first resonance, as well as significantly higher se levels at higher frequencies, above 1700 mhz. fig. 13 the comparison of se measurement results for enclosure without absorber, case lw + 2sw, case lw + bcw and case lw + up fig. 14 the comparison of the measured se of the enclosure around the first resonant frequency – without absorber (empty), case lw+2sw, and case lw+up+bcw+2sw experimental shielding effectiveness study of metal enclosure with electromagnetic absorber inside 467 finally, fig. 14 presents the compared se characteristics for enclosure without absorber, for the sixth case (lw+2sw) and the configuration where all inner wall surfaces are coated with absorbers, the eight case (lw+up+bcw+2sw). obviously, the eight case has a quite better se level at the first resonant frequency, but bear in mind that it includes a significant amount of absorber materials employed. in practical applications, the case lw+2sw is the most economical one, due to the fact that the significant absorber effects are achieved by using absorber materials only on two walls. table 3 the se values at the first enclosure resonance emi absorber position fr1_meas [mhz] se_meas [db] δfr_meas [mhz] δse_meas [db] empty 686 -15.45 lw 696 9.87 10 25.32 2sw 694 8.70 8 24.15 bcw 694 5.45 8 20.9 lw+2sw 696 15.09 10 30.54 lw+bcw 696 13.75 10 29.2 lw+up 696 13.51 10 28.96 lw+2sw+bcw+up 696 19.30 10 34.75 7. conclusion the eight configurations of the enclosure without absorber and with different positions of absorbers inside are considered in the experimental shielding effectiveness study, supported by numerical analysis. the significant se level improvement of 30.54 db at the first resonant frequency, compared to the empty enclosure case, is obtained for the case lw+2sw. the case lw+2sw+bcw+up gives further improvement of 4.21 db with respect to case lw+2sw, however, it requires a significantly higher amount of absorber material. overall, the technique of using thin absorber can improve the shielding properties of enclosure, but to estimate its effects in different positions a numeric study will be beneficial to be carried out. therefore, a numerical model of emi absorber will be in future research focus. in addition to that, emi absorber presence may also influence the se peaks at the higher frequencies and that will be also further explored. acknowledgement: this work has been supported by the euroweb+ project, by the cost ic 1407 and by the ministry of education, science and technological development of republic of serbia (grant no. 451-03-68/2022-14/ 200102). references [1] n. nešić, s. rupčić, v. mandrić-radivojević and n. dončov, "experimental analysis of a metal enclosure shielding effectiveness improvement with emi absorber", in proceedings of the 15th international online conference on applied electromagnetics пес 2021, niš, 2021, pp. 98–101. [2] c. christopoulos, principles and techniques of electromagnetic compatibility, 2nd ed. crs press, 2007. [3] h. a. mendez, "shielding theory of enclosures with apertures", ieee trans. electromagn. compat., vol. 20, no. 2, pp. 296–305, may 1978. 468 n. nešić, n. dončov, s. rupčić, v. mandrić-radivojević [4] p. m. robinson, m. t. benson, c. christopoulos, f. j. dawson, d. m. ganley, c. a. marvin, j. s. porter and p. w. thomas, "analytical formulation for the shielding effectiveness of enclosures with apertures", ieee trans. electromagn. compat., vol. 40, no. 3, pp. 240–248, august 1998. [5] c. christopoulos, the transmission-line modelling (tlm) method. piscataway, new jersey: wiley-ieee press in association with oxford university press, may 1995. [6] n. j. nešić, and n. dončov, "shielding effectiveness estimation by using monopole-receiving antenna and comparison with dipole antenna", frequenz, vol. 70, no. 5-6, pp. 191–201, april 2016. [7] n. j. nešić, numerical and experimental analysis of aperture arrays impact on the shielding effectiveness of metal enclosures in microwave frequency range, doctoral thesis, in serbian, singidunum university, belgrade, 2017. [8] n. j. nešić, s. rupčić, v. mandrić radivojević and n. dončov, "experimental analysis of electromagnetic interferences absorber influence on metal enclosure immunity", in proceedings of the 8th international conference on electrical, electronic and computing engineering (icetran). bosnia and herzegovina, 2021, pp. 383–386. [9] x. luo and d. d. l. chung, "electromagnetic interference shielding using continuous carbon-fiber carbonmatrix and polymer-matrix composites", elsevier science, compos. b eng., vol. 30, no. 3, pp. 227–231, april 1999. [10] r. kumar, s. r. dhakate, p. saini and r. b. mathur, "improved electromagnetic interference shielding effectiveness of light weight carbon foam by ferrocene accumulation", the roy. soc. of chem. 2013: rsc advances, vol. 3, pp. 4145–4151, january 2013. [11] t. k. gupta, b. p. singh, r. b. mathur and s. r. dhakate, "multi-walled carbon nanotube–graphene– polyaniline multiphase nanocomposite with superior electromagnetic shielding effectiveness", the royal society of chemistry 2014: nanoscale, vol. 6, p. 842–851, 2014. [12] f. costa, s. genovesi, a. monorchio and g. manara, "a circuit-based model for the interpretation of perfect metamaterial absorbers", ieee trans. antennas and propag., vol. 63, no. 3, pp. 1201–1209, march 2013. [13] b. a. munk, frequency selective surfaces theory and design, new york: john wiley and sons, inc., 2000. [14] f. qin and c. brosseau, "a review and analysis of microwave absorption in polymer composites filled with carbonaceous particles", j. appl. phys., vol. 111, p. 061301, march 2012. [15] a. ameli, p. u. jung and c. b. park, "electrical properties and electromagnetic interference shielding effectiveness of polypropylene/carbon fiber composite foams", elsevier: carbon, vol. 60, pp. 379-391, august 2013. [16] j. paul, s. greedy, h. wakatsuchi and c. christopoulos, "measurements and simulations of enclosure damping using loaded antenna elements", in proceedings of the ieee 10th international symposium on electromagnetic compatibility, york, 2011, pp. 676–679. [17] n. nešić, b. milovanović, n. dončov, v. mandrić-radivojević and s. rupčić, "improving shielding effec-tiveness of a rectangular metallic enclosure with aperture by using printed dog-bone dipole structure", in proceedings of 52nd international scientific conference on information, communication and energy systems and technologies (icest), niš, 2017, pp. 97–100. [18] n. j. nešić, b. g. milovanović, n. s. dončov, s. m. rupčić and v. mandrić-radivojević, "improving shielding effectiveness of a metallic enclosure at resonant frequencies", in proceedings of the ieee 13th international conference on advanced technologies, systems and services in telecommunications (telsiks), niš, 2017, pp. 42–45. [19] d. m. pozar, microwave engineering, 4th ed. wiley, 2012, chapters 2-3, pp. 48–162. [20] v. trenkić, a. j. wlodarczyk and r. scaramuzza, "a modelling of coupling between transient electromagnetic field and complex wire structures", int. journal of num. modelling, vol. 12, no. 4, pp. 257–273, july/august 1999. [21] n. j. nešić, slavko s. rupčić, nebojša s. dončov, vanja mandrić-radivojević, "experimental shielding effectiveness analysis of metal plate influence inside an enclosure with aperture", in proceedings of the ieee 14th international conference on advanced technologies, systems and services in telecommunications (telsiks), niš, 2019, pp. 190–193. [22] https://multimedia.3m.com/mws/media/960654o/3m-emi-absorber-ab7000hf-series-halogen-free.pdf https://multimedia.3m.com/mws/media/960654o/3m-emi-absorber-ab7000hf-series-halogen-free.pdf 10771 facta universitatis series: electronics and energetics vol. 36, no 1, march 2023, pp. 31-42 https://doi.org/10.2298/fuee2301031r © 2023 by university of niš, serbia | creative commons license: cc by-nc-nd original scientific paper machine learning assisted optimization and its application to hybrid dielectric resonator antenna design pinku ranjan1, harshit gupta1, swati yadav2, anand sharma3 1abv-indian institute of information technology and management (iiitm), gwalior, madhya pradesh, india 2department of electronics & communication engineering, college of engineering roorkee(coer), roorkee, uttrakhand, india 3department of electronics & communication engineering, motilal nehru national institute of technology allahabad, india abstract. machine learning assisted optimization (mlao) has become very important for improving the antenna design process because it consumes much less time than the traditional methods. these models' accountability can be checked by the accuracy metrics, which tell about the correctness of the predicted result. machine learning (ml) methods, such as gaussian process regression, artificial neural networks (anns), and support vector machine (svm), are used to simulate the antenna model to predict the reflection coefficient faster. this paper presents the optimization of hybrid dielectric resonator antenna (dra) using machine learning models. several regression models are applied to the dataset for optimization, and the best results are obtained using a random forest regression model with the accuracy of 97%. additionally, the effectiveness of machine learning based antenna design is demonstrated through comparison with conventional design methods. key words: dielectric resonator antenna, machine learning, gaussian process regression, anns, svm 1. introduction antenna design optimization is a topic that has received a lot of attention in previous few years. that is because methodologies of conventional antenna design are comprehensive and do not have any guarantee of producing effective results because of the complications of the latest antennas fabrication and execution necessities [1]-[3]. despite the fact that design automation via optimization goes with conventional approaches of antenna design, optimization of antenna designs has many problems [3]-[5]. the significant issues cover the received may 18, 2022; revised july 27, 2022; accepted august 31, 2022 corresponding author: pinku ranjan abv-indian institute of information technology and management (iiitm), gwalior, madhya pradesh, india e-mail: pinkuranjan@iiitm.ac.in 32 p. ranjan, h. gupta, s.yadav, a. sharma efficiency and optimization capacity of accessible techniques to address a wide extent of antenna design issues thinking about the developing details of current antennas. the methods presented in this report can have an effect on the upcoming development of antennas for an abundance of applications. the frequencies which are in the microwave range of their measurements of current (i) and voltage (v) become very difficult [6]-[8]. at higher frequencies, we do not measure current or voltage. it is preferred to measure power. as it goes to higher microwave frequencies, it is hard to carry out the short circuit and open circuit for the ac signals over the broad bandwidth. to control this problem, at the microwave range, we use s parameters. s parameters are stated in terms of incident and reflected traveling waves. they are easy to use in the analysis. s parameters can simply be measured using network analyzers the acceptances of the use of such parameters have been growing rapidly [8]. s-parameters are a complex matrix that shows reflection/transmission characteristics (phase/amplitude) in the frequency domain. there are various parameters on which this parameter depends, such as frequency bandwidth, return loss and radiation pattern [9]-[11]. some academics have concentrated on this issue and forecast antenna performance using various ml techniques in the open literature. sharma et.al. suggested using lasso (least absolute shrinkage and selection operator), ann, and knn approaches to optimize a tshaped monopole. to train the model at two separate frequency bands, 2.5 ghz and 5.5 ghz, 450 data set points were collected [12]. gao et al. optimized the yagi uda, shaped printed antenna, and dual-mode printed antenna using the gaussian process and support vector machines [13]. j. p. jacobs suggested using ml techniques based on gaussian process regression to optimize a u-shaped slot-loaded microstrip antenna. the aforementioned antenna is compatible with both 2.75 ghz and 3.75 ghz frequency bands [14]. this paper deals with hybrid antennas that combine passive and active architecture and its optimization using different machine learning techniques. hybrid antennas are widely used antennas with important applications including radar display control systems for managing self-driving cars, or automated equipment control systems using radar signal inputs. 2. background 2.1. artificial neural network anns were acquainted with the em field and microwave designing during the 1990s [15]. artificial neural networks have discovered applications in antennas, design of radar circuit, remote sensing, measurement difficulties and various fields. neural networks intended to demonstrate the manner by which the individual mind plays out a specific undertaking. an overall meaning of a neural network is given as massive. in the late twentieth century, anns were first acquainted with mimo antennas. ann was used to transform the design parameters, including the dielectric constant, and antenna’s dimensions. as of late, anns include many hidden layers, which are generally alluded to as deep neural networks (dnns) or deep learning, that have been presented to solve antenna parameter and problems of optimization. the output in ann can be predicted as follows: y = xinput * weights + b(bias) (1) yfinal = (x1 * w1) + (x2 * w2) + …. + (xn * wn) (2) machine learning assisted optimization and its application to hybrid dielectric resonator antenna design 33 2.2. support vector machine the svm can take up classification as well as the regression problems. in the problems of regression, on a high-dimensional space called a feature space, the input space of svm is mapped here with the help of linear functions regression that can be accurately performed [16]. in the antenna design field in contrast to anns, the svm is introduced because of its better generalization capability. in practical applications, the sets of training data generated by full-wave em simulations are mainly of finite size, which causes overfitting in certain artificial neural networks applications. also, svm needs fewer training patterns to give precise results, which fastens procedure of training. 2.3. gaussian process regression as of late, the gpr has received broad attention in the area of em designing, including for antenna design. rather than the other 2 ml techniques, the gpr can tell the uncertainty at new design points for the predicted results, which will assist creator with investigating worldwide optima when hardly any points for training are given, the gpr was acquainted with model antenna responses containing the reflection coefficient, gain performance and crosstalk level for 3 distinctive antenna models [17]. 2.4. antenna architecture for this study a hybrid dielectric resonator antenna is used. in the hybrid structure, every antenna is designed to radiate in its own separate band. the hybrid resonator can offer ultrawideband operation if the different bands are sufficiently close. ultra-wideband bandwidth is possible in hybrid resonators to offer ultra-wideband operation by using the techniques of bandwidth improvement in dra and in other antennas as well. fig. 1 displays the structural layout of the dra antenna, and table 1 shows the dimensions of the proposed hybrid cp radiator. in this radiator, the ring-shaped ceramic material is excited by dual linked circular ring-shaped space. table 1 dimensions of proposed hybrid cp radiator symbols dimensions (mm) symbols dimensions (mm) ls = ws 50.0 h 13.0 t2 4.0 lf 31.0 hs 1.6 wf 3.0 r1 13.5 d3 10.0 r2 2.0 d4 4.0 t1 2.0 l1 12.0 fig. 2 shows the fabricated prototype of proposed cp antenna. alumina material (relative permittivity of ceramic material = 9.8; dielectric loss tangent = 0.002) is used to make the ring-shaped ceramic. alumina material is fixed over fr-4 substrate (relative permittivity of fr-4 material = 4.4; dielectric loss tangent fig. 1 structural layout of dra antenna: a) dual linked circular ring aperture; b) panoramic view 34 p. ranjan, h. gupta, s.yadav, a. sharma = 0.02) with the assistance of a quick fix. the thickness of the proposed antenna is 1.6 mm. dual linked circular ring-shaped aperture and t-shaped microstrip lines have been carved over the substrate. а) b. fig. 2 fabricated prototype of proposed cp antenna: (a) bottom view; (b) top view 2.5. s-parameter electrical systems represent the relationship between input and output by the designation of their port. for example, when having 2 ports named port 2 and port 1, the power that transfers from port 1 to port 2 is called . is transfer of power from port 2 to port 1 when it comes to antennas, speaks about amount of reflected power from the antenna, which is called its reflection coefficient. if s11=0 db, at that point 100% power will return back from the antenna and radiated value will be 0. if s11= -10 db, this depicts, if 3 db of power is transmitted to the antenna, -7 db will be the power that will reflect back. the remaining power was delivered to the antenna. this accepted power is either transmitted or consumed as losses inside the antenna itself. since antennas are commonly intended for a low loss, preferably most of the power delivered to the antenna is radiated. fig. 3 shows the simulated and measured s11 parameter of the reference antenna. fig. 3 variation of s-parameter with frequency machine learning assisted optimization and its application to hybrid dielectric resonator antenna design 35 reflection coefficient is a specification which expresses the amount of a reflected wave because in the transmission medium impedance discontinuity is presented. it is equivalent to the ratio of the amplitude of the reflected wave to the incident wave, with each represented as phasors. for instance, it is utilized in optics when we measure the proportion of light reflected back from surface whose index of refraction is different, such as a glass, or in an electrical transmission line to compute line the amount of the electromagnetic wave is reflected due to impedance. 3. implementation and results this segment depicts various steps of the methodology section in more details along with the actual implementation details. 3.1. data collection by altering various parameters of the hybrid dra antenna, the data was first collected by using the hfss software. the data exported from this software includes various parameters related to the antenna such as height, frequency, as well as the corresponding s11 parameters. in this collected dataset the frequency parameter varied from 2.00ghz to 5.00ghz, whereas the height parameter discretely varied from 5mm to 15mm. the dataset contained the value of s11 parameter for every pair for height and frequency. 3.2. data preprocessing and sampling as can be seen from the sample dataset in fig. 6, the dataset exported contained some random, unrelated entries which needed to be removed. the exported dataset was also not in proper format and thus some rearranging of columns and rows was required. in this particular step, we mostly performed such operations on the exported dataset, and finally the exported dataset was dumped into a csv via bash script. this csv will serve as an input to our machine learning model algorithms. to build any machine learning model from a dataset, the first step is the sampling of data. in the dataset provided to the model for as csv input, a sampling procedure was acted upon so as to separate it into various subsets with each one having its own utility. it is normally expected that if we have more data to construct a model, it will give better outcomes. typically, the dataset is isolated like this with their individual utility. as the name demonstrates, the training set is utilized in the training of the learning algorithm. fig. 4 depicts different stages of sampling which is done on the dataset. for the validation and optimization of the model crossvalidation set is used. to ensure that our model extracted the proper patterns from the data and did not introduce too fig. 4 sampling over the dataset 36 p. ranjan, h. gupta, s.yadav, a. sharma much noise, crossvalidation is utilized. and here the fold value is 5. we cannot check the model on this set because the results would be very optimistic as the model is built by using the training set. to perceive how properly the learning algorithm performs with unknown data, we used the test set. 3.3. building ml model building any ml model starts with loading the csv dataset into the python code for ml modeling, after that different machine learning models are applied on that data set according to the requirement of the programmer and for training and running various ml models over the dataset google collab platform is used. 3.4. unpacking of data the dataset of height, frequency and s11 parameter obtained after preprocessing and filtering comes in a .csv file format to our model implementation. to read the dataset shown in fig. 5, from those files a small python code is implemented and dumped into the data frame for object serialization, fast and easy access. fig. 6 it shows that how data is stored in the csv file. fig. 5 python code for reading csv fig. 6 sample data as read from csv 3.5. preparing final data for input as can be seen in the previous image the data read from csv contains s11 values for corresponding pairs of frequency and height (from 5 to 15). after that prepare a data frame consisting of all three values (frequency, height and s11) in one row so that it can machine learning assisted optimization and its application to hybrid dielectric resonator antenna design 37 be used as actual data for our models, i.e., with features and responses defined clearly. preparation of final data frame is indicated in fig. 7. fig. 7 preparing final data frame 3.6. data visualization the final data frame created in the previous step was then visualized in to get insights from the data, shown in fig. 8 and to decide on which models can be leveraged for such a dataset. treating frequency and height as independent variables and s11 will be a dependent variable. relation of all these three variables is shown in fig. 9. fig. 8 data visualization fig. 9 3d visualization of dataset 38 p. ranjan, h. gupta, s.yadav, a. sharma 3.7. multiple regression the very first model used here for dataset was multiple regression. the multiple regression was used because of our dependent variable (s11). in linear regression, the relationship between dependent variable and independent variable x1, ......, xp is given by equation: y = f(x) + ϵ (3) since there are multiple independent variables (height and frequency) and a dependent variable (s11) to predict, the multiple regression model will be given by: f(x) = β0 + β1x1 + ... + βp xp (4) where to calculate dependent variable (f{x}), β0...p are the coefficients of the model and x1...p are the independent variables used to ensure maximal prediction of the dependent variable from the set of independent variables. fig. 10 multiple regression predictions versus actual s11 this multiple regression model was not that much aligned with the dataset and we got only 23% of accuracy for our test dataset. this can be referred from fig. 10. 3.8. polynomial regression it is one of the types of linear regression where connection amidst the dependent and independent variable y and x is demonstrated as an nth degree polynomial. a nonlinear relationship is fitted on this regression between an estimation of x and the subsequent mean of y, denoted by e(y|x). polynomial regression is used for many reasons: 1. all the curvilinear relationships include polynomial terms. 2. inspection of residuals. on the off chance of a curved data a linear model is fitted, a graph consists of a predictor (x-axis) and a scatter plot of residuals (y-axis) is having many positive residuals in the center consequently, in these cases it is not suitable. 3. a speculation in different multiple linear regression analysis talks about independent variables. in a polynomial regression model, this supposition is not fulfilled. y = a + b1x1+ b2x2 +.... + bnxn + ϵ (5) machine learning assisted optimization and its application to hybrid dielectric resonator antenna design 39 on the variable x, y is dependent, intercept of y is the rate of error. therefore, using least square technique, computed y is the response value. another important thing to note is that the polynomial regression is very delicate from the outliers and in the proximity of countable outliers which are present in the data of a nonlinear analysis can change the results drastically. fig. 11 polynomial regression predictions vs actual s11 starting with the 2nd degree polynomial, polynomial regression did not achieve the desired result. when the degree is increased, it is also observed that accuracy increases; within this model accuracy is up to 62%. actual and predicted value using polynomial regression predictions are displayed in fig. 11. random forest regression random forest is a supervised learning algorithm. the “forest” it builds is an ensemble of decision trees[18]. this is based on a gathering method based on bagging. the classifier functions are demonstrated in this way: 1. d is the classifier that primarily makes k bootstrap specimens of d, and each of the specimens symbolizes di . 2. a di has almost the same number of tuples as d that is tested with substitution from d. 3. along inspecting on substitution, in such a manner as a portion belonging to the real tuples of d may should not contain di, although further can happen more than once. the classifier at that point builds a decision tree dependent on each di. accordingly, a “forest” that comprises k decision trees is made. for categorizing an obscure tuple, x, every tree gives back its class forecast considering a single vote. the ultimate choice of x’s group is allocated and given to that tree which has the maximum votes. the working of random forest regression is portrayed in fig. 12. 40 p. ranjan, h. gupta, s.yadav, a. sharma fig. 12 random forest regression for its tree induction this project uses gini index. for d, the gini index is computed as: m gini(d) = 1 ∑ pi2 (6) i = 1 where pi is the likelihood that a tuple in d belongs to class ci. the gini index measures the contamination of d. if the index value is lower than better d was divided. fig. 13 random forest regression predictions vs actual s11 random forest regression is tried with a different number of estimators, and after multiple training, the results were extraordinary for the provided dataset, and the accuracy that is achieved is 97%. this model also has the best fit as compared to all other models that were tried on given dataset as can be seen in the fig. 13. machine learning assisted optimization and its application to hybrid dielectric resonator antenna design 41 4. conclusion this paper is implemented on python. on analyzing the dataset, random forest regression gave the highest accuracy rate of 97%, and the polynomial regression algorithm 62%. this multiple regression model was not a good fit with the dataset and we got only 23% of accuracy for our test dataset. so, multiple regression model has the worst results among all the models used in this paper. it is seen that machine learning is a good option for the optimization of antenna parameters and to predict the variation of s11 with different values of height and frequency. it saves a lot of time and material which was getting wasted in traditional designing. with these ml models, a near prediction based on real values was made (obtained from hfss). the ml models are further supported by experimental findings. the proposed antenna operates between 2 to 5 ghz. the optimized design validates its suitability for the application in hybrid dra by exhibiting stable radiation characteristics within the operating frequency range. references [1] q. wu, y. cao, h. wang and w. hong, "machine-learning-assisted optimization and its application to antenna designs: opportunities and challenges", china commun., vol. 17, pp. 152-164, 2020. [2] g. k. uyanik and n. guler, "a study on multiple linear regression analysis", in proceedings of the 4th international conference on new horizons in education, 2013, pp. 1-6. [3] k. c. lee, "application of neural network and its extension of derivative to scattering from a nonlinearly loaded antenna", ieee trans. antennas propag., vol. 55, pp. 990-993, 2007. [4] k. c. lee and t. n. lin, "application of neural network to analyses of nonlinearly loaded antenna arrays including mutual coupling effects", ieee trans. antennas propag., vol. 53, pp. 1126-1132, 2005. [5] y. rahmat-samii, j. m. kovitz and h. rajagopalan, "nature-inspired optimization techniques in communication antenna designs", proc. ieee, vol. 100, pp. 2132–2144, 2012. [6] w.-q. wang, h. shao and j. cai, "mimo antenna array design with polynomial factorization", int. j. antennas propag., vol. 2013, p. 358413, 2013. [7] a. sharma, g. das, s. gupta and r. k. gangwar, "quad-band quad-sense circularly polarized dielectric resonatorantenna for gps/cnss/wlan/wimax applications", ieee antennas propag. mag., vol. 60, pp. 57-65, 2018. [8] a. gupta and r. k. gangwar, "hybrid rectangular dielectric resonator antenna for multiband applications", iete tech. rev., vol. 37, pp. 83-90, 2020. [9] a. sharma, p. ranjan and sikandar, "dual band ring shaped dielectric resonator based radiator with left and right handed sense circularly polarized features", iete tech. rev., vol 38, pp. 511-519, 2020. [10] a. k. dwivedi, a. sharma and p. ranjan, "dual-band modified rectangular shaped dielectric resonator antenna with diversified polarization feature", int. j. circuit theory appl., vol. 49, pp. 34343442, 2021. [11] t. suryakanthi, "evaluating the impact of gini index and information gain on classification using decision tree classifier algorithm", int. j. adv. comput. sci. appl., vol. 11, pp. 612-619, 2020. [12] y. sharma, h. h. zhang and h. xin, "machine learning techniques for optimizing design of double tshaped monopole antenna", ieee trans. antennas propag., vol. 68, pp. 5658-5663, 2020. [13] j. gao, y. tian and x. chen, "antenna optimization based on co-training algorithm of gaussian process and support vector machine", ieee access, vol. 8, pp. 211380-211390, 2020. [14] j. p. jacob, "efficient resonant frequency modeling for dual-band microstrip antennas by gaussian process regression", ieee antennas wirel. propag. lett., vol. 14, pp. 337-341, 2014. [15] p. burrascano, s. fiori and m. mongiardo, "a review of artificial neural networks applications in microwave computer‐aided design", int. j. rf microw. c. e., invited article, vol. 9, pp. 158-174, 1999. [16] g. min and y. feng, "calculation of the characteristic impedance of tem horn antenna using support vector machine", in proceedings of the international conference on microwave and millimeter wave technology, 2010, pp. 895-897. 42 p. ranjan, h. gupta, s.yadav, a. sharma [17] j. gao, y. tian, x. zheng, x. chen and m. mrugalski, "resonant frequency modeling of microwave antennas using gaussian process based on semisupervised learning", complexity, vol. 2020, p. 3485469, 2020. [18] p. ranjan, a. maurya, h. gupta, s. yadav and a. sharma, "ultra-wideband cpw fed band-notched monopole antenna optimization using machine learning", prog. electromagn. res. m, vol. 108, pp. 27-39, 2022. 10854 facta universitatis series: electronics and energetics vol. 36, no 1, march 2023, pp. 77-89 https://doi.org/10.2298/fuee2301077j © 2023 by university of niš, serbia | creative commons license: cc by-nc-nd original scientific paper frequency analysis of the typical impulse voltage and current waveshapes of test generators vesna javor department of power engineering, faculty of electronic engineering, university of niš, serbia abstract. frequency analysis of the impulse waveshapes of generators which are commonly used for testing of the equipment in high-voltage engineering is presented in this paper. some of the typical impulse waveshapes, such as 1.2/50 µs/µs, 10/350 µs/µs, 10/700 µs/µs, 10/1000 µs/µs, and 250/2500 µs/µs, are approximated by the doubleexponential function (dexp) and by the terms of multi-peaked analytically extended function (mp-aef). experimental set ups for impulse signal generation are based on the desired outputs as given in the iec 60060-1 standard. dumped oscillations are characteristic of the standardized 8/20 µs/µs waveshape. the positive part of the normalized sinc function with dumped oscillations is also approximated by mp-aef terms. the corresponding frequency spectra of these aperiodic signals are obtained analytically by using piecewise fourier transform (pwft). this paper presents the procedure to obtain fourier transforms of the functions with multiple and sharp peaks typical for the impulse current and voltage test generators’ waveshapes. key words: fourier transform, high-voltage technique, standard impulse waveshapes, test generators 1. introduction for the testing of equipment in high-voltage technique the generators have to produce the defined waveshapes as given in the relevant standards, and the testing waveshapes have to be repeatable within tolerances which are also given in these standards. frequency analysis of such waveshapes is important for the study of their effects on the tested equipment. it is also important for the choice and use of measuring instruments and their characteristics in frequency domain (their frequency response and bandwidth). such analysis is significant for the design of test generators [1]. fourier transform (ft) of aperiodic signals results in continuous spectra over frequency domain, whereas periodic signals have discrete spectra and their amplitudes at each frequency received june 13, 2022; revised july 22, 2022 accepted august 08, 2022 corresponding author: vesna javor department of power engineering, faculty of electronic engineering, university of niš e-mail: vesna.javor@elfak.ni.ac.rs 78 v. javor represent the strength of the signal at that frequency. however, impulse voltages and currents testing signals are also called energy signals as they have finite energy in time domain. they are characterized by energy spectral density which is proportional to the square of the signal integrated over the time domain. according to the parseval’s relation, the same value is obtained if the square of its fourier transform amplitude is integrated over the frequency domain [2]. if an impulse function is approximated by e.g. double-exponential function (dexp) [3]-[4], it may be further replaced by the terms of multi-peaked analytically extended function (mp-aef) [5], and its fourier transform may be obtained analytically by using piecewise fourier transform (pwft) [6]. there are various applications of exponential functions for approximation of waveshapes and it is proved in this paper that mp-aef may be used equally without introducing any error. the sequence in which the analysis is carried out is the following: ▪ the parameters of dexp are obtained so to approximate the impulse waveshape by using least squares method (lsqm) [3]-[4], or by using marquardt method for least squares estimation of nonlinear mp-aef parameters (mlsm) [7]; ▪ frequency spectrum is determined analytically by using pwft [6] applied to the mp-aef terms; ▪ frequency bandwidth of the signal is determined so to use these results for further calculations or for the choice of measuring instruments, and possible peaks in the amplitude spectrum are analyzed so to check the periodicity of the waveshape (if it has dumped oscillations). this procedure is applicable to impulse functions with multiple and sharp peaks which can be well approximated by the terms of mp-aef. some typical impulse waveshapes, such as 1.2/50 µs/µs, 10/350 µs/µs, 10/700 µs/µs, 10/1000 µs/µs, and 250/2500 µs/µs, are analyzed in this paper. the frequencies at which fourier transforms amplitudes of these signals decay to a certain percentage of the amplitude at the frequency f=0 are also given in this paper. this is important for the analysis of induced effects on the equipment under test and for any further calculations in the frequency domain. other waveshapes can be analyzed by the presented procedure, including 8/20 µs/µs waveshape [8] and oscillatory dumped waveshapes of test generators, as they are suitable for approximation by the terms of mp-aef, but not suitable for approximation by dexp. to confirm the procedure on such functions, the results are given in this paper for the fourier transform of the sinc function approximated by a few terms of mp-aef. 2. typical impulse voltages and currents of the test generators impulse waveshapes are usually defined by several parameters, such as the peak value um for the voltage or im for the current at the instant tm, and the two time parameters t1 and t2. t1 is the front time which is the time interval between the instant ta at the point a when the signal is either 30% (fig. 1) or 10% of the peak value and the instant tb at the point b when the signal is 90% of the peak value. t2 is the time interval between the instant ta’ when the signal starts at the virtual origin o1 and the instant tk when the signal is half of the peak value (fig. 2). p1 is the saddle point at t1 in the rising part (fig. 1), p2 the saddle point at t2 in the decaying part (fig. 2). frequency analysis of the typical impulse voltage and current waveshapes of test generators 79 fig. 1 rising part of the voltage 1.2/50 s/s normalized to the peak value versus time t fig. 2 decaying part of the voltage 1.2/50 s/s normalized to the peak value u(t)/um versus time t 80 v. javor the iec 60060-1:2010 standard [9] gives general definitions and test requirements for the high-voltage test techniques. the standard voltage testing waveshape in highvoltage engineering is defined by t1/t2 = 1.2/50 µs/µs, as given in figs. 1 and 2. if approximated by dexp function, the parameters of some typical waveshapes are calculated using least squares method for parameters estimation in [3], as given in table 1, and in [4] for the switching impulse 250/2500 µs/µs approximated by dexp. table 1 dexp parameters obtained by using the least squares method [3], [4] waveshapes t1/t2 η α (s-1) β (s-1) 1.2/50 µs/µs 0.95847 14732.18 2080312.7 10/350 µs/µs 0.9511 2121.76 245303.6 10/700 µs/µs 0.97423 1028.39 257923.7 10/1000 µs/µs 0.98135 712.41 262026.6 250/2500 µs/µs 0.9057971 316.95721 16003.329 dexp function approximating impulse voltage is given by α β α β ( ) ( ) ( ) η t t t t m u u t u e e e e − − − − = − = − (1) for u the voltage value that multiplied by the peak correction factor η results in the peak voltage value um, whereas α and β are the parameters of the dexp function. the peak correction factor is the function of α and β, α β η m m t t e e − − = − (2) for tm the time instant of the maximum value um and 1 β ln β α α mt = − (3) the waveshape 10/350 µs/µs is the lightning current waveshape of the first positive stroke as given in the standard iec 62305 [10]. this function is very important because the positive first stroke has the highest specific energy among lightning discharge types. other impulse lightning discharge currents, such as 0.25/100 µs/µs for the subsequent negative stroke and 1/200 µs/µs for the first negative stroke, are not typical for test generators, but they are used for the analysis of induced voltages in electric power systems due to short rising times. the waveshapes of electrostatic discharge (esd) generators are given in [11], and important characteristics of impulse generators are listed in [12]. a simplified scheme of the circuit to produce an esd impulse for the testing of devices is given in fig. 3, for cd the distributed capacitance which exists between the generator and its surroundings, cd + cs of the typical value of 150 pf, rd of the typical value of 330 ω, rc of the typical value between 50 mω and 100 mω, as given in [9]. frequency analysis of the typical impulse voltage and current waveshapes of test generators 81 fig. 3 simplified scheme of the circuit to produce esd impulse for testing according to [9] experimental set ups and realization of impulse generators are discussed in [13]-[15]. various realizations of impulse generators for testing in high voltage engineering are given in [16]-[18]. real time conditions are the reason of defining intervals for typical parameters of the waveshapes in relevant standards, so that repeatability of testing and experiments can be achieved and compliance proved with the standardized values. 3. numerical examples 3.1. multi-peaked analytically extended function (mp-aef) and its fourier transform the mp-aef function [5] term is given by b b 1 2 1 e b e b b e 1 2 1 e b e b e b e b 1 2 1 1 2 1 2 ( ) c (c c ) exp 1 1 c (c c ) exp c (c c )[(d d ) exp(1 d d )] a a a t t t t x t t t t t t tt t t t t t t t t t t t   − − = + − − =   − −         + − − − + =     − − − −      + − + − − (4) for the parameter a and coefficients c1, c2, d1 = (te –tb) -1 and d2 = –tb (te –tb) -1= –tb d1. c1 is the function value at the beginning tb of using approximation term, so that y(tb) = c1, and c2 is the function value at the end te of using approximation term, so that y(te) = c2. the dexp function (1) may be replaced by four terms (4) using transformation ( ) [exp ( α ) exp( β )] [(α 1)exp( α ) α exp( 1)exp(1 α ) (β 1)exp( β ) β exp( 1)exp(1 β )] m m x t x t t x t t t t t t t t = − − − = + − − − − − + − + − − (5) fourier transform of each term (4) is obtained analytically, for 0 1 c , as 82 v. javor 1 2 1 2 1 1 21 1 1 c (c c ) exp( d / d ) ( ) γ[ 1, , ] d ( / d ) a a p y p a z z p a p + − + = + + + (6) for the gamma function defined by 2 1 1 2γ[ 1, , ] exp( )d z a z a z z t t t+ = − (7) with the arguments a+1, z1=(d1t1+d2)(a+p/d1), and z2=(d1t2+d2)(a+p/d1). 3.2. fourier transform of the impulse voltage 1.2/50 µs/µs waveshape the approximation of any impulse function with one peak may be also done by just two mp-aef terms, so that the impulse voltage 1.2/50 µs/µs may be approximated by                       −=               −= = tt t t t t utu tt t t t t utu tu m b mm m m a mm m ,1exp)( 0,1exp)( )( 2 1 (8) for a and b the parameters of the two mp-aef terms u1(t) and u2(t), respectively. the first term is the same as (4) for c1=0, c2 =um, te = tm, tb = 0, d1 = tm -1 and d2 = 0. this results in the piece-wise function (6) for tm =1.9 µs of the peak value and parameters a=4 and b=0.03126. rising part of the function u1(t) represents the rising part of u(t) and is given by blue line in fig. 4. decaying part of u2(t) represents the decaying part of u(t) and is given by red line in fig. 4. fig. 4 impulse voltage waveshape 1.2/50 s/s normalized to the peak value versus time t, approximated by the two mp-aef terms u1(t)/um (blue line) and u2(t)/um (red line). frequency analysis of the typical impulse voltage and current waveshapes of test generators 83 the result for the fourier transform of the function 1.2/50 µs/µs is given in fig. 5. it can be noticed that the amplitude of the fourier transform of that waveshape is approximately constant up to the frequency of 1khz. at the frequency f1  4khz the amplitude is approximately half of the value at low frequencies. at the frequency f2  200khz the amplitude is approximately 1% of the value at low frequencies. fig. 5 amplitudes of the fourier transform of the impulse 1.2/50 s/s versus frequency f. 3.3. fourier transforms of 10/350 µs/µs, 10/700 µs/µs and 10/1000 µs/µs waveshapes standard lightning current impulse i(t)/im of the first positive stroke is defined by 10/350 µs/µs and approximated by mp-aef terms. its fourier transform is presented in fig. 6. the amplitudes of the fourier transform are approximately constant up to the frequency of 100hz. at f1  200hz the amplitude is approximately half of the value at low frequencies, whereas at f2  10khz the amplitude is approximately 1% of the value at low frequencies. fourier transforms of the three typical impulse waveshapes are presented in fig. 7 and denoted by a) 10/350 µs/µs, b) 10/700 µs/µs, and c) 10/1000 µs/µs. these results show that the amplitudes of the fourier transforms are approximately constant up to the frequency of 100hz for all the three waveshapes. at f1a  200hz, f1b  300hz and f1c  600hz the amplitudes are approximately half of the value at low frequencies, whereas at f2a  10khz, f2b  15khz and f2c  25khz the amplitude is approximately 1% of the value at low frequencies. impulse 10/700 µs/µs presents an open-circuit voltage waveshape, whereas 10/1000 µs/µs may present either open-circuit voltage waveshape or short-circuit current waveshape of the impulse generator, but in this paper all the three waveshapes were analyzed together in fig. 7, so to notice the influence of the decaying time on the frequency spectrum of the waveshape. 84 v. javor fig. 6 amplitudes of the fourier transform of the impulse 10/350 s/s versus frequency f fig. 7 amplitude spectra of the impulse waveshapes: a) 10/350 s/s, b) 10/700 s/s, c) 10/1000 s/s versus frequency f frequency analysis of the typical impulse voltage and current waveshapes of test generators 85 results presented in figs. 5, 6 and 7 may be used to estimate the frequency bandwidth of the measuring instruments used in testing of the equipment according to the desired accuracy and relevant standards. for the computation of the induced effects of such signals in frequency domain is enough to take into account frequencies up to 1mhz. 3.4. fourier transform of the switching voltage 250/2500 µs/µs waveshape switching impulse waveshapes are slower in the rising part than the lightning impulse waveshapes, and have longer time duration in total. for the impulse t1/t2 = 250/2500 µs/µs the fourier transform is given in fig. 8. the amplitude of the fourier transform is approximately constant up to the frequency of 10hz. at f1  90hz the amplitude is approximately half of the value at low frequencies, whereas at f2  3khz the amplitude is about 1% of the value at low frequencies. fig. 8 amplitudes of the fourier transform of the switching impulse 250/2500 s/s versus frequency f 3.5. fourier transform of the dumped oscillations waveshapes standard 61000-4-12 impulse current, also denoted as ring wave, is given in fig. 9. pk1 denotes the first peak, pk2 the second, pk3 the third, and pk4 the fourth. only pk1 is specified for the current waveform. t1 is the rise time and t the period of oscillations. 86 v. javor fig. 9 standard 61000-4-12 [8] impulse current waveshape generation of 8/20 µs/µs impulse current assumes dumped oscillations with the rise time t1 = 8 µs ± 20% and the decaying time t2 = 20 µs ± 20% for the first peak pk1. the advantage of mp-aef over dexp is that it is suitable to approximate waveshapes with multiple peaks as given in fig. 9. sinc function is also an example of the dumped oscillations waveshapes. normalized sinc function is defined, for 0t , as t t t π )(πsin )(sinc = (9) this function is presented in fig. 10, for s]6,0(t . it is approximated by six terms of mp-aef given by (4), but for d1i =tmi -1 and d2i =  −= = − 1 0 1 ik k mkmi tt , so that the terms are given by 1 1 0 0 1 2 1( ) c (c c ) exp 1 a k i k i m k m k k k i i i i mi mi t t t t x t t t = − = − = =      − −          = + − −                      (10) for tm0 = 0. parameter a=3 for all the terms and other parameters are given in table 2. the complete procedure for obtaining function parameters is presented in [19]. amplitude spectrum i.e. modulus of the fourier transform of this function is presented in fig. 11 for f  [0.01hz, 10mhz], and in fig. 12 for f  [0.1hz, 1hz], so to notice the peak at f = 0.5hz due to т=2s (fig. 10). for the dumped oscillations of the impulse current waveshape 8/20 s/s with the period about т=330s, the peak in its fourier transform appears at about f = 3khz. frequency analysis of the typical impulse voltage and current waveshapes of test generators 87 fig. 10 sinc function approximated by six mp-aef terms, for s]6,0(t table 2 parameters of the six mp-aef terms term i c1i c2i tmi 1 1 0.21722 1.431 2 0.21722 0.12837 1.028 3 0.12837 -0.0913 1.012 4 -0.0913 0.07091 1.006 5 0.07091 -0.05797 1.004 6 -0.05797 0.04903 1.003 fig. 10 amplitude spectrum of the function from fig. 9 versus frequency f  [0.01hz, 10mhz] 88 v. javor fig. 11 amplitude spectrum of the function from fig. 9 versus frequency f  [0.1hz, 1hz] 4. conclusion electrostatic discharge currents and also lightning discharge currents are of impulse waveshapes. induced voltages and currents in electric circuits due to fast changing external electromagnetic fields are of impulse waveshapes. fast transients in electric circuits due to switching operations are of impulse waveshapes. due to all these, the testing generators have to produce such waveshapes [20]-[22] in order to check the equipment according to the standards. fourier transforms of typical impulse waveshapes of test generators in high-voltage technique are obtained by using terms of mp-aef. the procedure is suitable for aperiodic functions because fourier transform of mp-aef terms is obtained analytically by using gamma functions. frequency analysis of the waveshapes 1.2/50 µs/µs, 10/350 µs/µs, 10/700 µs/µs, 10/1000 µs/µs, and 250/2500 µs/µs, shows the necessary bandwidths for frequency domain calculations and also for measurements. the comparison of the three functions with the same rising time 10/350 µs/µs, 10/700 µs/µs, and 10/1000 µs/µs is also given in this paper. the procedure is also applied to the oscillatory dumped function in this paper as such waveshapes are important for testing of the equipment in high-voltage technique. in the future research, other oscillatory dumped functions as 8/20 µs/µs, 4/16 µs/µs and 5/320 µs/µs will be analyzed by using terms of mp-aef to approximate the waveshapes, and afterwards pwft is going to be used to obtain their frequency spectra. acknowledgement: the paper is a part of the research done within the project no. 451-03-68/2022-14 of the faculty of electronic engineering in niš, supported by the ministry of education, science and technological development of the republic of serbia. frequency analysis of the typical impulse voltage and current waveshapes of test generators 89 references [1] i. f. gonos, n. leontides, f. v. topalis, i. a. stathopulos, "analysis and design of an impulse current generator", wseas transactions on circuits and systems, vol. 1, pp. 38-43, january 2002. [2] k. l. kaiser, electromagnetic compatibility handbook, crc press, 2015, chapter 12, pp. 165. [3] d. lovrić, s. vujević, t. modrić, "least squares estimation of double-exponential function parameters," in proceedings of the 11th int. conf. пес 2013, niš, serbia, sept. 2013, pp. 1-4. [4] k. schon, high impulse voltage and current measurement techniques, springer, 2013, chapter 2, pp. 5-9. [5] v. javor, "multi-peaked functions for representation of lightning channel-base currents", in proceedings of the iclp 31st int. conference on lightning protection iclp 2012, sep. 2-7, 2012, ove vienna, austria, sep. 2012. [6] v. javor, "piece-wise fourier transform of aperiodic functions," in proceedings of the 21st int. symp. infoteh-jahorina 2022, march 2022, pp. 1-4. [7] k. lundengård, m. rančić, v. javor, s. silvestrov, "estimation of parameters for the multipeaked aef current functions", in proceedings of the 16th applied stochastic models and data analysis international conference (asmda) and 4th demographics workshop, piraeus, greece, 30 june 4 july 2015, pp. 623-636. [8] iec 61000-4-12 standard, electromagnetic compatibility (emc) – part 4-12: testing and measurement techniques – ring wave immunity test, ed. 3.0, 2017, https://webstore.iec.ch/publication/29872 [9] iec 60060-1:2010 standard, high voltage test techniques part 1: general definitions and test requirements, ed. 3.0, 2010, https://webstore.iec.ch/publication/300 [10] iec 62305-1 standard, protection against lightning part 1: general principles, ed. 2.0, 2010-2012. [11] emc – part 4-2: testing and measurement techniques – electrostatic discharge immunity test, iec int. standard 61000-4-2, ed. 2, 2008-2012. [12] http://pes-spdc.org/sites/default/files/impulse_generatorsaddedrev3.pdf [13] z. javid, k.-j. li, k. sun, and a. unbree, "cost effective design of high voltage impulse generator and modeling in matlab", j. electr. eng. technol., vol. 13, no. 3, pp. 1346-1354, 2018. [14] v. c. vita, g. p. fotis, and l. ekonomou "parameters’ optimization methods for the electrostatic discharge current equation", int. journal of energy, vol. 11, pp. 1-6, 2017. [15] g. p. fotis, f. e. asimakopoulou, i. f. gonos, and i. a. stathopulos, "applying genetic algorithms for the determination of the parameters of the electrostatic discharge current equation", meas. sci. technol., vol. 17, pp. 2819–2827, 2006. [16] m. abdel-salam, high-voltage engineering: theory and practice, rev. and expanded. crc press, 2018. [17] e. kuffel, w. s. zaengl, j. kuffel, high voltage engineering: fundamentals, newnes, 2000, pp. 48-74. [18] s. mehta, p. basak, k. anelis, a. paramane, "simulation of single and multistage impulse voltage generator using matlab simulink", in proceedings of the 2018 international conference on computing, power and communication technologies (gucon), ieee, 2018, pp. 641-646. [19] k. lundengård, m. rančić, v. javor, s. silvestrov, "electrostatic discharge currents representation using the analytically extended function with p peaks by interpolation on a d-optimal design", facta universitatis, series electronics and energetics, vol. 32, no. 1, pp. 25-49, 2019. [20] y. trotsenko, v. brzhezitsky, o. protsenko, y. haran, "simulation of impulse current generator for testing surge arresters using frequency dependent models", technology audit and production reserves, vol. 1, no. 1 (57), pp. 25-29, 2021. [21] k. filik, g. karnas, g. masłowski, m. oleksy, r. oliwa, k. bulanda, "testing of conductive carbon fiber reinforced polymer composites using current impulses simulating lightning effects", energies, vol. 14, no. 7899, 2021. [22] p. ghosh, b. chakraborty, s. dalai, s. chatterjee, "simulation and real-time generation of nonstandard lightning impulse voltage waveforms", in proceedings of the 2022 ieee int. conf. on distributed computing and electrical circuits and electronics (icdcece), 2022, pp. 1-5. https://webstore.iec.ch/publication/29872 https://webstore.iec.ch/publication/300 http://pes-spdc.org/sites/default/files/impulse_generatorsaddedrev3.pdf 10607 facta universitatis series: electronics and energetics vol. 35, no 4, december 2022, pp. 557-570 https://doi.org/10.2298/fuee2204557s © 2022 by university of niš, serbia | creative commons license: cc by-nc-n original scientific paper event-triggered sliding mode control for constrained networked control systems* andrej sarjaš, dušan gleich university of maribor, faculty of electrical engineering and computer science, maribor, slovenia abstract. the paper describes a non-linear control (etnc) approach for constrained networked feedback control systems (nfcs). the real-time controller execution is implemented based on the event-triggering paradigm. a nonlinear variable structure is used for the controller design. the nonlinear approach is based on the predefined sliding variable defined by the system states with a nonlinear switching function. the system's stability is analyzed regarding the evolution of the sliding variable. the event-triggered operation of the nonlinear controller is based on the prescribed triggering rule. the stability boundary of the sliding variable is subject to the preselected triggering condition, whose selection is a tradeoff of system performance, networks constraints and transmission capabilities. the main focus of the event triggering approach is lowering network resources utilization in the steady-state behavior of the nfcs. the presented approach ensures a non-zero inter-event time of controller execution, which enables scheduling and optimization of the network operation regarding the network constraints and real-time system performance. the efficiency of the presented method is presented with a comparison of the classical time triggering approach. the real measurement supports the results. key words: event-triggering, networked control system, variable structure control, sliding mode control 1. introduction networked feedback control systems have been researched extensively over the last two decades [1]. new communication technologies integrated into tiny devices with decent computational capability offer vast, remote applications in distributed or decentralized structures. regarding the network structure and amount of connected devices, the implementation of the nfcs is critical. new methods are derived that improve network usage and ensure system performance according to the controller implementation and execution. the paper introduces the nonlinear control law with event triggering execution. received march 23, 2022; revised april 29, 2022; accepted june 15, 2022 corresponding author: andrej sarjaš university of maribor, faculty of electrical engineering and computer science, koroška cesta 46, si-2000 maribor, slovenia e-mail: andrej.sarjas@um.si * an earlier version of this paper was presented at the 15thinternational conference on advanced technologies, systems and services in telecommunications (telsiks 2021), october 20-22, 2021, in niš, serbia [1] 558 a. sarjaš, d. gleich sliding mode control (smc) is an effective approach to ensure the prescribed performance of a closed-loop system, despite external disturbances and system uncertainty [1]-[4]. depending on the controller structure, the sliding mode controller is straightforward to implement and requires much computational time. all controllers in the real-time system are implemented in a discrete form, which results in a hybrid system where the continuous and discrete systems are interconnected [3]-[6]. the most commonly used approach for controller implementation is a sample and hold technique, or time triggering approach (tt). time triggering means that the controller output is updated at equidistant time intervals, also known as a sampling time. such tt closed-loop system is more suitable to design, due to the vast amount of ` developed techniques and approaches for time sampled systems. on the other hand, the tt system requires constant resources` utilization and data transmissions over the network system. the event-triggering (et) approach of a closed-loop system offers an alternative to the tt [7]. regarding the tt in the et system, the closed-loop system is updated based on the trigger rule evaluation. in other words, the controller is updated when the system states violate the triggering rule, which means that the controller is no longer updated periodically with fixed time intervals. such an implementation of the controller is more efficient than the tt implementation, and requires fewer computational resources, especially when the sliding manifold is reached. regarding the latter, et is beneficial for the networked control system (ncs), where the trigger mechanism reduces network transmission and is suitable for systems with data-rate constraints [8]-[10]. the network constraints with variable round trip time (rtt), limited data transmission, and package drops are insufficient for the ncs[11], [12]. the mentioned network parameters reduce system performances considerably, and can lead to unstable operation. the presented work introduced an smc controller design with an associated triggering rule, which ensures ncs stability and takes all the network parameters into account during the design procedure. the derived event-triggered sliding mode controller (et-smc) introduces triggering boundaries regarding the admissible lower inter-event time value and network delay [13], [14]. the et-smc for ncs is divided into two steps. the first step introduces an smc controller design with preselected system dynamics and parametrized sliding variables [15]-[17]. the second step involves triggering boundary selection regarding the system tracking performance and ncs uncertainty robustness. in comparison to the similar linear et paradigms, the presented approach still ensure smc properties and lowers the computational burden and network usage effectively. the controller parameter selection can be presented as an optimization procedure. the optimal parameter selection can be evaluated as a tradeoff between network utilization regarding ncs uncertainties and closed-loop performance, such as tracking capability, transient performance, network delay, etc. the assessment of the admissible lower interevent time of the et shows the direct influence of the et-controller on the network utilization during the reaching and sliding phase of the sliding variable evolution. the efficiency of the proposed controller is evaluated on a real-time system. event-triggered sliding mode control for constrained networked control system 559 2. sliding mode controller design for the sliding mode controller (smc) synthesis, the given system is used, 1 2 2 2 , x x x bx gv d = = − + + (1) where 2 1 2 ( ) [ ( ) ( )] t x t x t x t=  is a state vector and ( )v t  is the control variable. the parameters :g → and :b → are system parameters, where :d → is a disturbance. for smc design, the boundary of the system parameters are given, max 0 b b  , min max g g g  , min max max 0/[ , , ]g g b   . for system tracking capability, new system states are introduced, 1 1d x x = − , 2 2d x x = − , where d x is the desired value with its derivative dx . the transformed system is given as, 1 2 2 2 ,b gv d     = = − − + (2) where 1 2[ , ] t   = , d d d d x bx= − + + and holds ( )0supt dd t     . the sliding variable is designed as s c= for 2c , where 1 1[ 1], 0c c c=  . differentiating of s c= with respect to time gives, 2 1 1 1 2 ( ) 0. s c c b gv d    = + = = − − +  (3) regarding (3), d   and the sliding property, which brings the sliding variable to the sliding manifold, , 0s s = the smc controller can select as, 1 1 2 (( ) ( )),v g c b sign s  − = − + (4) where  > d holds. after the smc controller design (4), the et mechanism will be introduced in the next section. the controller (4) contains a nonlinear term, the solution of the feedback system (2),(3) with controller (4) is understood in the filippov sense [18]. 3. event-triggered sliding mode control for ncs the event-triggered rule derivation is based on the analysis of the reaching phase stability of the sliding variable [2]. it is worthy of mentioning that the discrete implementation of smc can not reach a sliding manifold completely. as a result, the quasisliding mode is obtained [16], [19], where the sliding variable is limited with boundary , where  it is subject to the sampling time, sliding parameter, and disturbance d . furthermore, the presented work is limited to the et approach, where the band  will be determined regarding the trigger mechanism and preselected inter-event time. the et-smc after two consecutive updates is given as ( ) ( ) ( ) ( )( )( )1 1 2 ,et n nv t g c b t sign s t  − = − + (5) where tn is the last update, t is the current time between two updates, and is t  [tn,tn+1). ,s +     560 a. sarjaš, d. gleich theorem 1: consider system (2) with the sliding manifold s = 0. the parameter  is given so that ( )1 1 2( ) ( ) ( ) ( ( )) ,et n nv t g c b t sign s t  − = − + (6) for all t > 0, where 2 2 2( ) ( ) ( )ne t t t = − . the event triggering is established if the controller gain is selected as d   +  (7) where holds 0  . proof: before continuing to prove, the remaining et error variables are introduced, e1(t) = 1(t) − 1(tn), and e(t) = (t) − (tn). for the stability test, the lyapunov function is presented v(t) = s(t)2/2 for the time interval t  [tn,tn+1), where 0n  . differentiation v with respect to time t the derivative v is given as 1 2 (( ) ). et v ss s c b gv d= = − − + (8) substituting the controller (5) in (8) gives 1 2 1 2 1 2 1 2 2 (( ) ( ) ( ) ( )) (( ) ( ) ( ) ( ) ( ( )) ( )) (( )( ( ) ( )) ( ( )) ( )) et n n n n n v s c b t gv t d t s c b t c b t sign s t d t s c b t t sign s t d t        = − − + = − − − − + = − − − + ( ) 2 1 2 2 (t) 1 2 ( ) ( ( ) ( )) ( ) ( ) , n d e d d d s c b t t s s s c b e t s s s s s s s           − − − +   − − +   − +   − − −   − where  > 0. concerning the condition (7) and assumption sign(s(tn)) = sign(s(t)), it is to be noted that the sliding variable is approaching the sliding manifold, where s = 0. the above is true if at the time of triggering t = tn holds e2(tn) = e2(t) = 0, then the sliding variable s is bounded with , where, 2 1 ( ) ( ) ( ) ( ) , ( ) n n s t s t c t c t c e c k e c k k c b     − = − =   = − (9) regarding 2e e and 2k e e= . the parameter k is defined as ( ) 2 2 1 2 1 c b k   − = + and  is an upper limit of the 0 1sup ( )t e t     . the boundary  is defined as  = { , }s c k   =  , where the triggering rule in (6) can be defined as, 1 2 1 ( ) ( )e t c b −  − , (10) which is the end of the proof. event-triggered sliding mode control for constrained networked control system 561 the stability of the remaining system in (2) with controller (5) needs to be assessed after the stability analysis of the sliding variable with triggering condition. regarding the reaching phase boundary (9), it can be derived 2 1 1s c = − , where 1 1 1s c = − holds. with the introduction of the lyapunov function 2 1 / 2v = , the stability can be assessed as, 1 1 1 1 1 1 1 1 1 ( ) 1 , v s c c s c       = = −   = − −    with respect to conditions (6),(9), the system is stable if it holds that 1 1 1 0c s − −  . thus, the closed-loop system is stable with respect to s, and the system trajectory 1 is bounded by 1 1 1 . ( ) k c c c b   − (11) 4. event-triggered sliding mode control for ncs the structure of the network control system is depicted in fig. 1. the controller algorithm is executed on the network computer, where the triggering rule is evaluated on the plant. we assume that the plant has a real-time system with computational ability and communication interfaces. the real-time system on the plant side is used for noncomplex computation such as triggering condition evaluation, signal conditioning, and communication capability. the user datagram protocol (udp) is used for the given et-smc implementation. the data have been transmitted over different network hops, where additional time delay and package loss may occur. the package loss in the network is modeled as a loss delay [12], [13], where the maximal allowed round trip time (rtt) of the network is used for package loss detection. the plant side uses a dedicated package-loss timer, and if the watchdog timer is expired, then the request for new data is demanded from the server. we assume that two consecutive losses can not be accrued for the package loss occurrence. network network smc plant server data flow fig. 1 networked controller structure with et-smc 562 a. sarjaš, d. gleich the controller feedback structure is presented in fig. 2, where the triggering condition determines the network usage. the controller (5) is implemented on the server, and the triggering mechanism is on the plant side. plantsmc u xexd triggering condition network network server fig. 2 et-smc feedback configuration the inter-event time of the et-smc is determined regarding the error analysis of the two consecutive sampled states, 1 1 1 2 2 2 ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) n n t t td d d d e t e t t t tdt dt dt dt       −     = =    −    , (12) where ( ) 0 n t = , according to the last update. substitute (12) in (2), (5) which gives 0 1 0 0 0 ( ) ( ) ( ) ( ) 0 0 1 et n d e t t d t v t b gdt         + −      −      , 1 0 1 0 0 0 ( ) ( ) ( ( )) 0 0 1 1 cdc n bba t d t sign s t c         = + −      −      ( ( ) ( )) ( ( )) ( ) ( ) ( ) . c n c n d c c n c d d a e t t b sign s t b d t a e t a t b b     = + − +  + + +  the solution of the differential equations is ( )( ) ( ) ( 1)c n a t tc n c d d c a t b b e t e a   −+ +   − , (13) where the minimal inter-event time  = t − tn is determined as ( ) min 1 1 ln 1 ( ) ( ) c c c n c d d k a a c b a t b b        +   − + +    (14) it can be seen that the inter-event time depends on triggering condition  and selected controller parameters c1 and . regarding the uncertainty of the network, the delay n is introduced with the update time tn. the update sequence 0{ }n n nt   = + corresponds to the event-triggered sliding mode control for constrained networked control system 563 update time tn and means that the controller is not updated with the last states, wherein the inter-event time is extended by delay value n. hence the error (13) grows till the next update time tn+1. the triggering sequence is admissible, regarding if 1 0,n n nt t n+  +  and the triggering rules (6),(10) ensure system stability. the derivation of the delay boundary, where the triggering rule ensures the system stability, is similar to the derivation of the inter-event time in (13),(14). for a given derivation, we assumed that the controller (5) at the time t  [tn, tn + n) is not updated with the current state (tn), whereby the further updates are executed at t  [tn + n, tn+1 + n+1), and the analysis involves the controller structure with past value v(t) = g−1((c1 − b)2(tn−1)+ sign(s(tn−1))). the admissible interevent time is caused by the delay, which ensures that the system stability with triggering condition (10) is, ( ) ( ) ( )( ) n 1 1 1 ln 1 ( ) c c c n n c d d k a a c b a t t b b      −    = +  − + + +    (15) the system is stable, and the boundary (11) is preserved if n  n it holds. for proper parameter selection, it is necessary to assume the maximally allowed delay in the network. the delay boundary is given as 0supn n      . the network structure and the used protocol for communication are designed after derivation of the crucial parameters for event-triggering implementation. the focal point of the network system is a protocol that needs to ensure simple transmission and minimal package loss with low rtt. all transmitted data must be transparent to the server and the client, whereas the measured data should not be ambiguous. the designed protocol enables package lost detection and possible adaptation of the controller execution in a classical tt or et manner. the package loss algorithm is essential for controller output recovery. if the package loss is detected or the rtt timer reaches the threshold, the controller output must be updated. otherwise, the closed-loop system is running in an open-loop. the update can be done with a new data transmission request from the server or an internal model-based approach. the recurrent request sent is a straightforward task for the controller update, whereas the model-based approach is more complex and advanced. in the model-based approach, the system data are obtained from the model or system approximation algorithms such as fuzzy sets and neural networks. the model-based approach requires more computational resources on the server or the client-side. such an approach can ensure faster output recovery than sending a new transmission request. the model-based update regarding the computational resources can act as a redundant system in the case of irregularities on the network or system. the structure of the designed protocol for the client communication is presented in fig. 3. ids rtts rttc data1 data2 datan crc #... fig. 3 the communication protocol of the client message the ids presents the server address, which is the main system of the ncs. tags rtts and rttc are timing data of the network rtt, one on the server-side and the other on the client-side. both sides are measured with their own rtt, where the server`s rtts is the time from server send to server received, and the client rttc is similar to rtts with 564 a. sarjaš, d. gleich beginning on the client send and received. the package loss and network irregularities can be detected with comparisons of the rttc and rtts. tags data1,2,n are transmitted states of the system. the estimation and detection of the network irregularities through different measured parameters are not the main objective of the presented work and will not be discussed hereinafter. all additional parameters of the protocol, which are not directly involved in the ncs operation, are just starting points for the further research of a network`s quality and reliability assessment. the protocol is concluded with a cyclic redundancy check crc and the delimiter #. the response message from the server to the client is presented in fig. 4. idc rtts rttc cont1 cont2 crc #... fig. 4 the communication protocol of the server message the idc presents the client address, where rtt, crc and # are the same parameters as in the client message presented in fig. 3. the tags cont1,2,.. are controllers update values. all the time values and data are presented in 4bytes float format. the id and crc are presented with 32-bit integer values. the length of the message is determined with a number of transceived system states (data), whereas id, rtts, rttc, crc, and # values are mandatory and are the control parameters of the used protocol. regarding the employed protocol with network rtt time measurement on the server and client-side, it is necessary to acknowledge the possible network uncertainty. the network uncertainty can be presented as network delay, where the network information takes time to spread from the sender to the receiver over different network hoops. the delay can cause an unwanted effect on the feedback system, such as an oscillation, slower response, deteriorated disturbance rejection capability, and even unstable operation. the delayed system needs special awareness in the controller design. in the proposed approach, the delayed system is presented as an additional elapsed time after requesting a new update from the client-side. the delay caused a more extended operation in the unstable region given in (11),(13),(14). the inter-event time (14) is extended, and the permitted state boundary is extended (13). such time delay lowers the performance of the closed-loop system and tracking capability. the system's stability is ensured with the proper selection of the controller gain given in (7). if the time delay is modeled as a parametric uncertainty with a prescribed bound,  then the controller gain selection can be lowered for the admissible delay boundary. d   +  +  (16) besides the network delay, package loss can occur in the network. unlike the network delay, package loss is generally described as information that never arrives at the destination. in the ncs approach, different types of package loss are known; newer arrived, out of order, and multiple package arrivals. in the tt-ncs approach, the state observer with a controller on the server-side is mainly used to recover the loosed packages [8]. in the et technique, the package loss stability criteria can be analyzed regarding the lyapunov stability function of the reaching phase in the et-smc operation, where the package loss is modeled such as the error, ep(t) = (t) − (t), for time t  [tk, tk+1), where 2    . the state (t) presents the last update after the package loss. the number of packages lost is equal to  − 1, which  = 2 means one lost package. the proof of the event-triggered sliding mode control for constrained networked control system 565 stability is similar to the proof presented in (8), where the lyapunov function is equal to 2 ( , ) ( , ) / 2v t s t = , and its derivative is 1 2 1 2 2 ( ) 1 (( ) ( ) ( ) ( )) ( ) ( ( ) ( )) ( ( )) ( ) ( ) ( ) ( ) p et e t p d p d p d v s c b t gv t d t s c b t t sign s t d t s c b e t s s s s s s             = − − +    = − − − +      − − +   − +   − − −  regarding the assumption  > n it holds p  . after a consecutive package lost, the system is stable if the controller gain ensure the given condition,  > p + d (17) the 1 trajectory is bounded by 1 1 1 ( ) p p k c c c b   − , (18) where is ( ) 2 2 1 2 1 p p c b k   − = + . after solving the differential equation ( , ) d e t dt  given in (12), the minimal inter-event time is ( ) min 1 1 ln 1 ( ) ( )p p p c c c c d d k a a c b a t b b         +   − + +    (19) it is evident that the package loss higher the boundary of the output trajectory 1. if the output boundary needs to be in the prescribed range (11), (18), the controller gain and interevent time (14), (19) need to be selected at lower values. the closed-loop performance needs to be reduced to ensure higher robustness of the network uncertainties. the package loss can be detected with tts,c measurement on both sides of the network. with the proper selection of the maxtts,c, and delay parameter , the desirable performance of the closed-loop system can be ensured; otherwise, the lowered closed-loop or unstable behavior can occur. 5. experimental results the dual servo system is used for the validation of the presented et-ncs approach. the servo system is presented in fig 5. the client is implemented on the arm® cortex®m7 based stm32f7xx mcu with an iwip stack for transparent udp communication with the presented ncs protocol presented in figs. 3 and 4. the iwip stack on stm32f7 enables 100base-tx communication speed. the arm embedded system is responsible for the measured current, velocity, and angle of the servo system and provides actuation to the motor drive, with pulse width modulation (pwm) at the frequency of 10khz and resolution 4mv/duty. all the measurements before the transmission are preprocessed with different signal processing algorithms to ensure the high fidelity and reliability of the measured data. the used brushed motors in fig. 5 have a maximal velocity of 3500rpm at 24v and max load current 4a. 566 a. sarjaš, d. gleich fig. 5 real-time system with network socket the network is composed of an arm embedded system, router and pc-server. the embedded system provides a request for the controller update, which is sent to the server. the request message structure is defined with the protocol presented in fig. 3. after the client's received message, the server calculates the new controller output and prepares the server message back to the client, fig. 4. the used network is presented in fig. 6. stm32f7 current -velocity -angle measurements processing udp router serverclient smc -python script l w ip udp adc, pulses, pwm e t -t ri g g e r fig. 6 ncs-network configuration the sliding mode controller is implemented with python 3.7. the main components of the python script are running the udp server with additional timer interrupt threats for tt implementation and rtts measurements. the closed-loop performance for tt and et implementation is evaluated with the given performance indices, 2 1 1 1 , { , , , }, sn w tt et ks rms w w x s v v n = =  (20) 1 1 2 1 1 0 2 1 0 for { ( ) } , , 1 for { ( ) } sn v i i i u t c a flag n n u t c a     − − − =   − = =   −  (21) event-triggered sliding mode control for constrained networked control system 567 where ns and ni are the numbers of triggering events for controllers vtt and vet respectively. the controller vtt stands for the tt execution of the controller algorithm presented in (4) as v. the controllers vtt and vet are tested in the same condition, with equal references values and a sampling time of 10ms for tt execution and periodic triggering evaluation for vet execution n  10ms (15). the parameters of the system presented in (1), (2) are, b = 3.3, g = 0.897, d = 7.1.the selected controller parameters are, c1 = 5.2,  = 16.2,  = 19.7, p = 19.7, tts,c = 11ms. the network performance is presented in fig. 7. fig. 7 measured rtts and rttc values of the ncs network the periodic triggering evaluation is selected properly regarding the measured rtt values for server and client trigger = 10ms. in each trigger period, only measured data are examined concerning the triggering boundary . figs. 8 and 9 present the ncs performance of the controller v = vtt. fig. 8 tracking capability, rpm value, and vtt controller output of the ttncs 568 a. sarjaš, d. gleich fig. 9 sliding variable and controller update flag of the ttncs figs. 10 and 11 present the ncs performance of the controller vet fig. 10 tracking capability, rpm value, and vet controller output of the etncs fig. 11 sliding variable and controller update flag of the etncs event-triggered sliding mode control for constrained networked control system 569 the estimated indices values (20),(21) are presented in table 1. table 1 performance indices of tt-ncs and et-ncs figs. 8-11 show the implementation results of the tt-ncs and et-ncs strategies. the advantages of both approaches are shown clearly. the tt-ncs has better tracking performance regarding table 1 and the rmsx1 value. this result was expected, due to the constant controller update with a prescribed sampling time of 10ms. on the other hand, the tt approach uses constant network resources. for a given experiment, at least two messages are transmitted in each 10ms time frame. regarding rmsx1 of the et-approach, the tracking capability has a slightly deteriorated response. the lower performance is the result of the nonlinear switching function of v and the unstable boundary region of the output x1 variable derived in (11) and the triggering condition. the network usage in the et-strategy is reduced drastically, especially when the system reaches a sliding manifold. the average update time for et-ncs is 41ms, presented in column avg(ts/n) of table 1. the average update time is related closely to the preselected triggering boundary and the course of the reference value. the triggering boundary affects the tracking capability of the closed-loop system directly. the employment of the et-ncs system is a tradeoff between network resources usage and the accuracy of the system. in the given experiment, the network usage of the tracking system is reduced by almost 70%, and the output rmsv value is reduced drastically. the et approach can also be considered a chattering alleviation technique for sliding mode controllers with an explicated output signum function, which is studied extensively within different implementation techniques and adaptation algorithms [18]-[21]. 6. conclusion the paper presents the event-triggering nonlinear controller implementation for a networked control system. compared to the classic time triggering implementation, the approach is beneficial for the ncs system with data rate constraints, where the network constraints can be considered during the controller design. the experimental results confirm the theoretical assumptions of et-nsc and derivation. the network usage and embedded system utilization are reduced. the et technique can be a viable alternative for tt feedback systems, especially where the computational and network resources are limited or the optimization subject. the work is a good research starting point for multi-agent, distributed control, and task scheduling in embedded systems. the central supervised server system can share its computation capacity with other distributed systems and control multiple sub-plants remotely, where the relaxation of network requests can be lowered significantly and preestimated. acknowledgement: this research was funded by the slovenian research agency (arrs) grant number p2-0065. ncs rmsx1 rmsv rmss avg(ts/n) rtts rttc flag vtt 83.2 4.56 57.2 10ms 8.23ms 3.21ms 100% vet 85.7 1.82 58.4 41ms 8.43ms 2.78ms 28.7% 570 a. sarjaš, d. gleich references [1] a. sarjaš and d. gleich, "nonlinear event-triggered networked feedback control system under data-rate constrains", in proceedings of the 15th international conference on advanced technologies, systems and services in telecommunications (telsiks), 2021, pp. 376-379. [2] a. k. behera, b. bandyopadhyay and x. yu, "periodic event-triggered sliding mode", automatica, vol. 96, pp. 1916-1931, jan. 2018. [3] v. i. utkin, sliding modes in control and optimization. new york: springer-verlag, 1992. [4] c. edwards and s. k. spurgeon, sliding mode control: theory and applications taylor and francis, 1998. [5] i. furtat, y. orlov and a. fradkov, "finite-time sliding mode stabilization using dirty differentiation and disturbance compensation", int. j. robust nonlinear control, vol. 29, no. 3, pp. 793-809. [6] k. j. aström, "event based control" in a. astolfi and l. marconi (eds.), analysis and design of nonlinear control systems, pp. 127-147, berlin, heidelberg, springer, 2006. [7] k. j. åström and b. m. bernhardsson, "comparison of riemann and lebesgue sampling for first-order stochastic systems", in proceedings of the 41st ieee conference on decision and control (cdc), las vegas, nv, usa, 2002, pp. 2011-2016. [8] a. ferrara, g. p. incremona and v. stocchetti, "networked sliding mode control with chattering alleviation", in proceedings of the 53th ieee conference on decision control, los angeles, ca, usa, december 2014, pp. 5542-5547. [9] e. kofman and j. h. braslavsky, "level crossing sampling in feedback stabilization under data-rate constraints", in proceedings of the 45th ieee conference on decision control (cdc), san diego, ca, usa, dec. 2006, pp. 4423-4428. [10] j. ludwiger, m. steinberger, m. horn, g. kubin and a. ferrara, "discrete time sliding mode control strategies for buffered networked systems", in proceedings of the 57th ieee conference on decision control, miami beach, fl, usa, dec. 2018, pp. 6735-6740. [11] m. cucuzzella, g. p. incremona and a. ferrara, "event-triggered variable structure control", int. j. control, vol. 93, no. 2, pp. 252-260, jan. 2019. [12] j. ludwiger, m. steinberger and m. horn, "spatially distributed networked sliding mode control", ieee control syst. lett., vol. 3, no. 4, pp. 972-977, may 2019. [13] j. ludwiger, m. steinberger, m. horn, g. kubin and a. ferrara, "discrete time sliding mode control strategies for buffered networked systems", in proceedings of the 57th ieee conference on decision control, miami beach, fl, usa, dec. 2018, pp. 6735-6740. [14] a. k. behera and b. bandyopadhyay, "event-triggered sliding mode control for a class of nonlinear systems", int. j. control, vol. 89, no. 9, pp. 1916-1931, jan. 2016. [15] a. k. behera, b. bandyopadhyay and x. yu, "periodic event-triggered sliding mode", automatica, vol. 96, pp. 1916-1931, jan. 2018. [16] a. k. behera and b. bandyopadhyay, "robust sliding mode control: an event-triggering approach", ieee trans. circuits syst. ii: express briefs, vol. 64, no. 2, pp. 146-150, feb. 2017. [17] w. gao, y. wang and a. homaifa, "discrete-time variable structure control system", ieee trans. ind. electron., vol. 42, no. 2, pp. 117-122, april 1995. [18] s. koch and m. reichhartinger, "discrete-time equivalents of the super-twisting algorithm", automatica, vol. 107, pp. 190-199, 2019. [19] b. brogliato and a. polyakov, "digital implementation of sliding-mode control via the implicit method: a tutorial", int. j. robust nonlinear control, vol. 31, no. 9, pp. 3528-3586, 2021. [20] v. utkin, "discussion aspects of high-order sliding mode control", ieee trans. automat. contr., vol. 61, pp. 829-833, 2016. [21] u. p. ventura and l. fridman, "design of super-twisting control gains: a describing function based methodology", automatic, vol. 99, pp. 175-180, 1990. instruction facta universitatis series: electronics and energetics vol. 29, n o 3, september 2016, pp. 383 393 doi: 10.2298/fuee1603383k smart outlier detection of wireless sensor network sahar kamal 1 , rabie a. ramadan 2 , fawzy el-refai 3 1 department of electronics and electrical communications, higher institute of engineering, el-shorouk academy, el-shorouk city, egypt 2 computer engineering department, cairo university, egypt 3 department of system and computer engineering, el-azhar university, cairo, egypt abstract. data sets collected from wireless sensor networks (wsn) are usually considered unreliable and subject to errors due to limited sensor capabilities and hard environment resulting in a subset of the sensors data called outlier data. this paper proposes a technique to detect outlier data base on spatial-temporal similarity among data collected by geographically distributed sensors. the proposed technique is able to identify an abnormal subset of data collected by sensor node as outlier data. moreover, the proposed technique is able to classify this abnormal observation, an error data set or event affected set. simulation result shows that high detection rate is achieved compared to conventional outlier detection techniques while preserving low positive false alarm rate. key words: wireless sensor network, outlier’s detection, fuzzy logic, spatial and temporal similarity 1. introduction wireless sensor network is considered a promising solution for monitoring and measurement of natural physical phenomena such as temperature, humidity, earthquakes, pressure, light, volt, etc. a typical wsn consists of a large number of very small sensors deployed over a topological area of interest. these sensors are supplied by power resources (batteries, solar cells), measurement unites, processing units and wirelesses tx/rx unit. unfortunately, the data collected from sensor nodes are considered inaccurate and may be even unreliable due to measurement errors or superimposed noise on the received data packets in [2]. duplicated measurement or even missing values are not common in the data set collected by a wsn. a subset of data which appear to be in consistence with the whole received august 30, 2015; received in revised form november 15, 2015 corresponding author: rabie a. ramadan computer engineering department, cairo university, egypt (e-mail: rabie@rabieramadan.org) *an earlier version of this paper was presented at the international conference on recent advances in computer systems racs-2015, hail university, saudi arabia, 2015 [1]. 384 s. kamal, r. ramadan, f. el-refai data set from which it is collected is called an outlier. outlier can be defined as in [2] “an outlier is a subset of observations which appear to be inconsistent with other dataset". on the other hand, outliers as in [3] can be defined as “those measurements that are deviated from the consistence dataset". each of two definitions can be used as a solution to declare the outlier in a data set. abrupt events such as sudden sensor failure, battery power deployment or even natural physical phenomena are also reasons to which outlier data can be attributed. in order to boost the accuracy and reliability of the collected sensor data, an outlier detection process should be applied and possibly corrected. there are three sources of outliers due to environmental changes or error coming from a faulty sensor, which can be defined as (1) errors& noise, (2) events and (3) malicious attacks, the last one being related to the network security as in [2]. noise or error refers to a noise-related measurement or data instance coming from a faulty sensor. outliers caused by errors may occur frequently, while outliers caused by events tend to have a smaller probability of occurrence. erroneous data is normally represented as an arbitrary change and is extremely different from the rest of the data. noisy data as well as erroneous data should be eliminated or corrected if possible. however, events may arise due to sudden change in the real world, for example rainfall, forest fire, chemical spill, air pollution, etc. removing the event outlier from data set will lead to a loss of important hidden information of the data about events as in [4].outliers that are very close to random errors in terms of size can only be determined through the application of outlier tests. outlier classification as an event or error is an important matter. many researches consider outliers and events as similar conditions by treating events as some sort of outliers. due to the fact that there are spatialtemporal similarities between neighboring nodes, measurements enable us to classify outlier as either an event or error. this depends on the fact that error data observations seem to be unrelated, while event observations seem to be spatially correlated as in [5]. the main approaches to determine outliers can be grouped as statistics-based methods, nearest neighbor-based, cluster-based and artificial intelligence techniques. new approaches are used for outlier detection including artificial intelligence techniques such as neural networks and fuzzy logic technique. the latter was suggested by [6] in which it can also be used for geodetic networks for outlier detection. the main aim of outlier detection in wsn is to declare outliers with high detection rate while decreasing the resource consumption of network. our work is based on the observation that in most applications of wsns measurements of sensors in the environment tend to be highly correlated for sensors that are geographically close to each other (spatial similarity), and also highly correlated for a period of time (temporal similarity) as in [5]. using this observation, we take advantage of the spatial and temporal similarity in the sensor data. in the first study, we detect outliers in the univariate attribute in wsn. the main contribution of this paper is the use of euclidean distance and fuzzy logic to detect outliers in wireless sensor networks. however, spatial and temporal similarity were used to make it easy to distinguish between error and event. if probability of output of fuzzy logic is above a prefixed threshold, the observation is considered as an outlier. the model is tested on a real data set from grand-st-bernard as in [7] and implemented using matlab. this paper achieves a high detection rate and still keeps a low false positive alarm rate and computational complexity. the rest of the paper is organized as follows: section (2) shows the necessary background definition related to outlier detection. the proposed algorithm is presented in section (3) along with the assumptions upon which the proposed technique is built. section (4) shows smart outlier detection of wireless sensor network 385 experimental results and the performance evaluation of the proposed technique using a realistic data set. finally, the whole paper is concluded in section (5). 2. related work recently, there are many researches in outlier detection of wsn to improve reliability and quality of measurement sensor. these researches used different techniques to detect outlier such as statistical-based, nearest neighbor-based, clustering-based, classificationbased, and spectral decomposition-based approaches. in general, these researches can be those that do not use spatial or temporal correlation data set or those that are based on spatial or temporal correlation only or on both. in 2006, the author in [8], uses the spatial correlation that exists among neighboring sensor nodes to distinguish between outlying sensors and event boundary. in this model, each node calculates the difference between its own measurements and the median from its neighboring measurements. then outlying node is declared when the absolute value of its measurement‟s deviation degree is greater than a pre-selected threshold. this technique suffers from a low detection rate because it ignores temporal correlation between sensor data reading. as shown by [9], this model used a cluster based technique to identify the global outlier. first, each node clusters the reading and reports cluster summaries and then transmits the raw sensor reading to its cluster head. the cluster head collects cluster summaries from all of its nodes before sending them to the sink. an outlier cluster can be declared in the sink if the cluster's average inter-cluster distance is greater than one threshold value of the set of inter-cluster distances. however, these models suffer from the choice cluster width parameter. additionally, these techniques increase computational complexity when computing the distance between data instance. in [10] author uses distance similarly to identify global outliers in wsn. each node uses a distance in a similar way to identify local outliers and then broadcast abnormal data instances to all neighboring node for verification. this technique is repeated until all neighboring nodes agree on the global outliers. this technique increases computational complexity and it isn't adapted for a large scale network. in 2007, the proposed technique as in [11] uses one class quarter sphere based technique to detect outliers in wsn. this technique takes advantage of temporal correlation to identify local outliers at each node. a measurements sensor that lies outside the quarter sphere is considered as an outlier. each node transmits only brief information to its parent for global outlier‟s classification. this technique suffers from a low detection rate because it ignored spatial correlation between neighboring nodes. at 2008, the author as in [12] uses a centered quarter-sphere support vector to detect local outlier in wsn. this technique takes advantage of spatial correlations that exist in sensor data of adjacent nodes to reduce the false alarm rate and to distinguish between events and errors, but it ignores temporal correlation and increases computational complexity. but in 2009, the author as in [13] used outlier detection technique to identify outliers in data set of wsn. this technique takes advantage of spatial temporal correlation exist among sensor data reading. in 2011, author as in [14] proposed outlier detection method in the wireless sensor networks and distinguishes between event and error. this technique is used to classify the sensor node data as local outlier or cluster outlier or network outlier. this technique considers the network outlier or cluster outlier as event and local outlier as error. this algorithm suffers from high computational complexity. in 2012, the author of [15] use the advantage of temporal correlation only to detect the outlier in wsn. however, this technique suffers from some computational complexity. this approach 386 s. kamal, r. ramadan, f. el-refai differs from our approach in that our approach has the advantage of spatial-temporal similarity combined with fuzzy logic to detect outlier and identify errors and events with high detection rate and relatively low false positive rate in comparison with the result in [15]. in 2013, the author as in [16] uses temporal and spatial properties to identify outliers and distinguish between event and error but with low detection rate and false positive rate in comparison with our approach. 3. the proposed stodm technique sensor nodes are assumed to be densely deployed and synchronized in wsn. a subset of sensors is considered as members of the same cluster if they fall within the same radio transmission range of each other. at any time interval , each node reads a data vector sij where “i” is the time index of the data symbol and “j” is the node spatial id. the potential of an outlier detection technique is to identify a subset xi of each sensor set si as outliers. a super advantage of a given detection technique is to classify deviation data instance as event or error. in this section, the proposed approach is introduced in details. many outlier detection techniques have been developed, however, they did not take into account the interesting events. on the other hand, several recently developed researches are interested only in events and did not care about erroneous data. in this paper, a new distance-based approach depends on spatial-temporal similarity combined with fuzzy logic-based approach is proposed to classify outliers, i.e. error data or events. our methodology consists of the following steps: first step the spatial and temporal similarity is calculated, each one of these is entered as input or (membership function) to fuzzy logic to detect outliers in each node. second step classifies the outlier as event or error. 3.1. spatial-temporal similarity in our proposed algorithm, spatial-temporal similarity is calculated using a two-step process. first step, the temporal similarity of a given data set of sensor node is calculated on point by point basis and is given by first order difference| si2-si1|. the absolute difference is compared to a pre-specified threshold which is calculated according to tolerance of temperature sensor. a data point si2 is considered similar to other points if the absolute first order difference does not exceed the threshold. otherwise, dissimilarity is obtained and point of data may be outlier. second step, spatial similarity is calculated based on the distance between neighboring nodes. we use the euclidean distance to calculate similarity measure between two points x, y, that are in the same transmission range and are in the same close time which is calculated as eq. (1). euclidean distance is a popular choice for univariate and multivariate continuous attributes as [17]. data instance in point x is considered similar to data point in y if euclidean distance d(x, y) does not exceed preselected threshold. spatial link is defined as number of spatial similarity to each point with its neighbors as in eq. (2). where spatial similarity threshold is calculated by computing mean distance of all data points in the close time. d(x, y) =√( ) (1) smart outlier detection of wireless sensor network 387 spatial link = ∑ (2) where n is the number of neighboring nodes. 3.2. fuzzy logic model recently, many approaches have been tested on decision making theories. some of the artificial techniques that are used in outlier analysis are neural networks, support vector machine and fuzzy logic as in [18]. our approach use fuzzy logic as one of artificial techniques to detect outliers in data set of wsn. fuzzy logic is a logical model providing a general idea about the decision process in the analysis of the data set. the fuzzy logic suggested by [19] is essentially an approach that allows transition values to make a definition between the conventional values such as right/wrong, yes/no, high/low. the main purpose of the method is to bring a certainty to assigning a membership degree to the concepts which are hard to express or have difficult meaning. a fuzzy logic system consists of three main parts, which are fuzzification, rule base and defuzzification. firstly, fuzzification can be defined as a transfer between a definite system and a fuzzy system and it describes a property of an object in a certain fuzzy set. the objects can belong to „low, middle, high‟ property classes with membership functions, and each object is assigned to a membership degree between 0 and 1. this technique uses temporal and spatial similarity as two inputs or two membership functions to fuzzy system. these membership functions are chosen empirically and optimized using a sample input/output data. the most common membership functions include a triangle, trapezoid, gauss curve and sigmoid. as the membership functions represent the fuzzy set, the selection of their shape and form directly affects the decision process. secondly, the rule base combines the membership functions from the fuzzificator with the rule handling data such as „if, and, although, if not‟ which is based on the database and stored there. the if-then rules define a connecting antecedent to the consequent (i.e. input to output). these rules are given weights based on their criticality as in [19]. with this approach, measurements can be classified according to their membership degrees by adequate membership, e.g.  if spatial link (low) and temporal similarity (low) then outlier (high)  if spatial link (low) and temporal similarity (med) then outlier (high)  if spatial link (high) and temporal similarity l (high) then outlier t (low)  if spatial link (med) and temporal similarity (med) then outlier (med) thirdly, in the defuzzification unit, the rule results that are obtained from the rule handling unit are evaluated in the fuzzificator and turned into definite results as in [19]. outlier is declared according to the rule results. fig.1 represents all three stages of fuzzy logic. fig. 1 three stage of fuzzy logic 388 s. kamal, r. ramadan, f. el-refai 3.3. outlier classification the third step is to classify the degree of outlier value (error or event). in this step, we aim to know the source of the values labeled as outlier. there are two possible options; either this outlier value is due to an error, as a result of a low battery or network damage, or due to an event or phenomena in the surrounding environment. our idea is based on the following observation in the result of this technique “error in the sensor data are likely to be spatially unrelated while event measurements are probable to be spatially correlated”. on the other hand, data instance tends to be correlated in both time and space. hence, we employed this fact by using data from neighboring nodes to assist measuring the spatial similarity, also using time stamps between readings to assist measuring the temporal similarity. in other words, this technique detects the outlier in the previous step and if data instances are declared as outlier, it produces similar values or values larger than the outlier readings in all nodes. in addition, if those neighboring nodes readings are within the same time range, this indicates an interesting event in the physical world. otherwise, it is likely to be an erroneous data. in our work, we assume that a sensor node (x) is considered to be a neighbor of another node (y) if x is within y‟s communication range, and vice versa. 4. experimental result and performance evaluation in this section, we investigate the effectiveness of our proposed approach when applied on the real dataset from st.-bernard wireless sensor network in [7]. we compare the accuracy of our algorithm with another detection method called stgod method [15], which is based on spatial temporal correlation among neighbor nodes. we evaluate accuracy and the scalability of the proposed method against the stgod method on a real dataset. 4.1. study area and data description the proposed outlier detection described in section iii is applied to a realistic data set collected from 23 sensor nodes. these nodes are geographically distributed over switzerland and italian boarder, representing two clusters. the small cluster, situated in the italian boarder, contains the five sensor nodes from whose data set is obtained. fig. 2 illustrates the fig. 2 a small cluster (consists of five nodes) of the grand st deployment and their corresponding metric coordinates (e-n). smart outlier detection of wireless sensor network 389 geographical distribution of these nodes over the area in which they are deployed. the collected data represent temperature as the attribute of interest. temperature values are measured over a period 06:00–14:00 during the day (30th september, 2007). fig. 3 depicts a plot of temperature measurements sensors for all nodes in a small cluster (node25, node28, node29, node31, node32). the measurement tolerance of the deployed sensors is about ±0.3°c. fig. 3 represented data measurements of each sensor node. 4.2. results and performance evaluation this section is devoted to evaluating the performance of the outlier detection technique proposed in section (iii). two performance metrics are considered. the first is the detection rate (dr) defined as the ratio of the correctly detected outliers to the total number of outliers in a given data set. another performance metric of interest is the false positive alarm rate (fpr) which is defined as the ratio of normal data points incorrectly classified as outliers to the total number of normal data points. this section shows outliers in each node, detection rate, and false positive rate to each node. to evaluate performance of outlier detection needs a reference dataset. usually, labeling techniques are utilized to label sensor measurements and classify each data point as either a normal pattern or anomalous. the choice of the labeling technique powerfully influences the evaluation of the outlier detection techniques. there are three labeling techniques used, as in [15], i.e., running average-based, mahalanonis distance-based, and density-based, but our research used the first one which fits the data set as in [15]. in this research two software are applied, statistical model and fuzzy logic simulink, implemented by matlab. as in fig. 4 and fig. 5, spatial temporal outliers in univariate attribute (temperature) in both node25 and node29, whose detection rate in node25 is about 92% and fpr is 10.4%, while in node29 the detection rate is 93.75% and high false positive rate is 18.33%. 390 s. kamal, r. ramadan, f. el-refai fig. 4 spatial temporal outliers in node29 detected by (stodm) fig. 5 spatial temporal outliers in node25 detected by (stodm) while in fig. 6, fig. 7 and fig. 8, node28, node31, and node32, they have high detection rate 100% and fpr 9.16, 10, 4.5% respectively in each node. fig. 6 spatial temporal outliers in node28 detected by (stodm) smart outlier detection of wireless sensor network 391 fig. 7 spatial temporal outliers in node31 detected by (stodm) fig. 8 spatial temporal outliers in node32 detected by (stodm) fig. 9 shows the result of accuracy assessment for detected outliers by using pattern approach. the highest detection rate (100%) is at node (28, 31, 32) while the lowest detection rate (92%) is at node 25. the lowest amount of fpr is at node 32 (4.5%) while the highest rate is at node 29 (18.33%). fig. 9 accuracy of the detected outliers at different nodes 392 s. kamal, r. ramadan, f. el-refai extensive ratio on the collected data set shows that both the detection rate and fpr increase when the threshold is decreased. a fixed threshold of temporal similarly and the mean of euclidean distance of all nodes is computed as threshold of spatial similarity that yields an average detection rate of 97.15% and fpr of 10.472%. the relative high fpr is a result of misclassifications of some normal observations, while the high detection rate achieved is a result of considering spatial temporal similarly. table 1 shows the comparison between the proposed storms with the most frequently used data labeling technique, namely the tsod and the stgod technique with the detection rate and false positive alarm achieved by each algorithm. it can be observed that the proposed algorithm outperforms these techniques in terms of detection rate. both references models are applied to the same data set as considered in our model. another advantage of the proposed technique is that it is able to distinguish between errors and events in a given data set obtained from the sensor node. classification of the outlier source is reported in table 2. table 1 comparison between our approach (stodm) and stgod model proposed of running average in [15] method dr% fpr% stodm 97.15 10.4 tsod 23.4 1.7 stgod 72.34 10.94 table 2 number of outliers and events detected at different nodes using stodm (our model) nodes no of outlier no of event node25 48 5 node28 23 4 node29 60 5 node31 25 5 node32 21 4 5. conclusions stodm algorithm proposed in this paper combines the fuzzy logic theory and distance base similarity to detect outliers and is a new try in the area of outlier detection for spatial temporal similarity. the proposed technique is able to identify normal and outlier data. moreover, error and event are also distinguished. high detection rate is achieved compared to conventional techniques while preserving the low positive alarm rate and also reducing computational complexity because it uses euclidian distance to calculate spatial similarity among neighboring nodes. for future work, we plan to build an algorithm to detect outliers in multi attributes and to consider dependencies among the attributes of the sensor data as well as spatialtemporal correlations that exist among the observations of neighboring sensor nodes. smart outlier detection of wireless sensor network 393 references [1] s. kamal, r. ramadan, f. el-refai, “smart outlier detection of wireless sensor network by fuzzy logic”, in proceedings of the international conference on recent advances in computer systems racs-2015, hail university, saudi arabia, november 2015. [2] y. zhang, m. nirvana, h. paul,”outlier detection techniques for wireless sensor networks,”,a survey, university of twente, p.o.box 217 7500ae, enschede, the netherlands, 2010. [3] v. chandola, a. banerjee, a. kumar, v,”outlier detection: a survey”, technical report, university of minnesota , 2007. [4] v. jha, o. veer singh, y. outlier, ”detection techniques and cleaning of data for wireless sensor networks”, a survey, international journal of computer science and technology, 2012. [5] x. luo, m. dong, y. huang, ”on distributed fault-tolerant detection in wireless sensor networks”, ieee trans computer, vol. 55, no. 1, pp. 58-70, 2006. [6] h. konak, a. dilaver, e. ozturk, ” the effects of observation plan and precision on the duration of outlier detection and fuzzy logic”, 2005, a real network application, survey review, vol. 38, 298, pp. 331341, 2005. [7] sensor scope system. http://sensorscope.ep.ch/index.php/main page [8] s. subramaniam, t. palpanas, d. papadopoulos, v. kalogeraki, d. gunopulos, ”online outlier detection in sensor data using nonparametric models”, seoul, korea:, vldb; young, the technical writer‟s handbook. mill valley, ca: university science, 1989, pp. 187–198m, 2006. [9] s. rajasegarar , c. leckie, m. palaniswami, j. c. bezdek,” distributed anomaly detection in wireless sensor networks”, uk: ieee, iccs, pp.12-16, 2006. [10] j. branch, b. szymanski, c. giannella, r. wolf, ”in-network outlier detection in wireless sensor networks”, in proceedings of ieee icdcs, 2006. [11] rajasegarar, s., leckie, c., palaniswami, m. and bezdek, j. c,”quarter sphere based distributed anomaly detection in wireless sensor networks,”proceedings of ieee international conference on communications, pp. 3864-3869,2007. [12] y. zhang, n. meratnia, and p.j.m. havinga, ”an online outlier detection technique for wireless sensor networks”, in proceedings of the third ieee european conference on smart sensing and context (eurossc), pp. 25-26, 2008. [13] y. zhang, n. meratnia, and p.j.m. havinga, ”adaptive and online one-class support vector machinebased outlier detection techniques for wireless sensor networks”, in proceedings of the ieee 23rd international conference on advanced information networking and applications workshops/symposia, pp. 990-995, 2009. [14] m.s. mohamed, t. kavitha, ”outlier detection using support vector machine in wireless sensor network real time data”, int j soft comput eng, vol.1, no. 2, 2011. [15] y. zhang, n.a.s. hamm, n. meratnia, a. stein, m. van de voort, p.j.m. havinga,” statistics-based outlier detection for wireless sensor networks”, international journal of geographical information science, 2012. [16] a. amidi, n.a.s. hamma, n. meratnia, ” wireless sensor networks and fusion of contextual information for weather outlier detection”, international archives of the photogrammetry, remote sensing and spatial information sciences, vol xl-1/w3, 2013. [17] a. fawzy, h.m.o. mokhtar, o. hegazy ,”outliers detection and classification in wireless sensor networks”, egyptian informatics journal, vol. 14, pp. 157-164, 2013. [18] s. syed, m.e. cannon, ”fuzzy logic based-map matching algorithm for vehicle navigation system”, in proceedings of the urban canyons, ion national technical meeting, san diego, ca, pp. 26-28, 2004. [19] y. sisman, a. dilaver, s. bektas, ”outlier detection in 3d coordinate transformation with fuzzy logic”, acta montanistica slovaca ročník 17, číslo 1, pp. 1-8, 2012. instruction facta universitatis series: electronics and energetics vol. 30, n o 3, september 2017, pp. 267 284 doi: 10.2298/fuee1703267a recent advances in ntc thick film thermistor properties and applications  obrad s. aleksić 1 , pantelija m. nikolić 2 1 institute for multidisciplinary research, university of belgrade, serbia 2 serbian academy of sciences and arts, belgrade, serbia abstract. an introduction to thermal sensors and thermistor materials is given in brief. after that novel electrical components such as thick film thermistors and thermal sensors based on them are described: custom designed ntc thermistor pastes based on nickel manganite nim2o4 micro/nanostructured powder were composed and new planar cellbased (segmented) constructions were printed on alumina. the thick film segmented thermistors were used in novel thermal sensors such as anemometers, water flow meters, gradient temperature sensor of the ground, and other applications. the advances achieved are the consequence of previous improvements of thermistor material based on nickel manganite and modified nickel manganite such as cu0.2ni0.5zn1.0mn1.3o4 and optimization of thick film thermistor geometries for sensor applications. the thermistor powders where produced by a solid state reaction of mnco3, nio, cuo, zno powders mixed in proper weight ratio. after calcination the obtained thermistor materials were milled in planetary ball mils, agate mills and finally sieved by 400 mesh sieve. the powders were characterized by xrd and sem. the new thick film pastes where composed of the powders achieved, an organic vehicle and glass frit. the pastes were printed on alumina, dried and sintered and characterized again by xrd, sem and electrical measurements. different thick film thermistor constructions such as rectangular, sandwich, interdigitated and segmented were printed of new thermistor pastes. their properties such as electrical resistance of the thermistor samples where mutually compared. the electrode effect was measured for all mentioned constructions and surface resistance was determined. it was used for modeling and realizations of high, medium and low ohmic thermistors with different power dissipation and heat loss. finally all the results obtained lead to thermal sensors based on heat loss for measuring the air flow, water flow, temperature gradient and heat transfer from the air to the ground. key words: metal oxide thermistors, thick film thermistor geometries, thick film thermistor sensors and systems received november 10, 2016 corresponding author: obrad s. aleksić institute for multidisciplinary research, university of belgrade, kneza višeslava 1a, 11 000 belgrade, serbia (e-mail: obradal@yahoo.com) 268 o. s. aleksić, p. m. nikolić 1. introduction to temperature sensors temperature measuring and control today is widely spread everywhere around us: from homes and buildings, ground surface, water and air to power machines in industry, cars and trucks, electronic equipments, chemistry, medicine etc. [1-4].the temperature measuring range is divided in several sub-ranges such as cryogenic -temperatures, nearroom temperatures, moderate elevated temperatures, and high temperatures. to cover these temperature ranges with measuring different temperature measuring methods and temperature sensor devices were developed mainly based on thermocouples (thermoelectric effect), thermistors (thermo resistive effect) and pyrometers (infrared to visible light radiation) [5-7]. the thermoelectric effect was discovered by seebeck in 1821.the thermocouple was formed of two wires of different metals joined in one point by welding [8]. the thermoelectric electromotive force emf depends on metals used in forming the thermocouple. there are various combinations of metals such as copper and iron, metal alloys of alumel (ni/mn / al/si), chromel (ni/cr), constantan (cu/ni), nicrosil (ni/ cr/si) and nisil (ni/ si/mn), the noble metals platinum and tungsten, and the noble-metal alloys of platinum/rhodium and tungsten/rhenium [9]. the output of thermocouples increases with temperature increase from low temperatures to moderate and high temperatures. the platinum rhodium-platinum thermocouple reaches 20 mv at 1600 ᵒc, chromel-alumel 50 mv at 1200 ᵒc and chromel-constantan 80 mv at 1000ᵒ c, while at cryogenic temperatures round 0 k voltage emf is negative and reaches -6 mv for copper-constantan and -10 mv for chromel-aufe wires. emf of thermocouples is non-lineal function of temperature t [ᵒc] which crosses zero at 0 ᵒc. it can be approximated by following equation: e = a1t +a2t 2 +a3t 3 +a4t 4 …. where a1, a2, a3, a4 …are constants experimentally determined. inaccuracy is round +0.2% of the emf voltage or +2 ᵒc at 1000 ᵒc, for example [10]. at higher temperatures optical pyrometers are used for measuring temperatures using stefan–boltzmann law [11]. the output signal of the photo detector is related to the thermal radiation or irradiance j * =ε σ t 4 , where σ is called the stefan-boltzmann constant and ε is the emissivity of the object. the temperature t is measured from the distance (1-10 meters) and the measuring temperature range of pyrometers is typically from 650-2500 ᵒc. the pyrometer inaccuracy is high, typically round +25ᵒ c at the beginning of the measuring range and +50 ᵒc at the end of the measuring range [12]. thermistors are a new class of temperature sensors based on the thermo-resistive effect [13,14]. it includes sintered metal oxides (electronic ceramics) exhibiting different values of resistance thermal coefficients in near-room and moderate temperature range. generally their resistance temperature coefficient can be positive or negative (ptc and ntc, respectively). ptc thermistors are used in heaters, temperature level controls and thermal switches while ntc thermistors are used in temperature measurements in electronic equipment, air conditioning, cars, domestic appliances etc. [15]. the main advantage of this class of sensors compared to thermocouples is small dimensions, high sensitivity and low price: thermistors are mass produced as electronic components with one pair of leads (small disc shape), or with short leads (small cubic chips) like other smd components (figure1) [ 16]. the nominal resistance for example is in the range from 1kω to 10k, temperature application range from -50 to 150 ᵒc and sensitivity is 3-5 % per ᵒc, which is much higher compared to thermocouples. the resistance temperature coefficient https://en.wikipedia.org/wiki/stefan%e2%80%93boltzmann_law https://en.wikipedia.org/wiki/thermal_radiation https://en.wikipedia.org/wiki/irradiance https://en.wikipedia.org/wiki/stefan-boltzmann_constant https://en.wikipedia.org/wiki/stefan-boltzmann_constant https://en.wikipedia.org/wiki/emissivity recent advances in ntc thick film thermistor properties and applications 269 values round b=4000, for example and resistance r decreases with temperature t increase by exponential law such as: r= r0 exp [b*(1/t  1/t0), r0-nominal thermistor resistance measured at t0= 293 k (20ᵒ c). it enables easy temperature measurement with inaccuracy of +0.1 ᵒc or less [17]. it also can be used for surge protection, delay in electronic circuits and temperature compensation in power electronic circuits. therefore there is a permanent interest in thermistor materials innovation and development for new applications. fig. 1 disc and chip thermistors based on sintered metal oxides (top figures): leaded and leadlessleft and right respectively, and typical resistance values r vs. temperature t for commercial nickel manganese ntc thermistors (bottom). comparison of typical values of the main characteristics of ntc thermistors, platinum resistors (rtd), thermocouples and semiconductor thermal sensors from the application aspect is given in table 1 [18-20]. one of the way to reach new thermistor applications is use of thick films custom design thermistor pastes (modified micro/nano structure, partly substitution of metals, even doping with re) and new planar constructions adapted by shape and size to required resistance and power. the objectives are measuring of moderate elevated temperatures, flow of gasses and liquids by heat loss thermistors (flowmeters), measuring of heat transfer through the surface or heat radiation (bolometers), temperature gradient in the ground, water and air. http://www.ussensor.com/products/ntc-thermistors http://www.ussensor.com/products/ntc-thermistors 270 o. s. aleksić, p. m. nikolić table 1 comparison of modern temperature sensor devices ntc thermistor platinum rtd thermocouple semiconductor sensor ceramic (metaloxide spinel) platinum wire or metal film thermoelectric semiconductor junction temperature range -100 to +325˚c -200 to +650˚c -200 to +1750˚c 70 to 150˚c accuracy 0.05 to 1.5˚c 0.1 to 1.0˚c 0.5 to 5.0˚c 0.5 to 5.0˚c stability at 100˚c 0.2˚c/year (epoxy) 0.02˚c/year (glass) 0.05˚c/year (film) 0.002˚c/year (wire) variable, some types very prone to aging >1˚c/year output ntc resistance -4.4%/˚c typical ptc resistance 0.00385ω/ω/°c thermovoltage 10μv to 40μv/°c digital, various outputs linearity exponential fairly linear most types nonlinear linear response time fast 0.12 to 10 s slow 1 to 50 s fast 0.10 to 10 s slow 5 to 50 s cost low to moderate wire high film low low moderate thick film technology enables miniaturization of thermistors, faster thermistor response, mass production, integration with electronics and realizing novel geometries such as row and matrix of thermistors, segmented geometry, multilayer and interdigitated geometry. they can be modified in shape to cover the heat source or measuring requirements for differential measurement (in defined point) or average measurements (on defined surface). this work deals with recent research on thermistor materials and their applications: from powder preparation, pressed and sintered samples, thick film pastes to thick films devices, including our contributions to that field in last two decades. our intention was to govern thermistor properties by finding out correlations between their electronic properties and micro/nano structure and between thick film geometry and electrical properties, to optimize the sensitivity, reliability, reproducibility, robustness, long term stability etc., and answer the specific requirements in new sensor applications as mentioned above. the authors expect new sensor products in the market based on thick film thermistors in the near future. 2. metal oxide thermistors thermistors were invented in 1930 and their name is a combination of the words thermal and resistor: thermal-resistors [22].the first thermistors were made of metal oxide powders mixture pressed and sintered at elevated temperatures. generally, they are classified into two types: (ptc) thermistors where resistance increases with increasing temperature, and the vdevice is called a positive temperature coefficient thermistor, and (ntc) thermistor where the resistance decreases with increasing temperature, and the device is called a negative temperature coefficient thermistor. moreover they are also classified in two groups such as thermistors for low temperature operation (-50 to 150ᵒc) and high temperature operation (150 to 900ᵒc) [23,24]. a large number of metal oxides have decrease of resistance with temperature increase, and their sign is defined as ntc, but they are not used as thermistor materials if their electrical resistance is too high. the most suitable values of thermistor resistances for temperature measurement and control in electronics or as temperature sensors at room temperatures are moderate values (from 1-10 kω). the materials that can be used in synthesis recent advances in ntc thick film thermistor properties and applications 271 of thermistors for low temperature range of-50 to 150 ᵒc are mixed oxides of mn, ni, co. they have a spinel structure such as: (nimn)3 o4, (nimnco)3o4, (nimnfeco)3o4 or cubic structure like (fe,ti)2o3 and can be doped with other metal oxides such as lio, ruo2, zno, cuo, coo etc. [25,26]. their bulk resistance ρ25 of powder pressed and sintered samples which are measured at room temperature is correlated with exponential coeficient b of bulk resistance and compared (figure 2)[27]. the small addition of co in spinel gives higher b values and higher values of resistance [28-30]. the resistance is the consequence of initial powder properties, sintering temperaturtes/time profiles and microstructure of the samples [30-36]. fig. 2 corellation between bulk resistivityρand exponetial coefficient of resistivity b for different ntc thermistor materials: curve r  (mn, ni, co), li  dopped ; curve r1 spinel group (nimn)3 o4, (nimnco)3o4, (nimnfeco)3o4; curve r2  (fe,ti)2o3. the materials that can be used in synthesis of thermistors for high temperature range of 150 to 1100 ᵒc are mixed oxides such as mg(al1−xcrx)2o4 [37], y-al-mn-fe-ni-cr-o [38], mgal2o4–lacr0.5mn0.5o3[39], catio3 [40], sr7mn4o15 [41], ni1.0mn2-xzrxo4 [42], fe2tio5 [43], batio3 [44], al2o3-cr2o3-zro2 [45] etc. their properties such as temperature operating range, resistance and exponential factor b are given in table 2. table 2 high temperature ntc thermistor materials operating range [ᵒc] t [ ᵒc] resistivity [ωˑcm] at t b [k] la2o3-al2o3/cr2o3,cuo 50-600 50, 600 3ˑ10 11 , 5ˑ10 4 10 4 bi2zr3o7 250-600 11-17ˑ10 4 zro2/cao 400-1000 400, 1000 5-6ˑ10 5 , 20 coo-tio2/cr2o3 620-1100 22,8ˑ10 3 coo-tio2/y2o3 620-1100 31,1ˑ10 3 tio2-coo/y2o3 650-1100 28ˑ10 3 mno-al2o3-na2o 1:2:4.8 20-600 20 9ˑ10 8 14.4ˑ10 3 ceo2/pro2 200-1200 10-15ˑ10 3 the high temperature thermistors are usually disc and chip type, sealed in glass with pt wire terminations. the ceramic type special glass have a strength more than two times that of traditional glass-coated products and have excellent durability against reducible 272 o. s. aleksić, p. m. nikolić gases, such as hydrogen gas. their accuracy is lower than the low temperature thermistors like nickel manganese thermistors nimn2o4. therefore they are used in applications that directly detect high temperatures in regions to be heated: burner temperature control in gas ranges and soldering tool, oil heaters, for other abnormal heating detection in combustion equipment, and for industrial equipment instead platinum temperature detectors and thermocouples. 3. thick film thermistor pastes thick film pastes are composed of the fine powders of thermistor materials, organic vehicle and glass frits as a binder to ceramic substrate. the most often used thermistor pastes are based on nickel-manganese nimn2o4 where nickel is substituted partly with co [46,47] and nickel manganese is substituted with zn, cu [48-50]. the base material nickel manganese can be doped with bi, la, sn, cr, ru, al oxides to improve stability of electrical characteristics in the temperature operation range and adjust exponential factor b [50-55]. moreover our contribution to thick film thermistor layers includes not only substitution of basic oxides with other oxides but development of pastes based on thermistor nanopowders [56]. the sintered thermistors electronic properties were measured by fir spectrometers [57,58], hall effect measurement [59], electrical measurement (activation energy) [60] and photoacoustic spectroscopy (pa). the thermistor material thermal diffusivity was also determined for the first time by pa [61-63]. xrd of thermistor powders and sem of recently developed low resistance thick film thermistors layers based on nimn2o4 partly substituted with cuo and zno are given below in figures 2 and 3. the ntc behavior of the thick films is given in figure 4, while electronic properties are given in table 4 [64]. recent advances in ntc thick film thermistor properties and applications 273 fig 2 xrd of thermistor powder based on modified nickel manganese nimn2o4 (substituted partly with zno and cuo) and used for thermistor pastes preparation. fig. 3 sem of ntc thick film thermistor layers sintered in the air at 850ᵒc / 10 min: cu0.25ni0.5zn1.0mn1.25o4 and cu0.4ni0.5mn2.1o4: (a and b) sample surfaces, respectively and (c and d) sample cross sections, respectively. fig. 4 ntc behavior of micro/nanostructured thick film thermistors based on nimn2o4 partly substituted with cuo and zno. 274 o. s. aleksić, p. m. nikolić table 3 electronic properties of ntc thermistor nanostructured thick films thermistor composition rsq [mω/sq] b [k] ea [ev] cu0.2 ni0.5zn1.0mn1.3o4 1.3 3356 0.294 cu0.25 ni0.5zn1.0mn1.25o4 1.2 3294 0.288 cu0.4ni0.5mn2.1o4 0.39 2915 0.255 the electrical surface resistance rsq (sheet resistance) of thick film thermistors was measured on rectangular resistor geometry 2.5  2.5 mm printed on pdag electrode matrix. thick films printed of pure nimn2o4 thermistor paste composed of round 0.9 micron powder has more than 10 times higher electrical resistance. the resistances where measured at room temperature (20 ᵒc). the exponential factor b of thick film thermistor was determined from resistances r20 and r30 measured at 20 and 30 ᵒc in the climatic chamber. activation energy ea is defined as ea = bˑk, where k is boltzmann constant. 4. thick film thermistor geometries thermistor resistance r is complex function of sheet resistivity, geometry (shape and size), temperature t, and time t. the sheet resistance ρ(k) changes with k inter-electrode spacing due to electrode effect e.g. the diffusion of conductor layer to the thermistor layer. far enough from electrodes (k=few mm) practically there is no diffusion of metal electrodes to resistive layer at sintering temperature of 850 ᵒc/10 min and sheet resistance is constant and marked with ρbulk. but this is not nominal resistance of the thermistor paste: it is conventional to use rectangular resistor geometry 2.5 x 2.5 mm (du pont test resistor) to determine nominal resistance of the pastes, together with electrode effect. in fact during printing the other un-homogeneity in layer deposition occurs (thickness deformations) that can vary the resistance along the resistor [65]. the ideal resistor r is dependent of resistor length-l, width-w, thickness -d, and number of layers-n) and it can be easily calculated. the ntc behavior has an exponential factor a∙exp (-b/t) where b is exponential temperature coefficient, a is constant. moreover, thermistor resistance is dependant of time f(t): it is increasing lineal function (few seconds) and when heating is higher than cooling ntc effect occurs and further resistance it is decreasing function of time. finally the equation which describes thermistor resistance is given as follows: (1) the most difficult for modeling is f(t) as the heat transfer from thick film thermistor to air depends of shape and size of thermistor which can be different depending of application. the rectangular thermistors measure temperature in the defined point, while the surface planar thermistor constructions can measure temperature radiation flux, or average temperature of the surface, heat loss in fluids etc. for example different planar thick film thermistor constructions such as rectangular, sandwich, multilayer, segmented and interdigitated are given in figure 5. their ideal resistance is modeled as r(l, w, d, n) and sheet resistance as ρ(k), k-electrode spacing [66]. ( ) ( , , , ) ( ) b tr k r l w d n ae f t   recent advances in ntc thick film thermistor properties and applications 275 s d l w d l rectangular sandwich multilayer segmented interdigitated fig. 5 different thick film thermistor constructions (top view and cross section); rectangular, sandwich, multilayer, segmented and interdigitated, respectively. gray area  pdag conductive paste, black  ntc thermistor layer. the sheet resistance ρ(k) changes with k inter-electrode spacing as given in figure 6. the highest electrode effect which affects sheet resistivity ρ(k) occurs in sandwich, multilayer and segmented constructions. the electrode spacing in that case is only 30-33 microns (three sequentially printed and fired thermistor layers), while in case of rectangular and interdigitated construction spacing is 1 mm or more. sheet resistivity ρ=20-32 [ωm] for low k and ρ= 275285 [ωm] for k>1 mm respectively. 0,001 0,01 0,1 1 10 100 1000 10000 1 10 100 1000 10000 k [mm]  [wm] fig. 6 the sheet resistivity ρ of thick film thermistors versus inter-electrode spacing k. dashed line are experimental data measured at room temperature and solid line is modeled (fitted) ρ(k) – determined by an exponential function. w d l l d w l/3 w l w 276 o. s. aleksić, p. m. nikolić the simple modeling of sheet resistance is done using following equation: 2 0( / )( ) (1 ) k k bulk k e     (2) where k=0, ρ=0; k>>k0, ρ= ρbulk; k=k0=1mm, ρ= ρbulk(1-1/e).the modeld data differ from the experimental data (figure 6) for thermistor layer thickness less than 30 microns: sheet resistivity never crosses zero value as the diffusion of conductor pdag to ntc thermistor layer is limitted by finite porosity of thermistor microstructure (electrode effect saturation). thick film segmented thermistor is a novel construction developed for heat loss sensor aplications as it has gradient of temperature along the fluid flow. it's electrical equvalent electrical scheme consists of serial and parallel resistances rs and rp and serial and parallel capacitances cs and cp arrangend between bottom and top electrodes and between the neighbour electrodes in the same row of electrodes (figure 7). different thick film segmented thermistors are given in figure 8. + + - rscs rp cp fig. 7 equivalent electrical sheme of segmented thick film thermistor construction: rs  serial resistor, cs  serial capacitance, rp  parallel resistor, cp  parallel capacitor. fig. 8 different thick film segmeted thermistors: 5w (76.7  12.7 mm), 2w (51  6.35 mm) printed of nimn2o4 paste and 1w (25.4  6.35 mm) printed of modified paste cu0.25ni0.5zn1.0mn1.25o4. recent advances in ntc thick film thermistor properties and applications 277 total resistance r of thick film segmented thermistor in dc regime is given with r=2n· rs = ρ·2n· d / ((l/3) · w) (w  electrode width, l-electrode length, l/3 electrode spacing, n – number of top electrodes as given in figure 5), parallel resistors rp >>rs is neglected. in dc regime only resistances are active and in ac regime both resistances and capacitances are active and forms low band pass filter [67, 68]. in the segmented thermistor construction rp>>rs and cs>>cp. the voltage applied on segmented thermistor is distributed over segments (cells) in accordance with rs value. 5. thick film thermistor sensors and systems thermistor temperature sensors generate output signals in one of two ways: 1. through a change in output voltage (constant current) 2. through a change in resistance of the sensor's electrical circuit. sensing methods: contact and non-contact. the contact method: sensor is in direct physical contact with the object to be sensed to monitor solids, liquids, gases over wide range. the non-contact method interprets the radiant energy of a heat source to energy in electromagnetic spectrum monitor non-reflective solids and liquids (thermistor bolometers). temperature is a scalar quantity that determines the direction of heat flow between two bodies. temperature measuring and control by thermistors is enabled using steinhart-hart equation [69]. 2 3 ( ) exp b c d r t a t t t         or 31 (ln ) (ln )a b r c r t    (3) a,b,c are constants determined experimentally. in the first approximation two thermistor resistances r0 at t0= 293,16 к and r1 at temperature t1 are connected with following equation: 0 1 0 0 ( ) exp b t t r r tt        or 1 1 0 0 0 /( ln ) r t b t b t r            (4) unknown temperature t1 is defined by measuring r1 and using ratio r1/r0 (4). the methods for calibration and linearization of ntc thermistors for high precision temperature measurements are given in literature [70-72]. the temperature gradient (in the air, water or ground) can be measured by segmented thick film thermistors using inner electrodes (see figure 7): for measured resistances r1,r2,r3 and r4 on segments, for example, temperatures т1,т2,т3 and т4 are determined using equation (4), respectively. the first application of segmented thermistor as heat loss sensor was attempted in air flow measuring e.g. in anemometers [73-76]. after that three dimensional anemometer comprising thick film segmented thermistors, was formed using three uniaxial anemometers positioned under compasses to measure wind velocity as a three dimensional vector having {x, y, z } projections on x,y.z axes, respectively. the module of wind vector velocity | v | was calculated from square root of projections as | v | = (x 2 +y 2 +z 2 ) 1/2 and angles of wind vector to axes where calculated by from arctg { (x/v), (y/v), (z/v)}. three dimensional anemometer construction is given in figure 9, while the response of uniaxial anemometer ith of selfheated thermistor on wind velocity change (v) is given in figure 10 for room temperature of the air. 278 o. s. aleksić, p. m. nikolić fig 9 three dimensional anemometer comprising thick film segmented thermistors: thick film segmented thermistors top view (right); anemometer construction (figure in middle): x,y,z three uniaxial anemometers positioned under compasses, t  input air thermometer, v  humidity sensor with thermistors; cross section of uniaxial anemometer (right): (+,) power supply, u1,u2 inner electrodes, 1 air flow reductor, 2  sensor housing (tube), 3  segmented thermistor. the segmented thermistor is selfheated at constant voltage, and wind blow causes a heat loss on it's surface e.g., causes change of self resistance (increase), which further causes decrease (lowering) of selfheating current. the wind direction in uniaxial anemometers is determined by gradient of voltages using voltage difference (u1 u2) on inner electrodes for two halves of selfheating segmented thermistor (figure 9 right). 27 29 31 33 35 37 39 41 43 0 5 10 15 20 25 30 35 40 45 ith [ma] v [m/s] u-27 v fig. 10 the selfheating current ith of segmented thermistor in uniaxial anemometer as response on input air velocity v ( measured at room temperatures in aerodynamic tunnel). the heat loss volume water flow sensor (flowmeter for water) was formed with two segmented thermistors as micro flowmeter and flowmeter for stationary flow [77-78]. the water flowmeter for unstationary flow aimed for waterworks current volume flow measuring was formed of segmented thermistors with reduced dimensions printed on recent advances in ntc thick film thermistor properties and applications 279 alumina of cu0.25ni0.5zn1.0mn1.25o4 thermistor paste (figure 11). it consists also of two segmented thermistors: the first segmented thermistor (r) measures the input water temperature using r(t) using equation (4) and the second is selfheated thermistor at constant voltage (u). the selfheated current i is changed with water volume flow q and as i=f(q,t), where input water temperature t is a parameter [ 79 ]. fig. 11 flowmeter for water based on heat loss of segmented thermistors: segmented thermistors with reduced dimensions (left) and cross section of flowmeter (right): cold thermistor (r) – serves as thermometer for measuring input water temperature t, and selfheated thermistor at constant voltage u and selfheating current i measures volume water flow q. the response of flowmeter on current water flow q is given in figure 12: left diagram represents response of ultrasonic flowmeter as referent and righ diagram represents thermal flowmeter for water with segmented thermistors. two impulses were generated q=0.15 [l/s ] duration 30 s, 30 s pause and q=0.15 [l/s ] duration 30 s by fast switching of water flow using valves with lever. fig. 12 the electrical response of thermal flowmeter on current volume water flow: impulse water function input q = 0.15 l/s and q = 0.2 l/s time duration 15s . left diagram measured on referent ultrasonic flowmeter, right diagram measured on thermal flowmeter comprising segmented thermistor with reduced dimensions. their responses are given as electrical current i of ultrasonic flowmeter (output amplifier) and selfheating current i of ntc segmented thermistor, respectively. input water temperature is т=14.35 ᵒс, thermistor supply voltage u=14.7 v. the beginning and stop of water flow is detected by gradient of voltage measured on inner electrodes of thermistor. moreover, thick film segmented thermistors also can be applied as gradient 280 o. s. aleksić, p. m. nikolić sensors for measuring heat transfer from air to ground. other possible applications of thick film thermistor are thick film thermistor bridges, hybrid circuits with thermistors, bolometers for radiation heath measuring and thermistor arrays for measuring heat transfer and temperature or heath homogeneity etc. 5. discussion and conclusion metal oxide thermistors have higher sensitivity comparing to other temperature dependent devices (table 1), higher accuracy and stability, lower response time and lower size, and lower price /performance. therefore they are more suitable for application in different types of electronics: both ptc and ntc thermistors are widely produced as disc and chip shaped electrical components for many years, but ntc thermistors are more applied in temperature measurements as they have moderate exponential behavior with temperature (figure 1). ntc thermistors cover wide operation temperature range from 100 to 1200 ᵒc using different metal oxides (table 2). thick film thermistors with rectangular geometry (or flip chip) have appeared recently as smd (surface mounting device) commercial electrical component (tateyama kagaku device technology co., ltd.,). the main advantage of thick film chip thermistors is in laser trimming of resistance and in faster response to temperature change due to thermal conductivity of alumina used as a substrate for thick film thermistor layer. another advantage of thick film thermistors is in their sensitivity, due to thermistor layer low thickness (in microns), and low heat power for resistance change (a few mw). thick film thermistors are designed for application in microelectronics e.g. for temperature compensation of other devices and temperature sensing. their delay is as low as few seconds (transition from initial lineal to nonlineal regime), while sintered disc and chip thermistors are bulky and have much longer delay time. nickel manganese and other ntc thermistors which operate near room temperatures (low temperature range) are often in use in electronics as leaded or leadless electrical components for temperature measurements. thick film thermistors are used much less as hybrid components: they are used mainly as custom designed hybrid planar components for thermal sensors. high temperature operating thick film thermistors are very rarely used for sensors applications above 300-400 ᵒc. in practice thick film thermistor pastes appear as sensor pastes or as resistive pastes: they are produced for low temperature range applications (-50 to 130ᵒ c), with nominal square resistances 1, 10 or 100 [kω/□] at room temperature (esl, koartan, heraeus).the modified nickel manganese thermistor pastes developed recently and presented in this work also belong to custom design sensor pastes, which are aimed for temperature sensors: they have mesoporous structure and moderate ntc slope (see figure 2 and 3), and enable realization round 10 times lower resistance than pure nickel manganite thermistor paste (see table 3). different thick film thermistor devices (planar geometries) were analyzed and optimized to achieve suitable resistance and power dissipation of thick film thermistors and achieve faster response of thermistors needed for heat loss sensors. optimization of resistance included influence of electrode shape, size and arrangement, electrode spacing and diffusion of metal electrode to thermistor layer or electrode effect to sheet resistance (figure 5 and 6). mutual comparison of thick film thermistor geometries shows that sandwich recent advances in ntc thick film thermistor properties and applications 281 and multilayer geometries are “low ohmic” while rectangular and interdigitated geometries are “high ohmic” (ω and mω, respectively). a new geometry called segmented thermistor appeared as “moderate ohmic” geometry (kω), and most suitable for heat loss sensor. segmented thermistors were designed, realized and applied for heat loss from 1-5 w (different size and number of segments as given in figure 8). the equations (1) and (2), given above in part 5, are basic for resistance calculations, modeling and designing, and simulation e.g. predicting of properties of created new geometries, before thick film thermistor printing, sintering and measuring. the three axes anemometers and water flow sensors (presented above) are fully thermal devices based on selfheathed thick film thermistors and heat loss principle. comparing to electromechanical or ultrasonic flowmeters they are simpler: they do not contain amplifiers or moving parts, they are smaller in size and cheaper. the aim was to develop intelligent sensors as the second step of research and introduce intelligent functions such as auto-range, autocalibration, auto-correction of delay and auto-display of measured values and calculated values, selection of continual or switching operating mode, etc. finally, summing the recent advances in thick film thermistors three tendencies can be noticed: 1. new thermistor materials development, 2. new custom designed thermistor pastes development (micro/nano structured and doped with different oxides including rare earths) and 3. new thick film geometries development (planar constructions) aimed for thermistor sensors and systems. all three tendencies combined lead to novel thick film thermistors e.g. to new applications which fit the customer requirements. the new applications such as thick film gradient temperature sensor, temperature sensor array, bolometer with high temperature thermistor are partly in realization and their appearance is expected very soon in near future. acknowledgement: the paper is a part of the research done within the project iii 45 007 financed by the ministry of education, science and technological development of serbia. references [1] p. r. n. childs, j. r. greenwood, c. a. long, “review of temperature measurement”, review scientific instrumentation, vol. 71, no. 8, pp. 2959-2965, 2000. [2] t. d. mc gee, principles and methods of temperature measurement, john wiley,1988, pp. 2-21 [3] n.g. lewis, m. randall, thermodynamics, 2 nd edition, mcgraw-hill, new york, 1961, pp. 378-379. [4] d. sherry, “thermoscopes, thermometers, and the foundations of measurement”, studies in history and philosophy of science, vol. 42, pp. 509–524, 2011. [5] p. coates, d. lowe, the fundamentals of radiation thermometers, chapter 1:the basis of temperature measurement, crc press 2016, pp. 10-30. [6] m. j. moran, h. n. shapiro, fundamentals of engineering thermodynamics, chapter 1, john wiley & sons, 2006, pp. 10-25. [7] j. g. webster, h. eren, measurement, instrumentation and sensor handbook, thermal and temperature measurement, chapter 7, 2 nd edition, crc press 2014, pp. 65-78. [8] b. l. hunt, “the early history of the thermocouple”, platinum metals rev., vol. 8, no. 1, pp. 23-28, 1964. [9] y.s. touloukian, d.p. dewitt p.d., thermophysical properties of matter, tprc series, vol. 7 thermal radiative properties, metallic element and alloys, ifi/plenum ny, 1970, pp. 159-168. [10] r.e. bentley, handbook of temperature measurement, vol. 3: theory and practice of thermoelectric thermometry. springer-verlag singapore pte. ltd., 1998, pp. 24-36. http://scitation.aip.org/content/contributor/au0937925;jsessionid=mh4tt9qips2w3qudiyrd7zea.x-aip-live-06 http://scitation.aip.org/content/contributor/au0857680;jsessionid=mh4tt9qips2w3qudiyrd7zea.x-aip-live-06 http://scitation.aip.org/content/contributor/au0937927;jsessionid=mh4tt9qips2w3qudiyrd7zea.x-aip-live-06 http://psychology.okstate.edu/faculty/jgrice/psyc4333/thermoscopes_measurement2011.pdf 282 o. s. aleksić, p. m. nikolić [11] r.a. felice, “pyrometry for liquid metals”, advanced materials & processes, vol.166 (7), asm international, pp. 31-33, 2008. [12] r. p. benedict, fundamentals of temperature, pressure, and flow measurements, third edition, chapter 8. optical pyrometry, john wiley 1984, pp. 130-145. [13] e.d. macklean, thermistors, electrochem. pub., glasgow, 1979, pp. 5-22. [14] f. j. hyde, thermistors, first edition, published by iliffe, 1971, pp. 2-15. [15] d. r. white, “temperature errors in linearizing resistance networks for thermistors”, international journal of thermophysics, vol. 36, no. 12, pp. 3404–3420, 2015. [16] p. umadevi, c. l. nagendra, “preparation and characterization of transition metal oxide microthermistors and their application to immersed thermistor bolometer infrared detectors”, sensors and actuators a: physical, vol. 96, no. 2–3, pp. 114-124, 2002. [17] h. zumbahlen, linear circuits design handbook, chapter 3-2 temperature sensors: thermistors, analog devices -newnes, 2008, pp. 231-240. [18] w. kester, j. bryant, w. jung, sensor signal conditioning, temperature sensors, chapter 7, analog devices, pp. 1-38, 2000. [19] d. d. pollock, thermocouples: theory and properties, crc press, 1991, pp. 181195. [20] b. gosselin jr, “ntc thermistors versus voltage output ic temperature sensors”, texas instruments ecn: 04/02/2013, pp. 1-3. [21] t. kuglestadt, “semiconductor temperature sensors challenge precision rtds and thermistors in building automation”, texas instruments: application report: snaa267–04 2015, pp. 2-10. [22] t.g. nanov, s.p. yordanov, ceramic sensors: technology and applications, chapter 5 thermistors, crc press, 1996, pp. 193-203. [23] c. ma, y. liu, y. lu, h. qian, “preparation and electrical properties of ni0.6mn2.4xtixo4 ntc ceramics”, journal of alloys and compounds, vol. 650, pp. 931-935, 2015. [24] j. park, “microstructural and electrical properties of y0.2al0.1mn0.27−xfe0.16ni0.27−x(cr2x)oy for ntc thermistors”, ceramics international, vol. 41, no. 5, pp. 6386-6390, 2015. [25] o. shpotyuk, a. kovalskiy, o. mrooz, l. shpotyuk, v. pechnyo, s. volkov, “technological modification of spinel-based cuxni1−x−yco2ymn2−yo4 ceramics”, journal of the european ceramic society, vol. 21, no. 1112, pp. 2067–2070, 2001. [26] r. metz, “electrical properties of n.t.c. thermistors made of manganite ceramics of general spinel structure: mn3−x−x′mxnx′o4 (0 ⩽ x + x′ ⩽ 1; m and n being ni, co or cu). aging phenomenon study”, journal of materials science, vol. 35, pp. 4705–4711, 2000. [27] r.c. buchanan, ceramic materials for electronics: processing, properties and applications (electrical engineering & electronics), marcel dekker inc; enlarged 2nd edition, 1986, pp. 125-162. [28] m. vakiv, o. shpotyuk, o. mrooz, i. hadzaman, “controlled thermistor effect in the system cuxni1-xyco2ymn2-yo4”, journal of the european ceramic society, vol. 21, pp. 1783–1785, 2001. [29] e. s. na, u. g. paik, s. c. choi, “the effect of a sintered microstructure on the electrical properties of a mn-co-ni-o thermistor”, journal of ceramic processing research, vol. 2 (1), pp. 3134, 2001. [30] h. zhang, a. chang, c. peng, “preparation and characterization of fe 3+ -doped ni0.9co0.8mn1.3-xfexo4 (0 < x < 0.7) negative temperature coefficient ceramic materials”, microelectronic engineering, vol. 88, no. 9, pp. 2934–2940, 2011. [31] m. l. m. sarrión, m. m. sánchez, “preparation and characterization of thermistors with negative temperature coefficient, nixmn3–xo4(14), requires a lot of processing time. consequently, the complete set of n-variable bent boolean functions is known only for n ≤ 4 [3]. the general number of bent functions is an open problem. note that the number of bent functions increases rapidly with increasing n. there are 8 bent functions in 2 variables, 896 bent functions in 4 variables, 5,425,430,528 bent functions in 6 variables, and 99,270,589,265,934,370,305,785,861,242,880 bent functions in 8 variables [5]. for example, for a complete enumeration of bent functions with 8 variables (which is approximately 8.57*10-44 percent of all functions), using maximal algebraic degree, has used approximately 50 personal computers running for 3 months [6]. for this reason, identifying new characteristics can help in more efficient detection of bent functions. one way to represent a boolean function is with a binary decision diagram (bdd). bdds were originally introduced as an efficient support to the procedures and operations required to solve a given problem in synthesis and verification of logic circuits [7]. they are popular data structures widely used in various other areas where manipulation and computation with boolean functions is required. bdd is a representation of a boolean expression using a rooted directed acyclic graph that consists of terminal nodes (with constant values 0 or 1) and non-terminal nodes (marked with variables). a reduced ordered binary decision diagram, which is a widely used data structure in practice, is a bdd with a particular variable order where redundant nodes are shared, and redundant subtrees are also shared. robdds are derived by the reduction of the corresponding binary decision tree (bdt). robdds provide a compact representation allowing one to process large boolean functions efficiently in terms of space and time [7]. the most important characteristic of the robdd is the size of the graph representation or number of nonterminal nodes. this parameter is critical since the memory requirement during the construction of an robdd is directly proportional to the size [2]. besides the size, the following basic parameters are most often considered: the number of paths, width, and the average path length. there is a direct correspondence between these characteristics and the basic characteristics of a logic network derived from robdds. for example, the size of an robdd corresponds to the number of elementary modules in the corresponding realization of a logic network [1]. the paths related characteristics directly correspond to the interconnection complexity of the logic realization. therefore, this paper proposes researching the basic robdd characteristics of bent functions. previously, shafer [9], has analyzed the robdd characteristics of disjoint quadratic bent functions, symmetric bent functions, and homogeneous bent functions of 6variables. specifically, disjoint quadratic bent functions were found to have size 2n − 2 for a study of binary decision diagram characteristics of bent boolean functions 287 functions of n-variables, symmetric bent functions have size 4n − 8, and all homogeneous bent functions of 6-variables were shown to be p-equivalent. two functions are p-equivalent iff those two functions have identical bdds for distinct variable orderings [9]. however, in this paper the complete set of bent functions with 4 variables is analyzed. for bent boolean functions with 6 or 8 variables, only appropriate subsets of bent functions are analyzed and not only the size is included, but also the number of paths, width, and the average path length. a decision diagram experimental framework has been used for implementation of a program for the calculation of robdd characteristics. for discovery of bent functions, it uses maximal algebraic degree as the search space boundary. also, it uses the implementation of the discovery of bent functions using reed-muller (rm) subsets, which is described in [10]. experimental results show interesting robdd characteristics of bent functions. for each robdd characteristic, the range of values is determined. additionally, this paper also investigates the same robdd characteristics of non-bent functions with n-variables having hamming weight equal to (2n-1±2(n/2)-1). these additional characteristics confirm that experimental results can be used to create methods for discovering bent functions using robdd. this paper is organized as follows: section 2 shortly introduces the theoretical background about bent functions and their discovery in the rm domain. section 3 discusses robdds and their characteristics. the experimental results are shown and discussed in section 4. the closing section 5 summarizes the results of the research reported in this paper. 2. background theory if a function f and its walsh spectrum sf,w, in matrix notation, are represented by vectors t nffff ],,,[ 121,0 − =  , and t wf n ssss ],,,,[ 1210, − =  respectively, the walsh transform is defined by the walsh matrix w(n) [11]: f wf nws )1)(( , −= (1) wf n n snxxxf ,1 )(2),,( − = (2) where  i n i xnx 211)( 1 −= = , tffff n ])1(,,)1(,)1[((-1) 1210 −−−−=  (3) and )1()( 1 wnw n i= = ,       − = 11 11 )1(w (4) where w(1) represents the basic walsh transform matrix and  is the kronecker product. the nonlinearity of an n-variable boolean function f is the (hamming) distance of f from the set of all n-variable affine functions [3]. boolean functions achieving maximal nonlinearity are called bent functions. every bent function has a hamming weight (number of times it takes the value 1) of 2n−1 ± 2(n/2)-1. for example, bent functions with 4 variables have hamming weight of 6 or 10, with 6 variables have 28 or 36, and with 8 variables have 120 or 136. 288 m. radmanović a boolean function f in (1,−1) encoding is bent if all walsh spectral coefficients sf,w have the same absolute value 2n/2. the fast transform algorithm can be used to compute the coefficients in walsh spectrum. this algorithm is composed of the “butterfly” operations which are repeated and have structure derived from the basic transform matrices [11]. the recursive definition of the walsh transform matrix, expressed in eq. (4), is the fundamental for the definition of the fast walsh transform algorithm similar to a fast fourier transform (fft) algorithm. the computation of the fast transform algorithm consists of the repeated application of the same “butterfly” operations determined by the basic transform matrices [10]. figure 1 shows the “butterfly” operation for the walsh transform matrix. the “butterfly” operation are performed in each step over a different subset of data. figure 1 also shows the flow graphs of the fast walsh transform algorithm of the cooley-tukey type for computation of the walsh spectrum of a 2-variable boolean function f given by the truth-vector f = [ f (0), f (1), f (2), f (3)]t. this algorithm is highly exploited for testing of bentness across all possible boolean functions in some defined space for their discovery. fig. 1 the flow graph of the fast walsh transform for 2-variable boolean function for example, bentness testing for a function of 4 variables f (x1,x2,x3,x4), given by truth vector [1,0,1,1,0,1,0,0,0,1,0,0,0,1,0,0] t f = , with (1,−1) encoding, can be calculated as shown on figure 2. it should be noticed that the absolute values of all walsh coefficients are equal to 4. a positive polarity reed-muller form comprises exclusive-or of and product terms, where each variable appears uncomplemented. any boolean function f can be represented by the positive polarity rm form in matrix notation defined as [11]: fnrs rmf )(, = (5) rmfn snxxxf ,1 )(),,( = (6) where  i n i xnx 1)( 1= = (7) and )1()( 1 rnr n i= = ,       = 11 01 )1(r , 1 ))1(()1( − = rr , 1 ))(()( − = nrnr (8) where addition and multiplication are modulo 2, r(n) is the positive reed-muller transform matrix of order n, and )1(r is the basic positive reed-muller transform matrix. a study of binary decision diagram characteristics of bent boolean functions 289 fig. 2 example of the bentness testing for a 4-variable boolean function in the (1, -1) encoding using the fast walsh transform the elements of ],...,.,,,,,[ ...12123231331221,0, nrmf aaaaaaaaas = are coefficients in the positive polarity reed-muller (pprm) espressions for any boolean function [11]:  = = nji nnjiij n i ii xxxaxxaxaaxf 1 2112 1 0 )(   (9) where σ denotes modulo 2 summation. the algebraic degree or the order of nonlinearity of a boolean function f is a maximum number of variables in a product term with non-zero coefficient ak, where k is a subset of {1,2,3,...,n}. when k is an empty set, the coefficient is denoted as a0 and is called the zero-order coefficient. coefficients of order 1 are a1,a2,...an, coefficients of order 2 are a12,a13,...a(n−1)n, coefficient of order n is a12...n. the number of all coefficients of order i is         i n . the pprm coefficients are divided into order groupings according to the number of ones in the binary representation of its index in the spectrum. 290 m. radmanović the algebraic degree of bent functions is at most /2n for 4n [8]. thus, the maximal number of non-zero pprm coefficents of a bent functions is:  =        2/ 0 n i i n . since the order of bent functions is limited, the number of non-zero pprm coefficents is also limited and the positions of the coefficients in the pprm spectrum are restricted. these restrictions are the main reasons for discovery possibility since they certainly reduce the possible search space for discovery in the reed-muller domain. as the boolean function size increases, the possible search space increases too. for the pprm transform, we need an inverse transform to get back from the reedmuller domain. since the reed-muller transform matrix )(nr is a self-inverse matrix over gf(2), the forward and inverse transform are given by the same matrix. figure 3 shows the flow graphs of the fast inverse read-muller transform algorithm of the cooley-tukey type for computation of the boolean function f with the truthvector [ (0), (1), (2), (3)] t f f f f f= from a pprm spectrum. this algorithm is highly exploited for discovery of bent functions across all possible boolean functions in the rm domain. fig. 3 the flow graph of the fast inverse rm transform for 2-variable boolean function for example, discovery of the bent function for a function of 4 variables f (x1,x2,x3,x4), with truth vector [1,0,1,1,0,1,0,0,0,1,0,0,0,1,0,0] t f = , using the fast inverse rm transform of its , [1,1, 0,1,1, 0, 0, 0,1, 0, 0, 0,1, 0, 0, 0]f rms = , is shown in figure 4. the bentness testing for this truth vector, with (1,−1) encoding, is performed as shown in figure 2. black dots on the flow graph on the left side in figure 4 indicate 11 possible positions for non-zero pprm coefficients. the number of these possible positions is calculated according to the following formula: (10) 11641 2 4 1 4 0 442/4 0 =++=        +        +        =         =i i a study of binary decision diagram characteristics of bent boolean functions 291 fig. 4 example of the discovery of the bent function of 4 variables, using the fast inverse rm transform this means that there is one possible position for non-zero coefficient of the 0-th order, 4 possible positions for non-zero coefficients of the 1-st order and 6 possible positions for nonzero coefficients of the 2-nd order. the pprm espression for the function from figure 4 is: 43214214321 1),,,( xxxxxxxxxxxf = (11) fast transform algorithms are highly exploited for discovering bent functions in the rm domain. 3. robdd a bdd is a directed acyclic graph that contains non-terminal nodes, two terminal nodes, and edges. an robdd is a reduced bdd for which the nodes at a same level are labelled with the same variable [12]. the reduction is performed by sharing the isomorphic subtrees and removing the redundant data in the bdt using the appropriately defined reduction rules [6]. non-terminal nodes are labeled with variables xi and have two outgoing edges. outgoing edges are labeled ‘0’ and ‘1’ according to the values of the variable xi. terminal nodes contain the function values ‘0’ and ‘1’. the truth table entry 292 m. radmanović of a boolean function labels edges from the root node to the corresponding terminal node. an example of the robdd representation for the function defined by the truth vector [0,1,1,1,1,0,0,0,1,0,0,0,1,0,0,0] t f = using ordering (x1,x2,x3,x4) is shown in fig. 5. for characteristics of the robdd, the following basic parameters are most often considered: the size, the number of paths, width, and the average path length (apl) [13]. the efficiency of the robdd representation in the above example is that it represents a truth vector with a high level of redundancy in a compact form using non-terminal nodes, as long as the data is encoded in such a way that the redundancy is exposed. in a robdd for logic function f , the size of the robdd is the number of non-terminal nodes needed to represent the robdd. in the memory representation of the robdd, each nonterminal node requires an index and two pointers to the succeeding nodes. fig. 5 the robdd representation of the function defined by f=[0111 1000 1000 1000]t the width of a robdd is defined as the maximum number of nodes per level. the delay in a logic network is directly proportional to the number of levels of the robdd, which together with the width determines the surface area of the logic network [7]. a path in an robdd is the sequence of nodes connected by edges leading from the root node to the terminal node. the number of paths is the sum of all different paths to any of the terminals. the number of paths influences the robdd complexity. a minimized disjoint-sum-of-product representation can directly be extracted from an robdd and leads to a small logic network [8]. path length is the number of non-terminal nodes on the path. apl represents the arithmetic mean of the lengths of all possible bdd paths [13]. the minimization of the apl leads to reduction of the logic network evaluation time. an example of the characteristics of robdd representation shown in fig. 5 is given by: 66666.3)(( 2))(( 9))((# 6))(( = = = = frobddapl frobddwidth frobddpaths frobddsize (12) a study of binary decision diagram characteristics of bent boolean functions 293 4. experimental results this section presents the basic robdd characteristics of bent functions with the initial order of variables as shown in fig. 5. in these experiments, the order of the variables in the robdd was not changed. the complete set of all bent functions is analyzed only for functions of 4 variables. due to the very time-consuming process for finding bent functions of 6 and 8 variables, the complete set of all bent functions is not analyzed for these cases. for bent functions of 6 and 8 variables, robdd characteristics were analyzed on a set of 1 million, and 10,000 bent functions, respectively. the following robdd characteristics are included in the analysis: the size (the number of nodes), the number of paths, width, and the average path length. for discovery of bent functions of 4 variables, the maximal algebraic degree is used as the search space boundary. also, the implementation of the discovery of bent functions of 6 and 8 variables uses rm subsets described in [9]. this implementation performs discovery of single random bent function. additionally, these experiments also show the number of non-bent functions with n-variables having hamming weight equal to (2n-1±2(n/2)-1). only functions that have a predefined hamming weight are presented because they represent the search space when creating a potential method for discovering bent functions using robdd characteristics. implementation of the program for analysis of bent and specific non-bent functions was created using a decision diagram experimental framework. implementation is done by extension of an existing bdd package using the c++ programming language. the bdd package is implemented using all basic recommendations for programming bdd packages (unique table, operation table, garbage collector, swapping levels, etc.) [14], [15]. the experiments are performed on a pc pentium iv running at 3.66 ghz with 8 gb of ram. the bdd package performs operations using shared bdds, but in these experiments, only the shared bdds with one output were used. the size of the unique table and the operation table was limited to 262,139 entries. garbage collection was activated when available memory ran low. tables in this section present for each robdd characteristic (parameter) the total number of bent and specific non-bent functions that have that characteristic. table 1 shows the number of bent functions of 4 variables with respect to the robdd size, number of paths, widths, and apl. it can be noticed that almost 70% of all bent functions of 4 variables have size 7 or 8, about 63% of functions have the number of paths 9, 10 or 11, about 58% of all bent functions of 4 variables have the robdd width 2, and about 40% have the robdd apl 3.33333 or 3.5. this table also shows the number of all non-bent functions having hamming weight equal to 6 or 10 of 4 variables with respect to the robdd size, width, number of paths, and apl. it is evident that it is necessary to average 20 checks of functions that have hamming weight 6 or 10 that one of them to be bent. table 2 shows the number of bent functions of 6 variables on a sample of 1 million functions with respect to the robdd size, number of paths, width, and apl. the reason why the entire set of bent functions with 6 variables was not tested is the long time it took to discover these functions. in this table, about 70% of the sampled bent functions of 6 variables have size 15,16, or 17, about 50% have the number of paths 22, 23,24 or 25, about 73% have a robdd width 2, and about 11% have the robdd apl 5.00007 or 5.125. similarly, this table also shows the number of non-bent functions of 6 variables having hamming weight equal to 28 or 36 on a sample when there are 1,000,000 discovered bent 294 m. radmanović functions with respect to the robdd size, number of paths, width, and apl. it can be noticed that the number of non-bent functions follows the number of bent functions with a ratio of about 40 times more. table 3 shows the number of bent functions of 8 variables on a sample of 10,000 functions with respect to the robdd size, number of paths, width, and apl. the reason why the number of tested functions is reduced to 10,000 is the very long computation time required for discovery of these functions. in this table, about 60% of the sampled bent functions of 8 variables have size 22, 23, or 24, about 70% have the number of paths 30, 31, or 32. about 90% of the sampled bent functions have robdd width 2, and about 50% have robdd average path lengths 6.12903, or 6.133335. this table also shows the number of non-bent functions of 8 variables having hamming weight equal to 120 or 136 on a sample when there are 10,000 discovered bent functions. it can be noticed that the number of non-bent functions follows the number of bent functions with a ratio of about 80 times more. if we look at all three tables for functions with 4, 6 and 8 variables, it is easy to determine the formulas for the characteristics that have the largest number of functions. it is discovered that the largest number of n-variable bent functions have size 4*n-8. regarding other robdd characteristics for the maximum number of functions, no law can be determined. it is interesting that there is a law for bent functions of 4 and 6 variables where apl has formula 0.8333333*n. table 1 the number of all bent and all non-bent functions of 4 variables having hamming weight equal to 6 or 10 with respect to the robdd parameters. size #f (bent) #f (non-bent) 4 7 169 5 44 667 6 153 2001 7 308 4079 8 318 5296 9 66 2630 width #f (bent) #f (non-bent) 1 132 1780 2 520 8531 3 216 4113 4 28 442 #paths #f (bent) #f (non-bent) 5 3 90 6 17 272 7 52 728 8 117 1471 9 193 2398 10 210 2985 11 157 3074 12 95 2294 13 42 1186 14 9 323 15 1 33 apl #f (bent) #f (non-bent) 2.4 3 61 2.66667 12 123 2.83333 5 92 2.85714 12 176 3 38 572 3.125 78 907 3.14286 2 30 3.22222 8 430 3.25 39 484 3.33333 172 1740 3.4 42 1468 3.44444 13 227 3.5 168 1529 3.54545 86 2405 3.63636 71 669 3.66667 86 2143 3.75 9 152 3.76923 42 1182 3.85714 9 323 a study of binary decision diagram characteristics of bent boolean functions 295 table 2 the number of bent and non-bent functions of 6 variables having hamming weight equal to 28 or 36 with respect to the robdd parameters. size #f (bent) #f (non-bent) 8 3 142 9 49 2168 10 369 15845 11 2078 84473 12 9182 271672 13 33303 1005621 14 94595 2559252 15 195267 7037689 16 273064 10165891 17 225355 8587546 18 121807 5018437 19 37694 1889635 20 6794 506954 21 440 35829 width #f (bent) #f(non-bent) 2 269858 8434452 3 590964 22425672 4 133894 4584176 5 5284 155319 #paths #f (bent) #f (non-bent) 11 3 194 12 26 1593 13 120 5926 14 397 18922 15 1586 68573 16 3643 151345 17 7840 302133 18 20466 772155 19 37087 1363662 20 61213 2197944 21 83862 3018102 22 116633 4201566 23 135603 5127539 24 136380 6056333 25 117712 5578164 26 97858 4783127 27 80974 4205942 28 46092 2589411 29 26284 1881356 30 16647 1204724 31 8095 712410 32 1467 166751 33 12 1478 apl #f (b) #f (n-bent) 3.90909 3 122 3.91667 12 568 4.07692 24 799 4.08333 6 188 4.15385 32 1420 4.16667 8 521 4.21429 48 2278 4.23077 12 499 4.25 48 2568 4.26667 284 10891 4.28577 178 6995 4.30786 45 1591 4.3125 70 4575 4.33333 449 20481 4.35714 36 1665 4.38463 7 302 4.4 284 9895 4.41176 200 9678 4.42857 72 2889 4.4375 1254 59642 4.46667 282 9612 4.47059 96 4576 4.50019 823 47612 4.52941 660 28902 apl #f (bent) #f (n-bent) 4.80952 868 44599 4.8125 13 568 4.81818 4922 228955 4.82353 103 5601 4.83333 1128 57546 4.84214 4359 178221 4.85 5518 316647 4.85729 9472 376566 4.86431 12929 564982 4.88235 234 9677 4.89596 2046 74766 4.90063 34213 1956687 4.90478 15467 602886 4.90909 2191 80885 4.91304 35656 2012624 4.91667 18849 604202 4.9447 147 6457 4.95023 2078 74702 4.95238 25216 870164 4.95424 398 19465 4.95456 45164 1972354 4.95652 11110 575503 4.95833 639 26774 5.00007 68074 3089332 apl #f (bent) #f (n-bent) 5.16667 6967 257702 5.17241 2640 98321 5.17391 2862 97023 5.17857 20647 860785 5.18182 84 3677 5.18518 14512 798544 5.19231 757 34577 5.2 18960 870556 5.20833 13085 561017 5.21429 2091 79045 5.21739 24 885 5.22222 8080 387122 5.23077 22424 874322 5.23333 1849 70554 5.2381 9 349 5.24 4806 156855 5.24138 11841 590446 5.25 9424 519445 5.25806 5351 254023 5.25926 1928 85266 5.26087 38 1671 5.26667 9674 385661 5.26923 5140 202677 296 m. radmanović 4.53333 14 569 4.55556 656 30187 4.56383 900 35671 4.57143 2 85 4.58916 3783 165761 4.60025 221 8647 4.61111 106 4854 4.62533 220 12587 4.63158 1266 52908 4.64706 1362 49762 4.65 6022 158712 4.66667 10588 458972 4.68421 7271 276134 4.6875 36 1589 4.7 1436 57125 4.7059 1276 45476 4.72228 6569 245564 4.73684 1609 47933 4.75 2227 78231 4.7619 7652 256443 4.76471 90 3554 4.77778 1040 39677 4.78947 19927 819556 4.80001 8330 122 5.04167 5929 258466 5.04348 42467 1704665 5.04545 29139 956027 5.04762 172 5702 5.05 52 1795 5.0527 96 3899 5.07692 13845 507879 5.08 36418 1422644 5.08333 10918 403354 5.08696 6941 353002 5.09117 13059 680223 5.09532 574 31066 5.10649 8 346 5.10714 5499 285302 5.11111 39390 1896601 5.11538 23450 873321 5.12 3655 154667 5.125 45599 1564998 5.13043 31476 1502337 5.13636 234 13121 5.14815 3456 165209 5.15067 39 1466 5.15385 19933 698680 5.16 32140 1570665 5.27586 2173 75331 5.28 4278 165006 5.28571 2793 95661 5.29167 24 1164 5.2963 12014 485664 5.30769 2616 106447 5.31034 3726 185433 5.31818 1 43 5.32143 3222 135886 table 3 the number of bent and non-bent functions of 8 variables having hamming weight equal to 120 or 136 with respect to the robdd parameters size #f (bent) #f (non-bent) 16 9 1040 17 37 4074 18 110 12061 19 196 19866 20 438 41860 21 960 91649 22 1598 137048 23 2006 160217 24 2518 219748 25 1340 121521 26 740 67832 27 40 3821 28 8 836 width #f (bent) #f (non-bent) 2 9110 712085 3 890 83305 apl #f (bent) #f (non-bent) 5.80769 300 40801 5.83333 30 2161 5.84 120 9212 5.91304 15 1408 5.91667 60 6304 5.92593 88 6602 5.96 16 1282 5.96154 44 3240 6 24 1960 6.0303 540 52486 6.03333 48 3595 6.03448 36 3342 6.03571 114 9354 6.03704 16 1750 6.03846 8 705 6.04 22 1886 6.04167 8 711 6.0625 216 17381 6.06452 54 3865 6.07143 24 1938 6.07407 44 3982 apl #f (bent) #f (non-bent) 6.11765 36 3296 6.12 4 341 6.12903 3600 269403 6.13333 1329 124259 6.14286 136 9576 6.14815 38 2902 6.15152 36 2920 6.15385 22 2114 6.15625 18 1390 6.16 5 84 6.16667 72 5751 6.17241 402 28489 6.18182 48 4112 6.2 40 2971 6.21875 492 40242 6.22222 68 5509 6.22581 233 22063 6.23077 13 1071 6.24138 64 5356 6.25 215 23555 6.25806 12 860 a study of binary decision diagram characteristics of bent boolean functions 297 #paths #f (bent) #f (non-bent) 23 15 1661 24 98 9720 25 167 16465 26 405 34536 27 261 21638 28 527 41872 29 925 73051 30 1597 124734 31 3899 331211 32 1446 125449 33 624 63760 34 36 4603 6.07692 10 764 6.09375 720 52980 6.10345 372 28586 6.11538 8 706 6.26667 102 7354 6.32143 14 1166 6.33333 13 1385 6.34483 51 4902 5. conclusions and future work one efficient way to represent boolean functions is with a reduced ordered binary decision diagram. the strength of robdds is that they can represent boolean function data with a high level of redundancy in a compact form. the quality of compactness is expressed by basic robdd parameters or characteristics. these basic characteristics are the size, the number of paths, the width, and the average path length. using these characteristics, bent function analysis can be performed to determine their properties better. this paper investigates the characteristics of bent functions with a focus on their basic robdd parameters. a decision diagram experimental framework has been used for implementation of a program for calculation of these parameters. the complete set of all bent functions is analyzed for functions of 4 variables. due to very time-consuming process for the discovery of bent functions of 6 and 8 variables, robdd characteristics were analyzed on a set of 1 million, and 10,000 bent functions, respectively. so that we can use these experimental results in future research, this paper also investigates the robdd characteristics of nonbent functions with n variables having hamming weight equal to (2n-1±2(n/2)-1) with focus on the same parameters. the complete set of all non-bent functions of 4 variables is analyzed. the set of non-bent functions of 6 variables is analyzed on a sample when there are 1 million discovered bent functions and the set of non-bent functions of 8 variables is analyzed on a sample when there are 10,000 discovered bent functions. from the experimental results, it is evident that for bent functions of 4 variables there is a small set of values of robdd characteristics that most bent functions have. for example, for these functions, 70% of them have the robdd size of 7 or 8, 63% have the number of paths of 9, 10 or 11. 58% have the width 2 and 40% have the average path lengths 3.33333 or 3.5. for bent functions of 6 variables, the values of the robdd characteristics that have the largest number of bent functions can be determined again. the same applies to bent functions with 8 variables. it was also determined that the largest number of n-variable bent functions has a size of 4*n-8, and an average path length of bent functions of 4 and 6 variables is very close to 0.8333333*n. but unfortunately, it was not possible to confirm the same average path length formula for bent functions of 8 variables. perhaps the reason for this is the small set of functions that was tested. 298 m. radmanović from the experimental results for non-bent functions, it is evident that they follow the characteristics of bent functions. the ratio of the number of non-bent to bent is about 20 times more for functions of 4 variables, about 40 times more for functions of 6 variables and about 80 times more for functions of 8 variables. these values also represent the search space when creating a potential method for discovering bent functions using robdd characteristics. the results presented in this paper are intended to be used to create methods for the construction of bent functions using robdd as a data structure from which the bent functions can be discovered. research in this direction can reduce the time for discovering random bent functions. in addition, the results in this work represent new boundaries within which we can detect bent functions. it was shown that a large percentage of bent functions with 4, 6 and 8 variables have a very small range of robdd characteristics which are tested in this paper. also, based on individual robdd characteristics, new subsets of bent functions can be defined. bent function discovery can be performed within these robdd subsets that have a predefined hamming weight. future work will refer to the study of a pair or more robdd parameters of bent functions. it also can be extended to research of additional robdd parameters of bent functions, as well as to the study of the characteristics of not only binary decision diagrams, but also other types of diagrams, such as functional decision diagrams, algebraic decision diagrams, kronecker decision diagrams, pseudo-kronecker decision diagrams, etc [7], [8], [11]. references [1] o. rothaus, "on bent functions", j. comb. theory ser. a, vol. 20, pp. 300-305, 1976. [2] o. logachev, a. salnikov and v yashchenko, boolean functions in coding theory and cryptography, american mathematical society, 2012. [3] s. mesnager, bent functions, fundamentals and results, springer international publishing, 2016. [4] n. tokareva, bent functions, results and applications to cryptography, academic press, 2015. [5] m. stanković, c. moraga and r. stanković, "an improved spectral classification of boolean functions based on an extended set of invariant operations", fu: elect. energ., vol. 31, no. 2, pp. 189-205, 2018. [6] p. langevin and g. leander, "counting all bent functions in dimension eight 99270589265934370305785861242880", in designs, codes and cryptography, vol. 59, pp. 193-201, 2011. [7] t. sasao and m. fujita, representations of discrete functions, kluwer academic publishers, boston, 1996. [8] r. drechsler and b. becker, binary decision diagrams: theory and implementation, springer us, 2013. [9] n. schafer, "the characteristics of the binary decision diagrams of bent functions", m.s. thesis, naval postgraduate school, monterey, ca, september 2009. [10] m. radmanović, "efficient discovery of bent function using reed-muller subsets", in. proceedings of the 55th int. scientific conference on information, communication and energy systems and technologies (icest 2020), pp. 7-10, 2020. [11] m. g. karpovsky, r. s. stanković and j. t. astola, spectral logic and its applications for the design of digital devices, wiley, 2008. [12] m. thornton, r. drechsler and d. miller, spectral techniques in vlsi cad, springer us, 2012. [13] s. nagayama, a. mishchenko, t. sasao and j. t. butler, "minimization of average path length in bdds by variable reordering", in proceedings of the international workshop on logic and synthesis, 2003, pp. 207-213. [14] k. brace, r. rudell and r. bryant, "efficient implementation of a bdd package", in proceedings of the 27th acm/ieee design automation conference, 1990, pp. 40-45. [15] f. somenzi, "efficient manipulation of decision diagrams", software tools for technology transfer, vol. 3, no. 2, pp. 171-181, 2001. instruction facta universitatis series: electronics and energetics vol. 29, n o 2, june 2016, pp. 177 191 doi: 10.2298/fuee1602177m artifical neural networks in rf mems switch modelling  zlatica marinković 1 , vera marković 1 , tomislav ćirić 1 , larissa vietzorreck 2 , olivera pronić-rančić 1 1 university of niš, faculty of electronic engineering, niš, serbia 2 tu münchen, lehrstuhl für hochfrequenztechnik, münchen, germany abstract. the increased growth of the applications of rf mems switches in modern communication systems has created an increased need for their accurate and efficient models. artificial neural networks have appeared as a fast and efficient modelling tool providing similar accuracy as standard commercial simulation packages. this paper gives an overview of the applications of artificial neural networks in modelling of rf mems switches, in particular of the capacitive shunt switches, proposed by the authors of the paper. models for the most important switch characteristics in electrical and mechanical domains are considered, as well as the inverse models aimed to determine the switch bridge dimensions for specified requirements for the switch characteristics. key words: actuation voltage, artificial neural networks, resonant frequency, rf mems, switch 1. introduction modern communication systems rely to a great extent on new high performance rf and microwave devices and components that enable miniaturization of components according to the demand of integrating more and more functionalities by reducing the overall size of the system at the same time. rf mems (micro electro mechanical systems) are novel components which are able to meet the mentioned requirements [1]. rf mems components and devices exploit mechanically movable parts and thus enable a change of topology. one of the first examples developed in 1995 [2, 3] was an electrostatically actuated rf mems shunt switch where the ground of the coplanar waveguide is connected by a very thin membrane. if a dc voltage is applied between ground and signal line, the membrane is pulled down by the electrostatic force and thus it shortens the signal line. since these first developments, many different components based on mems switches have been introduced, like phase-shifters, reconfigurable antennas, matching networks, switch matrices, tunable filters, etc. [4-9] received september 29, 2015 corresponding author: zlatica marinković university of niš, faculty of electronic engineering, aleksandra medvedeva 14, 18000 niš, serbia (e-mail: zlatica.marinkovic@elfak.ni.ac.rs) 178 z. marinković, v. marković, t. ćirić, l. vietzorreck, o. pronić-ranĉić rf mems switches allow multiband operation due to their ability to reconfigure its topology. also, they have several advantages compared to their electronic counterparts, like pin diode or mesfet switches [10]-[12], such as: low insertion loss, high isolation, small size, high linearity and excellent compatibility with microwave and mm-wave circuits. because of those significant advantages, rf mems switches are of growing interest for use in various communication systems, primarily in satellite and mobile communication systems. current research of rf mems switches is mostly concentrated on various new structures, new materials or processes in devices [13]-[16], while optimization analysis of mems devices lacks enough study. a standard approach to obtain rf mems switch electrical characteristics is to use full-wave numerical methods in electromagnetic (em) simulators. however, as it is also necessary to determine mechanical characteristics, simulations in mechanical simulators should be included during the design and simulation as well. although these methods provide the necessary accuracy, they are generally limited to a single analysis for a specific structure, and their computational overhead (running time, memory) becomes extensive when a number of simulations with different mesh properties are needed [17]. an alternative approach to modelling and designing rf mems devices is based on artificial neural networks (anns). anns can be considered as a great fitting tool, i.e. they have the ability to learn the dependence between two sets of data and to generalize, which means to give a correct response to inputs not used in the learning process. they give response almost instantaneously, retaining the accuracy of the standard em and mechanical simulators. owing to these abilities, anns have found a lot of applications in different fields, among others in rf and microwaves. this paper is devoted to applications of anns for modelling and design of rf mems switches. as far as rf mems devices are concerned, anns have been applied as a modelling tool about for a decade [17-25]. they have mostly been applied for modelling the device membrane characteristics. several publications refer to neural modelling of rf mems switches [17, 20, 23-25]. in most of the referred applications, anns were exploited to model dependence of the switch scattering (s-) parameters and/or switch resonant frequency on the dimensions of membrane and frequency. almost all of them refer to switches which have a simple rectangular membrane. in this paper a capacitive switch with a more complex membrane is considered. the paper is organized as follows. after introduction, in section ii a short description of neural networks is given. the capacitive rf mems switch modeled in this work is described in section iii. ann models of switch characteristics, as well as corresponding numerical results and discussions, are presented in section iv. section v contains description of rf mems switch inverse ann models, the modelling results and the discussion. finally, the main concluding remarks are given in section v. 2. artificial neural networks all neural models presented in this work are based on the multilayer perceptron (mlp) neural networks. an mlp ann consists of basic processing elements (neurons) grouped into layers: an input layer, an output layer, as well as several hidden layers [26]. rf mems switch ann models 179 each neuron is connected to all neurons from the adjacent layers. neurons from the same layer are not mutually connected. each neuron is characterized by a transfer function and each connection is weighted. the anns exploited in this work have linear transfer function for neurons from the input and output layer and sigmoid transfer function for the hidden neurons. an ann learns the relationship among sets of input-output data (training sets) by adjusting the network connection weights and thresholds of activation functions. there are a number of algorithms for training of anns. the most frequently used are backpropagation algorithm and its modifications, as the levenberg marquard algorithm [26], used in the present work. once trained, the network provides fast response for various input vectors without changes in its structure and without additional optimizations. the most important feature of anns is their generalization ability, i.e., the ability to generate the correct response even for the input parameter values not included in the training set. the generalization ability has qualified anns to be used as an efficient tool for modelling in the field of rf and microwaves [26-36]. as examples, anns could be used as an alternative to time-consuming electromagnetic simulations [26-28, 30] or an alternative to the conventional modelling of microwave devices [25, 27, 30, 32, 35, 36]. in the present work, the accuracy of ann learning and generalization was tested by calculating average test error (ate), worst case error (wce) and pearson productmoment correlation coefficient (r) [26]. having in mind that it is not possible to determine the number of hidden neurons, in this work for each developed ann, anns with different number of hidden neurons in one or two hidden layers were trained. the network with the best test results was chosen as the final model. when reporting the ann structure of the final models, in this paper the following notation is used: ann denoted with n-h1-h2-m, has n input neurons, h1 and h2 neurons in the first and second hidden layer, respectively, and m output neurons; ann denoted with n-h1-m, has n input and m output neurons and only one hidden layer with h1 neurons. 3. modeled device the considered device is a cpw (coplanar waveguide) based rf mems capacitive shunt switch (see fig. 1) fabricated at fbk in trento in an 8-layer silicon micromachining process [37]. the signal line below the bridge is made by a thin aluminum layer. adjacent to the signal line the dc actuation pads made by polysilicon are placed. the bridge is a thin membrane connecting both sides of the ground. the inductance of the bridge and the fixed capacitance between signal line and bridge form a resonant circuit to ground, whose resonance frequency can be changed by varying the length of the fingered part, lf, close to the anchors and the solid part, ls. at series resonance the circuit acts as a short circuit to ground. in a certain frequency band around the resonance frequency the transmission of the signal is suppressed. the bridge can be closed by applying an actuation voltage of around 45 v. the actuation voltage is determined as the instant voltage applied to the dc pads when the bridge comes down and touches a cpw centerline, which is a pull-in voltage (vpi). this is strongly related to the switch features and mechanical/material properties, such as a dc pad size and location, a bridge spring constant and residual stress, bridge shapes or supports, etc. the finger parts (correspond to lf) in fig. 1 are to control vpi. if finger parts are long compared to the other parts, the bridge becomes flexible and the 180 z. marinković, v. marković, t. ćirić, l. vietzorreck, o. pronić-ranĉić switch is easily actuated by a low vpi. but this increases the risk of a self-actuation or a rf hold-down when the switch delivers a high rf power. and opposite, with the short finger parts, the switch needs a high vpi to be actuated. therefore, the bridge part lengths (lf, ls) should be carefully determined considering a delivering rf power and a feasible dc voltage supply [1]. (a) (b) fig. 1 top-view of the realized switch (a) and schematic (b) of the cross-section with 8 layers in fbk technology [37] 4. ann models of switch characteristics as mentioned in the introductory section, simulations of an rf mems switch characteristics in standard em and mechanical simulators are time consuming, which is especially important when it is necessary to repeat simulations during the design and optimization of the switch characteristics. similarly to the approaches presented in the literature, the authors of this paper have developed neural models of switch electrical and mechanical characteristics, in particular the neural models of s-parameters, resonant frequency and actuation voltage, as shown in figs. 2 and 3. an rf mems switch is a symmetric reciprocal device, i.e., s22 = s11 and s12 = s21, therefore only parameters s11 and s21 were modeled. the ann model of each modeled sparameter consists of two anns, both having three inputs corresponding to the bridge lateral dimensions, ls and lf, and frequency, f, whereas the outputs correspond to the magnitude and phase of the modeled parameter, |sij| and sij , respectively. with the aim to train the anns, it rf mems switch ann models 181 is necessary to simulate the s-parameters for several bridge sizes (i.e. for different values of the bridge lateral dimensions) in a full-wave em simulator. a properly trained ann gives responses which are very close to the response of the full-wave em simulator but in a shorter time, as the ann response is almost instantaneous. by using the developed model, analysis and optimizations of the switch dimensions can be done much faster than in the standard way. as far as the resonant frequency is concerned, the ann model consists of one ann with two inputs corresponding to ls and lf , and one output corresponding to the resonant frequency (see fig. 2b). the data for the ann training consists of several resonant frequencies corresponding to different bridge sizes, and can be acquired by determining the resonant frequency in a full-wave em simulator, or by using the neural model of the parameter s21. like the above mentioned model, this model enables a quick estimation of the switch resonant frequency and optimization of the dimensions to obtain the desired resonant frequency. the model of the switch actuation voltage has the same structure as the resonant frequency model. namely, it has two inputs and one output, corresponding to the bridge lateral dimensions and actuation voltage, respectively, as shown in fig. 3. as in the previous cases, the training data were obtained in a standard simulator able to calculate the switch mechanical properties. the gain in simulation time is the most significant in this case, as simulations in commercial mechanical simulator took much more time than the simulations of the electrical parameters in a full-wave em simulator. (a) (b) fig. 2 ann models of the switch electrical characteristics: (a) s-parameters; (b) resonant frequency (c) fig. 3 ann model of the switch actuation voltage 4.1. numerical results all ann models described above were developed for the considered switch [38, 39]. for development of the models of s-parameters and resonant frequency, the s-parameters for several different combinations of the switch lateral dimensions were simulated in the full-wave em simulator, ads momentum [40], and the corresponding resonant frequencies were determined. the data referring to 23 differently sized bridges were used for the model development, whereas the data referring to 17 bridges different than the training ones were used for validation of the models. the s-parameters used for the model development were simulated in 401 frequency points up to 40 ghz. for each ann model, 182 z. marinković, v. marković, t. ćirić, l. vietzorreck, o. pronić-ranĉić anns with different number of hidden neurons were trained and the anns listed in table 1 were chosen as the final models. table 1 final ann models for the switch electrical and mechanical characteristics with the number of training samples parameter ann model number of training samples |s11| 3-8-6-1 23 x 401  s11 3-10-10-1 23 x 401 |s22| 3-8-8-1 23 x 401  s22 3-10-10-1 23 x 401 fres 2-5-1 23 vpi 2-8-1 30 validation of the ann models has shown that they produce the values which are very close to the values obtained by using the em simulator. as an illustration, in fig. 4 the insertion loss (|s21| in db) and the return loss (|s11| in db) are shown for the device having the bridge with lateral dimensions ls = 350 µm and lf = 75 µm. a very good agreement of the parameters generated by using the developed ann models with the em simulations can be observed. this is especially important, as the data referring to this device was not included in the training set, proving that the anns achieved a good generalization. as far as the resonant frequency is concerned, the maximum difference between the modeled and the reference values for the test devices is less than 1%, which can be considered very good. another illustration of the achieved accuracy of the resonant frequency ann model is the scattering plot given in fig. 5 showing very good agreement of the values obtained by the ann model and the reference values calculated in the em simulator for six considered test devices. more details about development and validation of the ann models of the electrical characteristics can be found in [38, 39]. 0 10 20 30 40 -60 -50 -40 -30 -20 -10 0 s 1 1 ( d b ), s 2 1 ( d b ) f (ghz) ann em simulator s 11 s 21 fig. 4 insertion and return losses for the tested device (ls = 350 µm and lf = 75 µm) rf mems switch ann models 183 9 10 11 12 13 14 15 9 10 11 12 13 14 15 f r e s ( g h z ) e m s im u la to r f res (ghz) ann model fig. 5 resonant frequency scattering plot for six test devices the data used for training and validation of the neural model for the switch actuation voltage, shown in fig. 3, were obtained in the mechanical simulator comsol multiphysics [41]. in total, 39 data samples (pairs of lateral dimensions and the corresponding actuation voltages) were used, thereof 30 for the ann training and 9 for the ann model validation. the best ann has one hidden layer with 8 neurons, as listed in table 1. the validation results shown in table 2 confirm that this model also has very good generalization abilities, as the maximum error for the test devices not used for the ann training is around or less than 1%, i.e., less than 0.5 v. more details about development and validation of this model can be found in [42, 43]. table 2 actuation voltage for the test devices ls (m) lf (m) vpi_target (v) vpi_sim (v) abs. error (v) rel. error (%) 150 25 55.6 55.58 0.02 0.01 150 65 43 43.45 0.45 1.10 250 25 33.3 33.16 0.14 0.40 250 65 28.2 28.21 0.01 0.03 350 10 25.2 25.32 0.12 0.47 350 25 23.8 23.74 0.06 0.25 350 65 21.1 20.99 0.11 0.54 350 75 20.5 20.45 0.35 0.17 450 65 16.9 16.80 0.10 0.57 4.2. discussion as already mentioned, the developed models of the rf mems switch characteristics give responses instantaneously. having in mind that they give the responses with the accuracy close to the accuracy of the calculations in standard em and/or mechanical simulators, they are very convenient to be used for further analyses and optimizations of the considered switch. the mathematical expressions describing the developed anns can 184 z. marinković, v. marković, t. ćirić, l. vietzorreck, o. pronić-ranĉić be easily implemented within the standard simulators by means of blocks dealing with variables and expressions, or can be used separately in different (mathematical) software packages. as an example, optimization of the bridge lateral dimensions for the given requirements for s-parameters in a desired frequency band lasts less than a second when performed by using the neural model implemented in the ads circuit simulator, which is significantly faster than the optimization in the full wave simulator (ads momentum), which lasts around 2 hours [39]. this advantage is even more evident in the case of the mechanical characteristics modelling. namely, the calculation of switch actuation voltage versus the bridge lateral dimensions (plotted in fig. 6), lasts few seconds in the matlab environment by using the developed ann model, whereas the mechanical simulator requires several tens of minutes to determine the actuation voltage for a single combination of the bridge geometrical parameters. optimization of the switch bridge dimensions based on the ann model lasts several seconds, unlike the optimizations in the mechanical simulators lasting for hours. 125 200 300 400 500 0 20 40 60 80 100 0 25 50 75 100 l s (m)lf (m) v p i (v ) fig. 6 actuation voltage calculated by using the ann model [42] the developed ann models can be efficiently used to study the behaviour of the device when the bridge size is changed, either intentionally, with the aim to optimize the device characteristics, or due to the deviation of the dimensions in the device fabrication process. the analyses done in [44] for the resonant frequency and in [45] for the actuation voltage show that when the dimension changes are within the fabrication tolerances (which are for the considered device up to +/ 3 µm) the changes in the actuation voltage and the resonant frequency can be considered as acceptable. for instance, maximum changes of the resonant frequency for several arbitrary chosen devices when both dimensions were changed in the range +/ 3 µm, with the step of 1 µm are shown in table 3 [44]. it can be seen that maximum deviation of the resonant frequency is 1.5%, with the maximum absolute change of 0.24 ghz. rf mems switch ann models 185 table 3 resonant frequency test results for simultaneous changes of ls and lf up to +/3 µm ls (m) lf (m) max | fres| (ghz) max | fres/fres| (%) 200 20 0.24 1.5 200 50 0.20 1.4 200 80 0.17 1.3 300 20 0.13 1.1 300 50 0.12 1.0 300 80 0.11 0.1 450 20 0.08 0.8 450 50 0.07 0.7 450 80 0.06 0.6 5. rf mems switch inverse ann models as illustrated in the previous section, the developed neural models of the electrical or mechanical characteristics of rf mems switches can significantly speed up the analysis and design of these switches. however, the time needed for the optimization of switch dimensions can be further reduced if the inverse neural models of the switch characteristics versus dimensions are used. namely, it would be very useful to develop models that could predict both of the lateral dimensions of the switch bridge for the given resonant frequency or/and actuation voltage. however, this is not possible, as the inverse functions of the resonant frequency and actuation voltage dependence on the bridge dimensions are not unique, which means that several combinations of the lateral dimensions result in the same resonant frequency or actuation voltage. the authors of the paper proposed inverse models where one of the dimensions is fixed, and the other is determined by an ann, as shown in fig. 7 [39, 43, 46, 47]. (a) (b) fig. 7 inverse ann models for the switch electrical (or mechanical) characteristics: (a) ls (b) lf namely, the proposed inverse ann models of the switch electrical (or mechanical) characteristics consist of anns with two input neurons: one corresponding to the fixed lateral dimension (lf in fig. 7a and ls in fig. 7b) and the other to fres in the case of electrical inverse model, or to vpi in the case of mechanical inverse model, and one output neuron corresponding to the dimension being determined (ls in fig. 7a and lf in fig. 7b). however, during the design of an rf mems switch one may have a need to optimize the dimensions to meet the desired resonant frequency and the actuation voltage simultaneously. that could be complex as the em simulations and simulations of the mechanical characteristics are performed in different software packages. therefore, the authors proposed inverse electromechanical models, which calculate one of the lateral 186 z. marinković, v. marković, t. ćirić, l. vietzorreck, o. pronić-ranĉić dimensions for given both, the resonant frequency and the actuation voltage. for the same reasons as in the case of separate electrical and mechanical inverse models, it is not possible to develop a model that would determine both dimensions at the same time. therefore, the exploited anns have three inputs and one output, as shown in fig. 8. (a) (b) fig. 8 inverse electro-mechanical ann models: (a) ls (b) lf for both types of the inverse models, separate electrical and mechanical or electromechanical ann model, the data for training the anns is obtained by calculating the resonant frequency or/and the actuation voltage for several combinations of the lateral dimensions. this can be done in standard simulators, or alternatively by the previously developed neural models aimed at calculating the resonant frequency and actuation voltage for the given dimensions (let us call them the direct models). once the inverse models are trained, the determination of the desired dimension is done directly without optimization. 5.1. numerical results the proposed inverse ann models were developed for the rf mems switch considered in this work. due to behaviour of the inverse characteristics of the considered devices, it appeared that the data used for the development of the direct models of the resonant frequency and actuation voltage were not sufficient to train the inverse ann models with the satisfying accuracy, as the modelling error was higher than tens of percent in some parts of the input space [39, 43, 46]. therefore, to acquire more training data in these critical parts of the input space, the developed direct neural models were used for generating more training samples. the anns showing the best performance for each model are listed in table 4, together with the number of training samples. to illustrate the accuracy of the inverse modelling, in fig. 9 a comparison of the determined lf and its target value is plotted in the form of scatter plots. fig. 9a refers to the electrical inverse model and fig. 9b to the mechanical inverse model. it can be observed that the deviation of the lf value is within the boundaries of +/-3 µm, indicating very good prediction abilities of the proposed model. similar results were obtained for prediction of ls. table 4 final ann models for the switch electrical and mechanical characteristics with the number of training and test samples inverse model ann model number of training samples electrical lf 2-15-15-1 814 electrical ls 2-15-15-1 814 mechanical lf 2-25-25-1 961 mechanical ls 2-4-6-1 961 electro.mech. lf 3-10-20-1 4131 electro.mech. ls 3-20-10-1 4131 rf mems switch ann models 187 0 10 20 30 40 50 60 70 80 90 0 10 20 30 40 50 60 70 80 90 l f (m) target l f ( m ) in v e rs e a n n m o d e l (a) 0 10 20 30 40 50 60 70 80 0 10 20 30 40 50 60 70 80 l f (m) target l f ( m ) in v e rs e a n n m o d e l (b) fig. 9 inverse modelling of lf : (a) electrical inverse model, (b) mechanical inverse model the inverse electro-mechanical models gave similar accuracy as the separate electrical and mechanical models, which can be seen from the following analysis, where the inverse electromechanical model for determining the fingered part length (shown in fig. 8b) is considered. the influence of the determination of lf to changes of the resonant frequency (desired value 12 ghz) and the actuation voltage (desired value 25 v) were calculated and shown in tables 5 and 6, respectively [48]. namely, for the ls values from 280 to 340 µm, and the desired fres and vpi, the value of lf is calculated (lf_inv). further, the calculated lf value is used to determine the resonant frequency (table 5) or the actuation voltage (table 6) with the direct ann models for fres and vpi, respectively, and these values were compared with the desired values. the corresponding absolute errors (ae) and relative errors (re) are given in tables 5 and 6 as well. it can be seen that the relative errors are less than 2%, which can be considered as good. 188 z. marinković, v. marković, t. ćirić, l. vietzorreck, o. pronić-ranĉić table 5 rf mems switch inverse modelling results: resf [48] sl [µm] resf [ghz] piv [v] inv_fl [µm] dir_resf [ghz] resf ae [ghz] resf re [%] 280 12 25 71.350 11.886 0.114 0.95 290 12 25 61.703 11.859 0.141 1.20 300 12 25 52.228 11.827 0.173 1.40 310 12 25 43.426 11.789 0.211 1.80 320 12 25 35.355 11.755 0.245 2.00 330 12 25 26.917 11.884 0.116 0.97 340 12 25 16.59 12.009 0.009 0.07 table 6 rf mems switch modelling results: piv [48] sl [µm] resf [ghz] piv [v] inv_fl [µm] dir_piv [v] piv ae [v] piv re [%] 280 12 25 71.350 25.169 0.169 0.68 290 12 25 61.703 25.143 0.143 0.57 300 12 25 52.228 25.120 0.120 0.48 310 12 25 43.426 25.082 0.082 0.33 320 12 25 35.355 25.056 0.056 0.22 330 12 25 26.917 25.129 0.129 0.52 340 12 25 16.590 25.380 0.380 1.50 5.1. discussion the results shown above confirm the accuracy of the determination of the lateral dimensions of the bridge for the given requirements related to the resonant frequency and/or the actuation voltage. the deviation in the dimension prediction is in the order of fabrication tolerances, confirming also the accuracy of modelling. the developed inverse models provide a very fast straightforward calculation of the bridge dimensions. opposite to the direct models, which are valid in the range of the dimensions used for the ann model development, although the inverse models give response for all the inputs falling between minimum and maximum values of input values used for training, they are valid only in the ranges of input values which are physically meaningful. this means that before choosing an input combination for an inverse model, it should be checked if the chosen combination is physically meaningful. this can be efficiently checked from two-dimensional plots input dimension resonant frequency (and/or actuation voltage, depending on the inverse model used) which can be plotted by using the direct ann models [49, 50]. another challenge in bridge dimension optimization is how to determine the bridge lateral dimensions when total length of the bridge is given. since the desired dependence is not unique, as it is case for all mentioned inverse models, such direct model is not possible to be realized with anns. however, the developed ann based direct and inverse models can be used as a solution. the interested readers can find more details about it in [4951]. rf mems switch ann models 189 6. conclusion rf mems switches have seen increasing applications in the field of microwave control, therefore, the design of the circuits containing rf mems switches require the presence of the reliable models. artificial neural networks have appeared as an efficient alternative to standard commercial full-wave em simulators and mechanical simulators providing similar accuracy but with significantly lower computational cost. this paper gives an overview of the neural models of capacitive shunt rf mems switches. despite the fact that the development takes a certain time, as it is necessary to obtain the training data by using the standard simulation methods and to train the ann models (a few minutes per a trained ann), efficiency and speed in giving response make the ann models very convenient for modelling and optimization of electrical and mechanical characteristics of rf mems switches. acknowledgement: the authors would like to thank fbk trento, thales alenia italy, cnr rome and university of perugia, italy for providing rf mems data. this work was funded by the bilateral serbian-german project "smart modeling and optimization of 3d structured rf components" supported by the daad foundation and serbian ministry of education, science and technological development. the work was also supported by the projects tr32052 and iii-43012 of the serbian ministry of education, science and technological development. references [1] g. m. rebeiz, rf mems theory, design, and technology. new york: wiley, 2003. [2] c.l. goldsmith, z. yao, s. eshelman, and d. denniston, "performance of low-loss rf mems capacitive switches," ieee microwave guided wave lett., vol. 8, pp. 269-271, august 1998. [3] g. m. rebeiz, j. b. muldavin, "rf mems switches and switch circuits," ieee microw. mag., vol. 2, no. 4, pp. 59-71, december 2001. [4] s. a. figur, e. meniconi, b. schoenlinner, u. prechtel, r. sorrentino, l. vietzorreck, v. ziegler, "design and characterization of a simplifed planar 16 x 8 rf mems switch matrix for a geostationary data relay", in proceedings of european microwave conference, 2012. [5] s. montori, e. chiuppesi, p. farinelli, l. marcaccioli, r. v. gatti, r. sorrentino, "w-band beamsteerable mems-based reflectarray", international journal of microwave and wireless technologies, vol. 3, no. 05, pp. 521-532, october 2011. [6] g. m. rebeiz, k. entesari, i. reines, s. j. park, m. a. el-tanani, a. grichener, a. r. brown, "tuning in to rf mems", ieee microw. mag., vol. 10, no. 6, pp. 55 – 72, june 2009. [7] m. daneshmand, r. r. mansour, "rf mems satellite switch matrices", ieee microw mag, vol. 12, no. 5, pp. 92 – 109, may 2011. [8] i. jokić, m. frantlović, z. đurić, m. dukić, "rf mems/nems resonators for wireless communication systems and adsorption-desorption phase noise", facta universitatis – series electronics and energetics, vol. 28, no. 3, pp. 345-381, 2015. [9] a. napieralski, c. maj, m. szermer, p. zajac, w. zabierowski, m. napieralska, ł. starzak, m. zubert, r.kiełbik, p. amrozik, z. ciota, r. ritter, m. kamiński, r. kotas, p. marciniak, b. sakowicz, k. grabowski, w. sankowski, g. jabłoński, d. makowski, a. mielczarek, m. orlikowski, m. jankowski, p. perek, “recent research in vlsi, mems and power devices with practical application to the iter and dream projects”, facta universitatis – series electronics and energetics, vol. 27, no. 4, pp. 561-588, 2014. [10] m. lazic, m. skender, s. radosevic, “generating driving signals for three phases inverter by digital timing functions”, facta universitatis – series electronics and energetics, vol. 13, no. 3, pp. 353-364, 2000. [11] a. n. al-rabadi, “carbon nano tube (cnt) multiplexers for multiple-valued computing”, facta universitatis – series electronics and energetics, vol. 20, no. 2, pp. 175-186, 2007. 190 z. marinković, v. marković, t. ćirić, l. vietzorreck, o. pronić-ranĉić [12] j. vobecký “the current status of power semiconductors”, facta universitatis – series electronics and energetics, vol. 28, no. 2, pp. 193-203, 2015. [13] m. lamhamdi, p. pons, u. zaghloul, l. boudou, f. coccetti, j. guastavino, y. segui, g. papaioannou, r. plana “voltage and temperature effect on dielectric charging for rf mems capacitive switches reliability investigation” microel. reliab., vol. 48 pp. 1248-1252, sept. 2008. [14] m. matmat, k. koukos, f. coccetti, t. idda, a. marty, c. escriba, j-y. fourniols, d. esteve, “life expectancy and characterization of capacitive rf mems switches”, microelectron. reliab., vol. 50, no. 9–11, pp. 1692-1696, 2010. [15] l. michalas, m. koutsoureli, e. papandreou, a. gantis, g. papaioannou “a mim capacitor study of dielectric charging for rf mems capacitive switches”, facta universitatis – series electronics and energetics, vol. 28, no. 1, pp. 113-122, 2015. [16] m. koutsoureli, l. michalas, g. papaioannou, “assessment of dielectric charging in micro-electro-mechanical system capacitive switches”, facta universitatis – series electronics and energetics, vol. 26, no. 3, pp. 239245, 2013. [17] y. lee, d. s. filipovic, "combined full-wave/ann based modelling of mems switches for rf and microwave applications", in proceedings of the ieee antennas and propagation society international symposium, 2005, pp. 85-88. [18] y. lee, y. park, f. niu, b. bachman, k. c. gupta, d. filipovic, "artificial neural network modelling of rf mems resonators", int. j. rf microw. c e, special issue: rf applications of mems and micromachining, vol. 14, no. 4, pp. 302–316, july 2004. [19] v. litovski, m. andrejevic, m. zwolinski, "behavioural modelling, simulation, test and diagnosis of mems using anns," in proceedings of the ieee international symposium on circuits and systems iscas 2005, 2005, pp. 5182 5185. [20] y. lee, d. s. filipovic, "ann based electromagnetic models for the design of rf mems switches", ieee microw. compon. lett,, vol. 15, no. 11, pp. 823-825, november 2005. [21] y. lee, y. park, f. niu, d. filipovic, "design and optimization of rf ics with embedded linear macromodels of multiport mems devices," int. j. rf microw c e, vol. 17, no. 2, pp. 196-209, march 2007. [22] g. h. yang, q. wu, j. h. fu, k. tang, j. x. he, "an efficient modelling technique for rf mems phase shifter based on rbf neural network," in proceedings of the international conference on microwave and millimeter wave technology icmmt 2008, 2008, pp. 475-478. [23] y. mafinejad, a. z. kouzani, k. mafinezhad, "determining rf mems switch parameter by neural networks", in proceedings of the ieee region 10 conference tencon 2009, 2009, pp. 1-5. [24] y. gong, f. zhao, h. xin, j. lin, q. bai, "simulation and optimal design for rf mems cantilevered beam switch", in proceedings of the international conference on future computer and communication fcc '09, 2009, pp. 84-87. [25] s. suganthi, k. murugesan, s. raghavan, "neural network based realization and circuit analysis of lateral rf mems series switch," in proceedings of the international conference on computer, communication and electrical technology icccet 2011, 2011, pp. 260 265. [26] q. j. zhang, k. c. gupta, neural networks for rf and microwave design, artech house, 2000. [27] c. christodoulou, m. gerogiopoulos, applications of neural networks in electromagnetics, artech house, 2000. [28] p. burrascano, s. fiori and m. mongiardo, "a rewiew of artificial neural network applications in microwave computer-aided design", int j rf microw c e, vol. 9, no. 3, pp. 158-174, 1999. [29] z. marinković, v. marković, "temperature dependent models of low-noise microwave transistors based on neural networks", int. j. rf microw. c e, vol. 15, no. 6, pp. 567-577, 2005. [30] z. marinković, g. crupi, a. caddemi, and v. marković, "comparison between analytical and neural approaches for multibias small signal modelling of microwave scaled fets", microw. opt.techn. lett., vol. 52, no. 10, pp. 2238-2244, 2010. [31] j. e. rayas-sanchez, "em-based optimization of microwave circuits using artificial neural networks: the state-of-the-art", ieee trans. microw. theory techn., vol. 52, no. 1, pp. 420–435, 2004. [32] h. kabir, y. cao, and q. zhang, “advances of neural network modelling methods for rf/microwave applications,” applied computational electromagnetics society journal, vol. 25, no. 5, pp. 423-432, 2010. [33] z. marinković, g. crupi, d. schreurs, a. caddemi, v. marković, "microwave finfet modelling based on artificial neural networks including lossy silicon substrate", microel. eng., vol. 88, no. 10, pp. 3158-3163, 2012. [34] m. agatonović, z. marinković, v. marković, "application of anns in evaluation of microwave pyramidal absorber performance", applied computational electromagnetics society journal, vol. 27, no. 4, pp. 326333, 2012. http://www.sciencedirect.com/science/article/pii/s0026271410003379 http://www.sciencedirect.com/science/article/pii/s0026271410003379 http://apps.webofknowledge.com/full_record.do?product=wos&search_mode=generalsearch&qid=7&sid=y2opmh@dfgnmpdiofid&page=1&doc=1 http://apps.webofknowledge.com/full_record.do?product=wos&search_mode=generalsearch&qid=7&sid=y2opmh@dfgnmpdiofid&page=1&doc=1 rf mems switch ann models 191 [35] z. marinković, o. pronić-ranĉić, v. marković, "small-signal and noise modelling of class of hemts using knowledge-based artificial neural networks", int. j. rf microw. c e, vol. 23, no. 1, pp. 34-39, 2013. [36] z. marinković, n. ivković, o. pronić-ranĉić, v. marković, a. caddemi, "analysis and validation of neural approach for extraction of small-signal models of microwave transistors", microelectron. reliab., vol. 53, no. 3, pp. 414–419, march 2013. [37] s. di nardo, p. farinelli, f. giacomozzi, g. mannocchi, r. marcelli , b. margesin, p. mezzanotte, v. mulloni, p. russer, r. sorrentino, f. vitulli, l. vietzorreck, "broadband rf-mems based spdt", in proceedings of the european microwave conference, 2006. [38] z. marinković, t. kim, v. marković, m. milijić, o. pronić-ranĉić, l. vietzorreck, "rf mems modelling with artificial neural networks", in proceedings of the memswave 2013, 2013. [39] z. marinković, t. kim, v. marković, m. milijić, o. pronić-ranĉić, l. vietzorreck, "artificial neural network based design of rf mems capacitive shunt switches", submitted to aces applied computational electromagnetics society journal [40] advanced design system 2009, agilent technologies [41] comsol multiphysics 4.3, comsol, inc. [42] zlatica marinković, ana aleksić, tomislav ćirić, olivera pronić-ranĉić, vera marković, tomislav ćirić, "analysis of rf mems capacitive switches by using neural model of actuation voltage", 2nd international conference on electrical, electronic and computing engineering (icetran 2015), silver lake, serbia, june 8-11, 2015, pp. mti2.3.1-5. [43] t. ćirić, z. marinković, t. kim, l. vietzorreck, o. pronić-ranĉić, m. milijić, v. marković, "ann approach for mechanical characteristics modelling of rf mems capacitive switches," submitted to journal of electrical engineering-elektrotechnicky casopis [44] z. marinković, t. ćirić, v. đorċević, o. pronić-ranĉić, t. kim, m. milijić, v. marković, l. vietzorreck, "ann approach for the analysis of the resonant frequency behavior of rf mems capacitive switches", in proceedings of the first international conference on electrical, electronic and computing engineering icetran 2014, 2014, pp. mti2.1.1-5 [45] t. ćirić, z. marinković, o. pronić-ranĉić, v. marković, l. vietzorreck , "ann approach for analysis of actuation voltage behavior of rf mems capacitive switches", in proceedings of the 12th international conference on advanced technologies, systems and services in telecommunications telsiks 2015, 2015. [46] z. marinković, t. ćirić, t. kim, l. vietzorreck, o. pronić-ranĉić, m. milijić, v. marković, "ann based inverse modelling of rf mems capacitive switches", in proceedings of the 11th conference on telecommunications in modern satellite, cable and broadcasting services telsiks 2013, 2013, pp. 366-369. [47] l. vietzorreck, m. milijić, z. marinković, t. kim, v. marković, o. pronić-ranĉić, "artificial neural networks for efficient rf mems modelling", in proceedings of the xxxi ursi general assembly and scientific symposium ursi gass, 2014, pp. 1-3. [48] t. ćirić, z. marinković, t. kim, l. vietzorreck, o. pronić-ranĉić, m. milijić, v. marković, "ann based inverse electro-mechanical modelling of rf mems capacitive switches", in proceedings of the xlix scientific conference on information, communication and energy systems and technologies icest 2014, 2014, pp. 127-130. [49] z. marinković, a. aleksić, o. pronić-ranĉić, v. marković, l. vietzorreck, "analysis of rf mems capacitive switches by using switch em ann models", accepted for telfor journal, in press [50] z. marinković, a. aleksić, t. ćirić, o. pronić-ranĉić, v. marković, l. vietzorreck, "inverse electromechanical ann model of rf mems capacitive switches applicability evaluation", in proceedings of the xlx scientific conference on information, communication and energy systems and technologies icest 2015, 2015. [51] z. marinković, a. aleksić, t. ćirić, o. pronić-ranĉić, v. marković, t. ćirić, "analysis of rf mems capacitive switches by using neural model of actuation voltage", in proceedings of the 2nd international conference on electrical, electronic and computing engineering icetran 2015, 2015, pp. mti2.3.1-5. instruction facta universitatis series: electronics and energetics vol. 29, n o 3, september 2016, pp. 357 365 doi: 10.2298/fuee1603357k an architectural design for cloud of things abhirup khanna b.tech cse with specialization in cloud computing and virtualization technology university of petroleum and energy studies (upes) dehradun, uttarakhand, india abstract. in recent times the world has seen an exponential rise in the number of devices connected to the internet. this widespread expansion of the internet and growth in the number of interconnected devices has lead to the rise of many new age technologies. internet of things (iot) being one of them allows devices to communicate with one another that are connected through the internet. it provides a new way of looking towards pervasive computing wherein "things" be it sensors, embedded devices, actuators or humans interact with one another. but currently iot is facing a number of challenges related to scalability, interoperability, storage capacity, processing power and security which all act as a deterrent for its practical implementation. cloud computing, the buzzword of the it industry, suits best to handle all these challenges, thus leading towards the integration of cloud and iot. in this paper, we present a layered architecture for cloud of things, i.e. the amalgamation of cloud computing and internet of things. the architecture provides a scalable approach for iot as it allows dynamic addition of nnumber of "things". moreover, the architecture allows the end users to host their applications onto the cloud and access iot systems remotely. towards the end, the paper discusses a use case that proves the correctness of the proposed architecture. key words: cloud of things, internet of things, cloud computing, ubiquitous computing 1. introduction in present day times internet has become a key aspect in everyone's life. from shopping malls to banks, from e-health to military equipments, internet has made its mark. with the advancement of the internet more and more people are able to connect among themselves located at distant places throughout the globe. this outburst of the internet has given birth to a new idea of having every object connected to one another. soon the number of things connected to the internet would surpass the number of people living on earth. according to an estimate given by cisco, 50 billion devices would be connected through the internet by the year 2020. the future will have things communicating to one received june 30, 2015; received in revised form november 12, 2015 corresponding author: abhirup khanna university of petroleum and energy studies (upes), bidholi, via prem nagar, dehradun, uttarakhand 248007, india (e-mail: abhirupkhanna@yahoo.com) 358 a. khanna another rather than humans; in fact they would be talking on behalf of humans [1]. this rise in the outreach of the internet is gradually leading towards an era of internet of things (iot). wherein the objects (things) connected to the internet would be sharing information with other objects as well as with humans. new ways of communication would evolve allowing humans and things to communicate with one another. iot can be seen as a revolution in the field of computer science and would play a vital role in shaping the future of computing. the term internet of things was introduced way back in the year 1999 by kevin ashton. at that time many people thought it to be just an analogy for m2m communication but they never realized how big iot can become. it is true that the concept of iot follows the principles of m2m communication, but it cannot be considered as an analogy for it [2]. m2m communication finds its application in the late 1960s and early 1970s. it was a term used by the telecom industry to denote point to point communication. m2m communication was merely connecting embedded devices to one another through cellular or wired networks. whereas on the other hand internet of things is far more than this, having an ip based networking model along with the integration of sensors and embedded devices. iot allows various kinds of heterogeneous devices to connect to one another, collect data, exchange information and depict this information onto the real world with the help of actuators. iot facilitates the use of wireless sensor networks (wsn) in order to collect information from sensors present at remote locations. the wsn comprises of an n number of self powered sensing nodes connected through a wireless network. these nodes detect events, gather information and transmit this information to their base stations. to be precise, iot is not just about embedded devices connected to one another, rather, it consists of a large set of actors that lead to its proper functioning. talking of the actors that constitute the entire system of internet of things include: sensors, embedded devices (things), sensor networks, actuators and humans. sensors gather data that is transmitted to embedded devices through sensor networks. things process this data, generate information and exchange this information with one another or even humans. specific actions are performed by the actuators or humans in accordance to the processed information. talking of iot there is always a mention of data that is either being exchanged or processed or being depicted in the physical world. with the increase in number of "things" the data being exchanged or processed by them will also increase leading to an outburst of unstructured heterogeneous data. present day embedded devices lack the capabilities to store and process this humongous amount of data thus heading towards the integration of cloud computing and internet of things [3]. cloud computing needs no introduction as it is one of the big time game changers in the field of computer science. nowadays, a lot is heard about cloud computing and how it is being implemented in every walk of life. cloud computing is a next gen computing model that allows users to have access to resources on a pay as you use basis. cloud is constructed on the foundation of virtualization thus allowing its users to access unlimited amount of resources from remote locations. dynamic resource allocation, platforms to host heterogeneous services and applications, virtualization of resources, unlimited storage and processing capabilities is what makes cloud the buzz word of the it industry. the integration of cloud computing and iot will give rise to a new computing paradigm having benefits of both iot and cloud. this new paradigm can be addressed as cloud of things (cot), wherein cloud acts an architectural design for cloud of things 359 as a central control and processing unit and things are the real world entities which collect data and represent information in the form of suitable actions. but in order to make cot a reality there is an urgent need for an architecture that could depict its internal and external working. the architecture would define various actors along with their functionalities required to constrict an ecosystem for cot. the aim of this paper is to present such kind of an architecture that represents the amalgamation of cloud and iot. in this paper we propose a layered architecture for cot that leverages the capabilities of cloud and explores the outreach of internet of things. the rest of the paper is organized as follows. section 2 talks about the challenges of iot and the benefits of its integration with cloud. section 3 discusses some of the architectures for internet of things. in section 4 we have the proposed architecture for cloud of things. in section 5 there is a use case to validate the proposed architecture. finally, section 6 provides a conclusion for the paper. 2. challenges for cloud in iot till now we have discussed the benefits of iot and how its implementation could ease the way of living. but there are several challenges which we come across towards the implementation of iot and its potential to become the future of computing [4]. below are some of the prominent challenges which need to be addressed before implementing iot. 1) interoperability: it is said that iot is based on diversity and not interoperability but it is essential for an iot driven system to foster both technical as well as semantic interoperability [5]. every system working on the guidelines of iot should allow various kinds of heterogeneous devices to connect to one another. devices should be allowed to communicate among themselves irrespective of the operating system running on it or its hardware configuration. semantic interoperability also needs to be harnessed so that every device has a correct and similar interpretation of the exchanged information. 2) data access and control: data sharing is an essential part of iot. it would be beneficial for all if various organizations could come up and share their data in order to gain useful insights. thus who can access and control this data is a big question as data ownership still remains a concern for iot. 3) security and privacy: data integrity and privacy is a major concern for iot as most of the data exchanged comprises of users personal information. issues such as protecting users' privacy and manufacturers' ip; detecting and blocking malicious activity come under security threats pertaining to iot [6]. implementation of energy efficient data encryption schemes along with maintaining a proper authentication mechanism is a challenge for iot. 4) storage capacity: embedded devices used in iot lack the storage capabilities that are needed to store huge volumes of data collected from various sensors. their inability to store large amounts of data makes the system inefficient and leads to creation of incomplete data sets. 5) processing power: things involved in iot lack processing capabilities and thus are unable to process huge volumes of data. this lack of processing power leads to half baked information which when depicted lead to actions that are incorrect. 360 a. khanna 6) power consumption: the devices being used under internet of things, be it sensors or actuators, require power to run. new research needs to be done in promoting the use of low power devises that consume less battery life and can run for years. 7) reliability: iot systems need to be reliable in order to meet the industry standards. any single point of failure in the system should not hamper the working of the entire system. the system needs to be flexible, robust and fault tolerant in nature [7]. 8) scalability: one major question related to internet of things is how big it can become? or to put it this way, how far is iot scalable? [8] there are very limited systems or architectures that fully explore the scalability of iot. a lot of work needs to be done in designing systems for iot that facilitate dynamic increase and decrease of things. after going through all the above mentioned challenges cloud seems to be the best solution for all of them. integration of cloud with iot will allow iot systems to have access to unlimited storage and processing capabilities along with efficient security mechanisms. amalgamation with cloud will provide flexibility, scalability and robustness to the entire system. cloud will also be acting as a platform where service providers could host there services and monitor the working of the entire system. for end users cloud would act as an interface from which they can interact and communicate with their devices. the fusion of iot and cloud will also act beneficially for cloud providers as they will be able to enhance the reach of their services to the real world entities in a more dynamic and distributed manner. 3. related work since the outburst of iot, many architectures have been proposed in order to implement it in a practical scenario. similarly, many such frameworks have been proposed that exhibit fusion of cloud and iot. in this section, we have presented some of the research works pertaining to this area.  diat stands for distributed internet-like architecture for things. it is a layered architecture for iot that works on the principles of service oriented architecture (soa) and ensures minimum human involvement [9]. the architecture comprises of three layers, namely, virtual object layer (vol), composite virtual object layer (cvop) and service layer. all three layers are clubbed together along with their functionalities into a stack call iot daemon. this is the very daemon that forms the core of the entire architecture. talking of the different layers the vol acts like an interface between the real and physical world and is responsible for virtual representation of objects. the work of the cvop is to ensure communication and interaction between virtual objects present at the vol. last comes the service layer whose work is to manage and monitor all kinds of various services. it can also initiate service creation on its own in order to make the entire system automated.  marm also known as multi agent based rfid middleware is software that is built on the principles of agent oriented software engineering [10]. it also incurs a layered architecture having three layers for device management, data management and user interface. there is another architecture proposed in [11] that makes use of an architectural design for cloud of things 361 a cell based structure in order to ease the traffic congestion between rfid readers and tags.  next is an architecture which talks about the integration of cloud and iot. cloudthings is an architecture that aims at the integration of cloud computing and iot and interacts with all the three delivery models (iaas, paass, saas) of cloud [12]. the purpose of the architecture is to enhance the experience of application development and management through the use of cloud computing.  when dealing with internet of things mobile devices play a major role. with the advancements in the smartphone technology mobile devices ought to be the perfect match for what we call a "thing" in iot. mosden focuses at this aspect and provides a middleware between a mobile device and iot [13]. the middleware makes use of mobile devices as sensing units and transmit the sensed data to the backend systems. thus the work of the developer is made easy by allowing it to code at the backend rather than on the mobile device itself.  thin clients have always been used to propagate the principles of ubiquitous computing and now they are being implemented in designing systems for iot. the architecture proposed in [14] makes use of thin clients as thin servers which act as an interface for low level devices such as sensors and actuators. the architecture deploys communication protocols such as coap and http to facilitate communication between various applications and devices. the apps and thin servers make use of restful api calls to interact with the low end devices. the application model of the architecture works similar to web mashups and enables developers with the facility to reuse their code in designing new services. for discovery of new nodes the architecture takes help of meta data such as rfid tags, names, geospatial information, etc. 4. proposed architecture a scalable and robust architecture is required to ensure proper working and implementation of a cot based ecosystem. the architecture must cope with the never ending requirements of the end user along with tackling the challenges mentioned in section 2. constructing architecture is the first step towards a solution. in this section we propose architecture for cloud of things which would act as a blue print for the technology and describe various components that constitute it. below is the detailed description of a layered architecture along with its various actors pertaining to cloud of things. sensing layer: this layer comprises of the various kinds of sensors present in the system. the work of the sensors is to gather information and transmit it to the subsequent network layer. sensors act as the eyes and ears for the system and detect events and transmit the collected information. every sensor can be categorized on the basis of three parameters namely, sensor type, methodology and sensing parameters. sensor type defines which type of sensor it is, i.e. whether it is a homogeneous or a heterogeneous sensor or if it is a single dimensional or multidimensional sensor. methodology tells about the ways in which the sensor gathers information. it can be either active or passive. active sensing means direct collection of data, i.e. from an mri, while passive sensing is inferring data (blood pressure) from the data collected by active sensing. sensing parameters are the number of parameters which a sensor is able to sense. a sensor might 362 a. khanna just sense one parameter like body temperature or many parameters like in the case of ecg. the sensing layer may also comprise of rfid readers which gather information from rfid tags. these rfid tags can store large amounts of information and can be easily tagged on any object be it an animal, consumer product or a human being. fig. 1 layered architecture for cloud of things communication layer: it is also known as the network layer. the purpose of this layer is to maintain communication among various sensors, things and humans. the three broad categories of communication that take place are:  sensor to thing.  thing to thing.  human to thing. it is the communication layer which receives information from the sensing layer and forwards it to the control layer. the network layer comprises of two gateways which act as collection points to combine information collected from various sensors and rfid readers. these gateways combine all forms of unstructured information and transmit it to the subsequent control layer. the communication layer makes use of several networks in order to maintain interaction at various levels. wsn or wireless sensor networks form the core of the network layer. in case of wsn, sensors are connected through a wireless network and transmit information to their respective hosts through wireless communication. another type of sensor network which the network layer uses is the body sensor network (bsn). it consists of sensing nodes that are implanted inside or outside a patient's body. the work of the sensing nodes is to monitor and sense physiological parameters of a patient like its blood pressure and body temperature. the communication layer may also comprise of nsg, i.e. net generation networks which is a combination of body sensor networks and social networks. the communication layer works on the ip an architectural design for cloud of things 363 based networking model and provides a unique ip address to every node (sensor, thing, human) connected through the system. as the number of nodes in a system increase at an exponential rate, keeping this in mind the communication layer implements the ipv6 addressing scheme to map every node. control layer: it is the most important layer of the entire architecture. it is the control layer which derives useful insights by performing computations over the data received from the communication layer. in technical terms, the control layer can be considered the cloud layer as it is where all the data is stored and processed. the control layer is also known as the service layer as it provides a platform for service providers to host their services. it also acts like a web portal for end users to add, delete and monitor their devices (things and sensors). any device can become a part of the system by registering itself. after successful registration every device is allotted a unique id and password. the id is usually the ip address of that device and password is for secure authentication. once the device is added its entry is made in the cloud data base. it is the scalable and robust nature of cloud that facilitates dynamic addition and subtraction of nodes. the control layer receives data from the communication layer and stores it in the data bases. it then applies certain algorithms and performs the n number of computations on the stored data in order to find interesting patterns. the computations performed on the data are in accordance to the service to which it belongs. the results after processing are reverted back to the end user and are either depicted by the actuators or represented in the form of useful information (knowledge). actuation layer: the purpose of this layer is to represent information received from the control layer. it is the actuators which receive and represent useful insights coming from the control layer into the physical world. the actuation layer comprises of robotic arms, led screens, motors, pulleys, etc. the process of actuation can either be manual or automatic. in case of manual actuation, human intervention is involved and results are depicted by humans based upon the suggestions given by the control layer; whereas in automatic actuation the actuators work on their own in accordance with the information received from the control layer. 5. proof of concept over the past few years internet based technologies have found their way in numerous health care applications. with the advancements in sensor technologies iot is able to find its use in several medical applications [15]. iot aims at easing the life of people and it does the same when dealing with patients. iot based systems are able to provide convenience to both patients and doctors by offering services such as real time monitoring of the patient, health management, emergency management and patients information management. in order to prove the proposed architecture we created a test bed for it. the use case used in here is of a health care monitoring system. the system would be monitoring the physiological parameters of a patient on a real time basis and take suitable actions if the values of a parameter go out of range. the system will also be monitoring the geospatial location of the patient with the help of a gps sensor. below is the working of the health care monitoring system. 364 a. khanna  the system would be monitoring three physiological parameters, namely, blood pressure, body temperature, and pulse rate of a patient. sensors used for this purpose are pressure sensor, temperature sensor and a pulse sensor.  all the sensors are connected to an arduino board. the board is also connected to a 2.4" tft lcd screen in order to display suitable information.  the raw data is collected from the sensors and transmitted to an application running on cloud. data is transferred using the internet protocol. the application would be storing data onto the cloud data base and would be comparing whether the parameters lay in normal rage or not.  if any of the parameters go beyond its normal range the application would communicate with the arduino board.  the arduino board will take suitable actions such as display the name of a prescribed medicine or transmit the coordinates of the patient to the application by communicating with the gps sensor.  once the application has received the geospatial coordinates it can easily book an appointment with the doctor for a house visit or call the ambulance to that specific location. 6. conclusion since the last decade, internet has drastically changed as well as the needs of its users. with the growing popularity of the internet the number of users accessing it has also increased. in this paper, we propose an architecture for cloud of things which is an amalgamation of cloud computing and internet of things. cot is a new age technology that copes with the ever increasing size of the internet as well as to the never ending requirements of the end users. the proposed architecture talks about the various actors along with their functionalities that are required to setup a cloud integrated iot system. upcoming technologies such as fog computing or cloudlets will be more appropriate for iot rather than cloud, thus leading researchers are to explore new avenues related to them. in the future, people from different walks of life can make use of this architecture to implement a cot system in a practical scenario. references [1] m. gomes, r. da rosa righi, c. da costa, “internet of things scalability: analyzing the bottlenecks and proposing alternatives”, in proceedings of the ieee 6th international congress on ultra modern telecommunications and control systems and workshops (icumt), 2014, pp. 269-276. [2] c. doukas, l. capra, f. antonelli, e. jaupaj, a. tamilin, i. carreras, “providing generic support for iot and m2m for mobile devices”. in proceedings of the ieee rivf international conference on computing & communication technologies-research, innovation, and vision for the future (rivf), 2015, pp. 192-197. [3] s. w. kum, j. moon, t. lim, j. i. park, “a novel design of iot cloud delegate framework to harmonize cloud-scale iot services”, in proceedings of the ieee international conference on consumer electronics (icce), 2015, pp. 247-248. [4] v. gazis, m. goertz, m. huber, a. leonardi, k. mathioudakis, a. wiesmaier, f. zeiger, short paper: “iot: challenges, projects, architectures”, in proceedings of the 18th international conference on intelligence in next generation networks (icin), 2015, pp. 145-147. an architectural design for cloud of things 365 [5] o. vermesan, p. friess, internet of things-global technological and societal trends from smart environments and spaces to green ict, 2011, river publishers. [6] r. h. weber, “internet of things–new security and privacy challenges” computer law & security review, vol. 26, no. 1, pp. 23-30, 2010. [7] h. d. ma, “internet of things: objectives and scientific challenges”, journal of computer science and technology, vol. 26, no. 6, pp. 919-924, 2011. [8] d. miorandi, s. sicari, f. de pellegrini, i. chlamtac, “internet of things: vision, applications and research challenges”, ad hoc networks, vol. 10, no. 7, pp. 1497-1516. 2012. [9] c. sarkar, a. uttama nambi sn, r. prasad, a. rahim, r. neisse, g. baldini, diat: a scalable distributed architecture for iot, 2012. [10] l. v. massawe, f. aghdasi, j. kinyua, “the development of a multi-agent based middleware for rfid asset management system using the passi methodology”, in proceedings of the sixth international conference on information technology: new generations, 2009. itng'09. pp. 1042-1048. [11] a. solanas, j. domingo-ferrer, a. martínez-ballesté, v. daza, “a distributed architecture for scalable private rfid tag identification” computer networks, vol. 51, no. 9, pp. 2268-2279, 2007. [12] j. zhou, t. leppanen, e. harjula, m. ylianttila, t. ojala, c. yu, l. t. yang, “cloudthings: a common architecture for integrating the internet of things with cloud computing”, in proceedings of the ieee 17th international conference on computer supported cooperative work in design (cscwd), 2013, pp. 651-657. [13] c. perera, p. p. jayaraman, a. zaslavsky, d. georgakopoulos, p. christen, “mosden: an internet of things middleware for resource constrained mobile devices” in proceedings of the 47th hawaii international conference on system sciences (hicss), 2014, pp. 1053-1062. [14] m. kovatsch, s. mayer, b. ostermaier, b. “moving application logic from the firmware to the cloud: towards the thin server architecture for the internet of things”, in proceedings of the sixth international conference on innovative mobile and internet services in ubiquitous computing (imis), 2012, pp. 751-756. [15] j. choi, m. ha, j. im, j. byun, k. kwon, w. yoon, d. kim, “the patient-centric mobile healthcare system enhancing sensor connectivity and data interoperability”, in proceedings of the international conference on recent advances in internet of things (riot), 2015, pp. 1-6. facta universitatis series: electronics and energetics vol. 30, n o 3, september 2017, pp. 313 326 doi: 10.2298/fuee1703313h circular test structures for determining the specific contact resistance of ohmic contacts  anthony s. holland, yue pan, mohammad saleh n. alnassar, stanley luong school of engineering, rmit university, melbourne, victoria, australia abstract. though the transport of charge carriers across a metal-semiconductor ohmic interface is a complex process in the realm of electron wave mechanics, such an interface is practically characterised by its specific contact resistance. error correction has been a major concern in regard to specific contact resistance test structures and investigations by finite element modeling demonstrate that test structures utilising circular contacts can be more reliable than those designed to have square shaped contacts as test contacts become necessarily smaller. finite element modeling software nastran can be used effectively for designing and modeling ohmic contact test structures and can be used to show that circular contacts are efficient in minimising error in determining specific contact resistance from such test structures. full semiconductor modeling software is expensive and for ohmic contact investigations is not required when the approach used is to investigate test structures considering the ohmic interface as effectively resistive. key words: ohmic contact, specific contact resistance, contact resistance, test structure, circular transmission line model, transmission line model. 1. introduction in practice, ohmic contacts are one of the least complex aspects of semiconductor devices. the modelling of an ohmic contact requires only contact geometry, material resistivities and the specific contact resistances of all contact interfaces in the contact structure. if a contact is ohmic then its current-voltage behaviour is linear. an ohmic contact interface has a finite thickness defined by the alignment of the fermi levels of the two contacting materials at equilibrium and the thickness is really that of the „disturbed‟ region (depletion layer) of the semiconductor. the „undisturbed‟ region (undisturbed by the presence of the metal) of the semiconductor behaves resistively as intended due to whatever doping values it was fabricated with. though the transport of charge carriers across a metal-semiconductor ohmic interface is a complex process, such an interface is practically characterised by a characteristic specific received january 23, 2017 corresponding author: anthony s. holland school of engineering, rmit university, melbourne, victoria, australia (e-mail: anthony.holland@rmit.edu.au) 314 a. s. holland, y. pan, m. s. n. alnassar, s. luong contact resistance (scr). the evolution of semiconductor devices required the lowering of values of this parameter for ohmic contacts and the investigation of test structures to determine these small values has been a significant and important area of research. error correction has been a major concern in regard to scr test structures and investigations by finite element modelling demonstrate that test structures utilising circular contacts can be more reliable than square shaped contacts which are impractical to realise for small geometries. circular designed contacts will remain as circles when fabricated and hence their area can be accurately determined. test structures with circular contacts can be realised with an equipotential always resulting at the contact circumference and mathematical solutions are obtainable. in this paper it is shown that the use of finite element modelling software nastran can be used effectively for designing and modelling ohmic contact test structures and that circular contact are efficient in minimising error in determining scr from such test structures. nastran software solves for heat flow and gives temperature contour distribution. it has been extensively used for solving problems for the analogous situation of electrical current flow and equipotential distribution. full semiconductor modelling software is expensive and for ohmic contact investigation is not required when the approach used is to investigate test structures considering the ohmic interface as purely resistive. more complex investigations that consider tunnelling probability and other aspects of charge transport across an interface will require software for full semiconductor physics solutions but this is not necessary when an experimentalist wants to determine the effect of this physics which is an interface‟s scr. there are two ways to look at the parameter specific contact resistance (represented by symbol c, [ω.cm 2 ]). first there is the rather academic or theoretical way of describing it as the inverse of the differential of current density versus voltage (j-v at the origin) for uniform current density, so that even a schottky contact has a scr value. there is much to be gained from this first approach in understanding the physics of current across a metalsemiconductor junction. semiconductor software tools are great for this first approach but not real test structures as it is difficult to isolate a contact so that it is the only entity determining a j-v curve and have uniform current density at the same time. however, this approach is worth pursuing with computer modelling (and actual test structures if possible) of appropriate test structures to demonstrate the physics in scr equations [1]. the second or the more practical investigation is to study scr in regard to practical ohmic contacts only so that the derivative of a contacts j-v curve is the same at the origin as it is at practical voltage values e.g. j-v being linear from -5v to +5v. this second approach need not consider the physics of current transfer but rather the effective resistance of a contact interface as it contributes to the total resistance of a source or drain contact of a mosfet for example. in practical ohmic contacts the depletion layer of the metal-semiconductor interface is relatively small and for active layers of a practical contact structure, each layerto-layer interface can be considered to have a unique scr value. having determined scr values for any layer-to-layer interface should enable accurate modelling of structures of any geometry involving such interfaces. so, the second approach is very much a „try and see‟; where the first enquiry is to determine if a particular contact interface is ohmic and if so what is its scr value and can this be used to determine the effective resistance of a contact of a particular area. the reverse is also used, where the effective resistance of a two-layer contact can be used to determine the scr of the interface using appropriate analytical expressions relating scr and contact resistance. although the authors are not aware of any report on using computer modelling in this reverse way, it should be possible to use computer circular test structures for determining the specific contact resistance of ohmic contacts 315 modelling in an iterative way to determine what scr realises a given (e.g. experimentally determined) contact resistance. the study of scr requires the use of test structures for measuring the voltage drop across the contact of interest and any parasitic resistance encountered. it is the parasitic resistance that causes most difficulty. another difficulty is the effect of contact area, unless the area is small enough that uniform current distribution can be assured, otherwise the concept of transfer length has to be considered. it is in regard to area that circular contacts have an advantage (compared to square contacts) in that even though a circular contact realised after fabrication may not have the same diameter as designed, it will still be a circle and its diameter and area can be accurately determined. square designs have the disadvantage of ending up having rounded corners. several test structures have been developed using this advantage of circular contact designs [2-5]. the circular transmission line model (ctlm) ohmic contact test structure [6] was developed not with this advantage of circular contacts always being circular but with the advantage that no mesa etch or active area isolation is required which is a significant advantage compared to linear transmission line model (tlm) ohmic contacts test structure‟s [7, 8]. the disadvantages include the active layer isolation process steps and active layer overlap of contacts where in theory there should be none. the cross kelvin resistor (ckr) test structure has the same disadvantages [9]. 2.ohmic contact characterisation ohmic contacts are fundamentally important: there are at least two contacts in every transistor and there are billions of transistors on the most complex semiconductor chips. ohmic contact research is crucial for the development of novel nanotechnology devices [1]. it is imperative to have low resistance contacts to these nanoscale devices. fundamental understanding of ohmic contact structures, materials properties and processing will result in better semiconductor devices performance and enhanced power efficiency. scr is an extremely important parameter for quantifying a metal to semiconductor ohmic contact. its theoretical description is defined as the reciprocal of the derivative of current density with respect to voltage at v = 0 [8] (equation 1). a good ohmic contact requires a negligible value of scr to ensure the linear i-v characteristic (between such two contacts) is mainly due to resistance of the semiconductor ( ) (1) note that equation (1) is the definition of scr which is a theoretical quantity referring to the metal-semiconductor interface only. in practice, a more meaningful definition of the scr for a real metal-semiconductor ohmic contact is an electrical parameter which is determined from measured contact resistance between a metal and a semiconductor. scr is a very useful term for characterising ohmic contacts because it is independent of contact area and is a convenient parameter when comparing contacts of various sizes. though in practice, an experimentalist or device process engineer will want to know how many ohms an ohmic contact presents to current flow, a design engineer can utilise known scr values to better design and model a contact considering contact layers, interfaces and geometry parameters are all known, including the scr of each interface. to ensure accurate 316 a. s. holland, y. pan, m. s. n. alnassar, s. luong semiconductor modelling, any scr determined experimentally and used in contact or device modelling should be in agreement with equation 1, if this is possible to demonstrate. ohmic contact characterisation is carried out by using test structures to investigate the electrical behaviour and a suite of materials analysis tools to investigate the materials which make up the ohmic contacts e.g. silicide (metal-silicon reaction product) layer. characterisation usually aims to attain the outcomes listed below for optimising contact properties. 1. use test structures to accurately quantify the resistance due to contact interfaces. this resistive property is qualified using scr and improved efficiency and speed in determining low scr values for ultra-small contacts is often a goal of particular research in this area. 2. understand the influence of mechanical, electrical and thermal materials parameters, in particular the influence of defects and stress formation at the contact interface, on scr values. 3. optimise test structures and demonstrate new ones to confirm a test structure‟s suitability for determining processing changes that contribute to reducing the scrs of metal-silicide-silicon contacts for example. 4. hybridisation of analytical calculations and numerical computations of ohmic contact architectures to model the electrical behaviour of fabricated test structures. item 4 above is an area that could be explored further. multilayer ohmic contact test structures will of course have an effective resistance to electrical current and the accurate determination of the resistance of such contact structures can be better realised if interfaces have their scr‟s included – other parameters being layer resistivities and geometries for ohmic contacts structures only. scr is a parameter that has been reduced by several orders of magnitude (due to the introduction of silicides for silicon contacts [17] for example) throughout the semiconductor era. reported values of scr for some ohmic contacts are listed in table 1. in the international technology roadmap for semiconductors (itrs) the scr values required for particular technology nodes have been given in many of its publications showing significant reduction in the required value as technology generations progress. in 2017 the target is in the low 10 -9 cm 2 range. determining the value of scr quantifies the interface for a particular processing technology and gives information about the quality of an ohmic contact fabrication process. it also allows for comparison of different two-layer ohmic contacts or for different device processes using the same two layers for contacts. hence, determining scr allows for optimisation of the process for forming an ohmic contact. determining accurate values of scr will aid in better modelling of contact structures, in order to minimise the contact resistance (rc). note that scr is the biggest contributor to rc for relatively small contacts. minimising rc in turn minimises the net resistance of a circuit, and its overall power consumption; these will result in more power efficient devices and circuits. unlike contact resistance rc, the scr value should not include contributions from the resistivity of the two contacting layers or topological effects due to the contact geometry design. if a reported value of scr does include these effects it is regarded as an effective scr for a particular contact (including geometry) and such a scr value cannot be used in designing and modelling other geometries with the same contact layers (and processing steps). the units of scr (cm 2 ) may be misleading to some researchers who have not specialised in this area. this parameter cannot be used directly to determine the resistance circular test structures for determining the specific contact resistance of ohmic contacts 317 contribution of the interface in a two-layer contact unless one is confident that the current is uniformly distributed in the contact interface. if current can be assumed to be distributed uniformly in the interface of a two-layer contact then for area a (cm 2 ) the resistance of the interface is simply ρc/a (). this assumption does not always hold (unless a is relatively small) because electrical current in most semiconductor devices (which are planar) has to turn 90 o into a contact, and so is not always uniformly distributed across a contact interface. for example, current can flow laterally under the gate region of a transistor and then turn upwards through the drain ohmic contact, similarly for a contact in a test structure. the distribution of current in the drain contact area is dependent on the value of ρc, but is also influenced by other parameters such as the resistivity of any silicide used, the interconnect material, and any liner used; and the geometry of these materials. intuitive understanding in this case can be misleading, and only rigorous analytical and numerical modelling will portray the actual current distribution. table 1 reported values of specific contact resistance (scr) for some ohmic contacts ohmic contact layers scr value ωcm 2 ref. al-si 1  10 -6 [11] al-wsix-si 3  10 -7 [12] al-tisi2-si 1  10 -8 [13] al-tisi2 4  10 -9 [5] nisi-si 5  10 -9 [14] nige-ge 2.3  10 -9 [15] tisix-si 1.3  10 -9 [16] 3. test structure modelling the main test structures used for characterising ohmic contacts and determining scr in particular are the transmission line model (or transfer length method) (tlm) [7,8], cross kelvin resistor [9], and the circular transmission line model [6]. more recent test structures are the multi-ring ctlm [15], refined tlm [16] and the two-circle electrode contacts [4]. one of the main issues with test structures based on the transmission line model is that they are essentially 2-d models and do not allow for vertical voltage drops. an estimate as to whether a 3d correction is applicable to a contact can be made by calculating the parameter  where =c/b .t and b is the resistivity of the semiconductor layer. this parameter was first used by berger [8] to estimate the influence of semiconductor depth and resistivity on the derivation of c using transmission line model test structures. the parameter gives an indication of the ratio of the voltage drop across the contact interface to the voltage drop in the vertical direction occurring in the semiconductor material beneath the contact. when <1, 3d effects are significant as the voltage drop in the vertical direction in the semiconductor layer is nominally greater than the voltage drop across the contact interface (scr = c.) when >1 and increasing, the voltage drop in the vertical direction is becoming less important (the contribution of this vertical voltage drop compared to the measured values becomes less significant) and 2d modelling will be sufficiently accurate. calculation of  requires some knowledge of c; however an initial upper figure for  can be found using c determined from a 2d correction. 318 a. s. holland, y. pan, m. s. n. alnassar, s. luong this will give an indication of whether a 3d correction may be applicable. if <1 and the corrections are made using 2d data, then significant errors can be introduced (overestimation) in the derivation of c [18]. by using finite element analysis we can optimise the use of material and geometries of interconnect to minimise ohmic contact and interconnect via resistance [18]. the electrical equation used to describe d.c. electrical conduction (equation. 2) is analogous to that for thermal conduction (equation 3) (2) where j = electrical current density, v = voltage, n = spatial coordinate in the direction of current flow and = material electrical conductivity (3) where h = heat flux, t = temperature, n = spatial coordinate in the direction of heat flow, and k = material thermal conductivity equations 2 and 3 have the same form and therefore can be solved using the same finite element program. nastran is a finite element program developed by nasa for heat transfer analysis (and mechanical structural analysis). nathan et al. [19] reported on the use of this program for electrical analysis based on the analogy indicated by equations 2 and 3. nastran has been used by the authors to design and model various ohmic contacts test structures as well as interconnect vias [20] as shown in figure 1. figures 2 (a) and (b) show an example of modeling a ckr test structure using nastran. in figure 2(b) the metal layer has been lifted up to show the equipotentials. the contact layers are typically separated by a thin oxide layer with the contact opening. vb is the value of the equipotential of the voltage tap of the (top) metal layer of the contact. the voltage measured on the tap (va) is used to determine the average voltage at the bottom of the contact interface. figure 3 shows an ideal ckr ohmic contact test structure. it can easily be appreciated that such a test structure is not possible to realise, and contact widths smaller than the current and voltage arms are required. (stavitski et al give an excellent report on using ckr test structures in [21]).this leads to parasitic error which can be studied using software such as nastran. figure 4 shows a possible test structure for fast turnaround in ckr measurements using the technique described in [5]. again, the software nastran can be readily used to model such a test structure. the circular contacts used can be as small as possible as long as their diameters can be measured. this contrasts with square contacts which will most likely have rounded corners (figure 5). extrapolation of scr‟ (scr plus parasitic resistance effect) for small contacts where the effective scr‟ is determined for each d/w value using scr‟=(va – vb) x area, gives the actual scr, as shown in [5]. again the use of circular contacts is more reliable as contact area can be reliably determined using measured diameters. the series of ckr test structures demonstrated in figure 4 utilises the technique of the ckr and the accuracy of determining area of circular contacts. the benefits of the series of ckr of figure 4 is that as the contact becomes infinitely small then the contact resistance will dominate the ckr resistance measurement. the possible problems with tlm test structures can be demonstrated using nastran modelling. figure 6 shows the effect of vertical voltage drop in the semiconductor layer which occurs when the semiconductor resistance (due to semiconductor resistivity and circular test structures for determining the specific contact resistance of ohmic contacts 319 thickness) below the contact is comparable to that due to the contact interface (scr effect). investigation shows the relevance of the parameter  [17]. in fig 6(a) there is the effect of horizontal and vertical voltage drop and in fig 6(b) the semiconductor has only the horizontal resistance effect of sheet resistance and the tlm equations can be reliably applied. figure 7 shows a schematic with the inclusion of this tlm contact section in a test structure and the effect is to increase the value of rc determined. similar error contributions occur for the ckr test structure [10]. fig. 1 example of equipotential distribution in an interconnect via (for input current i) determined using nastran finite element modeler. ρc1 is the specific contact resistance between metal1 and the via liner material [7]. (a) (b) fig. 2 (a) example of equipotentials in a cross kelvin resistor test structure for ohmic contact characterisation of a semiconductor layer (bottom layer) to a metal layer contact. the distribution of quipotentials in the semiconductor layers current input arm and the voltage (va) tap are more clearly shown in (b). the metal layer is shown as having one equipotential (vb) for the scale used. (modelled using nastran). 320 a. s. holland, y. pan, m. s. n. alnassar, s. luong fig. 3 ideal ckr test structure, where the square contact area has the same width as the four arms. i i v1a v1b v2b v2a v3a v3b wi dth , w co nta ct d iam ete r, d d/w =0. 1 d/w =0. 2 d/w =0 .3 scr’ scr’ fig. 4 (a) schematic of a chain of ckr test structures with varying contact sizes to determine scr and quick electrical testing. v1a etc. are voltages measured on the respective ckr taps. (b) expected and observed trend for scr‟ determined for varying ckr contact geometry. the actual value of scr is obtained by extrapolating to d/w = 0. d is the contact diameter and w is the ckr arm width. (a) (b) circular test structures for determining the specific contact resistance of ohmic contacts 321 fig. 5 possible effects of fabrication steps in reducing designed area of contacts of circular and square shapes. (a) (b) fig. 6 examples of equipotentials (volts) distribution for tlm models of metal to semiconductor contacts. in (a) the structure has <1 and (b) has >1,  being the parameter introduced by berger [17] to quantify the effect of semiconductor layer resistivity on tlm resistance measurements. fig. 7 (a) schematic of tlm test structure for determining contact resistance (rc) by measuring resistance between two contacts. (b) shows shows the equipotential distribution where the vertical voltage drop is significant as indicated by the curvature of the equipotentials. the tlm test structure does not include this contribution and measurements will. hence error results when the measured rc is used to determine the scr of the contact. (c) plan view of tlm test structure. (i is input current, rsh is sheet resistance, l is distance between contacts, w is width of active layer). 322 a. s. holland, y. pan, m. s. n. alnassar, s. luong 4. 2d circular specific contact resistance test structure the circular transmission line model (ctlm) test structure can be demonstrated using nastran finite element modelling. this test structure completely eliminates alignment error (as there is no alignment) and error is mainly due to the any inaccuracy in sheet resistance and like the tlm and ckr, error due to finite resistivity of the semiconductor layer can cause significant voltage drop in the semiconductor layer under the contact. yue et al [4] reported a technique using the ctlm test structure shown in figure 8 (a) and figure 9. the outer radius r1‟ is regarded as infinite in figure 9. here we will call this test structure the yue2d. it consists of three electrode discs and resistance measurement from these can relatively easily give semiconductor sheet resistance and scr. the main error that can occur in the yue-2d will be due to the  factor [17]. figure 8 (b)-(d) show images from examples of nastran finite modelling of the yue-2d test structure. the perfect symmetry of each electrode means that only a small „wedge‟ of each of the three electrodes (of fig. 9) needs to be modelled. the equipotentials shown in the semiconductor layer of figure 8(d) are similar to those in figure 6(b) where the vertical arrangement of the equipotentials indicates that there is little voltage drop in the vertical direction and hence accurate determination of sheet fig. 8 (a) schematic of circular transmission line model (ctlm) test structure using two electrodes, for determining contact resistance (rc) and scr, (b) finite element mesh used to model representative section of ctlm, (c) nastran model result showing equipotential distribution for two electrode ctlm and (d) section of ctlm showing equipotentials in semiconductor layer. circular test structures for determining the specific contact resistance of ohmic contacts 323 resistance and scr should ensue. again, an advantage of the circular electrodes is that accurate contact geometry can be measured (the fabricated contacts will be circular) and the actual radii can be used in calculations to determine contact parameters. extremely small contacts can also be realised when the value of scr is small and appropriate geometry is described in [4] for this. such a test structure with extremely small contacts will require more than one metal layer [22] in order to connect a probe to the electrode. fig. 9 schematic of the yue-2d test structure for determining semiconductor layer sheet resistance and scr of metal to semiconductor interface [4]. 5. 3d circular specific contact resistance test structure the scr of metals contacts to bulk semiconductor material is not usually reported, as the main interest for the semiconductor industry is in determining and reducing scr to shallow active layers. however the authors consider the test structure shown in figure 10 which shall hereafter be called the yue-3d, to give the most reliable measurements [3]. however, unlike the test structures reported previously in this paper, there is no analytical solution available relating resistance measurements and scr. solutions have to be obtained by computer modelling and resistance measurements plotted as a function of varying semiconductor resistivity, scr and the two radii (see figure 11). unlike the yue2d, the yue-3d only needs one resistance measurement (from one pair of electrodes). because of its accuracy, this test structure would be very suitable for studies of scr where a series of substrates are available with varying resistivity and for investigating the effects of surface treatments on varying scr. the yue-3d can be used for investigating ohmic contacts to bulk semiconductors where the semiconductor has uniform resistivity to a depth of several times the inner radius (r1) of the outer electrode shown in figure 10(a). as in the yue-2d, the outer radius r1‟ can be infinite [2]. a scaling equation can be applied to this test structure similar to that reported by loh et al. [23] for ckr test structures. 2 0 1 2 0 1 2 ( , , , , ) ( , , , , ) t b c t b c r mr mr mr mn m n nr r r r    (2) 324 a. s. holland, y. pan, m. s. n. alnassar, s. luong fig. 10 (a) schematic of the yue-3d test structure for determining scr of a metal to semiconductor contact interface for bulk semiconductor [3], (b) example of equipotential distribution in a section of the yue-3d test structure obtained from fem modeling using nastran. fig. 11 example of fem (nastran) analysis results for total resistance rt between two electrodes (fig. 10) as a function of scr (ρc) with resistivity ρb varying from 0.001 ω·cm -to0.01 ω·cm. geometry is fixed; r0 = 3 μm, r1 = 5 μm, and r2 = 9 μm. note that this figure can be scaled using (3). [2] circular test structures for determining the specific contact resistance of ohmic contacts 325 6. conclusion this paper has reviewed ohmic contact test structures investigated by the authors for ohmic contact characterisation between a metal and semiconductor in both two dimensional (2-d) and three-dimensional (3-d) circumstances using these test structures. the issues with regards to error correction, difficulty in analysing results and difficulty in fabrication, lead to the development of test structures with circular electrodes. these issues are (i) active layer definition, (ii) contact misalignment and overlap, (iii) equipotential problem, (iv) complicated analytical expressions and (v) vertical voltage drop. when the semiconductor layer in a metal-to-semiconductor contact is neither true 2-d nor true 3-d, there will always be some error, and error correction is required. for the test structure presented here, accurate results can be always determined when semiconductor layer can be regarded as truly 2-d or 3-d. in summary, all of the above issues with conventional test structures have been addressed and improved by the novel test structures (yue-2d and yue3d) developed for ohmic contact characterisation in both 2-d and 3-d circumstances. the corresponding methods for determining scr have also been presented and demonstrated using finite element modeling (fem). because of the resistance only effect of ohmic contacts, a full semiconductor physics modelling program is not required. commercially available fem software for static thermal analysis, such as nastran can be used for ohmic contact test structure investigation considering the analogous equations for heat and electric current flow. the yue-2d set of three two-contact circular test structures does not require mesa isolation and correction factors are unnecessary. furthermore, the analytical expressions are relatively simple compared to the conventional ctlm test structure. a 3d test structure (yue-3d) was demonstrated that should be most accurate in determining specific contact resistance. references [1] hiep n. tran, tuan a. bui, aaron m. collins, and anthony s. holland, “consideration of the effect of barrier height on the variation of specific contact resistance with temperature”, ieee trans. electron devices, vol. 64, no. 1, pp. 325, 2017. [2] a. m. collins, y. pan, a. s. holland, “using a two-contact circular test structure to determine the specific contact resistivity of contacts to bulk semiconductors”, facta universitatis, series electronics and energetics, vol. 28, no. 3, pp. 457 – 464, september 2015. [3] y. pan, a. m. collins and a. s. holland, "determining specific contact resistivity to bulk semiconductor using a two-contact circular test structure", in proceedings of the ieee international conference on miel, may 2014, pp. 257-260 [4] y. pan, g. k. reeves, p. w. leech, and a. s. holland, “analytical and finite-element modeling of a twocontact circular test structure for specific contact resistivity,” ieee trans. electron devices, vol. 60, no. 3, pp. 1202–1207, mar. 2013. [5] a. s. holland, g. k. reeves, "new challenges to the modelling and electrical characterisation of ohmic contacts for ulsi devices", microelectronics reliability, vol. 40, pp. 965-971, 2000. [6] g. k. reeves, “specific contact resistance using a circular transmission line model,” solid state electron., vol. 23, no. 5, pp. 487–490, may 1980. [7] w. shockley, “research and investigation of inverse epitaxial uhf power transistors”, air force atomic laboratory, wright-patterson air force base, rep. no. al-tdr-64-207, sept. 1964. [8] h. berger, “models for contacts to planar devices,” solid state electronics, vol. 15, pp. 145-158, 1972. [9] s. j. proctor and l. w. linholm, ieee electron device lett., edl-3 (10) 294 (1982). [10] c. y. chang, y. k. fang, and s. m. sze, “specific contact resistance of metal-semiconductor barriers,” solid-state electron., vol. 14, no. 7, pp. 541–550, jul. 1971. 326 a. s. holland, y. pan, m. s. n. alnassar, s. luong [11] g. srinivasan, m. f. bain, s. bhattacharyya, p. baine, b. m. armstrong, h.s. gamble, d. w. mcneill, mat. sci. eng. b, 114-115, pp.223-227, 2004. [12] m. finetti, s. guerri, p. negrini, a. scorzoni, and i. suni, thin solid films, vol. 130, no. 37, 1985. [13] majumdar et al, “stlm: a sidewall tlm structure for accurate extraction of ultralow specific contact resistivity”, ieee trans. electron devices, vol. 34, no. 9, september 2013. [14] miyoshi et al, “in-situ contact formation for ultra-low contact resistance nige using carrier activation enhancement (cae) techniques for ge cmos”, in digest of technical papers symposium on vlsi technology, 2014. [15] yu et al, “titanium silicide on si:p with precontact amorphization implantation treatment: contact resistivity approaching 1 × 10 −9 ohm-cm 2 ”, ieee trans. electron devices, vol. 63, no.12, september 2016. [16] r. dormaier and s. e. mohney, “factors controlling the resistance of ohmic contacts to n-ingaas,” j. vac. sci. technol. b, vol. 30, no. 3, pp. 031209-1–031209-10, may/jun. 2012. [17] n. stavitski, m. h. van dal, a. lauwers, c. vrancken, a. y. kovalgin, and r. m. wolters, “evaluation of transmission line model structures for silicide-to-silicon specific contact resistance extraction,” ieee trans. electron devices, vol. 55, no. 5, pp. 1170–1176, may 2008. [18] holland a. s. and reeves g.k., “new challenges to the modelling and electrical characterisation of ohmic contacts for ulsi devices", in proc. of the miel 2000 conference, vol. 2, pp.461-464, nis, may 2000. [19] m. nathan, s. purushothaman and r. dobrowolski, “geometrical effects in contact resistance measurements: finite element modelling and experimental results”, j. appl. phys., vol. 53, no. 8, pp. 5776-5782, august 1982, [20] anthony s. holland, geoffrey k. reeves, patrick w. leech, “finite element modelling of misalignment in interconnect vias”, pp. 307-310, commad, brisbane 2004. [21] n. stavitski, j. h. klootwijk, h. w. van zeijl, a. y. kovalgin, and r. a. m. wolters, “cross-bridge kelvin resistor structures for reliable measurement of low contact resistances and contact interface characterization,” ieee trans. semicond. manuf., vol. 22, no. 1, pp. 146–152, feb. 2009. [22] phd thesis, “versatile circular test structure for ohmic contact characterisation” dr pan yue, rmit university 2015. [23] w. m. loh, s. e. swirhun, t. a. schreyer, r. m. swanson, and k. c. saraswat, “analysis and scaling of kelvin resistors for extraction of specific contact resistivity,” ieee electron device lett., vol. edl-6, pp. 105–108, mar. 1985. instruction facta universitatis series: electronics and energetics vol. 27, n o 2, june 2014, pp. i i editorial as emphasized in the editorial for the second in the series of the anniversary issues, we will strive to attract best submissions and publish best papers from a very broad geographic area, thus making facta universitatis: series electronics and energetics a truly international journal. we will also insist that all published papers are of high quality and practical value, thus leading to their worldwide citation, i.e. to the journal’s placement onto sci list. whilst insisting that all published papers are of high quality and practical value, we wish to avoid creation a situation where the journal publishes by quantity rather than quality, and that is the reason why we already started with rigorous refereeing of all submitted papers. our new policy regarding publication of practical papers in facta universitatis: series electronics and energetics deserves to be elucidated now in more details. we want to publish more practical papers, as badly as the readers want to see them, but they are hard to provide. it should be emphasized here that the acceptance rate for practical papers is considerably higher than that for theoretical ones, since we want to encourage the submission of practical papers. the main reason why you see so many theoretical papers and so few practical papers is that people from an academic environment get paid to produce hardware rather than to write papers about it, and both of them do their jobs reasonably well. when next time you complain about how only a few practical papers appear in this, or any similar journal in the field, please ask yourself the following question: “when was the last time when i, or someone from this division of my organization submitted a practical paper to this journal?” if you have an idea for practical paper, do not hesitate to contact me, and i will be pleased to discuss it with you. this, third in the series of the anniversary issues, is collection of 9 invited papers by well-known experts for the specific areas, most of them being members of our editorial team, who present and discuss the state-of-the-art issues of practical interest in the field. as a new editor-in-chief, i, along with our editorial team, promise to continue to develop and improve facta universitatis: series electronics and energetics in order to keep it at the forefront of science and technology. ninoslav stojadinović editor-in-chief 10486 facta universitatis series: electronics and energetics vol. 35, no 3, september 2022, pp. 421-435 https://doi.org/10.2298/fuee2203421m © 2022 by university of niš, serbia | creative commons license: cc by-nc-nd original scientific paper control of series impedance of power lines using power flow controller aleksandar aco marković1,2, slobodan vukosavić2,3 1university of banja luka, faculty of electrical engineering, banja luka, republic of srpska, bosnia and herzegovina 2university of belgrade, faculty of electrical engineering, belgrade, serbia 3serbian academy of sciences and arts, belgrade, serbia abstract. in this paper, the possibility of unified power flow controller (upfc) to modulate both series resistance r and series reactance x of an overhead power line is discussed. the classical power flow control system of the ufpc is modified in the manner that standard input references signals (active and reactive powers) are replaced by reference signals of series resistance and reactance. using the procedure described in this work, the reference signals for active and reactive powers are generated indirectly. the operation of upfc in proposed operation mode is analyzed using computer simulation, based on a model of single machine infinite bus (smib) with constant impedance loads and two parallel lines. the goal is to show that upfc is capable to control both series line parameters (r and x) directly and independently by means of a simple control system without additional decoupling controllers. an additional task is to show that power flows can be indirectly controlled this way. the step response of series line resistance and reactance is used to validate the operation of the proposed control system. the obtained results clearly show that all goals are fulfilled. key words: unified power flow controller, impedance regulation, power system, power flows 1. introduction with introduction of variable sources in ac grids, with electronically controlled loads, integrity of the grid is challenged by reduced system inertia, limited support for transients from power electronics devices, and with quite new and different static and dynamic properties of the sources and loads that interface the ac grid through grid side inverters. at the same time, electric power required to run the internet and digitalization is rising steadily, while the process of decarbonization of the transport by means of electrification requires further increase in electric energy demand. these growing trends have great received february 16, 2022; revised april 4, 2022; accepted may 25, 2022 corresponding author: aleksandar aco marković university of banja luka, faculty of electrical engineering, 5 patre, banja luka, 78 000 banja luka, republic of srpska, bosnia and herzegovina e-mail: aleksandar-aco.markovic@etf.unibl.org 422 a. a. marković, s. vukosavić impact on power system which must respond on increasingly complex requirements. some of these requirements are: the response time of the system, stability margin and quality of electrical energy delivered to the consumer. to fulfill all the requests, flexible alternating current transmission system (facts) devices are introduced into the system. the most complex and the most substantial device of all facts is unified power flow controller (upfc). the main reason for introduction of upfc into the system is the need for independent control of active and reactive power flows in power systems [1]. currently, upfcs are mostly used in two different operation modes: either voltage and power flow control mode or active power oscillation reduction mode. there are a lot of proposed algorithms for both operation modes. control algorithms for voltage and power flow control are often based on proportional – integral (pi) controllers. the simplest control system is described in dq – reference frame and it generates the desired upfc voltage reference out of the acquired feedback signals [2]. the feedback signals are usually line currents as well as active and reactive powers. there are several similar control systems with only a small difference between them in their parameter setting for achieving better performance or faster response [3-5]. however, some authors prefer using several feedback signals (up to three) to achieve better performance and faster stabilization [6],[7]. this way, several pi controllers are connected in cascade, thus reducing the phase margin with negative impact on stability and robustness. these problems are not discussed in literature. besides pi controller, some novel approaches are discussed too, such as fuzzy controllers and neural networks. these types of controllers are suitable for nonlinear systems like power system. fuzzy controllers are used in hybrid version, where only p control of pi controller is fuzzy – based and everything else is classical pi control [8]. more complex approach uses complete pi controller based on fuzzy logic [9 11]. it is noted that fuzzy based control schemes provide faster response, on the account of a rather complex and involve selection of suitable membership function types and domains, mostly performed on trial-and-error bases, rather than using exact mathematical procedure which would lead to predictable results. additionally, said algorithms can be numerically extensive. neural networks are also an option for upfc control. usually, simple algorithms based on radial basis neural networks using a single neuron in hidden layer with gaussian activation function are used [8]. there is also a hybrid version of controller which uses classical pi control combined with neural network. neural network based on back propagation error, uses deviation of variables of interest to generate the output which is summarized with the outputs of pi controllers [12]. the latest approach is to use neural network based controllers to generate auxiliary signals for active power oscillations reduction [13]. this way faster response and attenuation of active power oscillations can be achieved. however, these algorithms have also several shortcomings. their main disadvantage is that their stability cannot be mathematically proven [14] and they are rather complex for practical implementation. it can be seen that almost all algorithms in relevant literature use active and reactive power and nominal bus voltage as reference signals. some modifications of these algorithms use d and q axes currents which are calculated using active and reactive power references. there are some attempts to use upfc for reactance control [6], [15], [16]. however, in these works upfc is used only as a shunt device, so it is not capable of controlling resistance in this operation mode. sometimes, the term “impedance control” is used to describe reactance control, as it is explained in [17]. authors didn’t find any relevant literature dealing with the use of upfc for independent control of series line resistance and reactance. control of series impadance of power lines using power flow controller 423 in this paper, control solution is proposed where series resistance r and series reactance x are treated as reference signals, while the upfc performs complete emulation of the series impedance. this means that upfc generates the appropriate voltages to maintain series resistance and reactance on desired levels, thus exploiting the whole potential of the upfc hardware. 2. topology and mathematical model of upfc in this section topology of standard upfc system is discussed. additionally, mathematical model of upfc suitable for series power line impedance modulation is derived based on classical power flow upfc model. all mathematical equations are self – driven based on proposed equivalent schemes. 2.1. upfc topology topology of an upfc device is shown in fig. 1. fig. 1 topology of upfc in fig. 1, upfc is connected on bus k, and it can control power flow between buses k and k+1, along the power line with impedance zt. this device constitutes of two power converters (pc1 and pc2), which operation is based on power electronics switching devices. the first upfc, installed in the usa, used gate turn – off (gto) thyristors as switching devices, which operated on grid frequencies. latest upfcs, installed in china, use insulated gate bipolar transistors (igbt) as switching devices, combined in modular multilevel converters (mmc) and they operate on frequencies near 1[khz] [18],[19]. in upfc topology, two power transformers are obligatory (tr1 and tr2, fig. 1). shunt transformer (tr1) is a classical power transformer. series transformer (tr2) has much more complicated construction since it has to withstand line current and sometimes even short circuit currents for a small fraction of time. auxiliary transformers (atr1 and atr2, fig. 1) are not always necessary. they are usually used in cases when gtos are used in power converters to create appropriate phase shift. series transformer (tr2) and series converter (pc2) create series part of upfc which is called static series synchronous compensator (sssc). shunt transformer (tr1) and shunt 424 a. a. marković, s. vukosavić converter (pc1) together create the shunt part of upfc which is called static compensator (statcom). these two devices can operate separately from each other. however, when dc switch (dcs) is closed, shunt and series part share the same dc link and that configuration is called upfc. in this configuration, it is possible to achieve more complex control tasks than using statcom and sssc independently. 2.1. upfc mathematical model to describe upfc more precisely, the equivalent scheme shown in fig. 2.a can be observed. fig. 2 a – upfc equivalent scheme, b – phasor diagram variables uk and uk+1 represent complex voltages on busbars k and k+1, respectively. complex voltage use denotes series voltage inserted into the power line through the series power transformer. complex voltage ush is generated using shunt transformer. modified voltage phasor on sending end u’k represents vector sum of voltages uk and use. impedance zsh describes shunt impedance of upfc while zt is transmission power line series impedance. line current i flows through series transformer and current ish flows through shunt part of upfc, supplying the dc link with appropriate energy. in order to see how upfc generated voltages use and ush influence on power system operation, apparent power on sending end can be observed (1). 𝑆𝑘 = 𝑈𝑘 ′ 𝐼∗ = 𝑈𝑘 ′ ( 𝑈𝑘 ′ − 𝑈𝑘+1 𝑍𝑇 ) ∗ (1) according to the phasor diagram (fig. 2b), voltages can be expressed using their effective values and phases (2). 𝑈𝑘 = 𝑈𝑘𝑒 𝑗𝛿𝑘, 𝑈𝑘+1 = 𝑈𝑘+1𝑒 𝑗𝛿𝑘+1, 𝑈𝑠𝑒 = 𝑈𝑠𝑒𝑒 𝑗𝛿𝑠𝑒 (2) substituting (2) into (1), using previously explained condition uk ’ = uke jδk+1 + usee jδse equation (1) becomes (3). 𝑆𝑘 = 𝑈𝑘 ′2 𝑅𝑇−𝑗𝑋𝑇 − 𝑈𝑘𝑈𝑘+1𝑒 𝑗(𝛿𝑘−𝛿𝑘+1)+𝑈𝑘+1𝑈𝑠𝑒𝑒 𝑗(𝛿𝑠𝑒−𝛿𝑘+1) 𝑅𝑇−𝑗𝑋𝑇 (3) real and imaginary part of (3) are given with (4) and (5), respectively. for simplicity, resistance r is neglected because the ratio x/r for high voltage power lines is control of series impadance of power lines using power flow controller 425 approximately 1/11 for 400[kv] power lines. further, the appropriate phases are expressed as δ = δk – δk+1, δ’ = δse – δk+1. 𝑃 = 𝑈𝑘𝑈𝑘+1 𝑋𝑇 sin(𝛿) + 𝑈𝑠𝑒𝑈𝑘+1 𝑋𝑇 sin(𝛿′) = 𝑓(𝑈𝑠𝑒,𝛿𝑠𝑒) (4) 𝑄 = 𝑈𝑘 ′2 𝑋𝑇 − 𝑈𝑘𝑈𝑘+1 𝑋𝑇 cos(𝛿) − 𝑈𝑠𝑒𝑈𝑘+1 𝑋𝑇 cos(𝛿′) = 𝑓(𝑈𝑠𝑒,𝛿𝑠𝑒) (5) equations (4) and (5) represent active and reactive powers on sending end, respectively. it can be noted that these equations are function of effective value of series voltage use, and its phase δse. active power p can be dominantly controlled by generating appropriate phase δse while reactive power q is controlled by generating adequate series voltage amplitude. the importance of upfc lies in fact that effective value of series voltage use can be changed from zero to its maximal value use,m and the series voltage phase δse can be changed from 0 to 2π. this is possible only because two power controllers share the same dc link. in power control mode of operation, shunt part of upfc is used for delivering the energy for series part. active power exchanged between two converters is denoted as pex. it should be pointed out that reactive power cannot be transferred through the dc link. so, every converter has to generate or absorb the reactive power locally. shunt part is also used for keeping the k bus voltage amplitude at desired level, which is done by absorbing or injecting reactive energy. additionally, this part of upfc is used for controlling the dc link voltage by controlling exchanged active power pex. apparent power generated or absorbed by shunt part ssh can be expressed by (6). 𝑆𝑠ℎ = 𝑈𝑘𝐼𝑠ℎ ∗ = 𝑈𝑘 ( 𝑈𝑘−𝑈𝑠ℎ 𝑍𝑠ℎ ) ∗ (6) real part of (6) represents the shunt active power psh and imaginary part is shunt reactive power qsh. model of dc link can be described by (7). 𝑃𝑒𝑥 = 𝑃𝑠ℎ − 𝑃𝑠𝑒 = 𝑖𝐶𝑢𝐷𝐶 = 𝑢𝐷𝐶𝐶 𝑑𝑢𝐷𝐶 𝑑𝑡 (7) in (7) pse represents active power generated by series part of upfc, ic is current flowing through the dc link capacitor, udc is dc link voltage and c represents capacitor capacitance. traditionally, control of upfc is done by generating appropriate series use and shunt ush voltages. these voltages are generated by the control system (fig 1.), which goal is to regulate active and reactive powers as well as nominal voltage on k-th busbar. 3. proposed control scheme the main idea for control system is to use desired values of line resistance and reactance as reference signals. these signals are further to be used to calculate appropriate references for active and reactive powers. to accomplish this idea, the control system of the series part of upfc should be modified, while the control system of the shunt part of upfc can be kept the same relative to the standard control systems of upfc used in power flow control mode of operation. 426 a. a. marković, s. vukosavić 3.1. upfc series part control scheme unlike previously described classical control schemes, upfc can also be used in impedance control operation mode. to formulate the control low, the equivalent scheme shown in fig. 3 can be observed. fig. 3 upfc equivalent scheme for impedance control operation mode in this case, series part of converter can be observed as variable impedance z, unlike the classical study where the series part is represented by voltage source (fig. 2a). line current i should remain the same, independently of equivalent scheme (fig. 2a or fig. 3). line current form fig. 3 can be expressed by (8). 𝐼 = 𝑈𝑘+𝑈𝑠𝑒−𝑈𝑘+1 𝑍𝐿 (8) in this case, voltage vector use can be varied, while zt is constant. line current calculated using equivalent scheme from fig. 3 is given by (9). 𝐼 = 𝑈𝑘 − 𝑈𝑘+1 𝑍𝑒 (9) in case of (9), ze is equivalent line impedance, expressed as sum of variable part of impedance z and fixed impedance zt. these currents, expressed by (8) and (9), should be equal. from this equality, the expression for variable part of impedance can be easily obtained (10). 𝑍 = − 𝑈𝑠𝑒 𝐼 (10) variable impedance z is expressed using series injection voltage 𝑈𝑠𝑒 and line current i, which can be measured in a real power system. apparent power on power line, according the fig. 3, is expressed by (11). 𝑆𝑘,𝑟𝑒𝑓 = 𝑈𝑘 ( 𝑈𝑘 − 𝑈𝑘+1 𝑍𝑒,𝑟𝑒𝑓 ) ∗ = 𝑃𝑟𝑒𝑓 + 𝑗𝑄𝑟𝑒𝑓 (11) equation (11) shows that referent values for active and reactive power pref and qref, respectively, can be expressed indirectly by assigning referent values for equivalent impedance ze. calculated power references pref and qref are to be compared with measured control of series impadance of power lines using power flow controller 427 active and reactive powers given by (4) and (5). active power signal error represents input for pi controller (pi1, fig. 4.a), which output is imaginary part useq of complex voltage vector use. reactive power signal error feeds another pi controller (pi2, fig. 4.a), which output represent the real part used of complex voltage vector use. control scheme of series part of upfc is shown in fig. 4.a. fig. 4 a. upfc series part control scheme, b. upfc shunt part control scheme 3.2. upfc shunt part control scheme for proposed control scheme, based on impedance control, shunt part can be controlled classically. that means, shunt part complex voltage is generated using two pi controllers. the complete control scheme of shunt part of upfc is shown in fig. 4.b. the first pi regulator (pi3, fig. 4.b) is used to generate the real part ushd of complex voltage ush. this regulator is fed by error signal which is generated as difference between reference dc link voltage udc,ref and measured dc link voltage udc, which is obtained using (7). imaginary part ushq of complex voltage vector ush is generated using pi controller (pi4, fig. 4.b), which input signal is difference between referent (usually nominal) voltage on bus k uk,ref and measured voltage uk. controllers used in control schemes (fig. 4) are discrete type pi controllers in positional form with anti-windup mechanism (fig. 5). fig. 5 discrete type pi controller with anti-windup mechanism in fig. 6 signals f and y represent input and output signals, respectively. parameters kp and ki are proportional and integral gains, respectively, while parameter kc is calculated as ratio ki/kp. sampling time is denoted as t. all control parameters are given in appendix a. 428 a. a. marković, s. vukosavić 4. test system model operation of upfc in impedance control mode is tested by means of computer simulation, on a simple power system, showed in fig. 6. the system is classical single – machine infinite bus system with parallel lines. this type of system is widely used for demonstration of upfc performance by means of power regulation and active power oscillation suppression [20 – 23]. fig. 6 test system model model of the test power system (fig. 6) consists of four buses. buses 1 and 4 are generator buses whereby the bus 1 is slack bus. buses 1 and 2 are connected by means of power lines having impedances zt1 and zt4, respectively. buses 2 and 3 are connected by means of parallel lines with impedances zt2 and zt3. constant impedance loads are connected to buses 2, 3 and 4, and their impedances are denoted as zl1, zl2 and zl3, respectively. unified power flow controller is connected to the bus 2, in series with power line which impedance is zt2. thus, upfc will be used for control of impedance on this power line and simultaneously for controlling bus 2 voltage amplitude. power generator g1 (fig. 6) is slack generator, so it is modeled as constant voltage source with nominal voltage. detailed model of generator g2 is given in [24]. it consists of models of electrical and mechanical subsystems suitable for observation of transient and steady state periods. excitation system of this generator is modeled as standard type 1 ieee excitation system. system frequency controller is integral type controller, while turbine controller is modeled as widely used first order system with droop characteristics. power lines are described by their series impedances, where the shunt parts of the power lines are neglected. all loads are modeled as constant impedance loads. parameters of the test system model are given in appendix b and they are represented in per unit system with respect to base power 100[mva] and base voltage 220[kv]. 5. simulation results in order to explore the possibility of upfc to control series line impedance, computer simulation is created in matlab, simulnik. simulation is prepared according to the test system model (fig. 6) and upfc mathematical model, described in section iii. the simulation is divided into nine time segments (t1 – t9), and each of them lasts for 5[s]. the first time interval t1 starts at the time t1 = 10[s] and lasts until the time t2 = 15[s], and the last one t9 starts at the time t8 = 50[s] and lasts until the end of the simulation, control of series impadance of power lines using power flow controller 429 which is 55[s]. all time intervals are shown in fig. 7. the simulation results are observed form the time t1 = 10[s] in order to get clearer results and to skip the transient period. the aim of this simulation is to show the possibility of upfc to independently regulate line resistance and reactance. in order to investigate the great majority of all possible outcomes, different references of x and r are generated in every time interval. these are represented by step changes. the step responses of measured resistance (black) and reactance (blue) of line 2 are given in fig.7. dashed traces in fig. 7 represent nominal line parameters, when no compensation is done, that is re,ref=rt2=0.03[p.u] and xe,ref=xt2=0.2[p.u]. fig. 7 step change of equivalent line impedance the goal is to generate higher and lower values of resistance and reactance compared to uncompensated line parameters, to investigate if upfc is capable to independently compensate both line parameters. step responses of x and r represented in fig. 7 show that measured equivalent resistance and reactance follow the reference signals without steady sate error. the step responses are almost aperiodic. when the reference of one of the parameters (x or r) is changed while the other parameter is kept constant, undershoot or overshoot occur in response of the parameter which is kept constant. this can be observed in transition from time period t3 to t4, when r=0.05[p.u] and it is kept constant and greater than nominal (uncompensated) and x=0.15[p.u] which is lower than nominal. in this case the disturbance in measured resistance occurs and it is represented as an overshoot. however, this disturbance is evidently negligible, and it happens due to the socalled coupling between active and reactive powers. similar disturbances can be seen on the transition from time period t2 to t3 when the overshoot occurs in time response of measured reactance whereas in the transition from time period t6 to t7. the summarized results of the simulation for fig. 7 are given in table 1. in the table 1 the brief description of time periods t1 to t9 is given using symbols describing direction of change of x and r relative to previous time period. symbol “-“ which means no change in x or r, “↘” lower x or r and “↗” higher x or r. 430 a. a. marković, s. vukosavić table 1 summarized results of step responses of equivalent line x and r t1 t2 t3 t4 t5 t6 t7 t8 t9 x r x r x r x r x r x r x r x r x r d ir . − • • • • • • • • ↘ • • • • • ↗ • • • • • s t e p ap.1 • • • • • • • • • • • • over.2 • • • under.3 • • 1aperiodic 2overshoot 3undershoot the brief overview of the step response of equivalent x and r are described by the type of step response which can be aperiodic, with overshoot or with undershoot. the results in table 1 show also that the time response is mostly aperiodic. time responses of the variable resistance (blue) and reactance (red) are shown in fig. 8. when no compensation is done (time periods t1 and t8), variable r and x are zero, which is in accordance with the theoretical discussion. the step responses are the same as the step responses of equivalent x and r (fig. 7) since they represent the sum of these signals with constant, uncompensated values of x and r. in is interesting to notice that variable x and r, generated by the upfc can be both positive and negative. especially interesting is the possibility of generating negative resistance. fig. 8 step change of variable reactance and resistance the change of line resistance and reactance influences the change of active (red trace) and reactive (blue trace) powers in line 2, shown in fig. 9. dashed trances in fig. 9 represent active and reactive powers when no compensation is done. step responses of active and reactive powers are almost aperiodic. the overshoot in active power step response happens in transition from the time period t3 to t4 (1.2%) and in transition from the time period t5 to t6 (2.9%). however, these overshoots are under 5% which is control of series impadance of power lines using power flow controller 431 considered acceptable. it is important to notice that no oscillations in active power response are present. comparing the results in fig. 7 with the results obtained in fig. 9, it can be concluded that the step change in line reactance has greatest impact to power changes, which is in accordance with the theoretical discussion. fig. 9 active and reactive power change to deeply investigate the step response of equivalent line resistance (black trace), fig. 10 can be observed. the trances shown in fig. 10 are the same as the trances form fig. 7, only enlarged. fig. 10 the step response of equivalent line resistance step responses of equivalent resistance are mostly aperiodic as it is previously stated. step response is quite fast end it reaches the steady state for 4[s]. the enlarged parts in fig. 432 a. a. marković, s. vukosavić 10 show the exact time responses of equivalent resistance in transition from time period t6 to t7 when the overshoot of 4% occurs, and in transition from time period t8 to t9 when the overshoot of 3% occurs. these are acceptable values. however, greater disturbances evidently occur in transition from t3 to t4, t4 to t5 and t6 to t7. these disturbances can be lowered by designing an appropriate decoupling controller. further, the time responses of upfc and bus 2 voltages are observed (fig. 11). the main purpose of the upfc is to insert series voltage into the line to in order to generate the reference equivalent resistance and reactance. step responses of the d (red trace) and q (blue trance) components of the upfc series voltages are shown in fig. 11a. when no compensation is done (time periods t1 and t8), series voltage is equal to zero, which means that series part of upfc is inactive. in other time periods series voltage changes in appropriate manner to fit the regulation goals. time responses are obviously aperiodic with fig. 11 a. upfc series voltage, b. upfc shunt voltage, c. bus 2 voltage amplitude, d. dc link voltage control of series impadance of power lines using power flow controller 433 very fast response, with time constant below 1[s]. the amplitude of series injected voltage is within the rage of 0.1[p.u], which is the typical maximal value of inserted series voltage in practical implementation [18]. the main task of upfc shunt part is to keep bus 2 and dc link voltages at nominal level. fig. 11c and fig. 11d show that this task is successfully accomplished since observed voltages are kept constant during all time periods and no disturbances are noted. the reason for this is upfc shunt voltage which d (red trace) and q (blue trance) components are shown in fig. 11b, which is also kept constant during all time periods thanks to the shunt part control system. 6. conclusion the paper discusses the possibility of aiding to the integrity of ac grids by introducing unified power flow controller (upfc), enabled by the proposed controller, capable of modulating both series resistance r and series reactance x of an overhead power line. proposed controller is simple to set and straightforward to use. the proposed operation mode of ufpc is tested on single – machine infinite bus system consisting of four buses with detailly modeled generator. the results show that upfc is very efficient in compensating line equivalent resistance and reactance. the step responses are aperiodic with zero steady state error and small settling time. decoupling controllers are not required as the disturbances that take place during step changes of reference signals are quite insignificant. this way, active and reactive powers on the line are controlled indirectly, by changing the line impedance. no oscillations in active power step response are noted. the described possibility of upfc has the potential of being used for attenuation of power angle deviations and power oscillations in large scale power systems experiencing significant power disturbances. however, this possibility is yet to be proven. 7. appendix a parameters of four used pi regulators, numbered as in fig. 4 are: kp1=0.1, ki1=1, kc1=10; kp2=0.1, ki2=1.4, kc2=14; kp3=2, ki3=10, kc3=5; kp4=5, ki4=10, kc4=2. 8. appendix b parameters of the test power system are as follows: ▪ generator g2: xd=1.2[p.u], x'd=0.3[p.u], xq=1[p.u], t'd0=5[s], h=6[s], k=0.02; ▪ generator's g2 voltage regulator: ka=20, ta=0.2[s]; ▪ generator's g2 turbine: tch=0.4[s]; ▪ turbine's regulator: tsv=0.2[s]; ▪ system frequency regulator: tf=1[s]; ▪ power lines: zv1=0.01+j0.1[p.u], zv2=0.03+j0.2[p.u], zv3=0.03+j0.4[p.u], zv4=0.01+j0.2[p.u]; ▪ loads: zl1=2+j1[p.u], zl2=0.8+j0.6[p.u], zl3=0.8+j0.6[p.u]; ▪ upfc parameters: zsh=0.001+j0.08[p.u], c=0.5[p.u]. 434 a. a. marković, s. vukosavić references [1] l. gyugyi, "unified power flow control concept for flexible ac transmission systems", ieeе proceedings, vol. 139, no. 4, pp. 323–331, july 1992. [2] s. d. round, q. yu, l. e. norum and t. m. undeland, "performance of a unified power flow controller using a d-q control system", in proceedings of the sixth international conference on ac and dc power transmission, london, 1996, pp. 357–362 [3] k. r. padiyar and a. m. kulkarni, "control design and simulation of unified power flow controller", ieee trans. power deliv., vol. 13, no. 4, pp. 1348–1354, oct. 1998. [4] i. papic, p. zunko, d. povh and m. weinhold, "basic control of unified power flow controller", ieee trans. power syst., vol. 12, no. 4, pp. 581–588, nov. 1997. [5] h. fujita, y. watanabe and h. akagi, "control and analysis of a unified power flow controller", ieee trans. power electron., vol. 14, no. 6, pp. 1021–1027, nov. 1999. [6] l. liu, y. zhang, p. zhu, y. kang and j. chen, "control scheme and implement of a unified power flow controller", in proceedings of the international conference on electrical machines and systems, nanjing, 2005, pp. 1170–1175. [7] l. liu, p. zhu, y. kang and j. chen, "power-flow control performance analysis of a unified power-flow controller in a novel control scheme", ieee trans. power deliv., vol. 22, no. 3, pp. 1613–1619, july 2007. [8] p. k. dash, s. mishra and g. panda, "damping multimodal power system oscillation using a hybrid fuzzy controller for series connected facts devices", ieee trans. power syst., vol. 15, no. 4, pp. 1360–1366, nov. 2000. [9] f. m. albatsh, s. mekhilef, s. ahmad, h. mokhlis, "fuzzy logic based upfc and laboratory prototype validation for dynamic power flow control in transmission lines", ieee trans. ind. electron., vol. 64, no. 12, pp. 9538–9548, dec. 2017. [10] m. khaksar, a. rezvani and m. h. moradi, "simulation of novel hybrid method to improve dynamic responses with pss and upfc by fuzzy logic controller", neural comput. appl., vol. 29, pp. 837–853, feb. 2018 [11] n. narayana and r. k. mallick, "enhancement of small signal stability of power system using upfc based damping controller with novel optimized fuzzy pid controller", j. intell. fuzzy syst., vol. 35, no. 1, pp. 501–512, july 2018. [12] h. c. tsai, j. h. liu and c. c. chu, "integrations of neural networks and transient energy functions for designing supplementary damping control of upfc", ieee trans. ind. appl., vol. 55, no. 6, pp. 6438–6450, dec. 2019. [13] h. c. tsai and c. c. chu, "upfc supplementary damping control synthesis: a forward neural networks approximated energy function approach", in proceedings of the ieee industry applications society annual meeting (ias), 2018, pp. 1–8. [14] m. januszewski, j. machowski and j. w. bialek, "application of the direct lyapunov method to improve damping of power swings by control of upfc", iet proceedings – gener. transm. distrib., vol. 151, no. 2, pp. 252–260, april 2004. [15] m. a. sayed and t. takeshita, "line loss minimization in isolated substations and multiple loop distribution systems using the upfc", ieee trans. power electron., vol. 29, no. 11, pp. 5813–5822, nov. 2014. [16] k. k. sen and m. l. sen, introduction to facts controllers: theory, modeling and applications, john willey & sons, new jersey, 2009, chapter 2, pp. 58–62. [17] m. h. haque, "application of upfc to enhance transient stability limit", in proceedings of the ieee power engineering society general meeting, 2007, pp. 1–6. [18] x. yang, w. wang, h. cai, p. song and z. xu, "installation, system-level control strategy and commissioning of the nanjing upfc project", in proceedings of the ieee power and energy society general meeting, 2017, pp. 1–5. [19] y. cui, y. yu, w. bao, y. feng, q. guo, w. xie and m. jin, "analysis of application effect of 220 kv upfc demonstration project in shanghai grid", dianli xitong baohu yu kongzhi/power system protection and control, vol. 46, pp. 136–142, 2018. [20] s. k. samal, p. c. panda, "damping of power system oscillations by using unified power flow controller with pod and pid controllers", in proceedings of the international conference on circuits, power and computing technologies (iccpct-2014), 2014, pp. 662–667. [21] a. m. shotorbani, a. ajami, m. p. aghababa and s. h. hosseini, "direct lyapunov theory-based method for power oscillation damping by robust finite -time control of unified power flow controller", iet gener. transm. distrib., vol. 6, no. 9, pp. 822–830, nov. 2012. control of series impadance of power lines using power flow controller 435 [22] h. huang, l. zhang, o. oghorada and m. mao, "analysis and control of a modular multilevel cascaded converter-based unified power flow controller", ieee trans. ind. appl., vol. 57, no. 3, pp. 3202–3213, june 2021. [23] m. khaksar, a. rezvani and m. h. moradi, "simulation of novel hybrid method to improve dynamic responses with pss and upfc by fuzzy logic controller", neural comput. and appl., vol. 29, pp. 837–853, feb. 2018. [24] p. w. sauer, m. a. pai and j. h. chow, power system dynamics, and stability: with synchrophasor measurement and power system toolbox, 2nd edition, wiley-ieee press, 2017, chapter 4, pp. 53–70. 10684 facta universitatis series: electronics and energetics vol. 36, no 1, march 2023, pp. 1-16 https://doi.org/10.2298/fuee2301001l © 2023 by university of niš, serbia | creative commons license: cc by-nc-nd original scientific paper performance analysis of finfet based inverter, nand and nor circuits at 10 nm ,7 nm and 5 nm node technologies abdelaziz lazzaz1, khaled bousbahi2, mustapha ghamnia3 1,3laboratoire des sciences de la matière condensée (lsmc), département physique, université d’oran 1 ahmed ben bella, oran, algérie. 2ecole supérieure du génie electrique et energétique d’oran, (esgeeo), algérie abstract. advancement in the semiconductor industry has transformed modern society. a miniaturization of a silicon transistor is continuing following moore’s empirical law. the planar metal-oxide semiconductor field effect transistor (mosfet) structure has reached its limit in terms of technological node reduction. to ensure the continuation of cmos scaling and to overcome the short channel effect (sce) issues, a new mos structure known as fin field-effect transistor (finfet) has been introduced and has led to significant performance enhancements. this paper presents a comparative study of cmos gates designed with finfet 10 nm, 7 nm and 5 nm technology nodes. electrical parameters like the maximum switching current ion, the leakage current ioff, and the performance ratio ion/ioff for n and p finfet with different nodes are presented in this simulation. the aim and the novelty of this paper is to extract the operating frequency for cmos circuits using quantum and stress effects implemented in the spice parameters on the latest microwind software. the simulation results show a fitting with experimental data for finfet n and p 10 nm strctures using quantum correction. finally, we have demonstrate that finfet 5 nm can reach a minimum time delay of td=1.4 ps for cmos not gate and td=1 ps for cmos nor gate to improve integrated circuits ic. key words: finfet, quantum effect, cmos not gate, cmos nor gate, cmos nand gate, microwind 1. introduction the rapid development of nanoelectronics technology is closely related to solving the problem of minimum layout dimensions. the efficient miniaturization of a transistor has been one of the most important topic for integrating a greater number of electronic components in a single chip. received april 17, 2022; revised may 22, 2022, june 05, 2022 and june 16, 2022; accepted july 16, 2022 corresponding author: abdelaziz lazzaz laboratoire des sciences de la matière condensée (lsmc), département physique, université d’oran 1 ahmed ben bella, oran, algérie e-mail: lazzaz.abdelaziz@gmail.com 2 a. lazzaz, k. bousbahi, m. ghamnia finfet is one of the best alternative for replacing mosfet which encounter the problem of the sce like drain induced barrier lowering (dibl), and the increase of leakage current when the channel length is reduced below 32 nm. researchers around the world have tried to improve the performance of finfets by the introduction of high k dielectric materials and strained silicon technology [1]. since the conventional mosfet has reached its limit, the multi gate finfet has been one of the most promising devices for cmos technology and the different analytical studies of finfet is a current topic of research in large foundries like tsmc[6], samsung and intel, they are aiming to create the most efficient cmos circuits. shiqi liu et al. in 2021[21] have simulated an ultra-thin si finfet with a width of 0.8 nm by using ab initio quantum transport simulations. the results of their simulation confirm that even with the gate length down to 5 nm, the on-state current, delay time, power dissipation, and energy-delay product of the optimized ultra-thin si finfet still meet the high-performance applications. dhananjaya tripathy et al in 2022 [22] have examined the impact of variation in the thickness of the oxide (sio2) layer on the performance parameters of a finfet. the results confirm that a rise in sio2 thickness improves the energy and power dissipation of finfet. lazzaz et al. in 2022 [23] have simulated a theoretical model based on the bohm quantum potential (bqp) theory and compared it with experimental data. the theory fits with the experiment after optimization and correction using the right values of the geometric parameters. bourahla et al. in 2021 [24] have demonstrated that the ta2o5 material of gate with high permittivity (k = 27) turns out better values for performance parameters such as (vth, ss, ion, ioff current and ion/ioff ratio current, gm, and electrical field (e)) in comparison with other dielectrics such as sio2, sno2, zro2 which improve the performance of the device. lazzaz et al. in 2021 [2] have demonstrated the impact of the metal gate work function on the performance of the dg finfet 10 nm with silvaco tcad tools. uttam kumar das et al. in 2021 [25] have examined a comparative study between silicon finfet with carbon nanotube and 2d-fets for advanced node cmos logic application.the results of this simluationn confirm that the finfet delivers more than three times higher drive current, as well as five times better energy-delay performances. rajeev ratna vallabhuni et al. in 2020 [26] have simulated a 2-bit comparator designed with 18nm finfet technology. the simulation shows the cmos comparator in terms of power and delay using the cadence virtuoso tool. the result of this simulation confirm that finfet can be used where a fast switching rate is required, to improve the efficiency of control devices and to make compact device. j. jena et al. in 2022 [27] have simulated finfet-based inverter design and optimization for 7 nm technology node. the result of their simulation confirm that according to the sidewall orientation (<100 > or < 110>), the amount of mobility enhancement of both the electrons and holes results in more than 100% (>100%) and less than 25% (<25%) respectively. c. auth et al. in 2017 [32] have an industry leading 10 nm cmos technology node with excellent transistor such as finfet with interconnect performance and aggressive design rule scaling.the results of their simulation show a higher performane high density sram featuring 0.0312µm² cell size fabricated using all 10 nm process features. s.panchanan et al. in 2021 [35] have simulated an analytical model of tri-gate metaloxide-semiconductor field effect transistor (tg mosfet) for short channel lengths performance аnalysis of finfet based inverter nand and nor circuits at 10 nm, 7 nm ... 3 below 10 nm using tcad software. the model is examined by varying channel length, oxide thickness, gate voltage, drain voltage and doping concentration. the result of their simulation confirm that to obtain identical surface potentials, the oxide thickness of hfo2 must be larger than sio2. unlike sio2, the minima of surface potential remain constant with channel length for hfo2. b. vandana et al. in 2018 [36] have explored the analog analysis and higher order derivatives of drain current (id) at gate source voltage (vgs), by introducing channel engineering technique of 3d conventional and wavy junctionless finfets (jlt) as silicon germanium (si1-0.25ge0.25) device layer. the results of their simulation confirm that a better channel controllability over the gate is observed for wavy structures and high id is induced as lg scales down. n. p. maity et al. in 2019 [37] have simulated a double-gate (dg) heterojunction tunnel finfet structure with a source overlap region to optimize its performance and validate its technology computer-aided design (tcad) simulation results by modeling of the surface potential, electric field, and threshold voltage. suparna panchanan et al. in 2021[38] have analysed an analytical model for surface potential and threshold voltage for undoped (or lightly) doped tri-gate fin. field effect transistor (tg-finfet) is proposed and validated using transistor computer aided design (tcad) simulation. suparna panchanan et al. in 2022 [39] have studied lambert w function-based a drain current model of lightly doped short channel tri-gate fin fashioned field effect transistor (tgfinfet). their results confirm that a precise drain current is obtained by adding quantum mechanical effect (qme) which also improves the efficiency of the model. shaheen saleh et al. in 2018 [41] have demonstrated the roles and impacts of various effects and aging mechanisms on finfet transistors compared to planar transistors on the basic approach of the physics of failure mechanisms to fit to a comprehensive aging model. so, the above literature survey indicates the importance of using high-k dielectrics in finfet devices and the importance of multi gate finfet to overcome the sce and to improve the channel control. in this paper, we present a comparative study of different cmos gates (not gate, nand and nor gate) based on 10 nm, 7 nm and 5 nm technology node to extract optimul geometric parameters to have an operational finfet device for future applications like sram circuits. 2. device structure and simulation tri gate (tg) finfet technology is based on the vertical fin represented by the fin length (l), fin height (hfin) and fin width (wfin) as show in figure 1. finfet devices have been used in a variety of innovative digital and analog circuit designs. tg (tri gate) has been recently developed and its ability to control three channel sides has been used in order to reduce circuit area, its capacitance and the variation of the threshold voltage. throughout the last few years, cmos scaling and improvement in processing technologies have led to continuous enhancement in circuit speeds due to the miniaturization of finfet device. the main difference between the bulk finfet and soi finfet is the buried oxide (box) which isolate the body from the subtrate, minimizes the leakage current due to quatntum effect, reduces the parasitic junction capacitance and source/drain capacitance. 4 a. lazzaz, k. bousbahi, m. ghamnia despice the use of the soi finfet technology in term of enhancement of the device, one of the drawbacks is the self heating effect because the active thin body is on silicon oxide which is good thermal insulator. during an operation, the power consumed by the active region cannot be dissipated easily therefore, the temperature of thin body rises and this decreases the mobility and the current of the device [32]. in this work, finfet structure has been simulated with microwind 3.8 software using parameters that are provided in table 1. figure 1 in the right shows the 3d schematic of simulated finfet 10 nm and in the left figure shows the design layout of the device: fig. 1 n finfet 10 nm table 1 different parameters of the simulated device [6] [7][12] notation description finfet 10 nm finfet 7 nm finfet 5 nm ls,ld length of drain /source 22nm 16 nm 12 nm lg gate length 18 nm 16 nm 14 nm tox oxide thikness 1 nm 0.9 nm 0.9 nm hfin fin height 46 nm 46 nm 46 nm wfin fin width 7 nm 6 nm 5 nm table 1 shows the design parameters that we have employed for the circuit simulations in our present work. the primary obstacles to the scaling of cmos gate lengths to 10 nm and beyond are short channel effect and leakage current which lead to low yield. finfet offers better control over of the sce and hence overcome the obstacles of scaling. the circuit simulation is done using microwind 3.8 which we have used to simulate electrical circuits in transient domain. microwind tool facilitates circuit level analysis of performance simulation of the integrated circuits. the predictive technology model (ptm) integrated in microwind provides accurate, customizable, and predictive model files for future transistor and interconnect technologies[28][29]. we have simulated different logic circuits such as the not gate,2 input nand and 2 input nor gates for leakage power dissipation, delay time and power delay product (pdp) at 10 nm, 7 nm and 5 nm technology nodes and a comparison is made to check the technology scaling. performance аnalysis of finfet based inverter nand and nor circuits at 10 nm, 7 nm ... 5 the threshold voltage expression can be represented by the following equation [11]: in ox ss ox d fmsth v c q c q v ++++=  2 (1) ms: work functions difference between gate and fin, qss: charge in the gate dielectric, cox: oxide capacitance, qd: depletion charge, f : fermi potential vin : input voltage. power dissipation plays a crucial role in the overall performance of the circuits in sub 10 nm regime and it represents an important performance metric to check the effectiveness of the proposed technique. time delay is a performance metric to evaluate the switching speed of the circuit, it is calculated by following equation [9]: 2 plhphl d tt t + = (2) tphl: high to low transition delay; tplh: low to high transition delay. leakage power dissipation is also an important parameter for research designers because it affects performance and reliability of the electronic device. the leakage power dissipation is calculated using following equation [8]: leakageddleakage ivp = (3) where vdd is supply voltage and ileakage is the leakage current. scaling of finfet plays a very important step in finfet structure where the scaling factor  is given in following equation [6]: oxfin tw 2+= (4) wfin: fin width; tox: oxide thickness. pdp (power delay product) is an essential requirement for better performance of the circuits. technology scaling increases power dissipation and delay values therefore, lowest value of pdp depicts better performance at the scaled technology nodes. pdp is given by following equation [17]: pdp =power dissipation x delay (5) the following equation represents the drain current equation on the sub-threshold mode used in this simulation: ) )( ( ),( nkt vvq dsondsds ongs evvii − = (6) vgs: gate source voltage , n: body coefficient , k: boltzman coefficient , t:temperature, q:electron charge. )1( ) ).42( 1(0 0 effsat dseff dseff gsteff dseffbulk gsteff r eff eff eff l v v vtv va v toxel w ids    + + −= (7) weff: effective width, leff: effective length , ε0: vacuum permittivity, εr: relative permittivity, toxe: oxide thichness, vgsteff: gate source effective voltage, vdseff: drain source effective voltage, ε0: saturation permittivity, v: carrier velocity. 6 a. lazzaz, k. bousbahi, m. ghamnia in 3d nanochannel devices, the sce modifies the drain current expression by a correction factor cf for the post-threshold voltage regime: l cf + =   (8)  : mean free path, l: channel lengh, cf: is also called transition coefficient. figure 2 represents the transfer characteristics of n finfet 10 nm and illustrates a comparison between the theoretical and experimental transfer characteristics in subthreshold regime. the gate voltage is swept from 0 v to 0.8 v for different drain values 0.05 v, 0.1 v and 0.2 v. the maximum value of drain current represents the on current when vgs= vdd=0.8 v and the value of on current is 10.5 µa.the leakage current is 2.75 na and it represents the value of the current when vgs=0. to fit the experimental results, the drain current is modified by correction factor cf represented in equation 8. this coeffcient represents the transport mode transition factor. the transport is quasi balistic in the channel. this transition coefficient takes into consideration the type of charge carrier n or p therefore, the correction value distinguished between both structures. the fitting of the simulated results with the experimental data is due to the quantum correction that gave a good convergence between two curves. the cacultated parameters are used to compute the means free path used in the equation number (8) such as, effective mobility in th n channel, diffusion coefficient and unidirectional thermal velocity. the electron carrier mobility used in this simulation of n finfet 10 nm is 350 cm²/v.s. the average mobility value has been extracted from berkley spice model for finfet 10 nm [28]. it is noted that for the gate voltages 0.1 v and 0.2 v, the simulation curves fit very well with the experimental [31], there is therefore a good convergence between the theoretical model and the experimental points curves at these gate voltages. the discrepancy at 0.4 v and 0.5 v voltages can be explained by the presence of complex scattering phenomena which are very difficult to model. fig. 2 transfer characteristics of n finfet 10 nm [31] performance аnalysis of finfet based inverter nand and nor circuits at 10 nm, 7 nm ... 7 figure 3 represents transfer characteristics of p finfet 10 nm and illustrates a comparison between the theoretical and experimental transfer characteristics. we note that the on current is 50 µa and the leakage current is 44.76 na. the threshold voltage in this simulation is 0.20 v and the decrease of threshold voltage is due to the increase of the quasi-fermi level. the values of drain voltage have been chosen to calculate the threshold voltage and to fit the curve with experimental data [31]. fig. 3 transfer characteristics of p finfet 10 nm figure 4 represents the transfer chrematistics of n finfet 7 nm, we note that on current is 0.306 ma and leakage current is ioff is 82.536 na. various low static power technology needs higher threshold voltage but the miniaturization of integrated circuits and channel length decreases the threshold voltage. the threshold voltage in this simulation is 0.22v [33]. the leakage current in this simulation of n finfet 7 nm is lower than calculated by suyog gupta et al [4]. fig. 4 transfer characteristics of n finfet 7 nm 8 a. lazzaz, k. bousbahi, m. ghamnia figure 5 represents the transfer characteristics of p finfet 7 nm, we note that ion is 0.250 ma and leakage current is ioff= 221.571 na. we note that on current in this simulation of p finfet 7nm is higher than calculated in t.dash et al [18] and leakage current is lower than calculated by suyog gupta et al [4]. fig. 5 transfer characteristics of p finfet 7 nm figure 6 represents transfer characteristics of n finfet 5 nm, we note that the on current is 0.240 ma and the leakage current is 81.694 na. the threshold voltage is 0.23 v for this simulation and the increase of its value is due to the fermi level and to have better threshold voltage, we need to increase the fin height [3][14]. on current in this simulation is higher than calculated by n. p. maity et al [5]. we can control and minimize the leakage current in this structure with different channel length by optimizing the geometric parameters in order to have optimal results. fig. 6 transfer characteristics of n finfet 5 nm performance аnalysis of finfet based inverter nand and nor circuits at 10 nm, 7 nm ... 9 figure 7 represents the transfer characteristics of p finfet 5 nm, we note that the maximum current ion is 0.199 ma and leakage current is 219.31 na. we think that the problem to the increase of the leakage current is the leaked quantum confinement and the choice of geometric parameter like the gate oxide which leads to the raising of the conduction band, so we need more potential to create an inversion layer [13]. fig. 7 transfer characteristics of p finfet 5 nm the following table 2 represents the performance ratio ion/ioff the threshold voltage vth and dibl calculated for different structures of finfet 10 nm, 7 nm and 5 nm [10]. the table presents a comparatice study with international roadmap for device and systems (irds) results [30].the supply voltage for finfet 10 nm and 7 nm is 0.8 v and 0.65 v for finfet 5 nm.these parameters are extracted from berkley spice model [28]. table 2 performance ratio of finfet 10 nm,7 nm and 5nm device finfet 10 nm finfet 7 nm finfet 5 nm ion/ioff values for n structure 3818.18 3707.36 2937.79 ion/ioff values for p structure 1117.6 1128.30 907.36 vth (v) for n structure 0.24 0.22 0.23 vth (v) for p structure 0.20 0.20 0.22 ion/ioff for n strcture (irds)[30] 950 930 840 dibl n fnfet (mv/v) 49.5 45.5 40.5 dibl p finfet (mv/v) 50.5 46.5 41.5 we note that the better performance ratio of n finfet is for finfet 7 nm due to the leakage current and the higher ratio performance of p finfet is for finfet 10 nm due to the minimum strain effect of on current. 10 a. lazzaz, k. bousbahi, m. ghamnia 3. cmos gates designs this paper has considered three design styles for digital logic circuits structures using finfets. the circuit diagram of different finfet-based not gate, nand, nor gate designs along with the ordinary cmos is shown in the figure 8. fig. 8 (1): not gate, (2): cmos nand, (3) cmos nor [8] the three different circuits of cmos (nand nor and inverter) based of finfet have been analyzed using the microwind 3.8 tool. the first step is the implementing of three different circuits of finfet based nand and nor gates in order to create the layout styles [16]. the design rule must be checked before applying the inputs. the design rule which is used in this simulation is lambda-based design rule. the value of lambda is fixed to 8 nm [6] [15]. figure 9.a represents the layout design of cmos not gate with finfet 5 nm using microwind 3.8 and figure 9.b represents the structure of cmos not gate in 3d with finfet 5nm.[19] (a) (b) fig. 9 (a) design layout cmos inverter , (b) cmos inverter 3d structure performance аnalysis of finfet based inverter nand and nor circuits at 10 nm, 7 nm ... 11 figure 10.a represents the design layout of cmos nand with finfet 5 nm using microwind 3.8 and figure 10.b represents the structure of cmos nand in 3d with finfet 5nm. (a) (b) fig. 10 (a) design layout cmos nand gate, (b) cmos nand 3d structure figure 11a represents the design layout of cmos nor gate with finfet 5nm using microwind 3.8 and figure 10.b represents the structure of cmos nor gate in 3d with finfet 5nm. (a) (b) fig. 11 (a) design layout cmos nor gate, (b) cmos nor gate 3d structure figure 12 represent the different vtc curves of different cmos circuits: 12 a. lazzaz, k. bousbahi, m. ghamnia (a) (b) (c) fig. 12 (a) vtc curves of cmos not gate, (b) vtc curves of cmos nor gate, (c) vtc curves of cmos nand gate noise margin is a measure of design margins to ensure circuits operation within specified conditions and it is closely related to the dc transfer curve [40]. this parameter allows to determine the allowable noise voltage on the input of a gate so that the output will not be corrupted. the specification most commonly used to describe noise margin (or noise immunity) uses two parameters: the low noise margin nml and the high noise margin nmh [8]. table 3 represents calculated parameter from vtc (voltage transfer curve): figure 13 represents the values of power delay product (pdp) with different cmos gates. we note that the better value of pdp in not gate is for finfet 5 nm and for cmos nand, nor gates is finfet 7 nm. performance аnalysis of finfet based inverter nand and nor circuits at 10 nm, 7 nm ... 13 table 3 calculated parameters of different cmos finfet gates device finfet not gate finfet nand gate finfet nor gate technlogy node 10 nm 7 nm 5 nm 10 nm 7 nm 5 nm 10 nm 7nm 5 nm vdd 0.80 0.8 0 0.65 0.80 0.80 0.65 0.80 0.80 0.65 vsp(v) 0.385 0.385 0.386 0.397 0.391 0.3999 0.379 0.373 0.371 vil(v) 0.3243 0.3243 0.3189 0.3445 0.3445 0.3351 0.3148 0.3189 0.3202 voh(v) 0.7750 0.7687 0.7656 0.7509 0.7562 0.7562 0.7187 0.7718 0.7562 vih(v) 0.4378 0.4391 0.4418 0.4513 0.4472 0.4472 0.4189 0.4216 0.4437 vol(v) 0.0531 0.0406 0.0406 0.0375 0.0437 0.0406 0.0343 0.05 0.0437 nml(v) 0.2712 0.2836 0.2782 0.3070 0.3007 0.2944 0.2804 0.2689 0.2764 nmh(v) 0.3372 0.3296 0.3238 0.2996 0.3090 0.3090 0.2998 0.3502 0.3125 td (ps) 1.6 1.5 1.4 2.20 2.20 2.10 1.10 1.10 1.0 p( w ) 0.446 0.357 0.460 0.475 0.686 0.5950 0.325 0.416 0.401 pdp (10-18w.s) 0.7136 0.5355 0.6440 1.0450 1.5092 1.2495 0.3575 0.4576 0.4010 p: power dissipation in static cmos, pdp: power delay product. td: time delay; vol: maximum low output voltage, voh: minimum high output voltage, vil: maximum low input voltage, vih: minimum high input voltage, vsp: switching point voltage. fig. 13 power delay product (pdp) for different cmos gates figure 14 represents the values of times delay of different cmos gates, we note that the optimal device is finfet 5 nm due to the low time delay. the results obtained for each of the digital application at 10 nm, 7 nm and 5 nm of finfet shows a system tradeoff. we note that as we scale down the device from 10 nm to 5 nm, the time delay decreases because the supply voltage has been decreased [34]. 14 a. lazzaz, k. bousbahi, m. ghamnia fig. 14 time delay for different cmos gates the fluctuation in power delay product (pdp) is due to the fluctuation of static power dissipation and it is a minor issue because the system reliability has improved [20]. conclusion as ultra large semiconductor integration (ulsi) moves towards new advancement, new challenges have been arisen such as sce which are generated because of scaling of the transistors. from the simulation results, it has been observed that the leakage power dissipation is the major issue in modern semiconductor industry and finfet devices have the advantages to overcome these issues. the simulation results for finfet based digital application at nanometer regime of 10 nm ,7 nm and 5 nm technology are studied here in the educational tool microwind and a comparative and analysis is carried out in this paper for comparison between the different nodes technology of finfet device. from the simulation results, one can conclude that the impact of the time delay and power dissipation product on cmos based finfet device are crutial parameters for improvements of the performance of cmos circuits. we confirm in this study that significant progresses have been made by introducing a new generation of 5 nm finfet device which improves the switching performances and decrease the time delay as compared to different nodes such as 10 nm and 7 nm for cmos circuits. the results in this simulation confirm that the proper selection of supply voltage and geometric parameters is important for obtaining a high speed and stable cmos circuits. acknowledgement: the authors wish to thank pr etienne sicard and mr vinay sharma for their helpful suggestions in this work. performance аnalysis of finfet based inverter nand and nor circuits at 10 nm, 7 nm ... 15 references [1] b. yu, l. chang and s. ahmed, "finfet scaling to 10 nm gate length". in proceedings of the ieee digest. international electron devices meeting", 2002, pp. 251-254. [2] a. lazzaz, k. bousbahi and m. ghamnia, "modeling and simulation of dg soi n finfet 10 nm using hafnium oxide", in proceedings of the 21st ieee international conference on nanotechnology (nano), 2021, pp. 177-180. [3] x. zhang, d. connelly and p. zheng, "analysis of 7/8-nm bulk-si finfet technologies for 6t-sram scaling", ieee trans. electron devices, vol. 63, no 4, pp. 1502-1507, 2016. [4] s. gupta, v. moroz and l. smith, "7-nm finfet cmos design enabled by stress engineering using si, ge, and sn", ieee trans. electron devices, vol. 61, no. 5, pp. 1222-1230, 2014. [5] n. maity, r. maity and s. maity, "comparative analysis of the quantum finfet and trigate finfet based on modeling and simulation", j. comput. electron., vol. 18, no 2, pp. 492-499, 2019. [6] e. sicard and l. trojman, "introducing 5-nm finfet technology in microwind", hal open science, hal0325444, 2021. [7] n. bourahla, a. bourahla and b.hadri, "comparative performance of the ultra-short channel technology for the dg-finfet characteristics using different high-k dielectric materials" , indian j. phys., vol. 95, pp. 1977-1984, 2020. [8] n. weste and d. harris, cmos vlsi design: a circuits and systems perspective, pearson education india, 2015. [9] j. baker, cmos circuit, design, layout and simulation, ieee press series on microelectronic systems, pp. 332-375, 2010. [10] y. eng, l. hu, t. chang, s. hsu, c. chiou, t. wang and c. yang, "importance of $\delta v_ {{\text {diblss}}}/({i} _ {{\text {on}}}/{i} _ {{\text {off}}}) $ in evaluating the performance of n-channel bulk finfet devices", ieee j. electron devices soc., pp.207-213, 2018. [11] m. lundstrom, fundamentals of nanotransistors, world scientific publishing company, vol. 6, 2017 pp. 100-300. [12] n. collaert, high mobility materials for cmos applications, woodhead publishing, 2018, pp. 115-280. [13] y. chauhan, d. lu and s.venugopalan, finfet modeling for ic simulation and design: using the bsimcmg standard, academic press, 2015, pp 72-200. [14] m. tang, f. pregaldiny and c. lallement, "quantum compact model for ultra-narrow body finfet", in proceedings of the 10th international ieee conference on ultimate integration of silicon, 2009, pp. 293-296. [15] e. sicard, "introducing 20 nm technology in microwind", hal open science, hal-03324322, pp.3-20, 2011. [16] e. sicard and s. dhia, "microwind & dsch: version 3". insa, pp.1-90, 2004. [17] r. sharma and s.verma, "comparitive analysis of static and dynamic cmos logic design", in proceedings of the ieee international conference on computing and communication technologies, 2011, pp. 231-234. [18] t. dash, s. dey and s. das, "performance comparison of strained-sige and bulk-si channel finfets at 7 nm technology node ", j. micromech. microeng., vol. 29, no. 10, p. 104001, 2019. [19]l. artola, g.hubert and m.alioto,"comparative soft error evaluation of layout cells in finfet technology" microelectron. reliab., vol. 54, no. 9-10, pp. 2300-2305 ,2014. [20] v. vashishtha and l. clark ,"comparing bulk-si finfet and gate-all-around fets for the 5 nm technology node", microelectron. j., vol. 107, p. 104942, 2021. [21] s. liu, j. yang and l. xu, "can ultra-thin si finfets work well in the sub-10 nm gate-length region? ", nanoscale, vol. 13, no 10, pp. 5536-5544, 2021. [22] d. tripathy, d.acharya and p.rout, "influence of oxide thickness variation on analog and rf performances of soi finfet", fu: elec. energ., vol. 35, no. 1, pp. 001-011, 2022. [23] a. lazzaz, k. bousbahi and m. ghamnia, "optimized mathematical model of experimental characteristics of 14 nm tg n finfet", micro and nanostructures, p. 207210, 2022. [24] n. bourahla, b. hadri and n. boukortt, "impact of high-k dielectric material on ultra-short-dg-finfet performance", in proceedings of the 15th international ieee conference on advanced technologies, systems and services in telecommunications (telsiks), 2021, pp. 78-81. [25] u. das, m. hussain, "benchmarking silicon finfet with the carbon nanotube and 2d-fets for advanced node cmos logic application", ieee trans. electron devices, vol. 68, no 7, pp. 3643-3648,2021. [26] r. vallabhuni, d. sravya and m. shalini, "design of comparator using 18nm finfet technology for analog to digital converters", in proceedings of the 7th international ieee conference on smart structures and systems (icsss), 2020, pp. 1-6. [27] j. jena, d. jena and e. mohapatra,"finfet-based inverter design and optimization at 7 nm technology node", silicon, vol. 14, pp. 10781-10794, 2022. 16 a. lazzaz, k. bousbahi, m. ghamnia [28] s. sinha, g. yeric and v. chandra, "exploring sub-20nm finfet design with predictive technology models", in proceedings of the ieee dac design automation conference, 2012, pp. 283-288. [29] e. sicard and l. trojman, "introducing 5-nm finfet technology in microwind", hal open science, hal0325444, 2021. [30] international roadmap for devices and systems. available at: https://irds.ieee.org/ (2018 edition). [31] c. auth, a. aliyarukunju and m .asoro, "a 10nm high performance and low-power cmos technology featuring 3 rd generation finfet transistors, self-aligned quad patterning, contact over active gate and cobalt local interconnects", in proceedings of the ieee international electron devices meeting (iedm), 2017 pp. 29.1.1-29.1.4. [32] p. vora and r. lad, "a review paper on cmos, soi and finfet technology", design and reuse industry articles, p. 1-10, 2017. [33] m. tang, f. prégaldiny and c. lallement, "explicit compact model for ultranarrow body finfets", ieee trans. electron devices, vol. 56, no. 7, pp. 1543-1547,2009. [34] j. hu and x. yu, "near-threshold full adders for ultra low-power applications", in proceedings of the second ieee pacific-asia conference on circuits, communications and system, 2010, p. 300-303. [35] s. panchanan, r. maity and s. baishya, "a surface potential model for tri-gate metal oxide semiconductor field effect transistor: analysis below 10 nm channel length", eng. sci. technol. int. j., vol. 24, no. 4, pp. 879-889, 2021. [36] b. vandana, d. kumar and s. mohapatra, "impact of channel engineering (si1-0.25 ge0.25) technique on gm (transconductance) and its higher order derivatives of 3d conventional and wavy junctionless finfets (jlt)", facta universitatis, series electronics and energetics, vol. 31, no. 2, pp. 257-265, 2018. [37] n. maity, r. maity and s.baishya, "an analytical model for the surface potential and threshold voltage of a double-gate heterojunction tunnel finfet", j. comput. electron., vol. 18, no 1, pp. 65-75, 2019. [38] s. panchanan, r. maity, "modeling, simulation and analysis of surface potential and threshold voltage: application to high-k material hfo2 based finfet", silicon, vol. 13, no. 10, pp. 3271-3289, 2021. [39] s. panchanan, r. maity and s. baishya, "modeling, simulation and performance analysis of drain current for below 10 nm channel length based tri-gate finfet", silicon, vol. 14, pp. 11519-11530, 2022. [40] l. wang, y. chang and k. cheng, electronic design automation: synthesis, verification, and test, morgan kaufmann (ed), 2009. [41] s. shaheen, g. golan, m. azoulay, "a comparative study of reliability for finfet", facta universitatis, series electronics and energetics, vol. 31, no 3, pp. 343-366, 2018. instruction facta universitatis series: electronics and energetics vol. 27, n o 1, march 2014, pp. i i editorial as emphasized in the editorial for the first in the series of the anniversary issues, over the past quarter of century facta universitatis: series electronics and energetics has become one of the most widely read and cited journals in the field in the region of the west balkans. unfortunately, the journal has yet not gained worldwide recognition, and it will be the main goal, consistent with our coverage and focused aims, in the near future. in order to meet this goal we will strive to attract best submissions and publish best papers from a very broad geographic area, thus making facta universitatis: series electronics and energetics a truly international journal. we will also insist that all published papers are of high quality and practical value, thus leading to their worldwide citation, i.e. to the journal’s placement onto sci list. whilst insisting that all published papers are of high quality and practical value, we wish to avoid creating a situation where the journal publishes by quantity rather than quality, and that is the reason why we already started with rigorous refereeing of all submitted papers. however, recent submissions have surpassed our expectations in quantity, quality and practical values, so we are pleased to announce that journal will be published quarterly since this year. this one, second in the series of the anniversary issues, is a collection of 9 invited papers by well-known experts for the specific areas, most of them members of our editorial team, who present and discuss the state-of-the-art issues of practical interest in the field. as a new editor-in-chief, i, along with our editorial team, promise to continue developing and improving facta universitatis: series electronics and energetics in order to keep it at the forefront of science and technology. ninoslav stojadinović editor-in-chief instruction facta universitatis series: electronics and energetics vol. 30, n o 1, march 2017, pp. 81 91 doi: 10.2298/fuee1701081v on the numerical computation of cylindrical conductor internal impedance for complex arguments of large magnitude * slavko vujević, dino lovrić university of split, faculty of electrical engineering, mechanical engineering and naval architecture, split, croatia abstract. in this paper a numerical algorithm for computation of per-unit-length internal impedance of cylindrical conductors under complex arguments of large magnitude is presented. the presented algorithm either numerically solves the scaled exact formula for internal impedance or employs asymptotic approximations of modified bessel functions when applicable. the formulas presented can be used for computation of per-unit-length internal impedance of solid cylindrical conductors as well as tubular cylindrical conductors. key words: internal impedance, modified bessel functions, large function arguments, scaling. 1. introduction internal impedance per-unit-length (pul) or surface impedance of cylindrical conductors is required in analysis of numerous electromagnetic problems [1-5]. this pul internal impedance can be computed using various formulas which contain special functions such as bessel functions and modified bessel functions [6]. whatever formula is employed the results are valid only for smaller function arguments whereas for larger function arguments stability issues often occur. these issues are directly connected with computing special functions (bessel functions and modified bessel functions) under large parameters which in some cases yield extremely large values and in some cases extremely low values. in addition, these extreme values are multiplied, divided, subtracted and added which considerably makes thing worse. in this paper an algorithm is presented which circumvents the mentioned issues by first scaling the employed formulas to avoid overflow/underflow issues and then solving the expressions for modified bessel functions in two ways either by numerical integration or by using asymptotic approximations when applicable [7].  received february 25, 2016; received in revised form april 7, 2016 corresponding author: slavko vujević university of split, faculty of electrical engineering, mechanical engineering and naval architecture, split, croatia (e-mail: vujevic@fesb.hr) * an earlier version of this paper was presented at the 12 th international conference on applied electromagnetics (пес 2015), august 31 september 2, 2015, in niš, serbia [1]. 82 s. vujević, d. lovrić the formulas presented in the paper are applicable to solid and tubular cylindrical conductors. all presented formulas are for a tubular cylindrical conductor, but by introducing the value zero for internal radius of the tubular cylindrical conductor, the pul internal impedance of a solid cylindrical conductor can be obtained. this model for computing pul internal impedance of single-layer tubular conductors represent a basis for a more general model which will be able to compute pul internal impedance of a multilayered tubular conductor which is currently in development. 2. formula for computation of tubular cylindrical conductor internal impedance computation of pul internal impedance of tubular cylindrical conductors (fig. 1), which takes the skin effect into account but ignores the proximity effect, can be performed using various formulas based on different special functions. it has been concluded in the previous work of the authors of this paper that, from the numerical stability standpoint, the most suitable formula for computation of pul internal impedance of tubular conductors is based on modified bessel functions of the first and second kind [7]: 1 0 0 1 1 1 1 1 ( ) ( ) ( ) ( ) 2 ( ) ( ) ( ) ( ) i e e i e i e e i k r i r k r i r z r k r i r k r i r                              (1) exp (1 ) 4 j j                    (2) where σ is the electrical conductivity of the conductor material, re is the external radius of the conductor, ri is the internal radius of the conductor, 0i and 1i are complex-valued modified bessel function of the first kind of order zero and one, 0k and 1k are complex-valued modified bessel function of the second kind of order zero and one (also called kelvin functions),  is the complex wave propagation constant, α is the attenuation constant, µ is the permeability of the conductor material, ω is the circular frequency and j is the imaginary unit. fig. 1 cross-section of a tubular cylindrical conductor as it has been shown in [7] by rearranging formula (1) and scaling it by an appropriate factor, the following formula for pul internal impedance of tubular conductors can be obtained: numerical computation of cylindrical conductor internal impedance 83 0 0 0 1 1 1 1 1 1 1 ( ) ( ) exp[ 2 ( )] ( ) ( ) ( ) 2 ( ) ( ) ( ) exp[ 2 ( )] ( ) ( ) s s i e e is s s e i e s s s e e i e e is s i e k r k r r r i r i r i r z r i r k r k r r r i r i r                                           (3) where the scaled modified bessel functions are: ( ) exp( ) ( ) s n n i r r i r        (4) ( ) exp( ) ( ) s n n k r r k r       (5) modified bessel functions of the first kind are scaled down exp( )r  times whereas modified bessel functions of the second kind are scaled up exp( )r  times. in such a way quantities of similar magnitudes are obtained which consequently enables more stable computation. the computation of internal impedance z can be further simplified depending on the magnitude of ( ) e i r r   . numerical analysis has shown that for ( ) 19 e i r r    computation of z must be performed using (3) in order to maintain high accuracy. however, for larger magnitudes of ( ) e i r r   simplifications of formula (3) can be performed without loss of accuracy. the following relation presents these simplifications and their interval of applicability: 0 1 15 15 ( ) 19 ( ) 10 2 ( ) ( ) 10 2 s e e is e e e i e i r ; r r r i r z ; r r r                               (6) as can be seen from (3) and (6), it is imperative to compute scaled modified bessel functions of the first and second kind as accurately as possible. the proposed numerical procedure for achieving this is addressed in the following section of the paper. 3. computation of scaled modified bessel functions in the developed algorithm for function parameters α∙r ≤ 25 integral representation of scaled modified bessel functions of the first and second kind is used. integral representation of modified bessel functions is more suitable than the infinite sum representation because the scaling factors given in (4-5) can be easily included in the integral representation of modified bessel functions. this is not the case when using the infinite sum representation. integrals that occur in modified bessel functions of the first and second kind are solved numerically using adaptive simpson rule. on the other hand, for function parameters α∙r > 25 computation of scaled modified bessel functions of the first and second kind is performed using asymptotic approximations. through extensive numerical analysis it has been found that for function parameter values larger than 25, asymptotic approximations of modified bessel functions produce results of equal accuracy as the numerical solution of integral representation of modified bessel functions but in less computation time. 84 s. vujević, d. lovrić 3.1. computation of scaled modified bessel functions of the first kind for α∙r ≤ 25 modified bessel function of the first kind of order zero in its integral form can be expressed by the following equation [8]: / 2 0 0 0 2 ( ) cos( sin ) exp( ) ( ) s i r j r d r i r                      (7) further simplification of the previous expression and separation of real and imaginary parts yields the following relation for scaled modified bessel function of the first kind of order zero: / 2 0 0 / 2 0 1 ( ) [exp( ) cos exp( ) cos ] [exp( ) sin exp( ) sin ] s i r a a b b d j a a b b d                         (8) where a and b are given by: (sin 1)a r     (9) (sin 1)b r     (10) the separation of the real and imaginary parts is performed because these integrals are solved separately using adaptive simpson numerical integration. numerical integration yields highly accurate results because the separated functions are simple to integrate as can be seen from fig. 2 and fig. 3 which depict how the real and imaginary parts of equation (8) behave on the integration interval for various values of parameter α∙r. fig. 2 real part of scaled modified bessel function of the first kind of order zero for various values of parameter α∙r numerical computation of cylindrical conductor internal impedance 85 fig. 3 imaginary part of scaled modified bessel function of the first kind of order zero for various values of parameter α∙r integral representation of modified bessel function of the first kind of order one can be expressed by the following equation [8]: / 2 1 0 1 2 ( ) sin( sin ) sin exp( ) ( ) s i r j j r d r i r                         (11) as before, by simplification of expression (11) and separation of real and imaginary parts, the following relation for scaled modified bessel function of the first kind of order one can be obtained: / 2 1 0 / 2 0 1 ( ) [exp( ) cos exp( ) cos ] sin [exp( ) sin exp( ) sin ] sin s i r a a b b d j a a b b d                             (12) two integrals present in equation (12) are again solved numerically using adaptive simpson rule. fig. 4 and fig. 5 depict how the real and imaginary parts of equation (12) behave on the integration interval for various values of parameter α∙r. 86 s. vujević, d. lovrić fig. 4 real part of scaled modified bessel function of the first kind of order one for various values of parameter α∙r fig. 5 imaginary part of scaled modified bessel function of the first kind of order one for various values of parameter α∙r 3.2. computation of scaled modified bessel functions of the first kind for α∙r > 25 asymptotic approximation of scaled modified bessel function of the first kind can be expressed by [8]: 2 2 1 1 [4 (2 1) ] 1 ( ) ~ 1 ( 1) ; 0, 1 ! (8 )2 m s m t n m m n t i r n m rr                                 (13) numerical computation of cylindrical conductor internal impedance 87 from the previous expression asymptotic approximations of scaled modified functions of the first kind of orders zero and one can easily be deduced: 0 1 1 ( ) ~ 1 ( )2 na s m m m c i r rr                 (14) 1 1 1 ( ) ~ 1 ( )2 na s m m m d i r rr                 (15) where:                r r r r r na 10000for3 10000300for5 300100for7 10050for9 5025for12 (16) 2 1 ( 1) [ (2 1) ] 8 ! m m m m t c t m          (17) 1 2 1 ( 1) [4 (2 1) ] 8 ! m m m m t d t m           (18) the expressions for cm and dm are deduced from (13) and are also used for asymptotic approximations of modified bessel functions of the second kind. values of na have been determined through numerical analysis. 3.3. computation of scaled modified bessel functions of the second kind for α∙r ≤ 25 integral present in the expression for the modified bessel function of the second kind of order zero has an upper integral limit that tends to infinity [8]. fortunately, the integral function rapidly tends to zero as the function argument increases so the infinite limit can be substituted with a finite limit tm0 without loss of accuracy: 0 0 0 0 ( ) exp( cosh ) exp( cosh ) mt k r r t dt r t dt                (19)          r tm 65 1cosh 1 0 (20) now the scaled modified bessel function of the second kind of order zero can be deduced from (19): 0 0 0 0 0 ( ) exp( ) cos exp( ) sin m mt t s k r d d dt j d d dt            (21) (cosh 1)d r t    (22) 88 s. vujević, d. lovrić the two integrals present in equation (21) are again solved numerically using adaptive simpson rule with high accuracy. fig. 6 and fig. 7 depict how the real and imaginary parts of equation (21) behave on the integration interval for various values of parameter α∙r. fig. 6 real part of scaled modified bessel function of the second kind of order zero for various values of parameter α∙r fig. 7 imaginary part of scaled modified bessel function of the second kind of order zero for various values of parameter α∙r similarly as for the modified bessel function of second kind of order zero, the integral present in the expression for modified bessel function of second kind of order one [8] can be replaced with a finite limit tm1: 1 1 0 0 ( ) exp( cosh ) cosh exp( cosh ) cosh mt k r r t t dt r t t dt                  (23) 25.001  mm tt (24) numerical computation of cylindrical conductor internal impedance 89 simplification of expression (23) yields the following expression for scaled modified bessel function of the second kind of order one: 1 1 1 0 0 ( ) exp( ) cos cosh exp( ) sin cosh m mt t s k r d d t dt j d d t dt              (25) as before the two integrals present in equation (25) are solved numerically using adaptive simpson rule with high accuracy. fig. 8 and fig. 9 depict how the real and imaginary parts of equation (25) behave on the integration interval for various values of parameter α∙r. fig. 8 real part of scaled modified bessel function of the second kind of order one for various values of parameter α∙r fig. 9 imaginary part of scaled modified bessel function of the second kind of order one for various values of parameter α∙r 90 s. vujević, d. lovrić 3.4. computation of scaled modified bessel functions of the second kind for α∙r > 25 asymptotic approximation of scaled modified bessel functions of the second kind is given by the following expression [8]: 2 2 1 1 4 (2 1) ( ) ~ 1 ; 0, 1 2 ! (8 ) m s t n m m n t k r n r m r                                  (26) from the previous expression asymptotic approximations of scaled modified functions of the second kind of orders zero and one can be deduced: 0 1 ( 1) ( ) ~ 1 2 ( ) mna s m m m c k r r r                 (27) 1 1 ( 1) ( ) ~ 1 2 ( ) mna s m m m d k r r r                 (28) where na is given by (16) whereas the coefficients cm and dm are computed from (17) and (18). 4. numerical examples the presented model for computation of pul internal impedance of tubular conductors was implemented into a fortran program. in order to ascertain the accuracy of obtained results and numerical stability of the model itself, a comparison is made with matlab which is used to compute pul internal impedance using the initial formula (1). both fortran and matlab employ double precision computing. it is important to note here that by using a program package which can employ more decimal places higher robustness of results would be achieved but at the expense of execution time. in the numerical example magnitudes and phase angles of z for a thin tubular copper conductor (internal radius ri = 3.8 mm and external radius re = 4 mm) are computed. the results of the comparison are presented in table 1 and table 2. table 1 comparison of magnitudes of tubular cylindrical conductor internal impedance. α∙re z (ω) proposed matlab 10 -2 0.003643657122067 0.003643657122067 10 -1 0.003643657122745 0.003643657122745 10 0 0.003643663902873 0.003643663902873 10 1 0.003710702668820 0.003710702668820 10 2 0.025181394368712 0.025181394368712 10 3 0.251267138203603 nan 10 5 25.12049572965153 nan 10 10 2512043.292911872 nan 10 15 251204329284.9072 nan numerical computation of cylindrical conductor internal impedance 91 table 2 comparison of phase angles of tubular cylindrical conductor internal impedance. α∙re φ (°) proposed matlab 10 -2 9.30814638898·10 -6 9.30814663686·10 -6 10 -1 9.30814668336·10 -4 9.30814668521·10 -4 10 0 9.30813198101·10 -2 9.30813198100·10 -2 10 1 9.164530090507745 9.164530090507741 10 2 44.85885196305934 44.85885196305934 10 3 44.98566888986672 nan 10 5 44.99985675983501 nan 10 10 44.99999999856761 nan 10 15 45.00000000000000 nan as can be seen from the results in table 1 and table 2, when computing formula (1) using matlab an underflow/overflow stability issue occurs for larger function parameters. these numerical instabilities are a direct consequence of the denominator consisting of subtraction of two products. when these products become identical up to the last decimal place that the program package can compute, the denominator becomes equal to zero thus resulting in a not a number value. the proposed numerical procedure successfully circumvents these issues as can be seen form the results of the analysis. 5. conclusion in this paper an algorithm for computation of pul internal impedance of cylindrical conductor under large complex function arguments is presented. the high accuracy and stability of the algorithm was achieved by selecting a formula for pul internal impedance which does not lead to undefined values for relatively small function arguments and by scaling the modified bessel functions present in this formula by an appropriate scaling factor. the developed algorithm represents a basis for computation of pul internal impedance of multilayered tubular cylindrical conductors which is in development. references [1] s. vujević, d. lovrić, "on the numerical computation of cylindrical conductor internal impedance for complex arguments of large magnitude", in proceedings of the extended abstracts of the 12th international conference on applied electromagnetics (пес 2015), niš, serbia, 2015, pp. (p1_1) 1-4. [2] p. sarajčev, s. vujević, "grounding grid analysis: historical background and classification of methods", international review of electrical engineering, vol. 4, pp. 670-683, 2009. [3] h. w. dommel, "emtp theory book, 2nd edition", microtran power system analysis corporation, 1992. [4] f. p. dawalibi, r. d. southey, "analysis of electrical interference from power lines to gas pipelines part i: computation methods", ieee transactions on power delivery, vol. 4, no. 3, pp. 1840-1846, 1989. [5] j. moore, r. pizer, "moment methods in electromagnetics techniques and applications", john wiley and sons, 2007. [6] j. a. stratton, "electromagnetic theory", john wiley & sons, 2007. [7] s. vujević, d. lovrić, v. boras, "high-accurate numerical computation of internal impedance of cylindrical conductors for complex arguments of arbitrary magnitude", ieee transactions on electromagnetic compatibility, vol. 56, pp. 1431-1438, 2014. [8] m. abramowitz, i. a. stegun, "handbook of mathematical functions with formulas, graphs, and mathematical tables", dover publications, 1964. instruction facta universitatis series: electronics and energetics vol. 29, no 2, june 2016, pp. 297 308 doi: 10.2298/fuee1602297m centralized detection of pre-alarm state in telephone network of electric power utility  dragan mitić, vladimir matić, aleksandar lebl, mihailo stanić, žarko markov iritel a.d., belgrade, serbia abstract. in this paper we consider the mixed telephone network of electric power utility consisting of ip, isdn and power line carrier links. very important demand in the network is high availability. the central detector of ip and isdn link failure (pre-alarm) is presented. the detector function is based on the prolonged response time of the network in the case of ip and isdn link failure. we define undesirable events in the detector operation: false prealarm and miss detection, and we derive the expressions for their probability calculation. it is indicated that centralization of this detector is merit, which facilitates testing of the whole network from one location. key words: centralized detector, electric power utility, mixed telephone network, pre-alarm state 1. introduction the main demand for the telephone network of electric power utility (epu) is very high availability and all possible resources are used to achieve that. in order to realize the main demand, different technologies (optical cables, metal cables, radio) and non-hierarchical network architecture (alternate routing) are used in the epu telephone network. using of different technologies increases the availability [1], [2], but the problem is the conversion of different signalling systems (cas, isdn, ip) and speech signal forms (analog, digital, packet) in signalling and media gateways. in this paper we present how the mixed network of epu, which uses new and old techniques, (besides the problem of interworking), can use different signalling systems, i.e. different duration of post-dialling delay for monitoring the proper operation of the parts of mixed network. different methods can be used to detect faulty link in the telephone network of epu. a few approaches based on telephone traffic characteristics are presented in [3-8]. received may 21, 2015; received in revised form august 31, 2015 corresponding author: dragan mitić iritel a.d., 11080 belgrade, batajnički put 23, serbia (e-mail: mita@iritel.com) 298 d. mitić, v. matić, a. lebl, m. stanić, ž. markov this paper deals with a novel method of finding faulty isdn or ip link (link of the first choice) by measuring post-dialling delay to the beginning of the ring-back tone. the method implementation is based on the fact that there are one or more links of the second choice (power line carrier (plc) links) with considerably slower dialling speed than the links of the first choice. if there is a fault in some part of the network, the slower link will be activated on that part of the network and the dialling speed will be decreased. by proper choice of dialling numbers, it is possible to detect the network section with faulty links of the first choice. the main advantage of the method is that the testing for the whole epu telephone network can be realized from one, central place. the testing can be realized manually, without any equipment, only by adequately choosing subscriber dialling numbers, or using relatively simple equipment to generate dialling. the contribution of the paper is that it develops the method for testing and that it calculates the main characteristics of the system: the miss probability and the probability of false pre-alarm. 2. model, designations and assumptions the mixed network of epu consists of telephone exchanges (te) and transmission systems, which can be ip, isdn and plc systems. the old network was based on plcs. (plc is the technique of telephone channel creating by the use of high voltage power lines. sometimes this transmission is called voice over high voltage power line. plcs exist in new mixed network in order to increase availability. in [2] plcs are referred e&m analog lines.) the main characteristics of plcs in epu telephone network are the use of slow e&m signalling with pulse digit transfer, [9], and lower quality of speech signal transfer. let us consider the connection through the mixed epu telephone network, (see fig. 1.a)). from this connection let us consider only two nodes on the connection route, (see fig. 1.b)) and (fig. 1.c)). the offered traffic to the group of links is designated as a. the number of channels on the isdn link, or the greatest number of connections using ip link is n. telephone exchanges tek and tek+1 are connected by isdn or ip link and from the earlier network are still connected by plc. the connections between exchanges tek and tek+1 are established by the selection rule (sr) such that first isdn channels (see fig. 1.b)) or ip link (see fig. 1.c)) are selected, and if they are not available, plc is selected. this sr results from the faster connection establishment and the better speech signal quality when digital connections are used then when plcs are used. (it is clear that selections in different directions on isdn links will be in such a way that collision probability will be minimized). normal operation (state) is the state when all links between exchanges are faultless. the alarm state is the state when it is not possible to establish the connection between exchanges tek and tek+1 because all links between exchanges are faulty. pre-alarm state is defined as the state when it is not possible to establish the connection by the route of first choice, i.e. by the isdn or ip link, because these links are faulty. the connection can be established using plc. it is important that some connections can be established in this state, for example dispatcher connections. post-dialling delay (pdd, or post selection delay) is defined as the time interval from the last dialled digit until the start of the called side answer, i.e. until the beginning of the centralized detection of pre-alarm state in telephone network of electric power utility 299 ringing (busy) tone. let us suppose that 5-digit numbering plan is used in the network and that the transfer of all digits is equally probable (uniform distribution). the aim of this paper is to present the operation of pre-alarm state detector (the state when isdn or ip links are faulty). the operation of this detector is based on the difference in pdd values in the case of using digital links and plcs. fig. 1 model of connection through mixed epu network the main components of pdd are the time intervals used for processing and sending the information about dialled number between adjacent network nodes. that’s why it is necessary to know the characteristics of transferring time intervals between nodes in the case of digital links and plcs. 3. time delay of successful transfer of address information (dialled number) between exchanges the time of successful transfer of signalling information about the dialled number between the network nodes is the most important component of pdd. this time depends on the signalling type, transmission method, traffic load of the links and nodes. that’s why it is random variable. in order to satisfy the main request that pdd has sufficiently short duration, the recommendations about the allowed duration of transfer of signalling information (concerning the dialled number) between network nodes are introduced. these recommendations are different for different techniques. 300 d. mitić, v. matić, a. lebl, m. stanić, ž. markov 3.1. isdn technique recommendations for the greatest allowed time of exchange operation are presented in [10], sections 2.3 (delay probability – non-isdn or nixed (isdn – non-isdn) environment) and 2.4 (delay probability – isdn environment). among all recommended values, we shall select the most stringent ones (the longest time intervals), which deal with isdn technique, the message carrying address information and en-bloc signalling (en-bloc signalling means that signalling transmission on one link starts when complete address information from the previous link is collected in the node preceding the considered link). these greatest allowed time intervals are defined in the following sections of [10]: 2.3.2.3 local exchange call request delay, 2.3.3.2.3 exchange call set-up delay for originating outgoing traffic connections, 2.4.3.1 call set up delay, 2.4.5 incoming call indication sending delay which recommend that the longest allowed mean time for the activity of one route section and one network node is 600ms (load a) and 800ms (load b). the longest recommended time for the activity in the case of 95% connections is 800ms (load a) and 1200ms (load b). the reason for taking the longest time intervals from [10] is that in that case is most probable to make an error (i.e. to replace dialling using isdn or ip link by dialling using plc link or vice versa), thus making one of two possible false detections in decision algorithm (false pre-alarm or miss detection). in [11] for cross-office transfer time for signalling ccs no 7 messages in the most difficult conditions (complex message content – processing intensive and increased load 30%), the longest mean time (450ms) and the longest time for forwarding at least 95% messages (900ms) is recommended. the probability distribution of time necessary for forwarding the address information is exponential, and its main component is the waiting time on the (signalling) processor service, [12]. in [12] it is indicated that the time of processor service can be constant or exponentially distributed. here we suppose that signalling processor service time is distributed according to exponential distribution. there are two reasons for this: the first one is that the service time of signalling processor for different messages is different, and the second one is that the results for exponential distribution are more reliable (conservative, on the safe side). the probability density function of the time duration needed for the address signalling message transfer across isdn link is presented by the function f(t) (see fig. 2). it is clear that in this case t is continuous random variable. the mean value of this time is signed as tmisdn. 3.2. ip technique the parts of telephone network, which are realized using ip techniques, use sip for connection setup [13]. the address information for connection setup exists in message (method) invite, and after sending message invite, an acknowledgement using some of the provisional or final responses from the groups 1xx or 2xx is expected. the message invite also can be transmitted using unreliable protocol (udp), and in this case preventive retransmission must be used. in [13] it is stated that the first retransmission is sent after 500ms. let us suppose that in private network, as is the case of epu network, the time interval of 500ms is enough to receive the response on 95% invite requests. we can suppose that for address transmission between two network nodes using ip techniques the centralized detection of pre-alarm state in telephone network of electric power utility 301 same recommendations are valid as for transmission time across isdn link. the only (positive) difference is that in this case the time intervals are shorter. as the conclusion, it can be said that the longest allowed time for address information transfer between two network nodes of digital links is that which is valid for 95% of all connections with traffic load b, i.e. 1200ms. 3.3. plc in this technique the dialled digits are forwarded in pulse form without acknowledgement. that’s why we shall consider that address information transfer between exchanges is finished after the selected number is completely transmitted. the time of address information transfer between exchanges in ip or isdn technique depends on processor load and link load (i.e. signalling equipment load), and doesn’t depend on signalling message duration. on the contrary, in the case of plc the time for address information transfer between exchanges depends on signalling information transfer, i.e. on the number of dial pulses. the time for address information transfer using plc link is random variable, which has discrete values. example 1: if we use 5-digit numbering plan, i.e. there are more than 10000 users in the network, then the time for address information transfer can be calculated as tplc= 4·tp + 100·n (ms), where tp is interdigit pause (350ms), and n is the number of dial pulses, n = 5,6,...49,50. probability distribution of this time duration is presented symbolically and signed as p(t) (see fig. 2). the values of time distribution p(t) for the five-digit numbering and plc link have discrete values (see fig. 3) (every fifth value presented bold). it is obvious that this is discrete random variable and that the probability has values different than 0 only for the values t = 1400 + 100·n (ms), where n is integer, i.e. only for values t = 1900, 2000, 2100,...., 6300, 6400 (ms). fig. 2 probability density function (full line) in the case of isdn link and probability distribution (dashed line) of address information transfer time in the case of plc link 302 d. mitić, v. matić, a. lebl, m. stanić, ž. markov fig. 3 probability distribution of the time for digit transfer on plc link for 5-digit numbering plan value tmplc is the mean value of time needed for address information transfer over plc link (see fig. 2). the main conclusion of this section is that time of address information transfer between two adjacent nodes of epu network differs in the case of isdn or ip link (tisdn) and plc link (tplc) for several seconds. in the case of 5-digit numbering plan, the mean value of this difference δtm is about 4s (see fig. 2.). 4. basic idea for pre-alarm state detector main idea of the detector is that it generates test telephone calls in the network and compares pdd with the usual values and, in the case of a great difference, declares prealarm state. the difference in the time delay of address information transfer on the link (δt), which is in pre-alarm state, is transferred on the total pdd time. let us present the main idea of the detector (see fig. 4) and (fig. 5). in normal state, i.e. when all isdn or ip links are correct, these links are used for the whole connection setup (see fig. 4.a)). when there is one faulty isdn (ip) link on one section of call route, it is replaced by plc (see fig. 4.b)). (the established connection is presented by bold line). the time values of pdd are different in the case of correct and faulty section on the call route (see fig.5). the moment of test signal sending is signed as td (see fig. 5). the time interval from signal sending till receiving the answer from the called side, in the case of all isdn and ip links are correct on the trace towards the called user (see fig. 4.a), is signed as pdd1 (see fig. 5). the response time from the receiving side if some isdn or ip link is faulty (see fig. 4.b), is signed as pdd2 (see fig. 5). time interval pddt is centralized detection of pre-alarm state in telephone network of electric power utility 303 chosen in advance as the threshold time value. if pdd>pddt, the pre-alarm state is declared. the guard interval is pddt – pdd1. fig. 4 basics of pre-alarm state detector fig. 5 the pdd values in the case of correct and faulty section on the call route as is the detector function based on random variables analysis, two undesired consequences are also possible: the false pre-alarm and the miss detection. the false pre-alarm is the phenomenon that all links are correct, and the detector declares the pre-alarm state. the detector miss is the reverse situation: the failure on isdn or ip link exists, but the detector does not detect it. the false pre-alarm is possible in the case of increased traffic load when all links are correct, and the connection is realized by plc. the miss in pre-alarm state detection is possible in the case that the value of pddt is chosen to be too high. 5. calculation of probability for false pre-alarm and for miss detection let us consider two network nodes in epu network (see fig. 1). these two nodes belong to one connection (see fig. 4). the central detector of pre-alarm state is turned on and for this case the threshold value for the answer time delay of the called side (pddt) is defined, (see fig. 5). the pre-alarm state is declared if pdd>pddt. the false pre-alarm can occur in two cases: 304 d. mitić, v. matić, a. lebl, m. stanić, ž. markov  if the telephone traffic is high and isdn or ip link is faultless, but busy by previous calls, and the next call is served by plc;  if the signalling traffic between network nodes, which form the connection, is great, the time for address information sending is too great and the total time until the answer from the called side becomes pdd>pddt. the probability of false pre-alarm, caused by the great traffic, i.e. the probability of false pre-alarm of the first kind is, obviously: 1 ( , )fpap b e a n  (1) where e(a,n) is the well known erlang loss formula in the group of n channels with the offered traffic a, [14]. the probability of the false pre-alarm, caused by the too great signalling traffic (probability of false pre-alarm of the second kind) can be calculated in the following way: let us consider the distribution of the time for address information sending between network nodes on isdn or ip link. in the subsection iii.1. it was pointed that this distribution is negative exponential, (see fig. 6). fig. 6 distribution of the time for address information sending between network nodes the probability density function of exponential distribution is ( ) , 0 x f x e x        (2) while the cumulative distribution function, i.e. the probability that t ≤ x (in other words p(t ≤ x) = f(x)), is: ( ) 1 x f x e     (3) the probability of false pre-alarm of second kind (see fig. 6) can be expressed as: ( ) 1 ( )fpa2 t tp p t pdd f pdd    (4) centralized detection of pre-alarm state in telephone network of electric power utility 305 the total probability of false pre-alarm is: 1 (1 ) (1 )fpa fpa1 fpa2 fpa1 fpa2p p p p p       (5) because 1fpa1p and 1fpa2p . example 2: let us consider the primary group of isdn channels (n = 30) with the offered load of 20e, then is pfpa1 = 0.00846. using the most stringent requirement from the subsection iii.1. that the waiting time for 95% calls must be less than 1200ms, we find the value of λ: 1200 1 (1200) 1 0.95 2.5f e s           (6) taking the value pddt = 1.5s, we have pfpa2 = 0.0235. the total probability of false pre-alarm in this example is pfpa = 0.032. there is no possibility for miss detection (pmiss = 0) if it is possible to define the value of time threshold (for address information sending when isdn (ip) links are faultless) on the smaller value than it is the minimum time of address information sending over plc. in that case there is no overlapping of possible time intervals: time interval of address information sending when all isdn (ip) links are faultless is surely shorter than time interval of address information sending over plc. but, the situation changes when these time intervals are overlapping. it means that the probability of miss detection exists if the value of time threshold (pddt) is greater than the lower limit for transmission time of address information over plc (tplcmin), pddt > tplcmin, (see fig. 3). in this case, if the value of pdd is tplcmin tplcmin, we can come to the situation when is pmiss > 0. 7. how central detector functions in epu network central detector contains the numbering plan of the whole epu network. in the situation when all links are correct it generates test calls (directed towards test ports) and determines the standard value of pdd (pdd1) for each node in the network (see fig. 5). these data are memorized in detector for comparison with later measured values of pdd. besides, according to the dialled number and the standard value of pdd, the threshold pddt is determined for each network node. the testing is performed in such a way that the ports of farthest nodes are called first. if pddpddt for distant node, it is necessary to determine on which route section the pre-alarm state exists (see fig. 7). let us suppose that we dial subscriber number of tsd in the exchange ted (far network node) from telephone tsa in the exchange tea, where centralized detector is situated. the response time differs from the standard value more than it is allowed according to the threshold. it means that on some of the route sections tea – teb, teb – tec, tec – ted pre-alarm state appeared. standard values for the pdd exist for the connections tsa – tsc and tsa – tsb. it is possible to detect the route section on which pre-alarm state appeared by successively dialling telephone numbers tsc and tsb. if the fault exists on the link between teb and tec, the pdd value when dialling tsc will be greater than the pre-defined threshold and the pdd value when dialling tsb will be smaller than the pre-defined threshold. flow-chart of the detector algorithm is presented in fig.8. in this flow-chart m is the number of directions, which have to be tested, i is the direction, which is instantaneously tested, ni is the number of nodes in the direction i, and j is the node, which is instantaneously tested in the direction i. dni,j is the testing dial number in node determined by i and j. as it is already pointed, testing of direction i starts from the last node in the direction (j = ni). test number is dialled (dnlast i,j) and if pdd is less then pddt for node i,j (pddti,j), testing is finished for direction i. if it is not the last direction to be tested (i < m), testing is continued on the next direction (i = i + 1). if all directions are tested, it is started again from the first direction. centralized detection of pre-alarm state in telephone network of electric power utility 307 in the case that test pdd> i0, yields: 0ln( ) ln( ) th v i = i n v  . (3) therefore, the plot of ln(i) vs v is a straight line whose slope, 1/nvth, and v-axis intercept yield at room temperature n=1.03 and i0=0.55 na, respectively. fig. 2 measurement (symbols) and simulation (lines) of a silicon diode i-v characteristics at room temperature in linear and logarithmic scales. simulations are done using (3) which assumes i>> i0 and results in a straight line in logarithmic scale 2.2. single-exponential diode model with series resistance figure 3 presents the lumped parameter equivalent circuit model of a diode with parasitic series resistance. as a consequence of the presence of the parasitic series resistance rs, the terminal current of this equivalent circuit is mathematically described by an implicit equation: 60 a. ortiz-conde, f.j. garcía-sánchez, j. muci, a. sucre-gonzález 0 exp 1s th v r i i = i n v           . (4) fig. 3 diode equivalent circuit with a parasitic series resistance the terminal voltage can be mathematically solved from the previous equation as an explicit function of the terminal current: 0 ln 1 s th i v = r i n v i        . (5) the implicit terminal current equation given by (4) can be solved explicitly in terms of the terminal voltage if we introduce the use of the special lambert w function [29], [30]:  00 0 0 exp sth s s th th v i rn v i r i w i r n v n v             , (6) where w0 represents the principal branch of the lambert w function [31] which is a special function defined as the solution to the equation w(x) exp(w(x))=x. the lambert w function has already proved its usefulness in numerous physics applications [32], [33]. figure 4 presents aim-spice [34] simulations of a diode with several values of the series resistance. it is important to observe that the effect of rs is significant for the high voltage region and that the region where ln(i) is proportional to v decreases as rs increases. 2.2.1. vertical and lateral optimization methods the three parameters (n, i0 and rs) that fully describe the diode in terms of this lumped parameter equivalent circuit model can be extracted by fitting the cell's measured data to any of the model's defining equations. equations (4), (5) or (6) can be applied directly for fitting. vertical or lateral optimization could be used for fitting by minimizing either the voltage quadratic error or the current quadratic error, respectively. in the present case the use of equation (5) in combination with lateral optimization [25], [35] affords the best computational convenience, since this equation is not implicit, as (4) is, and does not contain special functions, as (6) does. figure 5 presents measurements of a silicon diode from motorola [25] and simulated i-v characteristic of a diode, in linear and logarithmic scales, using the parameters extracted by lateral optimization [25]. a review of parameter extraction in diodes and solar cells... 61 fig. 4 aim-spice simulations of a diode with several values of series resistance fig. 5 measured i-v characteristics of a silicon diode and its simulation using the parameter values extracted by lateral optimization [25] 2.2.2. integration method to extract series resistance and ideality factor following the idea of araujo and sánchez about the use of integration for parameter extraction [36], the drain current may be integrated by parts in combination with (5): 2 00 0 2 v i s th r i dv v i v di i n v i i v      . (7) assuming that i>>i0 the last term in the above equation can be neglected and we obtain [37], [38]: 2 0 2 v s th r i dv i n v i  . (8) 62 a. ortiz-conde, f.j. garcía-sánchez, j. muci, a. sucre-gonzález therefore, a plot of the numerical integration of the measured current with respect to voltage is represented by an explicit algebraic quadratic function of i, which requires a much simpler fitting procedure than the original implicit equation. kaminski et al later generalized this method [39] by allowing an arbitrary lower integration limit (vi, ii) instead of the origin, so that (8) becomes:   1 2i v s i thv i r i dv i i n v i i      . (9) 2.2.3. the integral difference function concept and the g method we proposed a different approach [22], [23] that does not start with the extraction of the parasitic series resistance value. instead, it does just the opposite. the proposed method is based on calculating an auxiliary function, or rather an operator, whose purpose is to eliminate the effect of the parasitic series resistance, retaining only the intrinsic model parameters. this new function was originally called "integral difference function," it is denoted "function d," and is defined as: 0 0 0 0 ( , ) 2 2 i v v i d v i v di i dv iv i dv v di iv         , (10) where d has units of "power." the integrals with respect to i and v are the device's "content" and "co-content", respectively, as shown in fig. 6. for simplicity's sake and without loss of generality, the lower limit of integration in (10) is taken at the origin, but it may equally be placed at any arbitrary point of interest along the device's characteristics. notice that adding the content and co-content, instead of subtracting them, as in (10), yields the device's total power. it can be proved that in any given lumped parameter equivalent circuit model only nonlinear branches produce non-zero terms, and thus they are the only elements that contribute to the total d seen at the terminals. this property embodies the essence of the function d's ability to eliminate parasitic resistances (linear elements) from device models. fig. 6 schematic illustration of the content (c) and co-content (cc) of a simple case of nonlinear function a review of parameter extraction in diodes and solar cells... 63 it is important to point out that function d may be understood as a representation or measure of the device's amount of nonlinearity, which for a linear element is obviously equal to zero. this description of function d, in terms of linearity, led us to refer to this function as the "integral non linearity function" (inlf) [40], [41], and to use it to quantify the non-linear behavior of devices and circuits in terms of distortion. applying function d to the case of a single-exponential diode model with series resistance and restricting the analysis to the region of the measured forward characteristics, where i>>i0, the substitution of (5) into (10) yields [22], [23]:  0ln( ) 2thd i n v i i  , (11) which does not contain rs. dividing this equation by the current yields an auxiliary function, which we call g, defined by:  0ln( ) [ln( ) 2]thg d i n v i i    . (12) since this function g is calculated from function d, it requires a numerical integration of the experimental data. when g is plotted against ln(i), according to (12) the resulting curve is a straight line, whose intercept and slope allow the immediate extraction of the values of i0, and n, respectively as is shown in fig. 7. the extracted values of n=1.03 and i0=0.55 na are very close to those previously obtained by lateral optimization. fig. 7 function g as a function of the logarithm of the current calculated from the measured i-v characteristics of a silicon diode (symbols) and a linear fit of its quasi linear portion (solid line) 2.2.4. norde's method this method [42] contains clever mathematical ideas and it was developed for schottky diodes with n=1. the following notation is adapted to conventional p-n junctions. norde defined the following function which we denominate by his name: 64 a. ortiz-conde, f.j. garcía-sánchez, j. muci, a. sucre-gonzález ln 2 th x iv norde v i         . (13) where ix represents an arbitrary value of the current. norde's function presents a minimum value (vmin, imin) which is independent on the selected value of ix. the location of this minimum value is obtained by differentiating the above equation and equating it to zero: 1 0 2 th vd norde d i d v i d v    (14) the derivative of v with respect to i is obtained from (5), and using n=1 yields: th s vd v r d i i   . (15) combining and solving the two previous equations at i=imin yields the series resistance: min th s v r i  . (16) using (4) with n=1, the reverse current parameter i0 is obtained: min 0 min minexp 1s th i i v i r v           , (17) where vmin is the value of the voltage at the minimum of norde's function. there are two main disadvantages of norde's method: 1) that the ideality factor n needs to be assumed to be equal to unity, and 2) that the parameters are extracted from only a few data points near the minimum of norde's function. nevertheles, this is a clever transition.type extraction method, which extracts the parameters from a region where both the diode and the resistance effects are significant. to test norde's method, we will use the same previous experimental data [25], whose parameters previously were i0 = 0.580 na , n = 1.05 and rs = 33.4 . since n = 1.05 and for the present method it should be unity, we will let vth= 1.05x0.259 v. the extracted values are: rs = 40  and i0 = 0.76 na, for the three selected values of ix, as illustrated in figure 8. figure 9 presents measured and simulated i-v characteristics, using the parameters extracted by norde's method. we observe that simulations agrees very well with experimental data for values close to vmin = 0.4 v. a review of parameter extraction in diodes and solar cells... 65 fig. 8 norde's function as a function of the voltage calculated from the i-v characteristics of the silicon diode for three values of ix, showing the minimum that defines the value of the series resistance fig. 9 measured (symbols) and simulated (solid lines) i-v characteristics of the silicon diode using the parameters extracted by norde's method one of the limitations of norde's method, namely that of having to fix n=1, has been removed by various authors [43]-[45]. for example, the following generalized norde's function has been proposed [43], [44]: ln th x iv norde v i           , (18) 66 a. ortiz-conde, f.j. garcía-sánchez, j. muci, a. sucre-gonzález where  is a new parameter, which for the particular case of =2 yields the original norde's equation. this function presents a minimum value (vmin, imin) which is also independent on the selected value of ix. the location of the minimum is obtained as before by differentiating (18) and equating it to zero: 1 0 th d norde v d i d v i d v      . (19) the derivative of v with respect to i is obtained from (5): th s n vd v r d i i   . (20) combining and solving the two previous equations at i=imin yields: min ( ) th s n v r i    . (21) using (4), the reverse current parameter i0 is obtained: min 0 min minexp 1s th i i v i r n v           , (22) where vmin is the value of the voltage at the minimum of norde's function. because there are only two equations available ((21) and (22)), and we need to extract 3 parameters (n, i0 and rs), at least two norde's plots with different values of  are needed. it is interesting to compare the generalized norde's function with the previous g function. if we make  tend to infinity and let ix = i0, the generalized norde's function is closely related to the g function by: 0x i i g norde n     . (23) 2.2.5. cheung's method cheung et al [45] proposed the following procedure to extract the idelity factor and the series resistance. using the identity: ln( ) d v d v i i d id i d i d v   (24) in combination with equation (20) yields: s th i r i nv d i d v   . (25) a review of parameter extraction in diodes and solar cells... 67 therefore, when the ratio of the current to the conductance ( i/(di/dv) ) is plotted against the current it should produce a straight line, as shown in figure 10, whose slope yields the series resistance and its intercept is nvth, implying in the example shown that rs = 33.3  and n = 1.17. fig. 10 ratio of the current to the conductance ( i/(di/dv) ) as a function of the current showing a straight line behaviour cheung et al proposed the following variation of norde's function [45]: ln th x i cheung v n v i         . (26) where ix is an arbitrary value of the current. rewriting (5) with the assumption i>> i0 yields: 0 ln s th i v = r i n v i        . (27) combining the two previous equations yields: 0 ln x s th i cheung r i nv i         . (28) therefore, when cheung's function is plotted against the current it should produce a straight line, as is shown in figure 11, whose slope yields the series resistance and its intercept the reverse current, implying in the present example shown that rs = 33.4  and i0 = 2.6 na. 68 a. ortiz-conde, f.j. garcía-sánchez, j. muci, a. sucre-gonzález fig. 11 cheung's function as a function of the current showing a straight line behaviour 2.3. single-exponential diode model with series and parallel resistances figure 12 presents the lumped parameter equivalent circuit model of a diode with a series parasitic resistance and two parallel parasitic conductances, one at the junction (gp1) and the other at the periphery (gp2). the mathematical description of the terminal current of this equivalent circuit is given by the implicit equation: 2 0 1 1 1 ( 1 ) exp 1 ( ) (1 ) s p s s p p s p th v + r g ir i = i + v ir g vg + r g n v              . (29) the above equation has the following solution for the terminal current as a function of the terminal voltage [46], [47]: 1 00 0 0 2 1 1 1 ( ) exp (1 ) (1 ) 1 pth s s p s th s p th s p s p v g inv i r v i r i w v g r nv r g nv r g r g                       , (30) and for the terminal voltage as a function of the terminal current the solution is: 0 12 20 12 2 0 2 12 0 12 2 exp ( ) th s th th i i r di r v nv d w i d r r i r nv d nv                             , (31) where w0 represents the principal branch of the lambert w function, and a review of parameter extraction in diodes and solar cells... 69 1 11 (1 )s pd r g  , (32) 2 21 (1 )s pd r g  , (33) and 12 1 2 1 21 ( )p p p p sr g g g g r   . (34) fig. 12 lumped parameter equivalent circuit model with a parasitic series resistance and two parallel parasitic conductances, representing two possible shunt current losses, one at the junction (gp1) and another at the device's periphery (gp2) 2.3.1. bidimensional fit of function d this integration-based procedure that was developed in 2005 [47] can be summarized as follows: first for convenience function d in (10) is rewriten as: 0 ( , ) 2 i d v i v di iv  . (35) secondly the terminal voltage given by (31) and its integral with respect to i are substituted into (35), which results in a long expression that contains lambert w functions and the variables v and i. thirdly, substituting all the terms that contain lambert w functions using equation (31), and after some algebraic manipulations, we can arrive at a form of function d(i,v) that is conveniently expressed as the following purely algebraic bivariate equation: 2 2 v1 i1 v1i1 v2 i2 ( , ) d d d d dd i v v i vi v i     , (36) where the five coefficients are given by: i1 0 1d 2 2 (1+ )s th p sr i n v g r   , (37) v1 1 2 0 2 1 2 d 2 +2 ( 1)+2 ( ) s th p p s p th p p r n v g g i r g n v g g   , (38) i2 1d (1+ )s p sr g r  , (39) 2 2 2 v2 2 1 1 2 2 1 2 d 2 p p s p p s p s p p g g r g g r g r g g      , (40) and the fifth coefficient is dependent upon the others: 2 i1v1 i2 v2 d 1+4 d d . (41) 70 a. ortiz-conde, f.j. garcía-sánchez, j. muci, a. sucre-gonzález as can be seen, there are actually four independent coefficients, (37)-(40), and therefore only four unknowns may be extracted uniquely. the general solution of n, i0, gp1, and gp2, in terms of rs, di1, dv1, di2 and dv2 is: i2 1 2 d + s p s r g r   , (42) v1 2 i1 i1 d d d 2 s s p th r r g n v     , (43) 2 1 v2 s 1 v2 2 1 1 2 1 4 d 4 r d 2 (1 ) s p s p p s s p r g r g g r r g         , (44) and 0 v1 1 i1 2 i1 1 v1 1 2 i1 1 (d d d + d d ) 2 p p p s p s p i g g g r g r g    . (45) it is important to notice that a set of values of di1 , dv1 , di2 and dv2 defines a unique i-v characteristic which can be generated with various combinations of rs, n, i0, gp1, and gp2. particular cases, which do not simultaneously include both conduntances gp1 and gp2, present specific solutions as presented in table 1. the parameter extraction procedure consists of fitting algebraic equation (36) to the d(i,v) function as numerically calculated from the experimental data with (35). this bidimensional (bivariate) fitting process produces the values of the equation coefficients dv1, di1, dv2, di2. these resulting values are then used to calculate the diode model parameters (gp1 or gp2, rs, n, and i0), as presented in table 1 for the particular cases. to illustrate this extraction method, it was applied to simulated i-v characteristics for the case of series resistance and only peripheral shunt loss, using parameters values of i0 = 1 pa, n = 1.5, gp1 = 0 and various combinations of rs and gp2 as is shown in figure 13. symbols used in this figure are not data points but are used to identify the several cases. the ideal case of rs = 0 and gp2 = 0, identified by large hollow squares, is a straight line. the case when rs = 1k is significant and gp2 = 0, identified by small solid squares, produces a straight line for low voltage that bends down for high voltage (i.e. the effects rs become important at high voltage). the case when only gp2 is significant (gp2 = 1s and rs = 0), is identified by small solid circles. it is a straight line at high voltage and bends up at low voltage (i.e. the effects gp2 are important at low voltage). when rs and gp2 are both simultaneously significant (rs = 1k and gp2 = 1s) is identified by large hollow circles. it is important to notice that the plot in this extreme case does not exhibit any region from which the intrinsic parameters could be obtained, because the overlapping effects of rs and gp2 totally conceal the intrinsic characteristics everywhere. this contrasts with the fact that the intrinsic parameters of this extreme case could not be directly extracted by any traditional method from any portion of its i-v characteristics. a review of parameter extraction in diodes and solar cells... 71 table 1 particular cases of a single-exponential diode model dv1 02 i 0 12 +2 th pi nv g 0 12 +2 th pi nv g 0 2 2 2 ( 1) +2 s p th p i r g n v g  di1 02 2s thr i n v  2 thnv 0 1 2 2 (1+ ) s th p s r i nv g r   02 2s thr i nv  dv2 0 1pg 1pg 2 2(1 )p s pg r g  di2 sr 0 1(1+ )s p sr g r sr gp1 0  dv2  dv2 0 gp2 0 0 0 v2 1 1 4 d 2 s s r r    rs  di2 0 1 i2 1 1 1 4 d 2 p p g g     di2 n i1 0 d 2 2 s th r i v   1 2 i th d v  i1 v1 d d 2 s th r v   0 i1 2 +d 2 s th r i v  i0 v1 d 2 v1 1 d 2nv 2 th p g v1 1 d 2 2 th p nv g i1 2 1 d 2 p v g d 72 a. ortiz-conde, f.j. garcía-sánchez, j. muci, a. sucre-gonzález the previously described combinations, as well as several other additional cases, were simulated and the quadratic equation of d as a function of current and voltage, defined in (36), was then used to extract the simulated parameters. in all cases the extraction procedure succeeded in producing the exact original parameters, within computational accuracy. this means that the errors between the original and the extracted parameters depend only on the computational precision and accuracy of the fitting algorithms used. it must be pointed out that in order to obtain reasonably accurate results, it is advisable that measurements use a small as possible voltage step (typically at most 10 mv). additionally, it is of paramount importance to use a suitable algorithm for numerical integration, that is, one that will not introduce significant error, such as a closed newtoncotes formula with 7 points, as illustrated in the appendix of [41]. fig. 13 illustrative synthetic i-v characteristics for various cases with several series resistance values and several peripheral shunt loss values. symbols are used to identify the several cases and do not represent data points 2.3.2. iterative g function method for the particular case of gp1=0 an iterative procedure was proposed in 2000 [48], which is based on the g function described in section 2.2.3. by estimating the value of gp2 (gp2e) we can calculate the current in the diode branch: 2d p e i i g v  . (46) then, function g is calculated from the measured i-v data and is plotted as a function of ln(id) for different estimated values of gp2e. selecting the plot that best fits a straight line will determine the correct value of gp2e =gp2. to illustrate the approach, we use simulated data with parameters values: i0=1 pa, n=1.5, rs=1 k and gp2=1 s. figure 14 presents several plots of the calculated function a review of parameter extraction in diodes and solar cells... 73 g, using the id defined in eq. (46), for several estimated values of gp2e. the best straight line of the function g with respect to ln(id) will define the correct value of gp2e. fig. 14 function g vs the logarithm of the id estimated using (46). the plots tend to a straight line (solid line) when the estimated value of gp2e approaches the actual value of gp2=1s 3. multiple-exponential diode model when modeling real junctions a single-exponential equation is usually not enough to adequately represent the several conduction phenomena that frequently make relevant contributions to the total current of a particular junction. in such cases junctions need to be represented by lumped multi-diode equivalent circuits. 3.1. double-exponential diode model with series resistance the first single-exponential model for a p-n junction with a unity ideality factor and a series resistance was proposed by shockley in 1949 [15]. in 1957, sah et al [18] presented the first double-exponential model for a p-n junction with series parasitic resistance and diode quality factors of n2=2n1 and n1=1. the lumped parameter equivalent circuit is illustrated in fig. 15. the mathematical description of this circuit is given by the following implicit equation: 0201 exp 1 exp 1 2 s s th th v i r v i r i = i i v v                            (47) the above implicit equation does not have an explicit solution for the terminal current, but it does have a solution for the terminal voltage as an explicit function of the terminal current [49]: 2 02 02 01 01 01 2 ln 1 2 2 s th i ii v = r i v i i i                . (48) 74 a. ortiz-conde, f.j. garcía-sánchez, j. muci, a. sucre-gonzález a global lateral fitting procedure based on (48) has been proposed to directly extract the diode's model parameters [49]. figure 16 presents the i-v characteristics of an experimental silicon pin lateral diode fabricated at the université catolique de louvain [49] measused at two temperatures. the model playback i-v characteristics calculated using the parameter values extracted using this global lateral fitting procedure are also shown in fig. 16. fig. 15 a double-exponential model with series resistance fig. 16 measured and simulated i-v characteristics of an experimental silicon lateral pin diode at two temperatures. the playback is calculated using the doubleexponential model, with diode quality factors of n2=2 and n1=1, and the rest of the parameter values extracted by a direct global lateral fitting of (48) to the data it is important to point out that this lateral fitting procedure may be used in general when the value of one diode quality factor can be assumed to be roughly twice the value of the other (n22n1) even if n11. it is also worth mentioning here that a doubleexponential model parameter extraction method, based on area error minimization between measured and modeled i-v characteristics, was recently proposed by yadir et al [50]. the essence of that method is closely related to integration-based extraction methods [23], [24]. 3.2. functions a and b another possible situation worth considering is represented by a double-exponential model where the values of the ideality factors are arbitrary, and all series resistances and shunt conductances are negligible, as illustrated by the equivalent circuit shown in fig. 17. the mathematical description of the terminal current of such a circuit is given by the following explicit function of the terminal voltage: a review of parameter extraction in diodes and solar cells... 75 0201 1 2 exp 1 exp 1 th th v v i = i -i n v n v                         . (49) additionally assume that diode 2 (n2, i02) is dominant at low voltage, the curren in that region may be approximated by: 02 2 exp 1 th v i -i n v            . (50) fig. 17 ideal double-exponential model with arbitrary ideality factors and without parasitic resistances substituting (50) into the following operators a and b, yields [51]: 0 2 02 v th i dvcc v a n v i i i i            (51) and 0 2 02 v th i dvcc i b n v i v v v           . (52) therefore, the application of either one of these two operators (51) or (52) to measured iv characteristics produces linear equations on the ratio v/i or on its reciprocal i/v, from whose slopes and intercepts the values of the ideality factor n2 and the reverse saturation current i02 may be directly extracted. figure 18 presents measurements of the base current as a function of forward baseemitter voltage of a power bjt measured at t =298 k with vbc=0. figure 19 shows plots of operators a and b applied to this measurements. the slope of a gives an extracted value of i02=215 pa, and its ordinates axis intercept gives an extracted value of n2=2. the slope of function b gives an extracted value of n2=1.98, and its ordinates axis intercept gives an extracted value of i02 =210 pa. it is worth mentioning that a vertical optimization method could also be used for this case, as is illustrated in fig. 20. 76 a. ortiz-conde, f.j. garcía-sánchez, j. muci, a. sucre-gonzález fig. 18 measured characteristics (symbols) of the base current as a function of forward base-emitter voltage of a power bjt measured at t =298 k, with vbc=0 and 10 mv voltage steps. also shown is the model playback simulated with (49) (solid line). the low and high voltage asymptotes (dashed lines) are also shown fig. 19 plots of operators a (51) and b (52) applied to the measured power bjt characteristics shown in fig. 18 a review of parameter extraction in diodes and solar cells... 77 fig. 20 measured characteristics (symbols) of the power bjt shown in fig. 18, and the model playback simulated with (49) (solid line) using the parameter values extracted by vertical optimization 3.3. regional approach for a double diode with series and parallel resistance this method is based on the idea that some components of the diode model dominate at a given voltage region [39]. let us assume a double-exponential model with arbitrary values of ideality factors and with parallel and series resistance as illustrated in figure 21. the mathematical description of this circuit is given by the following explicit equation: 0201 1 2 exp 1 exp 1 ( )s s p s th th v i r v i r i = i g v i ri n v n v                            . (53) fig. 21 double-exponential model with arbitrary ideality factors and parasitic series and parallel resistances figure 22 presents a particular simulation using (53) with specific parameter values in which we observe that for low voltage, diode 2 and gp are dominant. thus, equation (53) may be simplified for low voltages to the following explicit equation: 02 2 exp 1 p th v i g vi n v            , (54) 78 a. ortiz-conde, f.j. garcía-sánchez, j. muci, a. sucre-gonzález fig. 22 synthetic i-v characteristics simulated by (53) with the parameter values indicated inside the figure, together with the components dominant at low and high voltage, as calculated with the parameter values locally extracted using (54) and (56) (also indicated inside the figure) similarly, for high voltage, diode 1 and rs are dominant, thus equation (53) may be simplified for high voltages to to the following implicit equation: 01 1 exp 1s th v i r i i n v            , (55) although (55) is implicit, it has the following explicit solution for the terminal voltage: 1 01 ln 1 s th i v = r i n v i        . (56) therefore, the parameters can be extracted locally from two regions: 1) gp, n2 and i02 by vertical optimization of the low voltage region fitting equation (54) to the measured current; and 2) rs, n1 and i01 by by lateral optimization of the high voltage region fitting equation (56) to the measured voltage. figure 22 also includes the original and the parameter values extracted by this method. 3.4. alternative multi-exponential model with parasitic resistances figure 23 illustrates a multi-diode equivalent circuit. accordingly, the total current has been traditionally described by the following conventional implicit equation: 0 1 exp 1 ( ) n s k p s k k th v r i i = i g v r i n v              . (57) a review of parameter extraction in diodes and solar cells... 79 fig. 23 a conventional equivalent circuit of a real junction with multiple diodes in order to circumvent the explicit insolvability of the previous equation, we proposed [52] the use of the equivalent circuit presented in figure 24. by solving each branch separately and adding the solutions, this model's i-v characteristics may be expressed by the following explicit equation for the terminal current: 0 0 0 0 1 exp n k a th s k a k a s k a k a ka p a k s k a k a th k a th n v r i v r i i = w i g v r n v n v                   . (58) where as before w0 represents the principal branch of the lambert w function [31], gpa = 1/rpa is the alternative outer shunt conductance and the rest of the parameters are defined as before. notice that the single global series resistance, rs, present in the conventional model, has been substituted in this alternative model by individual series resistances, rska, placed in each of the kth parallel current paths associated with the kth conduction mechanism. fig. 24 alternative equivalent circuit with multiple diodes, resistances in series with each diode, and an outer shunt resistance figure 25 presents the i-v characteristics of a lateral pin diode at four temperatures from 300 to 390 k. model parameters were extracted, for both conventional and alternative double-exponential models, by globally fitting the logarithm of each model to the experimental data. the left figure also includes the corresponding alternative model playbacks while the right figure includes the corresponding conventional model playbacks. additional calculations of the playback errors relative to the original measured data indicate that the alternative model produces a more accurate representation of this device's forward conduction behavior at the four temperatures considered here. 80 a. ortiz-conde, f.j. garcía-sánchez, j. muci, a. sucre-gonzález fig. 25 measured (red symbols), alternative and conventional model playbacks (black solid lines) forward i–v characteristics of an experimental lateral pin diode at four temperatures 3.5. lateral optimization using an approximate analytical expression for the voltage in multi-exponential diode models whenever the conductance gp can be neglected in the model presented in fig. 23, the total current is described by the following conventional implicit equation: 0 1 exp 1 n s k k k th v r i i = i n v            . (59) we recently proposed [53] an approximate solution of the above transcendental equation for the terminal voltage as an explicit function of the terminal current valid for arbitrary n, i0k, and nk. this approximate solution is [53]: 1 0 ln 1 kn n m s th k k i v r i m v i                   . (60) where m represents an empiric dimensionless joining factor. it is important to note in (60) that at a any particular bias point (i, v) at which only one of the conduction mechanisms represented by one of the diodes in the model is dominant, the summation in (60) reduces to only one term. for the particular case of a model with just two parallel diodes (n=2) a review of parameter extraction in diodes and solar cells... 81 with arbitrary values of n1 and n2, the explicit approximate terminal voltage solution simplifies to: 1 2 01 02 ln 1 1 n n m m s th i i v r i m v i i                         . (61) to illustrate the applicability of this approximate model, we applied it to experimental i–v characteristics of lateral thin-film soi pin diodes. figure 26 presents the measured i–v characteristics of a device where parameter extraction was performed using lateral optimization by minimizing voltage errors at a given current. the extracted parameters and the joining factor are indicated in fig. 26, together with the lateral voltage error with respect to measured data. fig. 26 (upper pane) measured (red dotted lines) and model playback (black solid lines) of a lateral thin-film soi pin diode at 150 k in linear and logarithmic scales; and (lower pane) absolute lateral error of model playback with respect to the measured data 82 a. ortiz-conde, f.j. garcía-sánchez, j. muci, a. sucre-gonzález 4. single-exponential solar cell model 4.1. single-exponential model without any resistance consider an idealized solar cell without any parasitic resistance, whose i-v characteristics under illumination may be described by superposition of two currents: a voltage independent photo-generated current source and the current of a single exponential-type junction, as shown in fig. 27. fig. 27 idealized solar cell equivalent circuit without parasitic resistances the terminal current of this lumped parameter equivalent circuit model is mathematically described by the following explicit equation of the terminal voltage: 0 exp 1 ph th v i = i i n v            , (62) where the magnitude of the photo-generated current iph depends only on the illumination intensity. alternatively the terminal voltage may be expressed as an explicit function of the terminal current: 0 ln 1 ph th i i v = n v i       . (63) figure 28 shows simulated i-v characteristics of an idealized solar cell, in linear and logarithmic scales, under illumination. fig. 28 simulated dark and illuminated i-v characteristic of an idealized solar cell in linear and logarithmic scales a review of parameter extraction in diodes and solar cells... 83 the short circuit current (isc) and open circuit voltage (voc) can be found by evaluating (62) at v=0 and (63) at i=0, respectively, as: 0sc v ph i i i     , (64) and 0 0 ln 1 ph oc i th i v v n v i          . (65) the output power is given by the vi product. using (6.1) yields: 0 exp 1 ph th v p v i = v i i n v                  . (66) maximum output power will be delivered when (66) becomes maximun. differentiating (66) with respect to voltage and equating to zero yields the value of the voltage (vmpp) at the maximum power point (mpp):  0 0 0 0 2.718 2.718 1 1 ph ph mpp th th i i i v = n v w n v w i i                       , (67) where w0 stands for the principal branch of the lambert w function [31]. the corresponding current (impp) at the mpp is found by evaluating (62) at vmpp using (67). 4.2. single-exponential model with series resistance figure 29 presents the lumped parameter equivalent circuit model of a solar cell with parasitic series resistance. fig. 29 solar cell equivalent circuit with a parasitic series resistance as a consequence of the presence of the parasitic series resistance rs, the terminal current of this equivalent circuit is mathematically described by an implicit equation: 0 exp 1s ph th v r i i = i i n v            . (68) the terminal voltage can be mathematically solved from (68) resulting in an explicit function of the terminal current: 84 a. ortiz-conde, f.j. garcía-sánchez, j. muci, a. sucre-gonzález 0 ln 1 ph s th i i v = r i n v i        . (69) the implicit terminal current equation given by (68) can be solved explicitly in terms of the terminal voltage if we introduce the use of the special lambert w function [54]: 00 0 0 ( ) exp ( ) s phth s ph s th th v r i inv i r i w i i r nv nv                . (70) another consequence of the presence of the parasitic series resistance rs is that it prevents finding an exact analytical solution for the maximum power point, since equating the derivative of the vi product to zero does not allow to analytically solve for either vmpp or impp. figure 30 illustrates the effect of series resistance rs on linear and semilogarithmic scale i-v characteristics simulated under illumination with three values of rs. fig. 30 simulated i-v characteristic of a solar cell at different values of parasitic series resistance, in linear and logarithmic scale the open circuit voltage voc does not depend on rs, since its effect, given by irs, becomes zero when the current goes to zero (open circuit). thus, the value of voc is given by the same equation (65). on the other hand, the short circuit current isc can be found by evaluating (70) at v=0, as: 00 0 0 ( ) exp ( ) s phth s sc v ph s th th r i inv i r i i w i i r nv nv                 . (71) the four parameters (n, i0, rs and iph) that fully describe the solar cell in terms of this lumped parameter equivalent circuit model can be extracted by fitting the cell's measured data to any of the model's defining equations. equations (68), (69) or (70) can be applied directly for fitting. vertical or lateral optimization could be used for fitting by minimizing either the voltage quadratic error or the current quadratic error, respectively. in the present case the use of equation (69) in combination with lateral optimization affords the best computational convenience, since this equation is not implicit, as (68) is, and does not contain special functions, as (70) does. a review of parameter extraction in diodes and solar cells... 85 4.2.1. first integration method to extract the series resistance of solar cells to the best of our knowledge, araujo and sánchez were the first to propose, back in 1982, the use of integration for parameter extraction in solar cells [36]. they used the integral of (69), assuming that i >> i0 and iph >> i0 , to obtain the relation:  2 0 0 0 ln ln 2 i ph phs th th ph th ph i i ir v di i n v i n v i i n v i i i                  . (72) evaluating (72) at an upper limit of integration i=isc, the series resistance rs can be evaluated as: 0 2 2 2 2 sci th oc s scsc sc v di n v v r ii i     . (73) 4.3. single-exponential model with parallel resistance figure 31 presents the lumped parameter equivalent circuit model of a solar cell with parallel series resistance. fig. 31 solar cell lumped parameter equivalent circuit model with parasitic parallel conductance the mathematical description of the terminal current of this equivalent circuit is given in terms of the terminal voltage by the explicit equation: 0 exp 1 ph p th v i = i i g v n v             . (74) the terminal voltage can be solved from the above equation as an explicit function of the terminal current if we use the special lambert w function: 0 00 0 exp ph ph th th p th p p i i i i i ii v nv w nv g nv g g                  . (75) as a consequence of the presence of the parallel conductance gp an exact analytical solution for the maximum power point is not possible, since equating the derivative of the vi product to zero does not allow to analytically solve for either vmpp or impp. figure 32 illustrates the effect of parallel conductance gp on linear and semilogarithmic scale i-v characteristics simulated under illumination with three values of gp. 86 a. ortiz-conde, f.j. garcía-sánchez, j. muci, a. sucre-gonzález fig. 32 simulated i-v characteristic of a hypothetical solar cell for three values of parallel conductance in linear and logarithmic scales for this particular case we find the short circuit current by evaluating (74) at v=0, yielding isc = -iph, which is independent of the value of gp. the open circuit voltage voc is obtained by evaluating (75) at i=0, yielding: 0 00 exp ph ph oc th th p th p p i i i ii v nv w nv g nv g g                . (76) the four parameters (n, i0, gp and iph) that fully describe the solar cell in terms of this lumped parameter equivalent circuit model can be extracted by fitting the cell's measured data to any of the model's defining equations. equations (74) or (75) can be applied directly for fitting. vertical or lateral optimization could be used for fitting by minimizing either the voltage quadratic error or the current quadratic error, respectively. in the present case the use of equation (74) in combination with vertical optimization affords the best computational convenience, since this equation is explicit and does not contain special functions, as (75) does. 4.4. single-exponential model with series and parallel resistances figure 33 presents the lumped parameter equivalent circuit model of a solar cell with series resistance and parallel conductance. the mathematical description of the terminal current of this equivalent circuit is given by the implicit equation: 0 exp 1 ( ) s s p ph th v i r i = i + v i igr n v            . (77) fig. 33 solar cell equivalent circuit with parasitic series and parallel resistances a review of parameter extraction in diodes and solar cells... 87 the use of the special lambert w function allows the above equation to be explicitly solved [54] for the terminal current as a function of the terminal voltage: 0 00 0 ( ) ( ) exp (1 ) (1 ) 1 s ph p phth s s th s p th s p s p v r i i vg i inv i r i w r nv r g nv r g r g                   (78) and for the terminal voltage as a function of the terminal current: 0 00 0 1 exp ph ph th s th p th p p p i i i i ii v nv w i r nv g nv g g g                           . (79) an exact analytical solution for the maximum power point is not possible in this case either, because equating the derivative of the vi product to zero does not allow to analytically solve for either vmpp or impp. figure 34 illustrates the effect of series resistance and parallel conductance gp on linear and semilogarithmic scale i-v characteristics simulated under illumination with different values of rs and gp. fig. 34 simulated i-v characteristic of a hypothetical solar cell with different values of series and parallel resistance in linear and logarithmic scales the short circuit current is found by evaluating (78) at v=0, yielding: 0 00 0 ( ) ( ) exp (1 ) (1 ) 1 s ph phth s sc s th s p th s p s p r i i i inv i r i w r nv r g nv r g r g                 , (80) and the open circuit voltage voc is obtained by evaluating (79) at i=0, yielding: 0 00 0 exp ph ph oc th th p th p p i i i ii v nv w nv g nv g g                . (81) 4.4.1. vertical optimization the implicit terminal current equation (77) could be directly fitted to the experimental data to extract the model parameters. however, a more convenient way [55], [56] would be to use instead the explicit equation (78) for the terminal current as a function of the 88 a. ortiz-conde, f.j. garcía-sánchez, j. muci, a. sucre-gonzález terminal voltage. of course, this implies having a lambert w function calculation añgorithm implemented within the data fitting software. del pozo et al [55] propose following this route by using matlab's non-linear curve fitting routine "lsqcurvefit." this vertical optimization procedure (minimizing the current quadratic error) allows the extraction of all the parameters at the same time, but it frequently requires using good initial estimates of the parameters. 4.4.2. extraction from the co-content function model parameters can be extracted from the integrals of the illuminated i-v characteristics. the integral with respect to the voltage is known as the co-content cc(i,v). for an illuminated solar cell it is defined as [23], [24]:   0 ( ) v sc cc i,v i i dv   . (82) the lower limit of integration in the above equation is defined at the point v=0, i=isc. substitution of (78) into (82) and integrating with respect to v results in a long expression that contains lambert w functions and both variables v, and i. replacing the terms that contain lambert w functions of v, using equation (78), and after some algebraic manipulations, the function cc(i,v) may be conveniently expressed for the solar cell as a purely algebraic equation of the form: 2 2 v1 i1 i1v1 v2 i2 ( ) c ( ) c ( ) c c ( ) sc sc sc cc i,v c v i i v i i v i i        , (83) where the five coefficients are given in terms of the model parameters by: 2 i1 0 c ( ) (1+ ) s ph sc th p s sc s p r i i i nv g r i r g     , (84) v1 0c ( )ph sc th p sc s pi i i nv g i r g      , (85) i2 (1+ ) c 2 s p s r g r  , (86) v2 c 2 p g  , (87) and the fifth is a coefficient that is dependent on the others: i2 v2 i1v1 1 1 16 c c 2 c  . (88) as can be realized from (88), there are actually only four independent coefficients, (84)-(87), and therefore only four unknowns may be extracted uniquely. however, all the model parameters may be extracted. the extraction procedure consists of performing bivariate fitting of algebraic equation (83) to the co-content function cc as numerically calculated from the experimental data using (82). this bivariate fitting process yields the values of the four equation coefficients cv1, ci1, cv2, ci2, which are then used to calculate the solar cell's model parameters gp, rs, iph, n and i0 as follows. the value of the shunt loss is calculated directly from (87): a review of parameter extraction in diodes and solar cells... 89 v2 2 c p g  . (89) the value of the series resistance is calculated by substituting (89) into (86) and solving the resulting quadratic equation: v2 i2 v2 1 16 c c 1 4 c s r    . (90) the value of the junction quality factor is calculated by substituting (89) and (90) into (84) and (85) and solving the two equations to yield: v1 v2 i2 i1 v2 v2 ( 1+16 c c 1) 4 c c 4 c th c n v    . (91) the value of the photo-generated current is obtained assuming i0<1. the graph plotted below in fig. 2(a) is for set 1 and the point of contact for tltlm and dual layer is considered to be point 0. the length of the tltlm contact region is 10 µm and the dual layered region is 20µm. it shows the flow of current starting from the point at which current exits one tltlm contact and enters the dual layer. here, i1 (solid line) is the current flowing in layer b, i2 (dotted line) in layer a and i3 (fine dotted line) represents the total current in the tltlm contact. the graph illustrates that i1 distributes itself and tends to remain constant at a ratio of rsa:rsu (i.e. i1:i2=4:6) in the dual layered structure. but, the flow is disturbed near the point when current enters the tltlm structure from the dual layer. 262 n. shrestha, g. k. reeves, p. w. leech, y. pan, a. s. holland the fem result confirms the same as shown in fig. 2(b). the voltage contour is uniform at the middle of the dual layer, and the contour is disturbed near the tltlm contact region. fig. 2 (a) current distribution in one tltlm contact and dual layer contact regions for a test structure with rsa=40 ω/sq, rsu= 60ω/sq, ρca=8e-7ω.cm 2 , ρcu= 1.6e-6ω.cm 2 with dual active layer of 20m (b) distribution of corresponding voltage contours, determined by fem for the test structure fig. 3(a) shows results for set 4 when all the other parameters are same as set 1 except the length between the contacts is reduced to 5.2m. because of this, the current flow with ratio rsa:rsu is over a shorter length. this leads to greater variation in total resistance. the higher error value of ~5% between rtot(std) and rtot(fem) shows that the total resistance between contacts is not given by rsh*l/w. fig. 3 (a) current distribution in a tltlm contact and dual layer contact regions for a test structure with rsa=40 ω/sq, rsu= 60ω/sq, ρca=8e-7ω.cm 2 , ρcu= 1.6e-6ω.cm 2 with a dual active layer of 5.2m in length (b) distribution of corresponding voltage contours, determined by fem for the test structure analytical test structure model for determining lateral effects of tri-layer ohmic contact... 263 4. conclusion two variations of transmission line model networks, namely tri-layer tlm and a dual layer network, for modelling current in semiconductor contact regions, were combined to model a test structure with multiple layers. a comparison of the mathematical analysis and a two-dimensional finite element model of the test structure with two metal contacts to a dualactive layer, show that the combination of tltlm and the dual-layer network expressions provides accurate analysis for these test structures. the limitations on the accuracy of expressions have been presented in terms of the  parameter. the distribution of current through the dual-layer and tltlm contact region is discussed in detail to understand its influence on the total resistance of the test structure. this distribution is accurately represented by the combined tltlm-dual-active layer model investigated which is an improvement on models where the current distribution and sheet resistance is considered uniform between contacts. references [1] a. m. collins, y. pan, a. s. holland, “using a two-contact circular test structure to determine the specific contact resistivity of contacts to bulk semiconductors”, facta universitatis, electronics and energetics, vol. 28, no. 3, september 2015, pp. 457-464. [2] y. pan, a. m. collins and a. s. holland, "determining specific contact resistivity to bulk semiconductor using a two-contact circular test structure", in proceedings of the ieee international conference on miel, may 2014, pp. 257-260. [3] v. gudmundsson, p. hellstrom, and m. ostling, “error propagation in contact resistivity extraction using cross-bridge kelvin resistors,” ieee trans. electron devices, vol. 59, no. 6, pp. 1585–1591, june 2012. [4] a. s. holland, g. k. reeves, “new challenges to the modelling and electrical characterisation of ohmic contacts for ulsi devices”, in proc. of the 22nd international conference on microelectronics (miel 2000), vol. 2, niš, serbia, pp. 461-464, 2000. [5] w. shockley, “research and investigation of inverse epitaxial uhf power transistors”, air force atomic laboratory, wright-patterson air force base, rep. no. al-tdr-64-207, sept. 1964. [6] g. k. reeves, h. b. harrison, “an analytical model for alloyed ohmic contacts using a tri-layer transmission line model”, ieee trans. electron devices, vol. 42, no. 8, p. 1536. 1995. [7] g. k. reeves, a. s. holland, p.w. leech, “influence of via liner properties on the current density and resistance of vias”, in proc. of the 23rd international conference on microelectronics (miel 2002), vol. 2, niš, yugoslavia, pp. 535-538, 2002. [8] y. li, g. k. reeves, h. b. harrison, “correcting separating errors related to contact resistance measurement”, microelectronics journal, vol. 29, 1996. [9] g. k. reeves, h. b. harrison, “using tlm principles to determine mosfet contact and parasitic resistance”, solid-state electronics, vol. 41, no.8, 1997. [10] y. shiraishi, n. furuhata, a. okhamoto, “influence of metal/n-inas/interlayer/n-gaas structure on nonalloyed ohmic contact resistance”, journal of applied physics, vol. 76, p. 5099, 1994. [11] h. h berger, “models for contacts to planar devices”, solid state electronics, vol. 12, 1972. appendix the expressions for the tltlm √* { √( ( )⁄ )} + (a1) √ { √( ( )⁄ )} (a2) (a3) 264 n. shrestha, g. k. reeves, p. w. leech, y. pan, a. s. holland matlab code %calculation of f factor rs=rsa+rsu; rsh=(rsu*rsa)/(rsu+rsa); al=sqrt(rs/pcu); c=(rs/pcu)+(rsa/pca); z=sqrt((c*c)-(4*rsu*rsa/(pcu*pca))); a=sqrt((c-z)/2); b=sqrt((c+z)/2); d=al*pcu*coth(al*l); e=al*pcu*(rsa*cosh(al*l)+(f1*rs-rsa))/(rs*sinh(al*l)); h=((b*((rsu-(pcu*a*a))*tanh(a*d)))-(a*((rsu-(pcu*b*b))*tanh(b*d))))/((b*ba*a)*tanh(b*d)*tanh(a*d)); g=rsa*(b*tanh(a*d)-a*tanh(b*d))/((b*b-a*a)*tanh(b*d)*tanh(a*d)); f2=(e+g)/(d+h+g); % current flow dual active layer a=rsh/rsu; y1=sqrt(pcu/(rsu+rsa)); i11=i0*(a+((f2-a)*sinh((l-x)/y1)/sinh(l/y1))+((f1-a)*sinh(x/y1)/sinh(l/y1))); i21=i0-i11; % current flow tltlm contact c=((rsa+rsu)/pcu)+(rsa/pca); z= (c*c)-(4*rsu*rsa)/(pcu*pca); a=sqrt((c-sqrt(z))/2); b=sqrt((c+sqrt(z))/2); p= f2*(rsu-pcu*a*a)-(1-f2)*rsa; q= f2*(rsu-pcu*b*b)-(1-f2)*rsa; i12=(i0/(pcu*(b*b-a*a)))*((p*sinh(b*(d+y))/sinh(b*d))-(q*sinh(a*(d+y))/sinh(a*d))); i23=(i0/(rsa*pcu*(b*b-a*a)))*((p*(rsu-pcu*b*b)*sinh(b*(d+y))/sinh(b*d))-(q*(rsu pcu*a*a)*sinh(a*(d+y))/sinh(a*d))); itot=i0-(i12+i23); plot(x,i11); hold on; plot(y,i12); hold on; plot(x,i21); hold on, plot(y,i23); hold on; plot(y,itot); % contact resistance c1=((rsa+rsu)./pcu)+(rsa./pca); z1= (c1.*c1)-((4.*rsu.*rsa)./(pcu.*pca)); a1=sqrt((c1-sqrt(z1))/2); b1=sqrt((c1+sqrt(z1))/2); k1=rsu./(pcu.*w.*(b1.*b1-a1.*a1)); analytical test structure model for determining lateral effects of tri-layer ohmic contact... 265 x1=tanh(b1*d); y1=tanh(a1*d); p1=f2.*(rsu-(pcu.*a1.*a1)); q1=f2.*(rsu-(pcu.*b1.*b1)); k1=(1-f2).*rsa; rc=k1.*(((p1-k1)./(b1.*x1))-((q1-k1)./(a1.*y1))); % total resistance using standard formula rtot=2*rc+(rsh*l/w) % total resistance using yao et al. formula when dual layer is longer beta=2*((((f1+f2)*rsu)/(2*rsh))-1); bcor=(rsh*beta)/(w*al); rtotli=2*(rc+(bcor/2))+(rsh*l)/w; 10414 facta universitatis series: electronics and energetics vol. 35, no 3, september 2022, pp. 393-403 https://doi.org/10.2298/fuee2203393p © 2022 by university of niš, serbia | creative commons license: cc by-nc-nd original scientific paper solar energy potential in freiburg, graz, maribor, banja luka, niš, and athens milica preradović university of banjaluka, faculty of mechanical engineering, banjaluka, republic of srpska, bosnia and herzegovina abstract. this paper presents a comparative analysis of solar energy potential for six different cities, in six different countries in europe: freiburg (germany), graz (austria), maribor (slovenia), banja luka (bosnia and herzegovina), niš (serbia), and athens (greece). data processed in this work are accessed from photovoltaic geographical information system (pvgis). photovoltaic technology is crystalline silicon, and installed peak photovoltaic power is 5 kwp. the aim of the work is to find out whether there are statistically significant differences among the cities in relation to monthly energy production in regard to different types of photovoltaic system (fixed – free standing, fixed – building integrated, inclined, and two axis solar power plants). the work is based on four hypotheses. the estimation of solar energy production in different regions is very important for determination of potential regions suitable for generation of renewable and sustainable energy. key words: solar panels, photovoltaic technology, crystalline silicon, pvgis 1. introduction different factors have impact on the amount of incoming solar radiation to the earth. the most important factors are: geographical latitude, part of the year and day, atmosphere condition, cloud status, surface disposition, and orientation. these information are important for planning and installing of photovoltaic systems [1]. in this paper, solar energy potential for six different locations in europe (freiburg, graz, maribor, banja luka, nis, and athens) has been compared. those six cities were selected in order to see the differences in the amount of produced electricity from photovoltaic systems. cities like freiburg, graz, and maribor have developed pv systems for electricity generation, while banja luka, niš, and athens, are on the ascending path in regard to application and use of solar energy. different types of photovoltaic systems were used for this comparison: fixed – free standing, fixed – building integrated, inclined, and two-axis solar power plants. received january 10, 2022; revised march 19, 2022; accepted march 23, 2022 corresponding author: milica preradović university of banjaluka, faculty of mechanical engineering, 71 vojvode stepe stepanovića, 78000 banjaluka, republic of srpska, bosnia and herzegovina e-mail: milica.preradovic@student.mf.unibl.org 394 m. preradović freiburg and graz have been green model cities from the late 1980s. both cities are midsized, with less than 500 000 inhabitants, and both cities are administrative centers of their regions. freiburg was ‘germany’s environmental capital’ in 1992, for its ecological accomplishments. in 2010, freiburg received another award, ‘federal capital of climate protection’, and in 2012, ‘most sustainable large city of germany’. graz has been awarded many times for its achievements in field of ecology and sustainability (‘greenpeace climate protection award’ in 1993 and the ‘sustainable energy europe award’ in 2008). in 1996, graz has received, as the fist city in europe, the ‘international sustainable city’ award by the european union [2]. freiburg is also called ‘europe’s solar city’. vauban is the neighborhood in freiburg, which is one of the most sustainable city neighborhoods worldwide. in this city district, the majority of houses have solar energy generation on-site (mostly from the rooftop pv panels). the surplus electricity is sold to the municipal grid [3]. the international headship of freiburg in urban sustainability began in the 1970s, after successful anti-nuclear protests in the city [4]. federal state government has intended to build nuclear power plant in the rural area north of the city. because of the strong resistance of the city’s citizens, the government plans have not been realized and therefore, freiburg is called ‘birthplace’ of the environmental movement [5]. freiburg is also one of the sunniest locations in germany. city has incorporated many branches – community, business, energy, scientific comunity, education, construction, tourism with civil society together with the help from local and national levels to become a world leader in solar energy [4]. in graz, in the first half of 1990s, many environmental proposals and projects were arranged (‘ecocity 2000’, ‘municipal energy and climate concept’, ‘eco-profit’, and ‘eco-drive’). at the same time, graz became the first austrian representative of ‘climate alliance of european cities’, with the aim to reduce greenhouse gas emission for 50 per cent until 2010 (with 1987 as the baseline). graz also embraced energy constricting plans for the renovation of buildings and the transition to district heating or renewable fuels. also, city has set in motion a ‘solar initiative’ that supports the feeding-in of solar thermal energy into the district heating system during summer [2]. in graz, the first smart city community is being developed. in this district, new energy technologies for energy self-sufficient cities are established. the smart city graz project is examining innovations like solar modules, solar cooling systems, solar power generation in urban areas, mini-chp-facilities (combined heating and power), integrated façade technologies and smart heat grids, with their application in demonstration buildings [6]. in maribor, the faculty of energy of the university of maribor is an important institution in the disciplines of thermo-energetics, hydropower, nuclear power, renewable and alternative energy sources. the emphasis of the research is on pv systems. the institute of energy technology possess a park of renewable energy sources, which comprises nine tracking pv systems. this renewable resources park aims to study various networking systems for examination of new elements that are components of a smart grid. pv systems in the park are coupled to the distribution grid [7]. another paper from seme et al. [8] presented a overview of performance study of pv systems in slovenia. total of 91% of the pv systems in slovenia have a peak power of 50 kwp or less. this is because of the energy law that prevents installations of higher power [8]. however, in recent years in slovenia, feed-in tariff has influenced the growth of the pv market, which triggered the lower prices of pv technologies [9]. dravske elektrane maribor is the major renewable electricity manufacturer in slovenia. it got a permit for segment five of the zlatoličje solar solar energy potential in freiburg, graz, maribor, banja luka, niš, and athens 395 power plant. this segment of the solar power plant will be installed on the left bank of the outflow canal of the biggest slovenian hydro power plant zlatoličje. a planned yearly production of 5 820 pv modules with a power of 2.7 mwp will be 3 gwh [10]. the republic of srpska holds a huge potential for electricity production utilizing pv systems. the promotion of renewable energy is secured by renewable energy in may 2013 together with the decision of the regulatory commission for energy of the republic of srpska on the charge level and premium prices. the republic of srpska gives a priority to grid connection for renewable energy source operators and proposes incentives for external investors. the solar energy laboratory of the academy of sciences and arts of the republic of srpska was developed in 2012, as an outcome of the scientific research projects on renewable energy sources – particularly solar energy. on one rooftop, in october 2012, fixed on-grid solar power plant (power 2.08 kwp, monocrystalline silicon solar cells) was installed. the solar power plant is equipped with accompanying tools for supervising, acquisition, and data obtaining, and measuring. with the help of this pv power plant, the effects of solar radiation strength, air temperature, wind speed, and air humidity on the energy efficiency of the pv solar power plant in the banja luka region can be constantly observed. two years later, in 2014, another solar system was installed additionally to the solar energy laboratory – solar box, which comprises a metallic base with five pv solar modules made of polycrystalline silicon, with distinct power of 50 w. three solar modules are placed vertically and positioned to the east, south, and west, respectively. the fourth solar module is placed horizontally, and the fifth is at an angle of 33° to the south. additionally, in october 2017, a two-axes tracking pv system was appointed on the roof of the academy of sciences and arts of republic of srpska. this system contains electronic, mechanic, and measuring subsystem. in 2020, in the republic of srpska, 42 electricity producers used pv systems of up to 250 kw [11]. following papers [12,13,14,15,16] contain great amount of material on the solar potenitals to generate electricity from pv solar plants in the republic of srpska. serbia’s solar centers are located in niš, zrenjanin, and novi sad. faculty of sciences and mathematics (fsm) in niš occupies a solar energy laboratory that studies physical features of the flat-plate thermal and hybrid solar radiation collectors, solar cells and pv solar power plants. also, in niš, faculty of electronic engineering possess contemporary laboratory for electronic exploring of rotational pv systems for optimum solar radiation incidence. faculty of technical sciences in novi sad owns renewable and distributed energy sources laboratory devoted to the investigation in the field of renewable energy, mostly in the wind and solar energy conversion and energy storing. in zrenjanin, faculty of technical sciences m. pupin, has a solar energy laboratory that focuses on flat-plate thermal and pv modules [17]. studies [17 – 21] contain relevant information on solar energy in serbia. greece is considered to be very attractive country in terms of investing in solar photovoltaics [22]. solar thermal market in greece is well explained in the [23]. starting in 2011, there were many policy attempts to promote solar investing. those efforts positioned greece at the leading position in global rankings for solar power share in electricity production, in just three years. but domestic pv market decreased in the time period from 2014 to 2017 to 1% of its 2013 range. this widespread closure of solar energy was directly in relationship with regulatory response to economic effects of the policy agenda very plentiful twenty-year-feed-in-tariffs provided for great scale developments, remaining at high levels despite the fact that costs have dropped. policy makers were forced to apply retroactive tariffs cuts. however, it could be fairly related to the energy-linked 396 m. preradović effects of political and economic insecurities, like the construction of new traditional power plants, and constant economic stagnation. another barrier for advanced development of solar power in greece can be contemporary immaturity of the economy, in terms of strategy and trade models, to motivate consumers to generate and accumulate clean energy locally [22]. currenlty, greece generates solar irradiation generally with flat plate collectors for low-temperature heating applications and with pv [24]. 1.1. general information on selected cities geographical information on freiburg, graz, maribor, banja luka, niš, and athens, are given in the following table (tab. 1). athens is at the same time the southernmost and easternmost city from the selected, freiburg is the northernmost and westernmost city from the selected. more details are presented in the following table. table 1 information on selected cities [29] parameter freiburg graz maribor banja luka nis athens geografical latitude (˚) 48.0005 47.071 46.5621 44.772 43.3187 37.982 geographical longitude (˚) 7.832 15.438 15.65 17.188 21.893 23.727 optimal angle for fixed solar power plants (˚) 36 37 36 34 fs: 34* bi: 33* fs: 32* bi: 31* optimal angle for inclined axis (˚) 38 39 38 36 36 34 elevation (m) 263 364 275 167 198 84 * fs – freestanding solar power plants, bi – building integrated solar power plants, only niš and athens have different values for optimal angle for fixed fs and bi solar power plants, all the other cities have the same optimal angles for fs and bi solar power plants. given elevation is accessed from pvgis and is related to free-standing solar power plants solar energy capacity and production of selected countries are presented in the following table (tab. 2). table 2 solar energy capacities and solar energy production in germany, austria, slovenia, bosnia and herzegovina, serbia, and greece in 2019 [25] country solar energy capacity (mw) solar energy production (gwh) germany 49 047 46 392 austria 1 702 1 702 slovenia 264 303 bosnia and herzegovina 22 30 serbia 23 14 greece 2 834 4 429 as it can be seen from this table, germany has the greatest solar energy capacity and the greatest solar energy production, whereas serbia and bosnia have the lowest solar energy capacity and the lowest solar energy production. greater solar energy capacity of the country, means larger solar energy production. solar energy potential in freiburg, graz, maribor, banja luka, niš, and athens 397 2. goals materials and methods the goal of this work is to analyze differences in the projected solar energy production (kwh) between six cities. also, the payback time for the installation of photovoltaic system (5 kw) is calculated for all six cities. [1] have studied solar radiation atlas for banja luka and it was concluded that there are no significant deviations of energy of global and direct solar radiation that fall on the horizontal and optimally positioned surface. in this work differences in solar energy potential were statistically analyzed between following cities: freiburg, graz, maribor, banja luka, niša, and athens. pvgis was established at the joint research centre (jrc) of the european commission within its renewable energies unit as a geographical information systems (gis) tool for the evaluation of performance solar pv systems in different geographical regions. it supplies data for technical, environmental, and socio-economic analysis of solar pv electricity generation [26,27]. the pvgis data base [28] consists of satellite data from four different meteorological sources: photovoltaic geographical information system on climate monitoring satellite application facility – pvgis-cmsaf, surface solar radiation data set heliostat pvgis-sarah, data produced by the european center for medium-range weather forecast – pvgis-era5, and consortium for small scale modelling – pvgiscosmo. the cmsaf data are obtained in this work. the cmsaf solar surface irradiance retrieval is built on radiative transfer calculations, where satellite-derived parameters are used as input. it is the part of the european organization for the exploitation of meteorological satellites (eumestat) ground segment and of the eumestat network of satellite application facilities. pvgis-cmsaf aims to generate climate data records, which are time series of certain length, stability and excellence to discover climate variability and differentiations. available data are from time period between 2007 and 2016 [29]. 2.1. pvgis method – explanation as it is described on the european commission’s science and knowledge service, the first stage in the calculation of solar radiation from satellite is the estimation of satellite images in order to see effects of clouds on the solar radiation, because they can reflect the arriving sunlight and so it comes to reduction of radiation that comes to the earth’s surface. cloud reflectivity can be estimated, when the same satellite image pixel is observed at the identical time every day in a month. the darkest pixel during a month denotes the state of the clearest sky, which means there are no clouds. the cloud reflectivity of other days is estimated relative to the clear-sky day. the same is applied for all hours in one day. so, on that way, effective cloud albedo could be estimated [30]. the second step contains calculations of the solar radiation of clear-sky states, with the help of radiative transfer theory in the atmosphere, together with the information on atmosphere aerosols quantity and the amount of water vapor and ozone concentration, because water vapor and ozone do attract radiation at certain wavelengths. the overall solar radiation is estimated from the cloud albedo and the clear-sky irradiance. this method achieves good results, but may be neglect in some occasions, i.e., when snow covers the ground. the snow could seem like clouds in case that the method determines very low irradiance. the aerosol data used in the method is average over longer period of time, and sudden changes in aerosols (due volcanic eruptions or dust storms) are not took into account in this method [30]. 398 m. preradović previously described method computes global and beam irradiance on a horizontal plane. but units and pv systems are placed at an inclined angle with respect to the flat plane or on tracking systems towards maximization of the incoming in-plane irradiance. in this case, the satellite-based values are not characteristic for the solar radiation obtained at the module surface, and it is crucial to evaluate the in-plane irradiance. for estimation of the values of the beam and diffuse constituents on sloped planes, the irradiance values on the horizontal plane of global and diffuse and/or beam irradiance components are needed. the addition of those gives the in-plane global irradiance on a sloped surface. straight from the solar disc originates the beam irradiance, and its value on a sloped surface can be retrieved from the value on the horizontal plane when position of the sun in the sky and precise placement of the inclined surface is known. however, the estimation of the diffuse irradiance over sloped surfaces cannot be easily calculated, because it can be dispersed by the atmosphere. in this case, models for defining of diffuse component are classified into two categories, isotropic and anisotropic. the first category takes into account equal distribution of diffuse irradiance over the sky. therefore, the diffuse irradiance on a sloped surface is same as the value on the horizontal plane scaled by the factor that depends only on the surface inclination and represents the portion of the sky, which can be seen from the plane’s surface. but the diffuse irradiance is almost never isotropic. the estimation model used in pvgis is anisotropic of two components, it can differentiate among clear and cloudcovered sky states and bright and shaded surfaces [30,31]. 2.2. statistical tests used for the calculations data and results are shown in tables and graphs. the analytical-statistical tool spss, version 24, was used for obtaining the data. applied statistical tests were kruskal wallis test, which determines whether three or more samples do originate from the same population. statistically significant differences were obtained by mann-whitney test that determines whether two samples originate from the same population [32]. the wilcoxon rank-sum test, also known as mann-whitney u test, analyses the differences in population means, when the populations are not normally distributed. first assumption that is necessary is that the population must be continuous, and the second assumption that is necessary, their probability density functions need to have same shape and size [33]. the mann-whitney u test calculates the statistic value u for each group. mathematically, the mann-whitney u statistic for each group is expressed by next equations [34]: 𝑈𝑥 = 𝑛𝑥 𝑛𝑦 + ( (𝑛𝑥(𝑛𝑥+1)) 2 ) − 𝑅𝑥 (1) 𝑈𝑦 = 𝑛𝑥 𝑛𝑦 + ( (𝑛𝑦(𝑛𝑦+1)) 2 ) − 𝑅𝑦 (2) where, nx describes the number of observations or number of participants of the first group, ny describes the number of observations or number of participants of the second group, rx represents the ranks sum of the first group, and ry is the sum of the ranks of the second group. equations (1) and (2) can be seen as the number of times observation in one sample precede or follow observation in the other sample, after all the score from one group is placed in ascending order. the null hypothesis can be either rejected or accepted, after the calculation of u value and the appropriate statistical threshold (𝛼) [34]. solar energy potential in freiburg, graz, maribor, banja luka, niš, and athens 399 the kruskal-wallis test represents a nonparametric statistical test, which considers differences of three or more independent groups on a single, and not normally distributed data [35]. the starting assumption is that we have k independent samples of volume n1, n2,…, nk, so that n1 + n2 + … + nk = n. after the ranking of samples, the sums of the ranks (r1, r2,…, rk) are obtained. test statistics can be described with the following equation (eq. 3) [36]: 𝑅 = 12 𝑛(𝑛+1) ∑ 𝑅𝑖 2 𝑛𝑖 𝑘 𝑖=1 − 3(𝑛 + 1) (3) the following four hypotheses have built this work: h01: there is no statistically significant difference in monthly solar energy production between the fixed solar panels (free-standing and building integrated) between the cities; h02: there is no statistically significant difference in monthly solar energy production of inclined photovoltaic system between the cities; h03: there is no statistically significant differences in monthly solar energy production between the cities in relation to two-axis solar power plant, and h04: there is no statistically significant differences in monthly solar energy production when all types of solar power plants were compared with each other among the cities. the aim of the test is to reject one hypothesis and to accept the other hypothesis. the p stands for probability and it calculates the probability that difference between the groups is random. the p value can be between 0 and 1 [37]. small p value, provides stronger evidence against h0, and we are more certain that h0 is not true. when the p value is large, h0 becomes more possible, but we cannot be confident that h0 is true. h0 should be rejected, in case when p ≤0.05 [33]. 3. results beforehand the results of statistical analysis, table 3 represents yearly solar energy production (kwh) in selected six cities. athens has the greatest yearly solar energy production among the selected cities, and freiburg has the lowest yearly solar energy production. more details are provided in the table below. table 3 yearly solar energy production (kwh) type of the pv technology freiburg (fr) graz (gr) maribor (mb) banja luka (bl) nis (ni) athens (at) fixed free standing 5316.05 5722.54 5851.62 5575.21 6302.62 8282.53 fixed building integrated 5128.36 5514.94 5640.08 5366.03 6051.62 7952.83 inclined 6661.49 7246.43 7541.29 7240.20 8216.33 11224.55 two-axis 6813.87 7421.63 7725.29 7417.43 8415.07 11550.93 testing the first hypothesis (h01), statistically significant differences were found in testing fixed-free standing photovoltaic systems between the cities (p = .044) and in testing fixed-building integrated photovoltaic systems between the cities (p = .043). high statistically significant differences for both types of fixed photovoltaic systems were 400 m. preradović obtained in monthly solar energy production between freiburg and athens (p = .009), between banja luka and athens (p = .009), between maribor and athens (p = .021), and between graz and athens (p = .018). high statistically significant difference was obtained between niš and athens (p = .0496) for fixed-free standing solar power plant, p = .043 for fixed-building integrated solar power plant). for the inclined photovoltaic systems (h02), statistically significant differences were obtained between maribor and athens (p = .028), between freiburg and athens (p = .011), between graz and athens (p = .021), and between banja luka and athens (p = 0.018). in testing of third hypothesis (h03), high statistically significant difference resulted in testing of monthly solar energy production between freiburg and athens (p = .009). statistically significant difference was obtained between maribor and athens (p = .028), graz and athens (p = .021), and between banja luka and athens (p = .015). results of testing h03 are presented in the table 4. table 4 results of testing of third hypothesis, monthly solar energy production by twoaxis solar power plant between the cities fixed – free standing fixed – building integrated inclined two-axis all .044† .043† .064† .062† mb & fr .273‡ .273‡ .299‡ .299‡ mb & gr .644‡ .644‡ .644‡ .644‡ mb & bl .773‡ .773‡ .817‡ .817‡ mb & ni .564‡ .603‡ .644‡ .603‡ mb & at .021‡ .021‡ .028‡ .028‡ fr & gr .326‡ .326‡ .419‡ .419‡ fr & bl .686‡ .686‡ .525‡ .564‡ fr & ni .248‡ .248‡ .225‡ .273‡ fr & at .009‡ .009‡ .011‡ .009‡ gr & bl .954‡ .954‡ 1.000‡ .954‡ gr & ni .488‡ .488‡ .525‡ .488‡ gr & at .018‡ .018‡ .021‡ .021‡ bl & ni .386‡ .386‡ .419‡ .453‡ bl & at .009‡ .009‡ .018‡ .015‡ ni & at .0496‡ .043‡ .065‡ .065‡ †kruskal wallis test ‡ mann-whitney test finally, for the fourth hypothesis (h04), high statistically significant difference (p = .000) was obtained when fixed-building integrated, inclined, and two-axis solar power plants were compared with each other. only in athens is there a statistically significant difference (p = .029) in testing monthly solar energy production of fixed-building integrated, inclined, and two-axis solar power plants. in all the other cities, there is no statistically significant difference when those three systems were compared with each other. results for testing of fourth hypothesis are presented in the table 5. solar energy potential in freiburg, graz, maribor, banja luka, niš, and athens 401 table 5 monthly energy production comparison between all types of installed solar power plants location fixed – free standing & fixed – building integrated inclined & two-axis fixed – building integrated & inclined & two-axis all .415‡ .655‡ .000† fr .488‡ .603‡ .140† gr .419‡ .686‡ .135† mb .525‡ .644‡ .150† bl .644‡ .686‡ .143† ni .644‡ .729‡ .166† at .564‡ .686‡ .029† † kruskal wallis test ‡ mann-whitney test in the following paragraphs, the payback time for installed fixed-building integrated photovoltaic system (5 kwp) has been calculated. also, information about annual incident solar energy (optimal angle), specific yearly electricity production, price of photovoltaic installation, and electricity prices in typical household (four members and yearly electricity demand 6 000 kwh) are shown in table 6. table 6 calculation of payback time for installed photovoltaic system, 5 kw, for one typical household with annual electricity demand of 6 000 kwh location yearly incident solar energy under optimal angle (kwh/m2) [27] specific yearly electricity production (kwh/kwp) electricity price that one household pays in one year (4 members, demand 6 000 kwh), country’s average for march 2021* payback time for installed photovoltaic system, with power 5 kw freiburg 1331.72 992 1 920 2.60 graz 1442.69 1145 1 260 3.97 maribor 1472.76 1167 1 080 4.63 banja luka 1433.46 1096 552 9.06 nis 1662.62 1239 480 10.41 athens 2108.31 1557 1 140 4.38 installation prices for photovoltaic system ‘key in hand’ for the selected cities are approximately the same (1 000 €/kwp), because of the bounded components. this is related to the systems with the power to 10 kw, which are mostly used in households for the own energy consumption. * country’s average electricity price as for march 2021, according to [38]: germany 0.32 €/kwh, austria 0.21 €/kwh, slovenia 0.18 €/kwh, bosnia and herzegovina 0.092 €/kwh, serbia 0.080 €/kwh, and greece 0.190 €/kwh. investment payback time is the shortest for the countries where the electricity price is the highest. the payback time is calculated by dividing investment costs with electricity price that one household pays in one year. 402 m. preradović 4. conclusion based on the presented research, following conclusions can be made: i. germany has the largest solar energy capacity and solar energy production; ii. between freiburg and athens, between banja luka and athens, between maribor and athens, graz and athens, and between niš and athens, there is a high statistically significant difference when the energy production of fixed-free standing and fixed-building integrated photovoltaic systems were tested; iii. statistically significant differences were obtained in testing of inclined photovoltaic system between following cities: maribor and athens, between freiburg and athens, between graz and athens, and between banja luka and athens; iv. in testing of produced energy amount by two-axis solar power plant, following results were obtained: high statistically significant difference between freiburg and athens, statistically significant difference between maribor and athens, graz and athens, and banja luka and athens; v. in athens, there is a statistically significant difference when monthly solar energy production was tested between three types of solar power plants (fixedbuilding integrated, inclined, and two-axis solar power plants), and vi. germany has the highest electricity price, and serbia the lowest electricity price. accordingly, in germany the payback time for installed photovoltaic system of 5 kw is the shortest, and in serbia the longest. references [1] t. m. pavlović, d. lj. mirjanić, i. s. radonjić, l. s. pantić and g. i. sazhko, "solar radiation atlas in banja luka in the republic of srpska", contemporary materials, vol. 12, no. 1, pp. 39-49, 2021. [2] h. rohracher and p. späth, "the interplay of urban energy policy and socio-technical transitions: the eco-cities of graz and freiburg in retrospect", urban studies, vol. 51, no. 7, pp. 1415–1431, 2014. [3] green city: freiburg, germany. (n.d.). https://www.greencitytimes.com/freiburg/, visited on february, 19. 2022. [4] a. thomas, freiburg solar region. https://wwf.panda.org/wwf_news/?204419/freiburg-green-city, visited on february 19. 2022. [5] s. fastenrath and b. braun, "sustainability transition pathways in the building sector: energy-efficient building in freiburg (germany)", applied geography, vol. 90, no. 1, pp. 339–349, 2018. [6] j. fälchle and photolia de. n.d. ‘energy innovation austria 4/2016’16. [7] s. seme, k. sredensek and z. praunseis, "smart grids and net metering for photovoltaic systems". in proceedings of the ieee international conference on modern electrical and energy systems (mees). kremenchuk, 2017, pp. 188–191. [8] s. seme, k. sredenšek, b. štumberger and m. hadžiselimović, "analysis of the performance of photovoltaic systems in slovenia", solar energy, vol. 180, pp. 550–558, 2019. [9] p. virtič and r. kovačič lukman, "a photovoltaic net metering system and its environmental performance: a case study from slovenia", j. clean. prod., vol. 212, pp. 334–342, 2019. [10] "dravske elektrane maribor obtains building permit for first part of solar park on canals of the zlatoličje and formin hydro power plants", hse. retrieved 19 february 2022 (https://www.hse.si/en/dravskeelektrarne-maribor-obtains-building-permit-for-first-part-of-solar-park-on-canals-of-the-zlatolicje-andformin-hydro-power-plants/). [11] t. pavlović and d. lj. mirjanić, solar energy and lighting in the republic of srpska. in the sun and photovoltaic technologies (pp. 383–411). springer international publishing. [12] t. m. pavlović, d. d. milosavljević, d. mirjanić, l. s. pantić, i. s. radonjić and d. pirsl, "assessments and perspectives of pv solar power engineering in the republic of srpska (bosnia and herzegovina)", renew. sust. energy rev., vol. 18, pp. 119–133, 2013. https://www.greencitytimes.com/freiburg/ https://wwf.panda.org/wwf_news/?204419/freiburg-green-city solar energy potential in freiburg, graz, maribor, banja luka, niš, and athens 403 [13] energy strategy of republic of srpska up to 2030, banja luka, https://www.vladars.net/eng/vlada/ministries/ miem/documents/energy%20strategy%20of%20the%20republic%20of%20srpska%20up%20to%202030_4 59254634.pdf, visited on february 9. 2022. [14] t. pavlović, i. radonjić, d. milosavljević, l. pantić and d. pirsl, "assessment and potential use of concentrating solar power plants in serbia and republic of srpska", thermal sci., vol. 16, no. 3, pp. 931–945, 2012. [15] t. pavlović, d. milosavljević, d. mirjanić, l. pantić and d. pirsl, "assesment of the possibilities of building integrated pv systems of 1 kw electricity generation in banja luka", contemporary materials, vol. 2, no. 3, pp. 167–176, 2013. [16] d. d. milosavljević, t. m. pavlović, d. lj. mirjanić and d. divnić, "photovoltaic solar plants in the republic of srpska current state and perspectives", renew. sust. energy rev., vol. 62, pp. 546–560, 2016. [17] t. m. pavlović, y. tripanagnostopoulos, d. lj. mirjanić and d. d. milosavljević, "solar energy in serbia, greece and the republic of srpska", academy of sciences and arts of the republic of srpska, 2015. [18] m. golusin, z. tesić, and a. ostojić, "the analysis of the renewable energy production sector in serbia", renew. sust. energy rev., vol. 14, no. 5, pp. 1477–1483, 2010. [19] l. pantić, t. pavlović and d. milosavljević, "a practical field study of performances of solar modules at various positions in serbia", thermal sci., vol. 19, pp. 511–523, 2015. [20] t. pavlović, d. milosavljević, m. lambić, v. stefanović, d. mančić and d. piršl, "solar energy in serbia", contemporary materials, vol. 2, no. 2, pp. 204–20, 2011. [21] s. prvulović, d. tolmac, m. matić, lj. radovanović, and m. lambić, "some aspects of the use of solar energy in serbia", energy sources, part b: econ. plan. policy, vol. 13, no. 4, pp. 237–245. [22] a. nikas, v. stavrakas, a. arsenopoulos, h. doukas, m. antosiewicz, j. witajewski-baltvilks and a. flamos, "barriers to and consequences of a solar-based energy transition in greece", environ. innov. soc. transit., vol. 35, pp. 383–399, 2020. [23] a. a. argiriou and s. mirasgedis, "the solar thermal market in greece—review and perspectives", renew. sust. energy rev., vol. 7, no. 5, pp. 397–418, 2003. [24] e. bellos and c. tzivanidis, "solar concentrating systems and applications in greece – a critical review", j. clean. prod., vol. 272, p. 122855, 2020. [25] irena, renewable energy statistics, the international renewable energy agency, abu dhabi, (2021) 43. [26] l. pantić, t. pavlović, d. milosavljević, d. mirjanić, i. radonjić and m. radovic, "electrical energy generation with differently oriented photovoltaic modules as façade elements", thermal sci., vol. 20, no. 4, pp. 1377–1386, 2016. [27] t. pavlović, d. milosavljević and d. pirsl, "simulation of photovoltaic systems electricity generation using homer software in specific locations in serbia", thermal sci., vol. 17, no. 2, pp. 333–347, 2013. [28] photovoltaic geographical information system, https://re.jrc.ec.europa.eu/pvg_tools/en/tools.html, visited on december, 10. 2021. [29] k. cieslak and p. dragan, "comparison of the existing photovoltaic power plant performance simulation in terms of different sources of meteorological data", edited by l. lichołai, b. dębska, p. miąsik, j. szyszka, j. krasoń, and a. szalacha. e3s web of conferences, 2018, vol. 49, 00015. [30] european commission, eu science hub pvgis data sources and calculation methods, https://jointresearch-centre.ec.europa.eu/pvgis-photovoltaic-geographical-information-system/getting-startedpvgis/pvgis-data-sources-calculation-methods_en visited on march, 1. 2022. [31] t. muneer, "solar radiation model for europe", build. serv. eng. res. technol., vol. 11, no. 4, pp. 153–163, 1990. [32] s. jakšić and s. maksimović. 2, verovatnoća i statistika: teorijske osnove i rešeni primeri, arhitektonskograđevinsko-geodetski fakultet, banja luka, 2020. [33] w. navidi, statistics for engineers and scientists. new york: mcgraw-hill, 2011. [34] n. nachar, "the mann-whitney u: a test for assessing whether two independent samples come from the same distribution", tutor. quant. methods psychol., vol. 4, no. 1, pp. 13–20, 2008. [35] p. e. mckight and j. najab, "kruskal-wallis test" in the corsini encyclopedia of psychology, edited by i. b. weiner and w. e. craighead. hoboken, nj, usa: john wiley & sons, inc. [36] m. lovrić, j. komić and s. stević, statistička analiza: metodi i primjena, 2. izmijenjeno i dopunjeno izdanje. narodna i univerzitetska biblioteka republike srpske, banja luka, 2017. [37] t. dahiru, "p-value, a true test of statistical significance? a cautionary note", annals of ibadan postgraduate medicine, vol. 6, no. 1, pp. 21–26, 2011. [38] global petrol prices, https://www.globalpetrolprices.com/electricity_prices/, visited on december, 20. 2021. https://re.jrc.ec.europa.eu/pvg_tools/en/tools.html https://www.globalpetrolprices.com/electricity_prices/ facta universitatis series: electronics and energetics vol. 30, n o 4, december 2017, pp. 611 625 doi: 10.2298/fuee1704611v automatic optimized document skew pre-processor for character segmentation algorithm  vladan vučković, boban arizanović university of niš, faculty of electronic engineering, niš, serbia abstract. in this paper, as a part of character segmentation algorithm, an automatic optimized document skew correction approach based on hough transform is presented. the importance of skew correction in document image analysis lies in the fact that further processing is impossible if the document image is skewed. the proposed approach is based on fast implementation of the standard hough transform which is followed by highly optimized low-level machine code implementation of the image rotation. in order to achieve high computational results, linear image representation is used. the proposed approach results from the aspect of time complexity and skew estimation accuracy which are analyzed and compared with the already existing skew correction approaches. the proposed approach gives better results compared with analogous approach used in related work, but it gives worse results compared with optimized version which exploits a bag algorithm. provided results show significant improvement of the standard hough transform implementation. key words: skew correction, hough transform, character segmentation, spatial transformations, rotation, fast algorithm 1. introduction as a part of pre-processing techniques used in document image processing, skew correction takes place in the very early stage since further processing is highly affected by document orientation. in general, image processing methods [1], which are suitable for extraction of the information, imply de-skewed image as input. in concrete case of character segmentation process, skew correction is the essential part of document image preprocessing and further processing would be impossible without it. variety of approaches for document image skew estimation and correction has been presented in the past. horizontal projection profiles, distribution of feature locations, a hough transform, and distribution of responses from local directionally sensitive masks are four broad classes of document skew estimation [2], [3]. methods for evaluation of document skew estimation techniques are of particular interest for researchers [4], [5]. hough transform is the most frequent technique for skew estimation [6]-[8]. most skew estimation and correction approaches are based on received december 18, 2016; received in revised form february 2, 2017 corresponding author: vladan vuĉković university of niš, faculty of electronic engineering, computer department, p.o. box 73, 18000 niš, serbia (e-mail: vladanvuckovic24@gmail.com) 612 v. vuĉković, b. arizanović the usage of the hough transform modifications [9]-[11]. one recent work proposes a method based on hough space derivatives in order to identify directions with sudden changes in their projection profiles [12]. some approaches based on using the hough transform do not use a voting scheme or use its modifications [13], [14]. hough transform which exploits the cross-correlation property between the pixels in vertical lines is proposed in [15]. many recent works based on the usage of the hough transform are focused on improving the processing time [16], [17]. one approach for fast skew correction is based on the usage of the block adjacency graph (bag) algorithm before applying the hough transform [18]. modification of the hough transform voting scheme can also be efficient [19]. approaches for improving the processing time of the hough transform are usually based on determination of boundary boxes which represents edges of important image elements. in case of document images, if characters do not have the same height, this idea is not applicable and other approaches based on the usage of the boundary boxes of connected components are used [14]. beside the hough transform, other approaches for skew correction are exploited. mathematical morphology proved to be useful for skew correction and can perform on both, binary and grayscale images [20]. other skew correction methods are based on the usage of the correlation functions and geometric text-line models [21], [22]. radon transform based projection profile technique is used for skew correction of handwritten words [23]. another work is based on robust borderlines which are extracted using the run length based method [24]. straight-line fitting based method can be also used for skew detection and correction [25]. unlike the previously mentioned approaches which work in spatial domain, frequency domain approach based on using the fourier transform and knn clustering can be also exploited [26]. hough transform is also used for other purpose, such as video processing [27], [28] and face recognition [29]. the proposed approach for automatic skew correction of the document images is intended for usage in the pre-processing stage of character segmentation system. character segmentation system was initially designed for machine-typed documents, but can be used for machine-printed documents, as well. the focus of this approach is fast implementation of the standard hough transform using the pointer arithmetic, which is followed by highly optimized low-level machine code implementation of the image rotation, which exploits the efficient generalized architecture for geometrical image transformations. in order to achieve fast implementation, linear image representation is used. the proposed approach is compared with classical implementation of the hough transform and also with the approach which exploits bag algorithm as a pre-processing stage of the hough transform [18]. from the aspect of time complexity, the proposed approach uses the rotation algorithm which gives better results than all rotation algorithms presented in [18]. when it comes to time complexity of the whole skew correction approach, the proposed approach performs approximately 2.5 faster than the classical implementation. compared with a version which exploits bag algorithm, the proposed approach gives worse results, but the proposed ultrafast architecture in combination with pointer arithmetic and highly optimized low-level machine code implementation, could be also used in combination with bag algorithm in order to provide even better results than the already existing ones. from the aspect of skew estimation accuracy, the proposed approach gives results as good as the existing ones. in order to show the proposed approach importance for character segmentation system and how the skew correction affects the further character segmentation process, nikola tesla’s documents from the “nikola tesla museum” in belgrade are used [30]. automatic document skew pre-processor for character segmentation algorithm 613 this paper is organized as follows: in section 2 the theoretical background of the hough transform is provided. in section 3 the flowchart of the complete character segmentation system is given and ultra-fast generalized image transformation approach used for the proposed approach is presented. section 4 offers implementation details for the main parts of the proposed approach, including the implementation details for the ultra-fast architecture which is used. section 5 provides experimental results for the proposed approach, and shows the comparison of the proposed approach with already existing skew correction approaches. also, this section shows how the skew correction pre-processing algorithm affects the character segmentation results. in section 6 the summary of the proposed approach performances is given. 2. theoretical background the central task of the proposed approach for document image skew correction is determination of the rotation angle in order to fix the document skew and make the document suitable for the character segmentation process. for this purpose, the standard hough transform is used [6]-[8]. this part of the skew correction approach provides the detection of the rotation angle in the range from -90 to 90 degrees. the hough transform was primarily intended for the purpose of straight line detection, and later has been generalized for detection of arbitrary shapes [7]. our focus will be on line detection, more precisely on detection of the angle between the line which passes through the most pixels in the image and x-axis. taking the document images into consideration, this angle represents the rotation angle of the document. the key of the hough transform is in representation of a line. in two-dimensional euclidian space, lines can be represented using the slope and intercept as follows: nkxy  (1) however, a line is not defined by x and y coordinates. complete information about a line is given by parameters k and n, and lines can be represented using pair (k, n). the problem with this representation lies in the unavailability to represent vertical lines, since the vertical lines make angle with x-axis equal to 90 degrees and tan 90° equals infinity. for that reason, the alternative way of line representation using parameters r and θ is used. in this representation r is a vector perpendicular to the given line and represents distance between the origin and the given line, and θ is the angle between the vector r and x-axis. the relation between these parameters is described using the following equations: )sin()cos(  yxr (2) )sin()sin( )cos(      r xy (3) for θ ∈ [0, 180] and r ∈ r. therefore, the hough transform uses the hough space, where each line is represented as a point using a pair (r, θ). mapping from the euclidian to the hough space is shown in fig. 1 (a) and (b). 614 v. vuĉković, b. arizanović (a) (b) fig. 1 mapping of line in euclidian space to point in hough space: (a) line in euclidian space and (b) corresponding point in hough space the basic idea of the hough transform is representing for each black pixel in the binary image all lines that can pass through that pixel. as it is shown before, in the hough space each line is represented as a point, therefore it is clear that all lines passing through the given pixel can be represented as a sine curve. an illustration of this process is shown in fig. 2 where spectrum of lines passing through the single point in the euclidian space is shown in (a), and line spectrums for different points in the hough space are shown in (b). all points from different curves represent lines which pass through at least one black pixel in the binary image. if different points from different curves have the same values for r and θ, it means that those points actually represent the same line, and the number of pixels that lies on that line is equal to the number of points with these coordinates. as the output of the algorithm, the θ coordinate of the line with most black pixels is taken. this value θ represents an angle of rotation and is used for the image rotation. (a) (b) fig. 2 representation of line spectrum: (a) for one point in the euclidian space and (b) for different points in the hough space the important part of the hough transform is a voting matrix which is used in the process of counting the lines with same parameters (r, θ). another term used for the voting matrix is accumulator. in a general case, the number of dimensions of the voting matrix depends on the shape we want to detect and is equal to the number of unknown parameters. in case of lines, the number of unknown parameters is 2, therefore a matrix is used. dimensions of a voting automatic document skew pre-processor for character segmentation algorithm 615 matrix may vary and they define the precision of calculations. in our case, both dimensions r and θ have a precision of 1 pixel for r and 1 degree for θ. an illustration of the voting matrix is shown in fig. 3. fig. 3 representation of the voting matrix the pseudo-code for the hough transform is as follows: input: image f output: angle a; 1: rmax = sqrt(imagewidth*imagewidth + imageheight*imageheight) + 1 2: theta = 180 3: fill voting matrix v[rmax][theta + 1] with zeros 4: for each black pixel f (x, y) do 5: for k = 0 to theta do 6: angle = k * pi / 180 7: r = x*sin(angle) + y*cos(angle) 8: if r ≥ 0 and r < rmax then 9: v[r][k] = v[r][k] + 1 10: end if 11: end for 12: a = find-max-pos-y(v) 13: a = a 90 14: return a 3. proposed approach the proposed optimized automatic skew correction approach represents the extended version of the character segmentation algorithm for machine-typed documents. the algorithm is extended with optimized skew correction pre-processing approach in order to fix the potential document skew. skew correction approach used here exploits the fast implementation of the standard hough transform used for determining the angle of document 616 v. vuĉković, b. arizanović skew, and highly optimized machine code implementation of the image rotation. efficient image rotation is achieved using the ultra-fast generalized architecture for geometrical image transformations. in order to achieve the fast implementation of the proposed approach, linear image representation is used. the flowchart of the complete character segmentation system is shown in fig. 4. fig. 4 flowchart of the complete character segmentation system rotation transformation is achieved using the ultra-fast architecture for geometrical image transformations. this architecture is generalized and can be also used for other spatial transformations. the architecture scheme is shown in fig. 5. fig. 5 ultra-fast architecture for image transformation automatic document skew pre-processor for character segmentation algorithm 617 the architecture used for image rotation is based on using the mapping offsets which represent the transformation matrix for chosen transformation. the transformation matrix is a matrix of offsets where each offset represents the offset relative to the first element of the input matrix. using the mapping offsets each input position is mapped to the specific output position. in practice, this architecture provides that each pixel in the input image can be mapped to the pre-computed position in the output image. calculation of the mapping offsets for the chosen transformation with specific parameters is performed at the start and whenever the transformation needs to be applied, already calculated mapping offsets are used. in this case, mapping offsets are calculated using the standard rotation transformation pair: )sin()cos(  yxs x (4) )cos()sin(  yxs y (5) where angle θ is previously estimated document skew angle. since the mapping offsets are valid for a given angle of rotation, it would be necessary to calculate the mapping offsets for different angles to make it possible to combine different mapping offsets for achieving a rotation for the desired angle. this approach proved to be very efficient and does not depend on the type of image transformation. the bad side of the presented ultra-fast architecture for geometrical image transformations is the memory usage. in order to achieve the fastest possible computational performances, the mapping offsets and other support lookup tables are loaded at the start and kept in memory all the time. 4. fast implementation the focus of this paper is the fast implementation of the automatic optimized skew correction approach. in order to achieve the fast implementation, image is represented as a one dimensional array. this linear image representation is shown in fig. 6. this image representation provides the direct memory access to the pixel intensity values using the pointer arithmetic. furthermore, implementation of the image rotation is achieved using the highly optimized low-level machine code. fig. 6 linear image representation 618 v. vuĉković, b. arizanović implementation of both parts of the skew correction approach exploits the lookup support tables for values of trigonometric functions. this is the common way to avoid multiple calculations with the same parameters inside big loops. pascal implementation of the ultra-fast image transformation architecture adapted for image rotation is shown in the following listing: for j := 0 to n do {number of transformation arrays} begin setlength(r[j], count); {set each transformation array to be equal to the number of pixels} s := dptr^; {get value for sin from lookup table} inc(dptr); {increment pointer} c := dptr^; {get value for cos from lookup table} inc(dptr); {increment pointer} rptr := @r[j, 0]; {pointer to current transformation array} x := trunc(htemp * c + wtemp * s); {determine the x coordinate of the image center after rotation} y := trunc(-htemp * s + wtemp * c); {determine the y coordinate of the image center after rotation} offsetx := wtemp y; {determine the offset from image center for x coordinate} offsety := htemp x; {determine the offset from image center for y coordinate} imagestartptr := @image[0]; {source image pointer} ptr1 := @posmap[0]; {pointer to support lookup table} ptr2 := @posmap[1]; {pointer to support lookup table} for i := 0 to count 1 do begin x := trunc(ptr1^ * c + ptr2^ * s) + offsety; {determine the x coordinate after rotation} y := trunc(-ptr1^ * s + ptr2^ * c) + offsetx; {determine the y coordinate after rotation} if (x >= 0) and (x < height) and (y >= 0) and (y < width) then {if coordinates are valid} rptr^ := x * width + y{store offset to transformation array} else rptr^ := -1; {store -1} inc(rptr); {increment pointer} inc(ptr1, 2); {increment pointer} inc(ptr2, 2); {increment pointer} end; end; the following subsections provide implementation details for the main parts of the proposed approach. 4.1. hough transform implementation in order to achieve the fast implementation of the hough transform, the pointer arithmetic is used. for both, linear and matrix representation of the image, classical implementation is automatic document skew pre-processor for character segmentation algorithm 619 not suitable due to slow indexed access to the array elements. using the linear image representation combined with pointer arithmetic, the direct memory access is achieved. the following code represents the pascal implementation of the standard hough transform based on using the pointer arithmetic, which is capable to determine skew angles in the range from 90 to 90 degrees. it should be mentioned that using the different parameters in the following implementation, it is possible to estimate angles in different ranges which is exploited for obtaining the experimental results. dmax := trunc(sqrt(width * width + height * height)) + 1; {maximal allowed line length} teta := 90; {angle which defines the range of estimation angle} ptr1 := @imagebinary[0]; {pointer to binary image which is being processed} ptr2 := @posmap[0]; {pointer to support lookup table} ptr3 := @votingmatrix[0]; {pointer to voting matrix} for i := 0 to count 1 do {main loop} begin if ptr1^ = 1 then {is it black pixel?} begin j := ptr2^; {get x value from linear offset lookup table} inc(ptr2); {increment pointer} l := ptr2^; {get y value from linear offset lookup table} inc(ptr2); {increment pointer} ptr5 := @sincos[0]; {pointer to lookup table of trigonometric values sin} ptr6 := @sincos[1]; {pointer to lookup table of trigonometric values cos} for k := 0 to teta do {loop through all angles} begin d := trunc(j * ptr5^ + l * ptr6^);{determine the line length for current parameters} if (d >= 0) and (d < dmax) then {is it in range?} begin ptr4 := ptr3; {get the starting pointer of voting matrix} inc(ptr4, d * teta + k); {increment the pointer bydetermined offset} ptr4^ := ptr4^ + 1; {increment the voting matrix value} end; inc(ptr5); {increment pointer} inc(ptr6); {increment pointer} end; end else inc(ptr2, 2); {increment pointer} inc(ptr1); {increment pointer} end; result := maxind(votingmatrix) 90;{final estimated angle} 4.2. rotation implementation image rotation is performed using the previously described ultra-fast architecture for geometrical image transformations. this approach proved to be very efficient and performs almost 50 times faster than standard approach for image rotation. in order to achieve the highest computational performances, the highly optimized low-level machine code implementation of the image rotation is used. the following listing shows the machine routine for image rotation: 620 v. vuĉković, b. arizanović asm pushad{push all registers to stack} mov ecx,count{number of pixels to process} mov esi,rptr{pointer to r transformation array} mov ebx,imagesrcptr{source image pointer} mov edi,imagedstptr{destination image pointer} @main: {main loop} lodsd {load current offset from r transformation array} mov edx,eax {save current offset} or eax,eax {is it -1?} js @init {if true, jump to label init} shl edx,2 {offset * 4} mov eax,[edx+ebx]{calculate final offset and load value from source to eax} stosd {store loaded value from eax to destination} dec ecx {decrement counter} jnz @main {if not zero, loop again through ecx} jmp @ex {else,jump to ex label} @init: {label init} mov eax,white_color{store white color definition to eax} stosd {store value from eax to destination} dec ecx {decrement counter} jnz @main {if not zero, loop again through ecx} @ex: {label ex} popad {pop up all registers from stack} end; 5. experiments testing of the proposed optimized skew correction approach is performed on pc machine with an amd quad core processor running at 3.1 ghz and 4 gb ram installed. the proposed approach performances from the aspect of time complexity are analyzed and compared with related work. beside the time complexity, skew estimation accuracy results are also provided. for the purpose of demonstrating the proposed approach performances and importance of the skew correction for character segmentation process, nikola tesla’s documents from the “nikola tesla museum” in belgrade are used. the proposed approach performances are compared with results provided in [18]. table 1 shows comparison of rotation processing time for the proposed approach and algorithms analyzed in [18]. table 1 comparison of processing time for image rotation image dimensions (px) c. singh et al. [18] processing time (ms) proposed approach processing time (ms) float rotation integer rotation fast implementation bresenham’s line like algorithm 1249x1249 15 172 15 78 15 78 16 31 6 4148x4068 218 1875 156 844 157 841 235 313 64 provided results show the efficiency of the rotation algorithm proposed as a part of skew correction approach. c. singh et al. provided results for different algorithms, including results for forward and inverse rotation for each analyzed rotation algorithm. an important fact is that the proposed rotation approach is not dependent on complexity of calculations, thus both, forward and inverse rotation, will be performed with the same processing time. taking this fact into consideration, results show that the proposed approach gives better results than any of the rotation algorithms analyzed in [18]. automatic document skew pre-processor for character segmentation algorithm 621 beside the rotation algorithm processing time, other important aspects of the skew correction approaches are the hough transform processing time and skew estimation accuracy. c. singh et al. used bag algorithm in the process of skew estimation and they compared the overall processing time with standard approach which does not use bag algorithm. table 2 shows the comparison of the proposed approach results with results provided in [18]. results given in table 2 show that the proposed approach, from the aspect of time complexity, gives better results than classical implementation of the hough transform and worse results than implementation which exploits bag algorithm. considering the fact that the proposed approach performs approximately 2.5 times faster than classical implementation, this is a significant improvement. also, taking into consideration that the proposed approach exploits ultra-fast architecture for image transformation, bag algorithm could be used in combination with the proposed approach to provide even faster results than the already presented ones. results provided in table 1 and table 2 are obtained using the document images with the same characteristics as images used in related work [18]. when it comes to the importance of the skew correction for character segmentation system, character segmentation system which is in the background of the proposed skew correction approach proved to be very sensitive to document skew. document images skewed for random angles are used for obtaining results. it is shown that document images skewed even for a small angle higher than 2° highly decrease the successful character segmentation percentage. table 2 comparison of overall processing time and skew estimation accuracy image dimensions (px) % of black pixels skew angle c. singh et al. processing time (ms) difference in skew (bag) difference in skew (no bag) proposed approach processing time (ms) difference in skew bag no bag 4168x4088 7.16 0 593 3875 0.00 0.000 1548 0.00 4308x4231 6.6 2 610 3969 0.06 -0.855 1589 -0.35 4508x4436 6.1 5 704 3968 0.079 0.079 1585 -0.15 4815x4750 5.3 10 797 4025 0.08 0.08 1612 -0.10 4981x4921 4.9 13 860 4079 -0.48 -0.65 1631 -0.25 5272x5222 4.42 19 953 4188 -0.19 0.195 1654 0.12 5654x5624 3.8 30 1046 4234 0.02 -0.26 1687 0.20 5838x5838 3.57 45 1234 4484 0.00 0.00 1711 0.00 4088x4168 7.1 90 522 3781 0.00 0.00 1536 0.00 5267x5315 4.3 110 985 4134 -0.43 -0.43 1644 -0.47 5739x5759 3.6 150 1374 4655 -0.32 -0.19 1720 -0.20 1904x2588 13.88 0 280 2034 0.20 0.00 478 0.00 1993x2653 14.07 2 250 2281 -0.73 -1.11 515 -0.88 2122x2744 11.69 5 235 2094 -0.43 -1.19 487 -0.65 2324x2879 9.47 10 281 2124 0.49 -1.16 492 -0.45 2437x2950 9.47 13 327 2125 -0.70 -0.48 489 -0.32 2643x3067 8.4 19 328 2140 -1.14 -0.57 493 -0.60 2943x3193 7.24 30 375 2172 -0.95 -0.95 500 -0.95 3176x3176 6.75 45 407 2187 -1.14 -0.64 508 -0.42 2588x1904 13.88 90 171 2078 0.00 0.00 481 0.00 3083x2674 8.25 110 281 2141 -0.65 -0.43 496 -0.71 2943x3193 7.24 150 375 2172 0.96 0.96 504 0.90 622 v. vuĉković, b. arizanović this characteristic of the character segmentation algorithm is due to its nature. the graph that shows the dependency of the successful character segmentation percentage from the skew angle is shown in fig. 7. fig. 7 character segmentation results as a function of document skew angle this graph shows that even a small document skew represents a big problem for character segmentation algorithm. visual results of the extended character segmentation algorithm are provided using the nikola tesla’s documents from the “nikola tesla museum” in belgrade. these results are shown in fig. 8 a), b), c), and d). (a) (b) (c) (d) fig. 8 extended character segmentation results for skewed and de-skewed original nikola tesla’s documents: a) first skewed document, b) first de-skewed document, c) second skewed document, d) second de-skewed document automatic document skew pre-processor for character segmentation algorithm 623 the character segmentation results shown in fig. 8 confirm the previous conclusion based on the graph results. based on the scanned document images from the “nikola tesla museum”, it is clear that further character segmentation is impossible without skew correction performed in the very early stage of the character segmentation process. 6. conclusions in this paper, the optimized hough transform based approach for skew correction is presented as an essential pre-processing part of character segmentation system. character segmentation system is initially designed for machine-typed documents, but can be used for machine-printed documents as well. in section 2 the theoretical background of the hough transform is provided. in section 3 the flowchart of the complete character segmentation algorithm is shown and description of the proposed skew correction approach is provided. the proposed approach uses the ultra-fast generalized image transformation architecture for achieving high computational performances. in order to achieve fast implementation, linear image representation is used. ultra-fast image transformation architecture is used for implementation of the image rotation algorithm, which is implemented using the highly optimized low-level machine code. the standard hough transform is implemented using the pointer arithmetic. for both implementations, support lookup tables are used. in section 4, the experimental results for the proposed approach from the aspect of time complexity and estimation accuracy are given and are compared with existing approaches. also, the results which show how the skew correction affects the character segmentation process, are given. based on the results, the proposed approach performs approximately 2.5 faster than the classical implementation used in [18]. also, the proposed approach gives worse results than skew correction approach which exploits a bag algorithm. although the proposed approach does not give the best results, it could be used in combination with a bag algorithm to provide even better results than existing ones. the estimation accuracy results show that the proposed approach gives results as good as results provided in the related work. on the other side, it is clear that character segmentation of skewed documents is impossible and gives bad results without a skew correction. since character segmentation system is initially designed for needs of the “nikola tesla museum”, namely for conversion of nikola tesla’s scanned documents to electronic form, the original nikola tesla’s documents from the “nikola tesla museum” are used for testing of the complete character segmentation algorithm performances. also, the official evaluation of the complete character segmentation system performances will be performed at the “nikola tesla museum”. our future work will be focused on the automatization of the character segmentation algorithm manual parts, improving its performances, the optimization of the complete algorithm including the proposed skew correction approach, and integration of the character segmentation system into the complete real-time ocr system. acknowledgments: this paper is supported by the ministry of education, science and technological development of the republic of serbia (project iii44006-10), mathematical institute of serbian academy of science and arts (sanu) and museum of nikola tesla (providing original typewritten documents of nikola tesla). 624 v. vuĉković, b. arizanović references [1] s. s. cvetković, s. v. nikolić and s. ilić, “effective combining of color and texture descriptors for indoor-outdoor image classification”, facta universitatis: electronics and energetics, vol. 27, no. 3, pp. 399-410, 2014. [2] j. j. hull, “document image skew detection: survey and annotated bibliography”, series in machine perception and artificial intelligence, vol. 29, pp. 40-66, 1998. [3] h. s. baird, “the skew angle of printed documents”, document image analysis, pp. 204-208, 1995. [4] a. papandreou et al., “icdar2013 document image skew estimation contest (disec’13)”, in proceedings of the 12th international conference on document analysis and recognition (icdar), 2013. [5] a. d. bagdanov and j. kanai, “evaluation of document image skew estimation techniques”, in spie proceedings 2660: document recognition iii, 1996, pp. 343-354. [6] p. mukhopadhyay and b. b. chaudhuri, “a survey of hough transform”, pattern recognition, vol. 48, no. 3, pp. 993-1010, 2015. [7] r. o. duda and p. e. hart, “use of the hough transformation to detect lines and curves in pictures”, in proceedings of the communications of the acm, vol. 15, no. 1, pp. 11-15, 1972. [8] s. n. srihari and v. govindaraju, “analysis of textual images using the hough transform”, machine vision and applications, vol. 2, no. 3, pp. 141-153, 1989. [9] o. g. okun, “geometrical approach to skew detection for documents containing the latin/cyrillic characters”, in proceedings of the spie, vol. 3811: vision geometry viii, 1999, pp. 357-365. [10] a. boukharouba, “a new algorithm for skew correction and baseline detection based on the randomized hough transform”, journal of king saud university computer and information sciences, vol. 29, no. 1, pp. 29-38, 2016. [11] d. kumar and d. singh, “modified approach of hough transform for skew detection and correction in documented images”, international journal of research in computer science, vol. 2, no. 3, pp. 37-40, 2012. [12] f. stahlberg and s. vogel, “document skew detection based on hough space derivatives”, in proceedings of the 13th international conference on document analysis and recognition, 2015. [13] v. shapiro, “accuracy of the straight line hough transform: the non-voting approach”, computer vision and image understanding, vol. 103, no. 1, pp. 1-21, 2006. [14] s. guo et al., “an improved hough transform voting scheme utilizing surround suppression”, pattern recognition letters, vol. 30, no. 13, pp. 1241-1252, 2009. [15] b. gatos, n. papamarkos and c. chamzas, “skew detection and text line position determination in digitized documents”, pattern recognition, vol. 30, no. 9, pp. 1505-1519, 1997. [16] u. pal and b. b. chaudhuri, “an improved document skew angle estimation technique”, pattern recognition letters, vol. 17, no. 8, pp. 899-904, 1996. [17] a. amin et al., “fast algorithm for skew detection”, in proceedings of the spie 2661: real-time imaging, 1996, pp. 65-77. [18] c. singh, n. bhatia and a. kaur, “hough transform based fast skew detection and accurate skew correction methods”, pattern recognition, vol. 41, no. 12, pp. 3528-3546, 2008. [19] l. a. f. fernandes and m. m. oliveira, “real-time line detection through an improved hough transform voting scheme”, pattern recognition, vol. 41, no. 1, pp. 299-314, 2008. [20] l. a. najman, “using mathematical morphology for document skew estimation”, in proceedings of the spie 5296: document recognition and retrieval xi, 2003, pp. 182-192. [21] g. bessho, k. ejiriand j. f. cullen, “fast and accurate skew detection algorithm for a text document or a document with straight lines”, in proceedings of the spie 2181: document recognition, 1994, pp. 133-141. [22] j. van beusekomand t. m. breuel, “resolution independent skew and orientation detection for document images”, in proceedings of the spie 7247: document recognition and retrieval xvi, 2009, pp. 72470k-72470k-8. [23] r. kapoor, d. bagai and t. s. kamal, “a new algorithm for skew detection and correction”, pattern recognition letters, vol. 25, no. 11, pp. 1215-1229, 2004. [24] h. liu et al., “skew detection for complex document images using robust borderlines in both text and non-text regions”, pattern recognition letters, vol. 29, no. 13, pp. 1893-1900, 2008. [25] y. cao, s. wang and h. li, “skew detection and correction in document images based on straight-line fitting”, pattern recognition letters, vol. 24, no. 12, pp. 1871-1879, 2003. [26] j. fabrizio, “a precise skew estimation algorithm for document imagesusing knn clustering and fourier transform,” in proceedings of the ieee international conference on image processing (icip), 2014. http://profiles.spiedigitallibrary.org/summary.aspx?doi=10.1117%2f12.234715&name=junichi+kanai http://profiles.spiedigitallibrary.org/summary.aspx?doi=10.1117%2f12.171101&name=goroh+bessho http://profiles.spiedigitallibrary.org/summary.aspx?doi=10.1117%2f12.171101&name=koichi+ejiri http://profiles.spiedigitallibrary.org/summary.aspx?doi=10.1117%2f12.171101&name=koichi+ejiri http://profiles.spiedigitallibrary.org/summary.aspx?doi=10.1117%2f12.807735&name=joost+van+beusekom http://profiles.spiedigitallibrary.org/summary.aspx?doi=10.1117%2f12.807735&name=joost+van+beusekom automatic document skew pre-processor for character segmentation algorithm 625 [27] a. chan-hon-tong, c. achard and l. lucat, “simultaneous segmentation and classification of human actions in video streams using deeply optimized hough transform”, pattern recognition, vol. 47, no. 12, pp. 3807-3818, 2014. [28] c. tu et al., “vehicle position monitoring using hough transform”, in proceedings of the international conference on electronic engineering and computer science, vol. 4, pp. 316-322. [29] r. varun et al., “face recognition using hough transform based feature extraction”, procedia computer science, vol. 46, pp. 1491-1500, 2015. [30] v. vuĉković and s. spasić, “3-d stereoscopic modeling of the tesla’s long island”, facta universitatis: electronics and energetics, vol. 29, no. 1, pp. 113-126, 2016. instruction facta universitatis series: electronics and energetics vol. 28, n o 2, june 2015, pp. 223 236 doi: 10.2298/fuee1502223m oscillation-based testing method for detecting switch faults in high-q sc biquad filters  miljana milić, vančo litovski faculty of electronic engineering, university of niš, serbia abstract. testing switched capacitor circuits is a challenge due to the diversity of the possible faults. a special problem encountered is the synthesis of the test signal that will control and make the fault-effect observable at the test point. the oscillation based method which was adopted for testing in these proceedings resolves that important issue in its nature. here we discuss the properties of the method and the conditions to be fulfilled in order to implement it in the right way. to achieve that, we have resolved the problem of synthesis of the positive feed-back circuit and the choice of a proper model of the operational amplifier. in that way, a realistic foundation to the testing process was generated. a second order notch cell was chosen as a case-study. fault dictionaries were developed related to the catastrophic faults of the switches used within the cell. the results reported here are a continuation of our previous work and are complimentary to some other already published. key words: obt method, sc filters, switch faults, fault dictionary. 1. introduction the synthesis of test signal is one of the essential problems in analog circuits testing. choices among many possibilities have to be made. first, one should select an analog test domain [1]. testing can be done by analyzing dc signals [2, 3, 4], signals in the frequency domain [5, 6, 7], as well as the signals in the time domain [8]. it is often necessary to use test signals from several domains simultaneously. if we use dc signals, then we search for fault effects related to the nonlinearities and quiescent conditions. a number of methods consider the time domain test signals selection [9, 10, 11]. in the frequency domain, one has to determine the most appropriate spectrum of the testing signal in order to achieve maximal fault effects. in the time domain one searches for one or more signal waveforms that will enable the fastest and the cheapest testing, in order to optimize the production and decrease the price of the product. a technique that does not require solving such problems since it needs no test signal is the oscillation-based test (obt), [12]. to implement this powerful method one has to create a feed-back during testing. by measuring the frequency of the created oscillator and by comparing that with the fault-free frequency, one can detect defective circuits. received may 16, 2014; received in revised form january 19, 2015 corresponding author: miljana milić faculty of electronic engineering, university of niš, aleksandra medevedeva 14, 18000 niš, serbia (e-mail: miljana.milic@elfak.ni.ac.rs) 224 m. milić, v. litovski structural testing is a concept where test signals that detect one or more faults are created. to detect all most possible faults we need many tests. in obt all possible defects are targeted with only one measurement since they all affect the oscillation frequency. here comes the major difference in oscillator design. regular oscillators are created to be insensitive to the presence of parameter variations. on the other hand, obt oscillators and their oscillation frequency should be as much sensitive to parameter variations as possible. unfortunately, the obt technique cannot be automated, since each analog circuit is unique. that is the biggest difficulty in implementing the method. unlike other testing approaches where numerous and most appropriate testing points have to be selected, observed, and captured signals processed, [13], the number of obt test points is one, i.e. the oscillator‟s output. nevertheless, the problem of measurement is not solved, since one has to decide which parameters of the response should be measured and extracted. there are also other problems related to obt implementation. first, bringing the oscillations into a stable state can slow down the testing process. second, in rare situations observing just one testing point (for example, the output voltage), cannot show the fault effect, so additional measurements are needed, such as iddq [14, 15]; additional voltage waveforms [16]; or even mixing domains, including physical redesign of the original circuit to create access points for measurement [17]. problems of needed test (measurement) points and most appropriate quantities of observation for determining the state of the circuit should be solved [18, 19]. the oscillation based testing (obt) method [12, 20, 21] has been drawing our attention for a relatively long period. the main reason for that was the fundamental discrepancy between the theoretical developments reported by the original authors and the practical implementation of the method. namely, as elaborated in [22, 23, 24] the original method is based on the presumption that the operational amplifiers (oa) (or active amplifying elements) within the circuit under test (cut) perform ideally as if the frequency is equal to zero. in practice, that is not the case. in fact, at the oscillation frequency the modulus and the phase of the gain of the operational amplifier(s) are so degraded that it makes the theoretically developed expressions not only impractical but also misleading. both the oscillation frequency of the fault-free (ff) and faulty circuits (fc) obtained by the closed form expression derived based on the original method are far from the real ones. later implementations of the obt concept [25, 26, 27, 28, 29, 30] suffer from the same drawback. namely, if one generates a fault dictionary using closed form formulae (or even by simulation) based on use of ideal model of the operational amplifier, and then verifies the results by (repeated) simulation based on the same models, one does not notice the fundamental problem: the gain and the phase distortions of the operational amplifier are not negligible and the fault dictionaries are far of the realistic ones. it is worth mentioning that, admittedly, the need to include the operational amplifier's phase shift in the evaluation of the oscillation frequency for implementation of the obt method to continuous time analog filter was mentioned earlier in the literature [31]. the idea was, however, implemented in the frequency domain and led to conclusions quite different than the ones we reported in [22]. we find the implementation of the obt method for continuous time filters reported in [32] to be the proper one. there, of course, due to the complexity of the cut the developments in the frequency domain were, simply, not feasible. oscillation-based testing method for detecting switch faults in high-q sc biquad filters 225 based on those considerations we concluded that a different, not analytical, approach to the extraction of these two quantities based on realistic (dynamic) models of the operational amplifiers and time (not frequency) domain simulation is to be implemented. of course, that introduces an additional problem related to the overall time needed to get the fault dictionary (for as many faults as needed) since one needs to extract the oscillation frequency from a time domain signal which in turn has to reach a steady-state. the problems introduced in practical implementation of the obt, however, are broadly compensated by the sole fact that obt resolves the main problem in the test generation per se. namely in general, and especially for analog circuits, the synthesis of the testing signal is a problem above all. obt needs no test signal and, if testable, it exposes any fault present in the circuit. that enables simple implementation of the structural concept [33] to the test signal generation by which only selected faults (being the most probable) in the system are targeted. the obt method was implemented to several types of circuits (as listed in all references above) among which to switched capacitor (sc) filters [34, 35, 36, 37, 38]. the problem of the non-idealities was considered in [37] where, due to the difference between the conclusion obtained from the z-domain and from time domain, much attention was paid to the time domain simulation for generation of the fault dictionaries. there, a simple cmos transconductance amplifier was implemented in the schematic of the sc filter and a conclusion was drawn that the difference between the time and frequency domain analysis is to be attributed to the feed-back circuit. we believe that a purely resistive model of the transistors within the op-amp was used. it is well known, however, [39] that the op-amp implemented in sc circuits has to fulfill stringent requirements not only in the frequency domain (including the nominal gain and the cut-off frequency) but also in the dc domain (low offset), and in transient domain (high slew-rate). in addition, low noise requirements are usually imposed. for example, in [40] the authors recommend the lt1055 op-amp which, in fact, has a jfet at the input in order to reduce the noise. this is why, we think, the main circuit, not the feed-back, is imposing the need to have a much better model of the operational amplifier. that was illustrated in more detail in [22]. considering the obt method, a very important and powerful one, and having in mind the need for a more realistic implementation, we started to resolve the corresponding issues of implementation of obt to the testing of sc filter cells, one by one. the nature of the faults in sc circuits was studied in [41]. namely, within a circuit one may encounter parametric and catastrophic faults. parametric faults here are related only to the capacitance values. catastrophic faults may belong to the following categories: faults related to the connection lines, faults related to the capacitors, faults related to the switches (transistors), and faults related to the operational amplifiers. to generate a fault dictionary, however, one has to resolve the synthesis of the oscillator and the fault insertion method first. we first attacked the problem of synthesis of the feed-back loop that enables oscillation. the success was demonstrated on the simplest situation that is fault dictionary creation for parametric or soft faults. [42]. large changes were attributed to all capacitance values. introduction of catastrophic faults into a circuit that has active elements and feed-back loops is a challenging task since the fault may have several very dramatic consequences. firstly, a catastrophic fault may change the quiescent working conditions of the active elements, eventually bringing them in saturation or in cut-off. that fundamentally changes the circuit behavior and makes the simulation settings much more complicated. one is not to forget that the simulation of an oscillator is a specific 226 m. milić, v. litovski problem related to the conflict between the requirements for a stable numerical integration rule and simulation of an unstable electronic circuit. on the other side, a catastrophic fault may break the existing or establish a new feed-back loop that, again, leads to a totally new circuit with unknown properties and behavior. for that reason, when speaking of catastrophic faults, we first considered the capacitors in an sc notch cell [43]. after getting good results and after considerable experience was accumulated we are here attacking the second element type of the sc filters, the switches. the results reported here are complimentary to the ones reported in [43] (which are not repeated here) and we consider the present and the report given in [43] as completion of a single task. to our knowledge the fault dictionaries reported here are the first and unique base for testing a notch sc cell. the paper is structured in the following way. in the next section the analysis and design of the high q sc notch cell is described. then, in section 4, the obt is discussed and application described. there follow, in paragraph 5, the main results being related to the method of creation of the fault dictionary and the dictionary itself. here the discussion of the results is given too. 2. high q sc notch filter a switched capacitor (sc) is an integrated electronic element used in discrete time signal processing systems. the main idea is to use capacitors and switches to emulate the drawbacks of integrated resistors which have pure accuracy and temperature dependence properties. in that way discrete time systems are obtained from continuous time originals. the circuits so obtained use non-overlapping signals to control the switches, often termed break before make switching, so that all switches are open for a very short time during the switching transitions. filters implemented with these elements are termed 'switchedcapacitor filters'. the switching frequency may be used to control the response of the filters since the equivalent resistances are directly dependent on it. from the implementation point of view the sc filters are in between the analog and the digital ones. namely, while the signals are sampled they are not quantized so that their advantage over digital filters is the potential to achieve a high dynamic range. in the same time the need for analog-to-digital (ad) and digital-to-analog (da) conversion as well as digital signal processing (dsp) hardware is avoided. the analog output signal is simply restored by a low-pass filter. on the other side, besides the potential to be integrated in silicon which is not the case for the continuous time rc active filters, unlike continuous time filters (which have to be constructed with resistors, capacitors and sometimes inductors whose values are accurately known), switched capacitor filters depend only on the ratios between capacitances and the switching frequency. for all these reasons sc filters are an important class of integrated circuits and testing of sc filters is an important issue in electronic design. the physical realization of the sc filters is very frequently performed by cascading second-order cells. that concept will be followed here, too. a topology of a universal second order sc filter cell is depicted in fig. 1 [44, 45]. this is a well known fleischerlaker active sc filter [46]. by proper choice of the parameter values of the cell one may produce all four variants needed for complete filter design: low-pass (lp), band-pass (bp), band-stop (notch), and high-pass (hp). in these proceedings (as we did in our previous oscillation-based testing method for detecting switch faults in high-q sc biquad filters 227 research related to the obt method) the notch cell will be elaborated mostly because it may be stated as the most complex one. namely, it has to suppress part of the frequency band and has two pass-bands. in addition, when creating filters with transmission zeroes at the axis of real angular frequencies, this cell is used as many times as the number of transmission zeroes is (which usually is n/2-1, n being the order of the filter). practically only one additional cell of the type lp, bp or hp is enough to complete the filter realization. finally, it is not to forget that the use of a universal topology drastically simplifies the layout design since it allows for ‟programming‟ the layout on the chip. a3 1c a4c1 a2 1c a 1 1c c1 a5 2c + + a 6 2c c2 v in v out f2f2 f 2 f2 f 2 f1 f1 f1 f1 f1 fig. 1 high q sc notch filter cell the transfer function of the cell t(s), obtained under presumption that the operational amplifiers have infinite gain (not frequency dependent) is given by 2 2 1 0 2 20 0 ( ) ( ) . ω( ) ω out in v s k s k s k t s v s s s q + +   + + (1) the element values of fig. 1 may be related to the coefficients of (1) in the following way [44]: 1 0 0 α /ωk t (2) 2 5 0 α α ω t  (3) 3 1 0 α /ωk (4) 4 α 1/ q (5) 6 2 α k , (6) where k0, k1, k2 are constants determining the position of the passband on the frequency axis, e.g. low-pass, band-pass, etc., while ω0 – notch frequency, q – quality factor, and t – non-overlapping clock period, are design parameters. in the specific case of a notch transfer function one should choose k0= (3∙ω0) 2 , and k1=0, k2=9. in the above expressions t stands for the sampling period and ω0=2πf0, is equal to the modulus of the pole of the cell which is, in the same time, equal to the notch frequency. 228 m. milić, v. litovski the cell is usually designed using (1). to do that we choose as an example for this study the following: f0=1 khz, q=10, t=10 μs (the frequency of the two-phase, nonoverlapping switching is 100 khz), and capacitances c1=c2=20 pf. the selected value of q, which is recognized as quality factor of the cell, is considered large, hence the name of the cell. after substitution of these values in (2)-(6) we obtain: ω0=6283.18 rad/s, α1=0.063, α2=0.063, α3=0, α4=0.1, α5=0.063, and α6=1. the amplitudeand phase-frequency response of this filter cell, produced under the presumption of use of infinite gain operational amplifiers, is depicted in fig. 2. note the usual spice [47] presentation of the phase which is presented as if the phasor is jumping for 180 degrees. fig. 2 amplitude (full line) and phase (doted) frequency response of the “ideal” filter 3. basic concepts of the obt under testing, within these proceedings, we will understand the creation of a fault dictionary. it is a look-up table containing the effect of every fault conceived in advance. by using it we practically implement the simulation before test approach [33]. the information stored in it tells the test engineer whether the selected fault is testable from both points of view: controllability and observability. the main problem hidden behind this table is the selection of a test signal that will activate and propagate the fault effect to the output in the shortest time (to reduce the overall testing in mass volume production). this problem is especially difficult to solve for analog circuits [48] since three domains are to be taken into account: dc, frequency and the time domain. there exists, however, a technique that needs no test signal. it is known as the obt [12, 20, 21]. the basic idea behind this powerful method is to create a redundant feedback loop that is to be activated during testing only. by measurement of the output signal of the fc and by comparison with the response of the ff circuit, one may conclude whether there are defects in the circuit or not. the simplicity of the method is deeper since usually only one testing point is needed (the output) and frequently only one quantity is to be observed: the oscillation frequency. when the ff circuit is set to oscillate, the fc may be revealed either by absence of oscillation or by a different value of the frequency of the signal measured at the output. oscillation-based testing method for detecting switch faults in high-q sc biquad filters 229 cut mode select fig. 3 simplified obt for the implementation of this method it is assumed that the system is so structured that an external controlling signal is capable of (1) isolating a part of it (being the cut) and (2) introducing positive feed-back that will make it oscillate. fig. 3 represents the local arrangement. as can be seen a switch is introduced that, under control of the ‟mode select‟ signal, when activated, simultaneously isolates the cut from the rest of the circuit and positive connects the feed-back branch. this concept allows for implementation of the design for testability (dft) concept of integrated circuits (ic) design which is depicted in fig. 4 in its simplified version [22, 43]. the development of the schematic proposed here is based on [49]. it is reminiscent of the one published in [50] where more details concerning the digital circuitry may be found. analog block 1 analog block 2 analog block n-1 analog block n additional circuitry amux control logic circuit under test (cut) inputs outputs output oscilations testing mode scan test signals test logic fig. 4 obt at the system level in this configuration additional digital control logic is provided in order to set the system in testing mode and to allow for the analog cuts to be isolated and tested one by one. since no test signal is needed, in testing mode, only the output of the cells is to be connected to the system output. note that every analog block is to be designed with a structure as depicted in fig. 3. when first reported the obt method was based on two fundamental presumptions: the active elements are ideal with infinite gain and consequently the system is linear. neither of these presumptions is valid. namely, the operational amplifiers which are necessary for implementation have real properties such as, among many others, finite and frequency dependent modulus of the gain and a phase shift that is different from zero and also frequency dependent. ignoring these properties leads to wrong expressions for calculation of the oscillation frequency and consequently wrong values for the outcome. unfortunately, the fact is that if the simplest, more realistic model of the oa was to be implemented (single pole roll-off) there would not be possible to get a closed form expressions due to the complexity of the node equations of the system. in fact, nonlinear expressions are obtained. 230 m. milić, v. litovski on the other side, there are no linear active circuits as such. furthermore, one is to be aware that if one designs an oscillator, one must draw the working point of the active element into saturation in order to limit the rise of the amplitude being forced by the positive feed-back. so, in part of the period, the circuit must be nonlinear. how long the active element will stay in saturation will depend on the quality of the feed-back loop, i.e. on the value of the modulus of the loop-gain. again, we come to the conclusion that no closed form expressions may be derived for calculation of the oscillation frequency. here we will allow ourselves to make a small digression. the fact that there are nonlinearities within the feedback loop does not disqualify the obt method as such. on the contrary! the abundance of harmonics in the output signal may be effectively used as additional information (besides the oscillation frequency) for both testing and diagnostic purposes [23, 24]. the use of harmonic analysis of the output signal (that may be done of-line) may drastically reduce the additional efforts, redesigns, and silicon area needed in order to get the supply current as additional information for testing what was done in [51]. to summarize, if the simulation with proper models of the active elements is presumed instead of closed form expressions, and if the problem of the additional circuitry needed to create positive feed-back loop is resolved, the obt method becomes a powerful means for testing and diagnosis of not only analog but also mixed-signal systems [20, 23, 24]. 4. simulations and testing the creation of the fault dictionary goes as follows. a list of faults is assembled first. it is normally shorter than the list of all possible faults for several reasons, one of them being the tractability of the testing process, while another is lower probability of occurrence of some specific faults. in these proceedings, as explained in the introduction we will consider the faults related to all transistors used as switches. two types of faults will be taken into account: stuck-at-open and stuck-at-closed. for the circuit of fig. 1 where 10 switches are used, one is to create a fault dictionary with 21 rows, the first one being allocated for the ff circuit. in the next step a fault is to be inserted in the circuit [52]. in our case we have to model an open-circuit (stuck-at-open fault) and a short-circuit. in the former case we use two variants. in the first we consider the open to be an infinite resistance (ideal open), while in the second we choose a more realistic model: the open has finite resistance of 1mω (representing the leakage between the source and the drain). similarly, for the modeling of the stuck-at-short we use a real short-circuit, i.e. zero ohms (ideal short) or the more realistic 0.01ω (representing a physical short circuit). note, to ensure numerical accuracy one is to use a relatively small span of the resistance values between the open and the closed case. in spice one uses 1 g and 1  as defaults. what we use is more realistic and less parted. accordingly, in the sequel we will present two fault dictionaries, one for ideal and the other for more realistic models of the faults. based on these, we will conclude whether we can rely on the ideal switch models or not. to get the oscillation frequency of the ff circuit it has to be extended by a positive feed-back loop. having in mind the value of the gain of the notch cell alone, we concluded that additional gain is to be added to the loop for the oscillations to be enabled. furthermore, the additional gain is to be positive and small enough to avoid excessive harmonic distortions. in fact, additional gain of 6 db was added to the loop-gain. the resulting configuration is depicted in fig. 5 where the schematic is copied from the schematic input to the spice simulator. oscillation-based testing method for detecting switch faults in high-q sc biquad filters 231 before proceeding to simulation we had to solve two additional issues. the first one is related to the choice of the integration rule for the differential equation solver within the simulator. namely, if one is to simulate an electronic circuit, one usually asks for an integration rule (or derivative approximation formula) that is stable and if possible a-stable. that however, as shown in [53], may lead to a signal with a decaying amplitude which eventually vanishes due to the stability requirement imposed. for that reason, for simulation of oscillator circuits, the so called trapezoidal integration rule [53] is to be implemented. finally, a realistic model of the operational amplifier is to be chosen and implemented. since no model (schematic neither) is normally given with the design kits delivered together with the technology file for ic design for the academic licenses we use, we were forced to use a model that is built in the simulator. that was the model of the ltc6078 [54] op-amp whose schematic is built into the ltspice simulator [47]. we published the schematic and the parameters of the model in [22]. all conditions set, after the simulation of the ff circuit, we obtained oscillation with frequency fosc= 960 hz. the fast fourier transform (fft) analysis results for the obtained output signal is shown in fig. 6. that is the information from which the oscillation frequency was extracted. in order to get the fault dictionary, as can be seen, for every fault we have to simulate the oscillator and to perform fft. here all together we needed 41 simulations and fft analyses. the results are given in table i and table ii. the first one uses ideal models of the faulty switches, while the second one uses more realistic models of the faulty switches. the test dictionaries expressed by table i and table ii contain the following data for the ff and for every fc: 1. oscillation frequency; 2. deviation (in percentage) of the oscillation frequency from the ff circuit; 3. and 4. the amplitude and the phase of the first harmonic; 5. the dc value of the output; and 6. the thd of the output signal. fig. 5 ltspice schematic of the notch filter 232 m. milić, v. litovski fig. 6 fft of the obt oscillator output signal we note, at the beginning, the difference between the notch and the oscillation frequency of the ff. for the idealized case it is 1 khz, while for the case when realistic models of the op-amps are implemented it becomes 960 hz. these, if ideal operational amplifiers were to be implemented, would be equal. one is not to forget that 1 khz is a very low frequency and this effect is to be expected to have much more severe consequences if the working frequency of the circuit is to be risen. we also expect that a higher gain in the additional circuit will be needed if oscillations at higher frequencies are to be created. as can be seen from the fosc column for both tables, oscillations are not established in all fcs. in cases where no oscillations are established, one simply concludes that the fault is testable. when however, the oscillations are established in the fc, we may distinguish three situations. in the first one, such as the cases s1-open, s5-open, and s7-short, in table i, there is clear difference in the oscillation frequency. that is enough to conclude that the fault is testable. in the second case, we may have oscillation with frequency near to the one of the ff but with a clearly different value of the amplitude of the first harmonic. this is practically always the case in table i and table ii. there is one case in table ii which deserves some additional attention. namely, when s2-short is present, if not satisfied, in order to get an absolutely firm conclusion about the presence of the fault, one may take into account not only the frequency (difference is 8.3%) and the amplitude of the first harmonic (difference is 29%), but the harmonic distortions, too. in all other cases there is no practical need for the use of the phase shift, the dc values and the distortions as an information about the testability of the circuit. there is a special situation where no sinusoidal oscillations are observed at the output. these are marked by 100k in the oscillation frequency column. instead, as a consequence of clock feed-trough, trapezoidal waveform, having the frequency of the clock, is obtained at the output. note that since such signal is far above the passband of the lowpass filter used at the output of the sc cell, if one wants to diagnose this effect, one is to measure directly at the sc output. in the opposite, the filter will suppress this fault effect. oscillation-based testing method for detecting switch faults in high-q sc biquad filters 233 table 1 simulation with ideal switch model – fault dictionary defect fosc [hz] δfosc [%] ampl. 1st. har. [mv] phase 1st. har. [deg] dc val. [mv] thd [%] comments ff 960 266.6 -125.45 -0.218 0.353 s1 open 360 62.5 139.6 -40.12 -0.073 1.161 s1 short 100k >500 0.216 80.62 -2999.87 209.071 no sin. osc. s2 open 100k >500 0.023 -117.67 -2999.91 171.753 no sin. osc. s2 short 1040 8.3 374.1 45.05 -1.147 0.914 s3 open 100k >500 0.091 -93.06 -2999.53 266.543 no sin. osc. s3 short 100 0. no oscillations s4 open 100k >500 0.018 -92.74 -2999.91 270.629 no sin. osc. s4 short 100k >500 59.52 -6.31 398.077 45.526 no sin. osc. s5 open 890 7.3 3.259 74.36 0.003 2.918 s5 short 100k >500 43.05 71.75 -16.101 109.583 no sin. osc. s6 open 890 7.3 20.76 81.64 -0.036 1.339 s6 short 100k >500 45.95 68.30 18.849 107.222 no sin. osc. s7 open 100 0. no oscillations s7 short 310 67.7 476.3 105.40 -12.934 134.391 s8 open 100k >500 0.033 -90.37 -2999.91 281.203 no sin. osc. s8 short 100k >500 0.051 -91.73 -2999.9 284.390 no sin. osc. s9 open 100k >500 0.092 -93.11 -2999.51 266.616 no sin. osc. s9 short 100k >500 0.155 -91.80 -2999.75 210.601 no sin. osc. s10 open 100k >500 0.008 -71.64 -2999.91 384.349 no sin. osc. s10 short 100k >500 0.430 -97.77 -2170.95 263.349 no sin. osc. table 2 simulation with real switch model – fault dictionary defect fosc [hz] δfosc [%] ampl. 1st. har. [mv] phase 1st. har. [deg] dc val. [mv] thd [%] comments ff 960 266.6 -125.45 -0.218 0.353 s1 open 727.3 24.2 52.67 -152.95 -0.195 0.541 s1 short 100k >500 0.291 82.61 -2999.84 221.73 no sin. osc. s2 open 670 30.2 6.073 7.47 0.078 1.386 s2 short 1028 7.1 374 133.07 1.966 1.754 s3 open 680 29.2 305.3 101.60 -0.509 0.29 s3 short 100 0 no oscillations s4 open 680 29.2 109.7 -166.00 -0.399 0.300 s4 short 100k >500 59.65 -6.35 398.039 45.352 no sin. osc. s5 open 930 3.1 120.2 47.64 -0.105 0.857 s5 short 100k >500 43.16 71.59 -16.221 109.513 no sin. osc. s6 open 930 3.1 95.39 45.16 -0.070 0.872 s6 short 100k >500 45.84 68.44 18.737 107.305 no sin. osc. s7 open 733 23.6 0.019 1.00 -0.007 4.535 s7 short 310 67.7 478.5 107.40 -12.276 133.594 s8 open 670 30.2 79.09 -158.37 -0.421 0.554 s8 short 100k >500 0.042 -91.93 -2999.9 273.155 no sin. osc. s9 open 680 29.2 318.3 22.08 -0.370 1.067 s9 short 100k >500 0.513 -92.00 -2998.75 227.005 no sin. osc. s10 open 680 29.2 258.1 -165.77 -0.063 1.139 s10 short 100k >500 0.421 -97.95 -2171.09 263.529 no sin. osc. 234 m. milić, v. litovski by comparison of table i and table ii we may get a notion on the quality of the model of the switch used. the main difference between table i and table ii is in the number of feed-trough fault effects. in addition, as can be seen for the case s7-open, the change of the circuit functionality due to the ideal model, leads to a wrong conclusion about the fault effects, while both models cover the fault. the realistic fault model is mostly suppressing this effect and this is why we do recommend it for this application. note its simplicity. by inspection of table i and table ii we may conclude that there are no untestable faults. the fault effects being different, all faults may be recognized at the output of the cell making obt a successful concept for testing this kind of cells while using an extremely simple additional circuitry for the synthesis of the oscillator circuit. we want to stress here again that only one testing point was used and only one measurement is undertaken the output voltage waveform was measured. the additional processing (fft) is unavoidable in order to get the oscillation frequency so that the numbers depicted in table i and table ii are obtained with no additional cost and effort. 5. conclusion implementation of the obt is a challenging issue. it comes from the fact that the method was originally proposed based on presumptions that the active elements exhibit ideal performances. that is not the case. in this proceeding we demonstrate the proper implementation of obt for the case of a second-order notch cell. this cell may be considered as the best representative (among other second order cells) for the task we undertook, since it is the most complicated and is the most frequently used one. it was synthesized to be implemented as an integrated circuit with switched capacitors. since the number and the nature of the possible faults if large we are attacking the problem in several phases, one of them being reported here. only catastrophic faults of the switches were modeled and corresponding fault dictionary was created. it was shown that full coverage of the selected faults may be achieved if proper modeling of the operational amplifiers is used and proper feed-back circuit is synthesized. the results reported here are parts of a project run for a longer period in which we started with continuous time analog filter cells and we are here ending with switched capacitor filter cell. acknowledgement. this research was partly funded by the ministry of education and science of republic of serbia under contract no. tr32004. references [1] m. soma, "automatic test generation algorithms for analogue circuits", iee proc. circuit, devices and systems, vol. 143, no. 6, december 1996, pp. 366-373. [2] c. dufaza and h. ihs, "a bist-dft technique for dc test of analog modules”, journal of electronic testingtheory and applications, vol. 9, no. 1-2, 1996, pp. 117-133. [3] m. marlett and j. abraham, "dc-iatp: an iterative analog circuit test generation program for generating dc single pattern tests", international test conference, 1988, pp. 839-845. [4] l. milor and v. viswanathan, "detection of catastrophic faults in analog integrated circuits", ieee transactions on computer aided design, vol. 8, 1989, pp 114-130. [5] m. slamani and b. kaminska, "multifrequency analysis of faults in analog circuits", ieee design & test of computers, vol. 12, no. 2, 1995, pp. 70-80. http://rd.springer.com/search?facet-author=%22christian+dufaza%22 http://rd.springer.com/search?facet-author=%22hassan+ihs%22 http://rd.springer.com/journal/10836 http://rd.springer.com/journal/10836 http://rd.springer.com/journal/10836/9/1/page/1 oscillation-based testing method for detecting switch faults in high-q sc biquad filters 235 [6] c. wang, y. yun, h. liang, j. he, m. chan, "multi-frequency test for analog circuits," electron devices and solid-state circuits (edssc), ieee international conference, june 2013, pp. 1, 2, 3-5. [7] s. huynh, s. kim, m. soma and j. zhang, "automatic analog test signal generation using multifrequency analysis", ieee transactions on circuits and systems—ii: analog and digital signal processing, vol. 46, no. 5, may 1999, 565-576. [8] b. burdiek, "generation of optimum test stimuli for nonlinear analog circuits using nonlinear programming and time-domain sensitivities," proc. of design, automation and test in europe, conference and exhibition, 2001, pp. 603-608. [9] z. guo and j. savir, "algorithm-based fault detection of analog linear time-invariant circuits", proc. of ieee instrumentation and measurement technology conference, budapest, hungary, may 2001, pp. 49-54. [10] s. cherubal and a. chatterjee, "parametric fault diagnosis for analog system using functional mapping", proc. of ieee date, nice, france, 1999, pp. 195-200. [11] v. prasannamoorthy and n. devarajan, "time domain technique for fault diagnosis of analog circuits with flexible accuracy algorithm", eur. journal of scientific research, vol. 51, no. 2, 2011, pp. 211-221. [12] k. arabi and b. kaminska, "oscillation-test strategy for analog and mixed-signal integrated circuits", proc. of the 14th ieee vlsi test symposium (vts‟96), princeton, new jersey, april/may 1996, pp. 476-482. [13] a. halder and a. chatterjee, "automated test generation and test point selection for specification test of analog circuits," proc. of 5th international symposium on quality electronic design, 2004, pp. 401-406. [14] g. hu, h. wang, m. hu and s. yang, "oscillation test strategy for analog filters by monitoring output voltage and supply current" thinghua science and technology, vol. 12, no. s1, 2007, pp. 78-82. [15] p. alli, testing a cmos operational amplifier circuit using a combination of oscillation and iddq test methods, m.sc. thesis, louisiana state university, usa, 2004. [16] m. wong and k. ko, "fault diagnostic improvement method for otm-based testing", proc. of 17th ieee instrumentation and measurement technology conf., 2000, baltimore, md, usa, pp. 1118 – 1123. [17] s. yellampalli, a. srivastava, and v. pulendra, "a combined oscillation, power supply current and iddq testing methodology for fault detection in floating gate input cmos operational amplifier", proc. of the 48th midwest symp. on circuits and systems, covington, ky, aug. 2005, pp. 503 506. [18] prasad, v.c.; babu, n.s.c., "selection of test nodes for analog fault diagnosis in dictionary approach", ieee transactions on instrumentation and measurement, vol. 49, no. 6, dec. 2000, pp. 1289-1297. [19] c. yang, s. tian, and b. long, "application of heuristic graph search to test-point selection for analog fault dictionary techniques", ieee trans. on instrumentation and measurement, vol. 58, no. 7, 2009, pp. 2145-2158. [20] k. arabi and b. kaminska, "efficient and accurate testing of analog-to-digital converters using oscillation-test method", proc. of the european design and test conference (ed&tc 97), paris, france, march 1997, pp. 384-352. [21] k. arabi and b. kaminska, “oscillation-test methodology for low-cost testing of active filters”, ieee trans. on instrumentation and measurements, vol. 48, no. 4, august 1999, pp 798-806. [22] m. milić, m. stošović and v. litovski, "oscillation based analog testing – a case study", in proceedings of the 34th international conference on information and communication technology, electronics and microelectronics mipro 2011, opatija, croatia, 2011, vol. 1, pp. 118-123. [23] m. stošović, m. milić and v. litovski, "analog filter diagnosis using the oscillation based method", journal of electrical engineering, issn 1335-3632, vol. 63, no. 6, 2012, pp. 349–356. [24] m. stošović, m. milić, m. zwolinski and v. litovski, "oscillation-based analog diagnosis using artificial neural networks based inference mechanism", computers and electrical engineering, vol. 39, 2013, pp. 190-201. [25] a. chaehoi, y. bertrand, l. latorre and p. nouet, "improving the efficiency of the oscillation-based test methodology for parametric faults", latw'03, 4th ieee latin american test workshop, natal, brazil, 2003. [26] a. raghunatan, h. shin and j. a. abraham, "prediction of analog performance parameters using oscillation based test", proc. 22nd ieee vlsi test symp., apr. 2004, pp. 377-382. [27] a. raghunatan, j. h. chun, j. a. abraham and a. chatterjee, "quasi-oscillation based test for improved prediction of analog performance parameters", proc. of the itc'04, international test conference 2004, pp. 252-261. [28] k. suenaga, e. isern, r. picos, s. bota, m. roca and e. garcía-moreno, "application of predictive oscillation-based test to a cmos opamp", ieee transactions on instrumentation and measurement vol. 59 , issue 8, 2010., pp. 2076-2082. [29] e. romero, m. costamagna, g. peretti and c. marques, "a performance evaluation of oscillation based test in continuous time filters", international journal of mechanical, industrial science and engineering vol. 8, no. 1, 2014, pp 196-201. https://www.researchgate.net/researcher/11902562_k_suenaga https://www.researchgate.net/researcher/7902034_e_isern https://www.researchgate.net/researcher/35260394_r_picos https://www.researchgate.net/researcher/11616821_s_bota https://www.researchgate.net/researcher/6100697_m_roca https://www.researchgate.net/researcher/8012611_e_garcia-moreno http://ieeexplore.ieee.org/xpl/tocresult.jsp?isnumber=5508591 236 m. milić, v. litovski [30] m. s. sankari. and p. sathish kumar, "oscillation test methodology for built-in analog circuits", international journal of computational engineering research, ijcer, vol. 2, issue no.3, 2012, pp. 868-877 [31] m. s. zarnik, f. novak and s. macek, "design of oscillation-based test structures of active rc filters. iee proceedings, circuits, devices and systems, 2000, vol. 147, no. 5, pp. 297–302. [32] m. wong, "on the issues of oscillation test methodology", ieee transactions on instrumentation and measurement, 2000, vol. 49, no. 2, pp. 240–245. [33] s. hurst, vlsi testing: digital and mixed analogue/digital techniques, institution of engineering and technology (iet), uk, 1999. [34] m. s. zarnik, f. novak and s. macek, "efficient go no-go test of active rc filters", international journal of circuit theory and applications, 1998, vol. 26, no. 5, pp. 523–529. [35] u. kač and f. novak, "all-pass sc biquad reconfiguration scheme for oscillation based analog bist", proc. of the 9th european test symposium, ajaccio, france, 2004, pp. 133-138. [36] u. kač and f. novak, "oscillation test scheme of sc biquad filters based on internal reconfiguration", journal of electron. test, vol. 23, no. 6, pp. 485-495, december 2007. [37] u. kač and f. novak, "reconfiguration schemes of sc biquad filters for oscillation based test", information technology and control, vol.42, no. 1, pp. 38-47, 2013. [38] g. huertas, d. vazquez, e. j. peralias, a. rueda and h. l. huertas, "practical oscillation-based test of integrated filters", ieee design & test of computers, vol. 19, no. 6, 2002, pp. 64-72. [39] k. martin and a. sedra, "effect of the opamp finite gain & bandwidth on the performance of switched-capacitor filters", ieee trans. circuits syst., vol. cas-28, no. 8, pp. 822-829, aug 1981. [40] j. náhlík, j. hospodka, p. sovka and b. pšenička, "implementation of a two-channel maximally decimated filter bank using switched capacitor circuits", radioengineering, vol. 22, no. 1, april 2013, pp. 167-173. [41] m. robson and g. russell, "a digital method for testing embedded switched capacitor filters", in proceedings of the conference on european design automation, euro – dac „96/ euro – vhdl ‟96, pp. 239–244. [42] m. milić and v. litovski, "soft defects testing in notch sc filters using the oscillation method", in proc. of the lvii etran conf., zlatibor, serbia, 2013, pp. el 2.3. [43] m. milić and v. litovski, "testing capacitors‟ hard defects in notch sc filters using the oscillation method", in proc. of the 5th small system simulation symposium, ssss 2014, niš, serbia, pp. 30-36. [44] p. e. allen and d. r. holberg, cmos analog circuit design, 2nd ed., oxford university press, new york, usa:, 2002. [45] f. h. ironns, active filters for integrated circuits applications, artech house, norwood, ma, usa, 2005. [46] p. e. fleischer, and k. r. laker, "a family of active switched capacitor biquad building blocks", bell system technical journal, no. 58, december 1979, pp. 2235-2269. [47] -, lt spice user manual, http://www.intactaudio.com/forum/viewtopic.php?t=596. [48] c. chalk, m. zwolinski, and b. r. wilkins, "test stimulus generation for steady-state analysis of analogue and mixed-signal circuits", proc. of the 3rd ieee international mixed signal testing workshop,1997, pp. 85-92. [49] p. kabisatpathy, a. barua and s. sinha, fault diagnosis of analog integrated circuits, springer, dordrecht, the nederland, 2005. [50] s. mosin, "a built-in self-test circuitry based on reconfiguration for analog and mixed-signal ic." information technology and control, 2011, vol. 40, no. 3, pp. 260-264. [51] g. hu, h. wang, m. hu and s. yang,, "oscillation test strategy for analog filters by monitoring output voltage and supply current," thinghua science and technology, vol. 12, no. si, july 2007, pp. 78-82. [52] b. kaminska, "analog and mixed signal test", in: eda for ic system design, verification, and testing, crc press, taylor&francis group, london, 2006. [53] litovski, v., zwolinski, m., "vlsi circuit simulation and optimization", chapman and hall, london, 1997. [54] -, linear ethnology, [55] http://www.linear.com/designtools/software/?gclid=ck_dzsaknl4cfqbmtaod2akarg#ltspice http://www.linear.com/designtools/software/?gclid=ck_dzsaknl4cfqbmtaod2akarg#ltspice plane thermoelastic waves in infinite half-space caused facta universitatis series: electronics and energetics vol. 30, n o 3, september 2017, pp. 327 350 doi: 10.2298/fuee1703327g modelling solar cell s-shaped i-v characteristics with dc lumped-parameter equivalent circuits a review  francisco j. garcía-sánchez 1,2 , beatriz romero 1 , denise c. lugo-muñoz 2,3 , gonzalo del pozo 1 , belén arredondo 1 , juin j. liou 3 , adelmo ortiz-conde 2 1 superior school of experimental science and technology (escet), rey juan carlos university (urjc), móstoles, madrid 28933, spain 2 solid state electronics laboratory (lees), simón bolívar university (usb), caracas 1080a, venezuela 3 emoat, llc, 1933 ayrshier place., oviedo, fl 32765, usa abstract. this article reviews and appraises the dc lumped-parameter equivalent circuit models that have been proposed so far for representing some types of solar cells that can exhibit under certain circumstances a detrimental s-shaped concave deformation within the energy-producing fourth quadrant of their illuminated i–v characteristics. we first present a very succinct recollection of lumped-parameter equivalent circuits that are commonly used to model conventional solar cells in general. we then chronologically present and discuss lumped-parameter equivalent sub-circuits that, combined with conventional solar cell equivalent circuits, are used to specifically represent the undesired s-shaped behaviour. the mathematically descriptive equations of each complete equivalent circuit are also examined, and closed form solutions for the terminal current and voltage as explicit functions of each other are presented and discussed whenever available. while comparing the most salient features and explaining the practical advantages and disadvantages of such equivalent circuit models, we offer some comments on possible directions for further improvement. key words: solar cell lumped-parameter equivalent circuit modelling, solar cell concentrated-element equivalent circuit models, s-shaped current-voltage characteristics, s-shape kink, organic solar cells, lambert w function received february 19, 2017 corresponding author: francisco j. garcía-sánchez solid state electronics laboratory (lees), simón bolívar university (usb), caracas 1080a, venezuela (e-mail: fgarcia@ieee.org) 328 f. j. garcía-sánchez, b. romero, d. c. lugo-muñoz, et al. 1. introduction the process of designing practical photovoltaic applications calls for the availability of dc lumped-parameter equivalent circuit models as simple as possible to compactly describe the solar cells’ electric behaviour represented by their terminals’ current–voltage (i-v) characteristics, measured in the dark and under standard illumination conditions. solar cell lumped-parameter (or concentrated-element) equivalent circuit models ignore the spatial distribution of the electrical mechanisms present, and instead assume that they are concentrated and represented by certain idealized passive and active lineal and nonlineal electrical components, typically resistors, capacitors, inductors, diodes, and current voltage sources, located at certain positions in an electrical network. under steady state (dc) conditions neither inductors nor capacitors are used. such simple equivalent circuits constitute essential tools for photovoltaic systems simulation, as well as for the important task of advancing basic and applied research and technological development of emerging solar cells’ materials, structures, and fabrication techniques. most well-established conventional solar cells under illumination exhibit the type of generic i-v characteristics that can be satisfactorily represented under steady state by the equivalent electrical behaviour of some of the conventional dc lumped-parameter circuit models that are shown in fig.1 [1]. however, there are innovative developmental or still experimental solar cells, such as some of those based on binary and ternary compound semiconductors, non-crystalline hetero-junctions [2], novel silicon quantum dot solar cells [3], others based on perovskite semiconductors [4-6], and most notably organic semiconductor-based solar cells [7-9], that might exhibit under certain circumstances undesirable deformations of their illuminated i-v characteristics that impair their energy conversion capacity. the main feature of such apparent anomaly becomes evident when the solar cell’s i-v characteristics under illumination present a peculiar concave shape within the fourth quadrant (the power generating quadrant); instead of exhibiting the normally expected, so-called “j” type conventional convex shape. this deformation of the illuminated i-v curve is commonly referred to as the s-shape “kink” of the i-v characteristics [10]. the presence of such bend seriously reduces the solar cell’s fill factor by depressing the location of the maximum power point, and thus represents a serious impairment for the cell’s power conversion efficiency that must be avoided, minimised or suppressed [7, 11]. in the sections that follow we offer a chronological perspective view of the most relevant dc lumped-parameter equivalent circuit models that have been proposed to date, for specifically describing in a compact way this adverse s-shaped behaviour observed in the illuminated i-v characteristics of some otherwise promising solar cells. 2. solar cell equivalent circuit models the simplest possible mathematical description of the i-v characteristics at the terminals of any conventional solar cell measured under illumination conditions consists of adding a photo-generated current to the well known shockley’s ideal diode current equation [12]. the equation resulting from adding these two terms is an explicit compact model of the terminal current expressed as an exponential function of the terminal voltage. this simplest mathematical description of a solar cell electrical behaviour under modelling solar cell s-shaped i-v characteristics with dc lumped-parameter equivalent circuits – a review 329 illumination represents a corresponding dc lumped-parameter equivalent circuit model that consists of the parallel combination of a diode and an illumination-dependent current source, as portrayed in fig. 1(a). such descriptive mathematical representation is practically appealing, not only because of its compactness, but also because its explicit nature allows it to be easily inverted and numerically calculated. unfortunately, this simplest model usually falls short of adequately describing all the relevant electrical phenomena that must be considered for solar cell development and photovoltaic system simulation and design. thus, the corresponding very basic dc lumped-parameter equivalent circuit model shown in fig. 1(a) is often deemed to be not accurate enough to be of practical use. 2.1. conventional dc lumped-parameter circuit models the basic equivalent circuit model is modified to offer a more realistic representation, by including other elements, especially parasitic resistors added as lumped elements connected both in series and/or in parallel to account for the possible presence of significant ohmic losses, as indicated in figs. 1(b), (c) and (d). similarly, and in order to be able to better account for the possible presence of more than one significant junction conduction mechanism, the equivalent circuit might also need to include more than one diode connected in parallel with the photo-current source, as presented in fig. 1(e). relevant lumped parameters potentially introduced in these more complex equivalent circuit models, in addition to the value(s) of the series rs and shunt rp=1/gp resistors, are the magnitudes i01 , i02,... of the reverse saturation current(s) and the corresponding value(s) of the junction ideality factor(s) n1, n2,... of the possible multiple diodes needed. a b c d e fig. 1 typical generic solar cell dc lumped-parameter equivalent circuit models showing the photo-generated current source with: (a) a single ideal diode in parallel; (b) plus a series resistance; (c) plus a parallel conductance; (d) plus both series resistance and parallel conductance; and (e) several ideal diodes plus a series resistance. 330 f. j. garcía-sánchez, b. romero, d. c. lugo-muñoz, et al. the parameters that may be used are supposed to bear direct associations to relevant fundamental microscopic physical features and phenomena actually present in the real solar cell to be modelled. the various circuit elements added to the basic dc equivalent circuit model of fig. 1(a) undoubtedly improve the model’s descriptive fidelity. however, their presence, as shown in figs. 1(b), (c), (d) and (e) also fundamentally complicates the mathematical handling of the resulting descriptive i-v equations. because of them the basic equation ceases to be explicit to become an implicit transcendental equation. from the point of view of photovoltaic system simulation and solar cell model parameter extraction through curve fitting, being transcendental is an undesirable trait of the descriptive equations. except in very few specific cases, they cannot be explicitly solved for the terminal current as a function of the terminal voltage, and vice versa, using only elementary functions. 2.2. conventional models’ mathematically descriptive equations and their solutions luckily, there is the lambertw function, which we will refer to here as w for short. this function comes in very handy for explicitly solving equations which are made up of both linear and exponential terms, such as those equations that describe the circuits of figs. 1(b), (c), and (d). this special function w, whose utility was ignored to a large extent until not long ago, may be succinctly defined as the solution to the generic linear-exponential equation: z = w(z) e w(z) , where z is any complex number [13, 14]. around the turn of this century the use of w started to become an accepted and increasingly ubiquitous tool for solving various important problems of physics [15, 16]. the problems newly solved by using w prominently include important areas related to semiconductor physics, such as electronic devices and circuits, where linear-exponential type of equations abound since they play essential roles in describing the underlying phenomenology. numerical calculation of w is relatively transparent nowadays, since various methods exist to quickly compute the principal w0(z) and other branches of w. additionally, efficient algorithms are routinely implemented in most major mathematical software packages, physics and device modelling tools, and circuit simulation systems. at the turn of the century two seminal works dealing with the use of w in the field of electronic circuit problems were published in the year 2000. one was an exact w-based analytical solution, proposed by banwell and jayakumar [17], of the terminal current i as an explicit function of the terminal voltage v for shockley’s modified equation [12]. it describes a circuit consisting of the series combination of a single diode and a lone resistor rs (similar the circuit of fig. 1(b) less the current source) which is expressed as: 0 exp 1s th v ir i i nv            , (1) where i0 is the reverse saturation current of the diode, vth = kbt/q is the thermal voltage and n is the so-called diode ideality factor, which describes how much the diode’s junction carrier transport mechanisms deviate from supposedly “ideal” behaviour (n=1). the exact w-based analytical solution of the terminal current as an explicit function of the terminal voltage presented by banwell and jayakumar is [17]: modelling solar cell s-shaped i-v characteristics with dc lumped-parameter equivalent circuits – a review 331 0 0 0 0 expth s s s th th nv i r v i r i w i r nv nv            . (2) because only a series-connected resistor was assumed to be involved in this problem, just the terminal current i needs be explicitly solved using w; whereas the terminal voltage v can be directly expressed as an explicit function of the terminal current using the natural logarithm elementary function: 0 0 ln th s i i v nv ir i        . (3) the other contemporaneous turn of the century seminal work about the use of w in the field of electronic circuit problems was published by ortiz-conde et al also in 2000 [18]. it was more comprehensive in the sense that it also contemplated the presence of significant shunt conductance gp=1/rp. this seminal work presented the derivation of the two exact w-based analytical solutions for both the terminal current and the terminal voltage as explicit functions of each other, of the transcendental equation corresponding to the circuit composed of a single diode and both seriesand shunt-connected resistors, rs and rp, respectively (similar to the circuit of fig. 1(d) less the current source). shockley’s modified terminal current equation in this case has an extra implicit term that accounts for the additional shunt conductance: 0 exp 1s s th p v ir v ir i i nv r              . (4) the explicit w-based closed form analytic solutions for both i and v as explicit functions of each other, as presented by ortiz-conde et al [18] are, for the current: 0 0 0 0 exp ( 1) ( 1) ( 1) th s s s s s th s p th s p s s p nv i r v i r v i rv i w r r nv r g nv r g r r g                , (5a) or 0 0 0 0 ( 1) ln exp ( 1) ( 1) th th s p s s s s s th s p th s p nv nv r g i r v i rv i w r r i r nv r g nv r g                    ; (5b) and for the voltage: 0 0 0 0 exp s th th p th p p i i i i i v ir nv w nv g nv g g              , (6a) or 0 0 0 0 ln expth p s th th p th p nv g i i i v ir nv w i nv g nv g                  . (6b) it might be noticed that eliminating the shunt conductance loss (letting gp=1/rp 0) does revert (5) back into (2), as it should, but does not allow to directly convert (6) into (3). 332 f. j. garcía-sánchez, b. romero, d. c. lugo-muñoz, et al. four years after the publication of these two important seminal works about how to use w to derive exact explicit solutions for both i and v of a circuit consisting of a (dark) diode with seriesand parallel resistors, jain and kapoor illuminated in 2004 the previously dark circuit model by adding in parallel with the diode a current source of a photo-generated intensity iph [19]. by so doing, the previously dark circuit became the illuminated solar cell equivalent circuit model shown in fig. 1(d). the new constant current concisely represents the photo-current iph generated by the transport and collection of separated charge carriers that are photo-generated within the cell’s body by the absorption of sufficiently energetic incoming photons that penetrate through the diode’s illuminated surface (now a photovoltaic diode). this photo-current must be inserted as an additional constant current term into the descriptive equation (4), so that now under illumination (4) becomes: 0 exp 1s s ph th p v ir v ir i i i nv r               . (7) the addition of the constant to transcendental eq. (4) does not alter the manner eq. (7) is solved, which remains the same as it was for eq. (4) [18]. the resulting exact w-based analytical solutions for both the terminal current and the terminal voltage, as explicit functions of each other, published in 2004 by jain and kapoor [19], are similar to eqs. (5) and (6), except for the presence of the added iph term. for the current: 0 00 0 ( ) ( ) exp ( 1) ( 1) ( 1) ph s ph sth s s s th s p th s p s s p v i i r v i i rnv i rv i w r r nv r g nv r g r r g                   , (8a) or 00 0 0 ( )( 1) ln exp ( 1) ( 1) ph sth th s p s s s s th s p th s p v i i rnv nv r g i rv i w r r i r nv r g nv r g                     ; (8b) and for the voltage, 0 00 0 exp ph ph s th th p th p p i i i i i ii v ir nv w nv g nv g g                 , (9a) or 00 0 0 ln exp phth p s th th p th p i i inv g i v ir nv w i nv g nv g                    . (9b) therefore, having inserted the additional photocurrent term iph into eqs. (5) and (6), they have become the w-based solutions (8) and (9) which explicitly describe the electric behaviour of illuminated solar cells with significant series and shunt parasitic resistances. it is interesting to check that turning the light off (by letting iph0) reverts eqs. (8) and (9), as they should, back into the original eqs. (5) and (6), respectively. one year later, in 2005, jain and kapoor directly used these same w function-based solutions, corresponding to the conventional solar cell lumped-parameter equivalent circuit model of fig. 1(d), to study organic solar cells [20]. modelling solar cell s-shaped i-v characteristics with dc lumped-parameter equivalent circuits – a review 333 in addition to a significant presence of both series and shunt parasitic resistances, sometimes it is evident in the measured i-v characteristics of the solar cell the presence of more than one significant conduction mechanism. in such cases multiple-diode equivalent circuit models are called for. they contain more than just one diode in parallel with the photocurrent source, as shown in fig. 1(e). consequently, the corresponding equations turn out to be of a multi-exponential nature and, thus, are in general more difficult, or even impossible, to solve exactly in an explicit form. regardless of the difficulty, such type of multiple-diode equivalent circuits must be used whenever the presence of multiple junction conduction mechanisms must be adequately described because their relative significance so demands [21, 22]. for a complete review of the existing literature about generic solar cell dc lumpedparameter equivalent circuit models and their corresponding equations, solutions, and methods for numerically extracting their parameters, see refs. [1, 23-26] and the references cited therein. although most solar cells can be adequately described by one of the just mentioned generic lumped-parameter equivalent circuit models, some researchers still prefer to use other models that are specifically intended for particular types of solar cells. for example, j. w. jin, et al recently published a universal compact model for organic solar cells, which consists of individually describing three different regimes of operation and then combining their mathematical descriptions into a single equation [27]. unfortunately, even that universal model for organic solar cells is not capable of describing the concave s-shaped behaviour occasionally exhibited by the illuminated i-v characteristics of organic solar cells. in fact, none of the just described conventional dc lumped-parameter equivalent circuit models seem to be capable by itself of properly modelling the occurrence of the undesirable s-shaped behaviour observed in the illuminated i-v characteristics of several types of solar cells [28, 29]. consequently, more suitable specialized circuit models need to be introduced to specifically represent the s-shaped kink. in the following sections we will analyse the issue of how to best describe, through other dc lumped-parameter equivalent circuit models, the harmful s-shaped deformation of the illuminated i–v characteristics, whose presence might seriously spoil the energy conversion performance of solar cells, especially but not exclusively those of organic solar cells [7]. 3. the s-shape kink as already mentioned above, some promising important types of solar cells can, and do, under certain circumstances exhibit the undesirable s-shaped concave deformation of their illuminated i-v characteristics. the s-shape kink is most evident in the fourth quadrant, where it can seriously reduce the fill factor, and thus, impair the solar energy conversion efficiency of the device. therefore when this is the case, corrective or palliative measures must be adopted to avoid or suppress the emergence of this detrimental kink. many researchers have proposed several explanations of the probable causes of this sshaped concavity, but its origins are still not totally clear. materials-related charge transport restrictions and charge accumulation-related interface phenomena, which alter the distribution of the solar cell’s internal electric field are generally regarded to be mainly 334 f. j. garcía-sánchez, b. romero, d. c. lugo-muñoz, et al. responsible for the occurrence of the s-shaped kink [5, 7, 11, 30-33]. in organic solar cells, misaligned metal work functions and selective blocking contacts can produce injection barriers, and insulating interfacial layers between the metal and the active layers can produce extraction barriers, both of which might produce the fatidic s-shape [34]. similar more or less pronounced s-shapes also can be observed in the measured characteristics of many types of experimental and developmental photovoltaic devices. for example, transient forward, or even reverse, i−v sweeps of perovskite semiconductorbased solar cells, where both ion migration and mobile charge trapping seem to cause undesirable scan direction-dependent hysteresis in their illuminated i-v curves [4, 5]. the same kind of s-shaped kink also has been observed in a-si/c-si hetero-junction solar cells at certain temperature and illumination levels [2]. other types of emerging more exotic photovoltaic devices display this type of detrimental behaviour. that is the case, for example, of the novel experimental ultra-thin photovoltaic cells made with van der waals force-bonded hetero-structures containing atomically thin layers of semiconducting transition metal dichalcogenides (such as mos2, ws2 and wse2) [35]. their illuminated i-v curves also exhibit detrimental s-shaped deformations that need to be suppressed for this attractive type of device to ever achieve usable energy conversion efficiency levels. it is, therefore, essential to identify the possible origins of the s-shape kink if we pretend to avoid or diminish it. thus, identifying and understanding the origin(s), as well as quantifying their influence on the s-shape kink’s emergence, growth or suppression, becomes a crucial task for optimising the design of such solar cells. this goal may be conveniently achieved through the introduction of additional lumped elements into an existing conventional solar cell equivalent circuit model, to modify it so that it may electrically account for the full range of illuminated i-v characteristics, especially including the s-shaped kink behaviour. this was kind of analysis followed by l. zuo, et al, among others, for investigating the origin of the s-shaped kink in the i-v characteristics of organic solar cells [36], who use an equivalent circuit model approach [37] as a tool for analysis. the object of study then becomes the evolution of the solar cell’s i-v characteristics experimentally measured under different operating conditions (illumination intensity, temperature, etc), or in response to adjustments in material composition, morphology, structural design and fabrication specifications, etc. the analysis of how the equivalent circuit’s lumped-elements’ parameter values (as extracted by fitting the model’s equations to the measured data) change in response to modifications of the conditions, can be used as a powerful tool to scrutinise and understand the causes of the s-shape kink, and, thus, to learn how it may be best suppressed. 4. equivalent circuit modelling of the s-shape kink many solar cells unfortunately exhibit this undesirable s-shaped “kink” visibly in the fourth quadrant of their illuminated i-v characteristics. since the kink cannot be described using only the conventional dc lumped-parameter equivalent circuit models shown in fig. 1, and discussed in the preceding sections, ancillary circuits have been proposed for over a decade [28, 29]. the additional lumped elements must be incorporated together with a modelling solar cell s-shaped i-v characteristics with dc lumped-parameter equivalent circuits – a review 335 conventional dc lumped-parameter equivalent circuit model to offer an overall description of the illuminated i-v characteristics. 4.1. model by b. mazhari (2006) as early as 2006 mazhari already understood the incapacity of stand-alone existing conventional dc lumped-parameter equivalent circuit models for properly describing the measured i-v characteristics of some illuminated organic solar cells [38]. he suggested that the commonly held hypothesis that the photo-current of organic solar cells remains essentially constant throughout the whole fourth quadrant (00; v>voc), where the current yielded by this model is needlessly forced to level off, following a general trend imposed to a great extent by the i-v locus of rp2 (see dashed blue lines in fig. 6). the reason is that in this model shown fig. 3 the first quadrant is primarily dominated, for large forward voltages >>voc, by the linear parallel resistor rp2, and thus, the i–v curve turns out to be quasi-linear. instead, what seems to actually happen in real cells that exhibit these s-shape kinks, is that their i-v characteristic when measured under illumination in the first quadrant beyond the open circuit voltage (i>0; v>voc) at some point start to describe an upward turn and continue to grow in what appears to be an exponential-like fashion [7, 30, 31, 33]. this circuit has been successfully applied in different experiments that involve sshaped removal with annealing [43] and uv soaking [46], and it has been validated with impedance measurements and ac modeling [47]. fig. 4 dc lumped-parameter equivalent circuit model proposed by gaur and kumar [29] to describe the i-v characteristics of polymer solar cells under dark conditions. it looks almost like a conventional double diode with series and shunt resistances model, except that the diodes have opposite polarities. 4.3. model by gaur and kumar (2013) in 2013 kumar and gaur [28, 29] proposed an improved equivalent circuit model to represent the behaviour of polymer solar cells under different environmental conditions. modelling solar cell s-shaped i-v characteristics with dc lumped-parameter equivalent circuits – a review 339 the proposed model does not result in a single compact formulation. the actual equivalent circuit turns out to be very complex and contains many circuit elements. the main reason for this is that the model itself separately treats the dark and illuminated characteristics, and even forward and the reverse characteristics are dealt with separately. the proposed dark equivalent circuit is a parallel combination of a shunt resistor and two diodes connected with opposite polarity, all connected in series with a series resistor, as presented in fig. 4. the dc equivalent circuit proposed to represent the i–v characteristics under illumination is based on the chief assumption that he photo-current is not constant but varies with applied voltage. it contains the dark circuit elements plus: a zener diode, up to four more diodes, two photo-generated current sources, and additional unconventional resistors. it is shown in fig. 1(b) of [28] and fig. 7 of [29], but it is too complex to be of any practical use to reproduce here. although this model allowed kumar and gaur to understand the phenomenology of degraded p3ht: pcbm polymer solar cells, its complexity is such that it is not suggested as a practical compact equivalent circuit model to efficiently represent in general the sshaped concave deformations that are observed under illumination in the i-v characteristics of some types of solar cells. 4.4. model by f. j. garcía-sánchez et al (2013) to deal with the inability of the model by araujo de castro et al [10] shown in fig. 3 to faithfully model real measured i-v data far beyond voc, a minor but crucial modification was introduced in 2013 by garcía-sánchez et al [48]. the improvement affects the sub-circuit#2i part of the proposed equivalent circuit model diagram presented in fig. 5. fig. 5 the solar cell dc lumped-parameter equivalent circuit model proposed by garcíasánchez et al [48] to allow describing the s-shaped kink. it includes the same conventional single ideal diode with series and shunt resistances as before (subcircuit #1), but the series-connected sub-circuit #2 has been modified to replace the previous resistor rp2 by a third diode connected with reversed polarity, the same as that of the diode of sub-circuit #1. 340 f. j. garcía-sánchez, b. romero, d. c. lugo-muñoz, et al. as in the model by araujo de castro et al [10], to reproduce the s-shaped concave region up to about the open circuit voltage (i<0; 00; v>voc). the substitution of the parallel resistor rp2, by the third diode modifies the current through sub-circuit#2i, which now is: 2 2 02 03 2 3 exp 1 exp 1 th th v v i i i n v n v                             . (18) as before, we can write the solution for this equivalent circuit’s terminal voltage v as a function of the terminal current i adding the voltage drop irs across the series resistor rs and the voltages v1 and v2 across the terminals of each of its two sub-circuits, as in (13) which is repeated here: 1 2s v ir v v   . (19) however, now v2 must be obtained by solving (18), and that solution is not explicit in general. it has close form w-based explicit solutions in some particular cases: that both ideality factors n1 and n2 are equal, or that one is twice as large as the other [48]. otherwise, in general (18) would have to be solved numerically or approximately for the terminal voltage v2. this is, of course, the price that must be paid for having a model with two diodes connected in parallel. to illustrate the difference the addition of diode 3 makes regarding the description of the observed upturn of the illuminated i–v curve for i>0; v> voc, we present in fig. 6 in in linear and semi-logarithmic scales the synthetic i-v characteristics of a hypothetical solar cell under illumination, as generated by numerical calculation using the two dc lumped-parameter equivalent circuit models (depicted in figs. 3 and 5), with suitable parameter values indicated in the inset of fig. 6 (b). notice that in this particular example the two ideality factors n1 and n2 were chosen to have equal values, so that the solution for v2 turns out to be explicit [48]. therefore the terminal voltage v calculated from (19) is also an explicit function of the terminal current. as can be seen in fig. 6, the equivalent circuit models of figs. 3 and 5 adequately describe, as expected, the s-shaped kink in the fourth quadrant of the illuminated i–v characteristics (i<0; 00; v>voc). modelling solar cell s-shaped i-v characteristics with dc lumped-parameter equivalent circuits – a review 341 fig. 6 comparison of the s-shaped kinks in two synthetic illuminated i-v characteristics, presented in linear (a) and semi-logarithmic (b) scales, as generated by the dc lumped-parameter equivalent circuit models shown in figs. 3 and 5, using the parameter values indicated within the lower pane (dotted black lines = sub-circuit #1, dashed blue lines = first model by araujo de castro et al [10], continuous red lines = model by garcía-sánchez et al [48]). the outstanding difference between both equivalent circuit models is easily visualised by comparing the dashed blue line and the continuous red line curves presented in fig. 6, that correspond to i-v characteristics calculated with the model by araujo de castro et al [10] of fig.3, and calculated with the model by garcía-sánchez et al [48] of fig.5, respectively. as an example of a real exponential-like upward bend in the first quadrant of the i-v curve beyond the open circuit voltage (i>0; v>voc), data points corresponding to an experimental organic solar cell with s-shaped kink measured under arbitrary illumination and described in [48] are presented in fig. 7. 342 f. j. garcía-sánchez, b. romero, d. c. lugo-muñoz, et al. fig. 7 illuminated i-v characteristics of an experimental organic solar with s-shaped kink (dashed black straight line = series resistance i-v curve, dotted black line = subcircuit #1 i-v curve, continuous dashed blue lines = sub-circuit #2 i-v curve and total model playback of previous model, red lines = sub-circuit #2i i-v curve and total model playback of newer model, green circles = measured i-v data). also shown in fig. 7 is the playback calculated using the model by garcía-sánchez et al [48] with the parameter values indicated in the figure which had been previously extracted by curve fitting of the model’s equation to the originally measured data. notice that in this case the relation n3=2n2 was imposed as a fitting condition, so that the solution for v2 also turns out to be explicit [48], and the terminal voltage v calculated from (19) is also an explicit function of the terminal current. whenever the model’s equations can be solved explicitly, we can avoid numerical iteration and thus ease the necessary curve fitting to experimental data when extracting the cell’s model parameters. such explicit equations are desirable also because they may be analytically operated on, which facilitates derivation of other analytic expressions, such as the temperature dependence of the open-circuit voltage. for assessment purposes, fig. 7 includes three separate i-v curves: two correspond to the conventional solar cell equivalent circuit model (sub-circuit #1), one is the rs curve (black dash straight line), and the other (black dotted curve) corresponds to the parallel combination of the constant photo-current source, the first diode, and its companion shunt resistor rp1. the third curve (continuous red line) corresponds to the s-shape-generating sub-circuit 2i, which is made up of the parallel combination of the second and third diodes with opposite polarities. modelling solar cell s-shaped i-v characteristics with dc lumped-parameter equivalent circuits – a review 343 a quick look at the shapes of the curves in fig. 7, in light of the terminal voltage equation (19) visually indicates how the s-shape kink is formed in the total model’s i-v curve (shown in fig. 7 on top of the green circles that represent the data points). in fact, the simple graphical addition of the three curves along the voltage axis, equivalent to adding the voltages across the three series-connected parts of the model (rs, sub-circuit 1 and subcircuit 2i), confirms that the result is indeed a sort of double s shape, by virtue of the upward turn at its highest voltage end. additionally, we may notice that the inflexion point of the shown total i-v curve is located at voc, as expected from the fact that the i-v characteristic of sub-circuit #2i, which is the anti-parallel combination of the two extra diodes, has its inflexion point at zero voltage. it is worth mentioning at this point that the reason for proposing that a conventional solar cell equivalent circuit model be connected in series to an additional circuit with a configuration such as that of sub-circuit #2i, is not the result of an arbitrary attempt to try to empirically reproduce the observed upturn beyond voc. rather, it is based on a certain understanding of how to best generalise the possible mechanisms that might be present in the different types of solar cells that exhibit this upturn beyond voc. therefore, a configuration such as that of sub-circuit#2i is most probably justifiable as a reasonable circuital representation of specific underlying physical phenomena taking place near the interfaces of solar cells that display the s-shaped kink. 4.5. model by l. zuo et al (2014) l. zuo et al [36] proposed in 2014, an improved equivalent circuit model for organic solar cells. they explain their proposal saying: “in view of the previous studies, an improved equivalent circuit is proposed to interpret the origin of s-shaped i–v curve and its effect on device performance.” their equivalent circuit model contains, as those proposed before, a sub-circuit with a rectifying junction connected in series with the conventional single-diode photovoltaic equivalent circuit, which is considered the essential reason for the s-shape curve. however, no mention is made in [36] of the previous models already proposed by araujo de castro [10] and garcía-sánchez et al [48] in 2010 and 2013, respectively. this sub-circuit is shown within the complete equivalent circuit model diagram presented in fig. 8. notice that there are two remarkable differences with respect to the original model by araujo de castro et al [10] (see fig. 1(b) of [36]).the first and foremost is that the second diode in sub-circuit #2 is connected with the same forward polarity as the diode in the photovoltaic sub-circuit #1, whereas in the model proposed by araujo de castro et al [10] the second diode in sub-circuit #2 was connected with reverse polarity, opposite to that of the diode in the photovoltaic sub-circuit #1 (see fig.3). in this case the second diode is supposed to be a schottky barrier junction introduced to represent the anode interface current caused by the rectifying properties induced by interfacial dipoles, unbalanced charge transport, etc. nonetheless, it is noted that this model is not limited to the use of schottky barriers, and thus any rectifying junction or non-ohmic contact would do. we must draw attention here to the fact that if an ideal diode with the same forward polarity as the photovoltaic diode were connected by itself (without rp2) in series with sub-circuit #1 of fig. 8, it would certainly modify the shape of the illuminated i-v curve 344 f. j. garcía-sánchez, b. romero, d. c. lugo-muñoz, et al. in the first quadrant, but at the same time it would suppress almost completely the current in the fourth quadrant. therefore, if a significant (reverse) current is to flow through subcircuit #2 under illumination, there must be substantial shunt current going around that second diode. fig. 8 organic solar cell dc lumped-parameter equivalent circuit model proposed by zuo et al [36] to describe the s-shaped kink. it includes the conventional single ideal diode with series and shunt resistances (sub-circuit #1), and series-connected sub-circuit #2 with a single schottky diode and a parallel resistor, and additional series resistor. that means that the value of the resistor rp2 that shunts the second forward diode in sub-circuit #2 (fig. 8) must be small. we would like to mention in passing that a series combination of ideal diodes with equal polarity was used in the past to model amorphous silicon junctions [9]. the second difference introduced in this model is a minor one. it refers to the fact that now there is a second series resistor rs2, in addition to rs1, the already present series resistor of the model proposed by araujo de castro et al [10] (compare figs.3 and 8). although this second series resistor rs2 seems redundant, because from a circuits point of view it can be absorbed by rs1, according to the explanation given in [36], this resistor “rs2 stands for the series resistance of each layer and interface resistance.” 4.6. second model by f. araujo de castro et al (2016) seeking further generalisation, araujo de castro et al published in 2016 [41] a modification of the previous model by garcía-sánchez et al [48]. in fact, what they proposed is a generalised 3-diode model (shown in fig. 9) aimed to, in these authors’ own words, “gain insight into the modelling and parametrisation of organic solar cell current voltage curves” [41]. the modification introduced by araujo de castro et al consists of two specific changes that are made to the previous model. modelling solar cell s-shaped i-v characteristics with dc lumped-parameter equivalent circuits – a review 345 fig. 9 second solar cell dc lumped-parameter equivalent circuit model, proposed as an improvement by araujo de castro et al [41]. it is said to improve on the previous equivalent circuit model by garcía-sánchez et al [48]. the previously present series resistor has been eliminated. the previously eliminated shunt resistor rp2 has been added again in parallel, but now with the two diodes in the seriesconnected sub-circuit #2 the first change made is the restoration of the shunt resistor rp2, originally connected in parallel with the single diode of sub-circuit #2 in the first model by araujo de castro et al [10] shown in fig. 3, which was later eliminated by garcía-sánchez et al [48] to be replaced by a third forward polarity diode in anti-parallel connection with the second diode. that shunt resistor rp2 was restored araujo de castro et al [10], but it is now connected in parallel with the two diodes of sub-circuit 2i in the model by garcía-sánchez et al [48] shown in fig. 5. the second change made was to eliminate the series resistor rs, which had been present in earlier equivalent circuit models. the change in [41], that is, the restoration of the shunt resistor rp2, seems to be a perfectly reasonable and necessary decision from a phenomenological point of view. the presence of that resistor rp2, that was present in the first model by araujo de castro et al [10], and was then eliminated and replaced by a diode in the model by garcía-sánchez et al [48], seems to be crucial for properly modelling bulk transport within the body of the solar cell. from a graphical point of view, resistor rp2 controls the i-v curve’s slope of subcircuit #2i around the origin. at the same time, the presence within sub-circuit 2i of the third diode in anti-parallel connexion with diode 2 introduced by garcía-sánchez et al [48] also seems to be necessary, in order to be able to produce the upward bend observed in the i-v curve beyond voc. therefore, the decision adopted in [41] of keeping both elements, the original shunt resistor rp2 and the third diode introduced in [gar1348], seems to be the best way to address two physical phenomenon-related circuital issues that are not likely to be mutually 346 f. j. garcía-sánchez, b. romero, d. c. lugo-muñoz, et al. excluding, since one seem to come mainly from the bulk while the other probably of interfacial origin. on the other hand, the second change made in [41] regarding the elimination of the series resistor rs, which had been present in all earlier solar cell lumped-parameter equivalent circuit models, does not seem to be a convenient decision. in fact, from a methodological point of view, that series resistor rs should not be even considered as part of the s-shape-generating sub-circuit #2, but as part of the conventional solar cell circuit model. therefore, there does not seem to be a good reason why rs should be substantially altered when adding an s-shape-generating sub-circuit to the total model. to write the solution for this equivalent circuit’s terminal voltage v as a function of the terminal current i only voltages v1 and v2 across the terminals of each of its two subcircuits need be added, since rs has been eliminated. therefore, (19) becomes simply: 1 2 v v v  . (20) however, now v2 must be obtained by solving: 2 2 2 02 03 2 3 2 exp 1 exp 1 th th p v v v i i i n v n v r                              . (21) the solution of (21) is not explicit in general and would have to be solved numerically or approximately for the terminal voltage v2. 4.7. model by p. j. roland et al (2016) regardless of the model development methodology used, the fact is that rs represents an indispensable lumped element of any solar cell equivalent circuit model to be able to describe the presence of omnipresent parasitic resistance at the contacts and other collecting electrode resistance. therefore, it is important keeping this series resistor rs in place, in any solar cell dc lumped parameter equivalent circuit model. the improved solar cell dc lumped-parameter equivalent circuit model, shown in fig. 10 as suggested by p. j. roland et al [50], is another step forward in the sequence of models previously proposed in [10, 41, 48]. the series resistor has been restored as an indispensable lumped element needed to describe the ubiquitous parasitic series resistances present in all solar cells. as before, we would write the solution for this improved equivalent circuit’s voltage v as a function of i by adding the voltage drop irs across the now restored series resistor rs and the voltages v1 and v2 across the terminals of each of its two sub-circuits. that means using (15) instead of (16), and obtaining v2 by solving (7) through numerical or approximate means. sub-circuit #2 contains the parallel combination of shunt resistor rp2 and the antiparallel pair of diodes 2 and 3. p. j. roland et al [50] used spice simulations of this equivalent circuit model to reproduce i-v plots with s-shaped deformation for studying the influence of changing the values of the circuit element’s parameters on the shape of the resulting i-v curve. modelling solar cell s-shaped i-v characteristics with dc lumped-parameter equivalent circuits – a review 347 fig. 10 an improved solar cell dc lumped-parameter equivalent circuit model, proposed by p. j. roland et al [50] as a further improvement to the models proposed by araujo de castro et al [41] and by garcía-sánchez et al [48]. the series resistor has been restored as a necessary element to describe the parasitic series resistance. a comparison between experimentally measured data of cds/cdte/fes2 nc/au photovoltaic devices at 200k, which exhibit s-shape kinks in the measured i-v characteristics and their corresponding simulated s-shaped curve was also carried out, trying to correlate the model’s parameters with the physical features that determine current flow through the device. 5. conclusion we have presented a brief chronological review and appraisal of dc lumped-parameter equivalent circuits that have been proposed to date for modelling the effect of the s-shaped “kink” which shows up in the fourth quadrant, and eventually in the first, of the i-v characteristics measured under illumination of certain types of organic solar cells, as well as of some other types of photovoltaic devices. in doing so, we have analysed the defining mathematical equations of the available equivalent circuits, and we have provided and discussed their possible solutions. critical analysis have been included and some recommendations were offered when relevant. we hope that the unifying approach and generic nature of this succinct review can provide extra insight and be of practical help to photovoltaic engineers and solar cell scientists that must deal with the important issue of the s-shape i-v curve deformation and its modelling through lumped-parameter equivalent circuits. acknowledgement: parts of his work were financially sponsored by the madrid autonomous community and urjc, under projects nos. s2009/esp-1781 and urjc-cm-2010-cet-5173, respectively. partial financial assistance was also received through an institutional grant from usb’s decanato de investigación y desarrollo. 348 f. j. garcía-sánchez, b. romero, d. c. lugo-muñoz, et al. references [1] a. ortiz-conde, f. j. garcía-sánchez, j. muci, a. sucre-gonzález, “a review of diode and solar cell equivalent circuit model lumped parameter extraction,” facta universitatis, series: electronics and energetics, vol. 27, no 1, pp. 57-102, march 2014. [2] r. v. k. chavali, j. v. li, c. battaglia, s. de wolf, j. l. gray, m. a. alam, “a generalized theory explains the anomalous suns–voc response of si heterojunction solar cells,” ieee journal of photovoltaics, vol. 7, no. 1, pp. 169 176, jan. 2017. [3] p. g. kale, c. s. solanki, "silicon quantum dot solar cell using top-down approach." international nano letters, vol. 5, no. 2, pp. 61-65, jun 2015. [4] s. van reenen, m. kemerink, h. j. snaith, “modeling anomalous hysteresis in perovskite solar cells,” j. of phys. chem. letters, vol. 6, pp. 3808−3814, 2015. [5] f. xu, j. zhu, r. cao, s. ge, w. wang, h. xu, r. xu, y. wu, m. gao, z. ma, f. hong, z. jiang, “elucidating the evolution of the current-voltage characteristics of planar organometal halide perovskite solar cells to an s-shape at low temperature,” solar energy materials & solar cells, vol. 157, pp. 981– 988, december 2016. [6] j. liu, g. wang, k. luo, x. he, q. ye, c. liao, j. mei. “understanding the role of electron transport layer in highly efficient planar perovskite solar cells.” chemphyschem. in press, jan 2017. [7] a. wagenpfahl, d. rauh, m. binder, c. deibel, v. dyakonov, “s-shaped current-voltage characteristics of organic solar devices,” physical review b, vol. 82, no. 115306, september 2010. [8] a. opitz, r. banerjee, s. grob, m. gruber, a. hinderhofer, u. hörmann, j. kraus, t. linderl, c. lorch, a. steindamm, a. k topczak, “charge separation at nanostructured molecular donor–acceptor interfaces,” chapter of elementary processes in organic photovoltaics , volume 272 of the series advances in polymer science, pp. 77-108. springer international pub., 2017. [9] v. h. tran, r. b. ambade, s. b. ambade, s. h. lee, i. h. lee. “low-temperature solution-processed sno2 nanoparticles as cathode buffer layer for inverted organic solar cells.” acs applied materials & interfaces, accepted paper in press, jan 2017. [10] f. araujo de castro, j. heier, f. nüesch, r. hany, “origin of the kink in current-density versus voltage curves and efficiency enhancement of polymer-c60 heterojunction solar cells,” ieee journal of selected topics in quantum electronics, vol. 16, no. 6, pp. 1690 – 1699, nov/dec 2010. [11] b. qi, j. wang, “fill factor in organic solar cells,” phys.chem. chem. phys., vol.15, pp. 8972-8982, 2013. [12] w. shockley, the theory of p-n junctions in semiconductors and p-n junction transistors, bell system technical journal, vol. 28, no. 3, pp. 435−489, july 1949. [13] r. m. corless, g. h. gonnet , d.e.g. hare, d. j. jeffrey, d. e. knuth, “on the lambert w function.” adv. comput. math., vol. 5, no. 1, pp. 329–359, 1996. [14] lambert w-function (4.13), nist digital library of mathematical functions. http://dlmf.nist.gov/4.13 [15] s. r. valluri, r. m. corless and d. j. jeffrey. “some applications of the lambert w function to physics,” canadian j. of physics, vol. 78, no. 9, pp. 823-831, 2000. [16] d. veberič, “lambert w function for applications in physics,” computer physics communications, vol. 183, pp. 2622–2628, 2012. [17] t. banwell, a. jayakumar, “exact analytical solution for current flow through diode with series resistance.” electronics letters, vol. 36, no. 4, pp. 291–292, 17 feb. 2000. [18] a. ortiz-conde, f. j. garcía-sánchez, j. muci, “exact analytical solutions of the forward non-ideal diode equation with series and shunt parasitic resistances,” solid-state electronics, vol. 44, no. 10, pp. 1861–1864, october 2000. [19] a. jain, a. kapoor, “exact analytical solutions of the parameters of real solar cells using lambert wfunction,” solar energy materials & solar cells, vol. 81, no. 2, pp. 269-277, february 2004. [20] a. jain, a. kapoor, “a new approach to study organic solar cell using lambert w-function,” solar energy materials & solar cells, vol. 86, pp. 197–205, 2005. [21] d. c. lugo-muñoz, m. de souza, m. a. pavanello, d. flandre, j. muci, a. ortiz-conde, f. j. garcíasánchez, “parameter extraction in quadratic exponential junction model with series resistance using global lateral fitting,” ecs transactions, vol. 31, no. 1, pp. 369-376, 2010. [22] a. ortiz-conde, d. lugo-muñoz and f. j. garcía sánchez, “an explicit multi-exponential model as an alternative to traditional solar cell models with series and shunt resistances,” ieee journal of photovoltaics, vol. 2, no. 3, pp. 261-268, july 2012. http://dlmf.nist.gov/4.13 modelling solar cell s-shaped i-v characteristics with dc lumped-parameter equivalent circuits – a review 349 [23] t. ma, h. yang, l. lu, “solar photovoltaic system modeling and performance prediction,” renewable and sustainable energy reviews, vol. 36, pp. 304-315, 2014. [24] v. j. chin, z. salam, k. ishaque, “cell modelling and model parameters estimation techniques for photovoltaic simulator application: a review,” applied energy, 154, pp. 500–519, 2015. [25] a. m. humada, m. hojabri, s. mekhilef, h. m. hamada, “solar cell parameters extraction based on single and double-diode models: a review,” renewable and sustainable energy reviews, vol. 56, pp. 494–509, 2016. [26] a. r. jordehi, parameter estimation of solar photovoltaic (pv) cells: a review,” renewable and sustainable energy reviews, vol. 61, pp. 354–371, 2016. [27] j. w. jin, s. jung, y. bonnassieux, g. horowitz, a. stamateri, c. kapnopoulos, a. laskarakis, s. logothetidis, “universal compact model for organic solar cell,” ieee transactions on electron devices, vol. 63, no. 10, pp. 4053-4059, october 2016. [28] p. kumar, a. gaur, “model for the j-v characteristics of degraded polymer solar cells,” journal of applied physics, vol. 113, no. 094505, 2013. [29] a. gaur, p. kumar, “an improved circuit model for polymer solar cells,” prog. photovolt: res. appl., 2013. [30] a. kumar, s. sista, y. yang, “dipole induced anomalous s-shape i-v curves in polymer solar cells,” journal of applied physics, vol. 105, no. 094512, 2009. [31] j. wagner, m. gruber, a. wilke, y. tanaka, k. topczak, et al, “identification of different origins for sshaped current voltage characteristics in planar heterojunction organic solar cells,” journal of applied physics, vol. 111, no. 054509, march 2012. [32] r. saive, c. mueller, j. schinke, r. lovrincic, w. kowalsky, “understanding s-shaped current-voltage characteristics of organic solar cells: direct measurement of potential distributions by scanning kelvin probe,” applied physics letters, vol. 103, no. 243303, 2013. [33] o. j. sandberg, m. nyman, r. österback, “effect of contacts in organic bulk heterojunction solar cells,” phys. rev. applied, vol. 1, 024003, 27 march 2014. [34] w. tress, o. inganäs, “simple experimental test to distinguish extraction and injection barriers at the electrodes of (organic) solar cells with s-shaped current–voltage characteristics,” solar energy materials & solar cells, vol. 117, pp. 599–603, 2013. [35] m. m. furchi, a. a. zechmeister, f. hoeller, s. wachter, a. pospischil, t. mueller, “photovoltaics in van der waals heterostructures,” ieee journal of selected topics in quantum electronics, vol. 23, no. 1, pp. 4100111, jan/feb 2017. [36] l. zuo, j. yao, h. li, h. chen, “assessing the origin of the s-shaped i–v curve in organic solar cells: an improved equivalent circuit model,” solar energy materials & solar cells, vol. 122, pp. 88–93, 2014. [37] a. cheknane, h. s. hilal, f. djeffal, b. benyoucef, j.-p. charles, “an equivalent circuit approach to organic solar cell modeling,” microelectronics journal, vol. 39, pp. 1173–1180, 2008. [38] b. mazhari, “an improved solar cell circuit model for organic solar cells,” solar energy materials & solar cells, vol. 90, no. 7, pp. 1021-1033, may 2006. [39] b. romero, g. del pozo, b. arredondo, “exact analytical solution of a two diode circuit model for organic solar cells showing s-shape using lambert w-functions,” solar energy, vol. 86, pp. 3026–3029, 2012. [40] k, roberts, s. r. valluri. "on calculating the current-voltage characteristic of multi-diode models for organic solar cells." arxiv preprint arxiv:1601.02679 , 2015. [41] f. a. de castro, a. laudani, f. riganti fulginei, a. salvini, “an in-depth analysis of the modelling of organic solar cells using multiple-diode circuits,” solar energy, vol. 135, pp. 590–597, 2016. [42] a. ortiz-conde, y. ma, j. thomson, e. santos, j. j. liou, f. j. garcía-sánchez, m. lei, j. finol, p. layman, “direct extraction of semiconductor diode parameters using lateral optimization method,” solid-state electronics, vol. 43, no. 4, pp. 845–848, 1999. [43] g. del pozo, b. romero, b. arredondo, “evolution with annealing of solar cell parameters modeling the s-shape of the current–voltage characteristic,” solar energy materials & solar cells, vol. 104, pp. 81– 86, 2012. [44] g. del pozo, b. romero, b. arredondo, “extraction of circuital parameters of organic solar cells using the exact solution based on lambert w-function,” proceedings of the int. society for optical engineering (spie), brussels, belgium, vol. 8435, organic photonics v, 84351z, june 2012. [45] k. tada, “validation of opposed two-diode equivalent-circuit model for s shaped characteristic in polymer photocell by low-light characterization,” organic electronics, vol. 40, pp. 8-12, 2017. [46] b. romero, g. del pozo, e. destouesse, s. chambon, b. arredondo, “circuital modelling of s-shape removal in the current–voltage characteristic of tiox inverted organic solar cells through white-light soaking,” organic electronics, vol. 15, pp. 3546–3551, 2014. 350 f. j. garcía-sánchez, b. romero, d. c. lugo-muñoz, et al. [47] b. romero, g. del pozo, b. arredondo, j. p. reinhardt, m. sessler, and u. würfel, “circuital model validation for s-shaped organic solar cells by means of impedance spectroscopy,” ieee journal of photovoltaics, vol. 5, no. 1, pp. 234-237, january 2015. [48] f. j. garcía-sánchez, d. lugo-muñoz, j. muci, a. ortiz-conde, “lumped parameter modeling of organic solar cells’ s-shaped i-v characteristics,” ieee journal of photovoltaics, vol. 3, no. 1, pp. 330-335, january 2013. [49] ortiz-conde, a., estrada, m., cerdeira, a., garcía sánchez, f.j., de mercato, g. , “modeling real junctions by a series combination of two ideal diodes with parallel resistance and its parameter extraction,” solid-state electronics, vol. 45, no. 2, pp. 223-228, 2001. [50] p. j. roland, k. p. bhandari, r. j. ellingson, “electronic circuit model for evaluating s-kink distorted current-voltage curves,” proc. ieee 43rd photovoltaic specialists conf. (pvsc), 2016. instruction facta universitatis series: electronics and energetics vol. 27, n o 1, march 2014, pp. 41 56 doi: 10.2298/fuee1401041w topology, analysis, and cmos implementation of switched-capacitor dc-dc converters  oi-ying wong 1 , hei wong 1 , wing-shan tam 2 , chi-wah kok 2 1 department of electronic engineering, city university of hong kong, tat chee avenue, kowloon, hong kong 2 canaan semiconductor ltd., fotan, nt, hong kong abstract. this review highlights various design and realization aspects of three commonly used charge pump topologies, namely, the linear, exponential, and the fibonacci type of charge pumps. we shall outline the new methods developed recently for analyzing the steady and dynamic performances of these circuits. some practical issues for the cmos implementation of these charge pump structures will be critically discussed. finally, some conventional voltage regulation methods for maintaining a stable output under a large range of loading current and supply voltage fluctuations will be proposed. key words: switched-capacitor dc-dc converters, charge pump, steady-state analysis, dynamic analysis, voltage regulation 1. introduction switched-capacitor dc-dc converter (or sc dc-dc converter, in short) is a kind of voltage converters which realizes a dc-to-dc voltage conversion using capacitors as the only energy storage elements. unlike the conventional inductor-based dc-dc converters, no inductor is used in the sc dc-dc converters and that makes this kind of converter to have less emi emission, more compact in size, and is easier for system integration. when compare with the low-dropout regulators (ldo) which can provide step-down conversion only, the sc dc-dc converters have the advantage of being able to generate a voltage higher than the supply. however, the conversion efficiency of a sc dc-dc converter is usually poorer than those of inductor-based converters and the silicon area occupation of a sc dc-dc converter is much larger than that of a ldo. nevertheless, sc dc-dc converters have been widely used for voltage generation in flash memory systems [1]-[3] and lcd driver circuits where dc voltages higher than the supply voltages are required [4], [5]. sc dc-dc converters are also used in energy harvesting system, self-powered systems like biomedical implant devices, rfid, and wireless sensor networks [6]-[11] where the available source voltages are too low to be used for operating any electronic  received december 27, 2013 corresponding author: hei wong department of electronic engineering, city university of hong kong, tat chee avenue, kowloon, hong kong (e-mail: eehwong@cityu.edu.hk) 42 o.-y. wong, h. wong, w.-s. tam, c.w. kok devices. the step-up capability and the ease of cmos implementation feature of the sc converters can also help to minimization the power consumption of some electronic systems [12], [13]. a sc dc-dc converter consists of an output capacitor (capacitor which is connected across the output node and the ground) and some coupling capacitors which are connected to different nodes in the circuit during the two system clock phases through some power switches. (there are some other structures using more than two clock signals. these structures are more complex and the analysis will be much more complicated. in this work, we shall focus on the two-phase configuration only.) after the clock signals being applied, the coupling capacitors in the converter will be charged and discharged alternately during the charging and discharging phases. during these processes, the energy is temporarily stored in the coupling capacitors and then transferred to other capacitors via the charge sharing nodes. in this way, the energy can be transferred from the input side to the output side via the coupling capacitors, and a desired power conversion can be achieved. the output voltage of a sc dc-dc converter is governed by its switch-capacitor network or the basic topology. higher conversion ratio, defined as the ideal output voltage divided by the supply voltage, can be achieved by cascading several units of the basic topology. in this paper, we shall first review some commonly used topologies in section 2. because of the charge sharing effects, the switching loss due to the finite "on"-resistance of the cmos switches, the performances of a sc converter are always poorer than the ideal ones. with these connections, we shall look at some circuit analysis methods that took the non-ideal conditions into consideration. we shall compare the performances of various sc converters using these methods. these ideas together with some typical results will be presented in section 3. on the other hand, process requirements for cmos realization of the different sc dc-dc converter topologies will also be different. the practical issues for the cmos implementation will also be discussed in section 4. finally, the implementations of the voltage regulating building block, i.e., the output stage of the converter, will be discussed briefly in section 5. 2. sc dc-dc converter topologies a sc dc-dc converter can be constructed with several different topologies. different conversion ratios can be achieved by cascading different numbers of stage n. the linear, fibonacci and exponential topologies, shown respectively in fig. 1, 2 and 3, are the most commonly used topologies for stepping up a supply voltage [14]-[16]. for the linear topology shown in fig. 1 [14], which is also known as the dickson charge pump, the voltage across the coupling capacitor in each stage is stepped up by a value equal to the supply voltage, vdd, during the clock phase φ1 or φ2. therefore, by cascading n repeating units, an output voltage equal to (n+1)vdd can be achieved in ideal case, i.e. the conversion ratio m is equal to (n+1). for the fibonacci topology in fig. 2 [15], the coupling capacitor in the k-th stage is charged to f(k+1)vdd in φ1 and φ2 for odd and even k values, respectively. here f(x) is the x-th member in the fibonacci series defined by 1, 1, 2, 3, 5, 8, 13, 21, ···. the conversion ratio of an n-stage fibonacci converter is given by f(n+2). for the exponential topology given in fig. 3 [16], the step-up voltage at the output of each stage will become the input voltage of next stage. hence the conversion ratio of an n-stage exponential converter is given by 2 n vdd. comparing these three converter topologies, the topologies, analysis, and cmos implementation of switched-capacitor dc-dc converters 43 fibonacci and the exponential topologies can achieve a higher conversion ratio with smaller number of stage and thus fewer components for implementations; whereas the linear topology has the advantage of the smaller voltage stress across the switches. note that the aforementioned topologies are not limited to step-up operation. by considering the input and output nodes of the step-up converter topologies as the output and input nodes, step-down operation is also possible. in the step down operation, the corresponding conversion ratios for the linear, fibonacci and the exponential topologies are, respectively, 1/(n+1), 1/f(n+2) and 1/2 n . fig. 1 topology of a dickson charge pump or the linear step-up sc dc-dc converter fig. 2 topology of a fibonacci step-up sc dc-dc converter fig. 3 an exponential step-up sc dc-dc converter topology 3. performance analysis for a practical sc dc-dc converter, its performances vary with different design parameters and are different from the ideal ones. the major design parameters including the number of stage (n), the supply voltage (vdd), the operation frequency (f), the unit value of the coupling capacitance (c), the "on"-resistance (ron), the clock duty cycle (d), 44 o.-y. wong, h. wong, w.-s. tam, c.w. kok the output capacitance (co), and the topand bottomplate parasitic factors (α and β), which are defined as the ratio of the parasitic capacitances at the topand bottomplates of a capacitor to the capacitance value. the mathematical relationships between these parameters and the performances of the converters have been analyzed in many previous publications. this section highlights these results. 3.1. output voltage fig. 4 equivalent circuit of a sc dc-dc converter [19] the output voltage of a sc dc-dc converter (vo) is always smaller than the ideal one as suggested by the conversion ratio m and it depends on the loading current io also. this behavior can be modelled by taking an equivalent output resistance into consideration (see fig. 4). the value of the resistance depends on the charging status or operation mode of the converter [17]-[20]. the charging status can be modelled with ψ = 1/(2fcron) [17], [18], [20]. when ψ > 1, the converter is operated in the complete charge transfer mode. the charge transfer among the capacitors in the converter is complete that the current in each path of the converter drops closed to zero at the end of each clock phase. the equivalent output resistance of the converter mainly depends on the 1/(fc) factor. when ψ < 1, the converter operates in the non-complete charge transfer mode. the charge transfer among the capacitors in the converter is far from complete that the current in each path of the converter is almost constant during each clock phase. the equivalent output resistance of the converter mainly depends on ron. when ψ ≈ 1, the converter operates between the non-complete and the complete charge transfer mode, i.e. partially charge transfer mode, in which the equivalent output resistance of the converter depends on both 1/(fc) and ron. the equivalent output resistances of the converter operated at the slow switching limit (ssl), for which ψ is closed to infinitive; or at the fast switching limit (fsl), for which ψ is closed to zero, theoretically. models for the equivalent output resistances of the dickson charge pump at the ssl [21]-[25] and fsl [26], [27] are often reported. it was also suggested that the variation on the equivalent output resistance of the dickson charge pump across different operation modes can be modeled by the coth(x) function [28], [29]. the equivalent output resistances of the dickson and the fibonacci converters at the ssl, with the consideration of the parasitic capacitance factors, are compared in ref. [30]. a generalized method for finding the equivalent output resistances of different converter topologies at the ssl and fsl has been proposed [19]. for ssl case, the equivalent output resistance of a given converter can be approximated by [19]: 2 ( ) ,k ssl k k a r fc   (1) topologies, analysis, and cmos implementation of switched-capacitor dc-dc converters 45 where ak is the charge multiplier of the k-th capacitor ck. the parameter models the amount of charge flow into the component and is normalized by the amount of output flowing charge. for fsl case, the equivalent output resistance of a given converter topology can be approximated by [19]: 2 , 2 ( ) , fsl k on k k r a r  (2) where ron,k is the "on"-resistance of the k-th switch and ak is the corresponding charge multiplier. 3.2. power efficiency power efficiency is an import figure of merit of the dickson charge pump and is often analyzed or modelled by taking the parasitic losses into consideration. without considering the parasitic capacitances, finite switch "on"-resistances, and the switching losses for the gate-capacitances of the power switches and the clock drivers, the power efficiency of a converter can be simply determined by vo/(mvdd) [31]. considering the switching losses of the parasitic capacitances, more accurate solutions were developed [21], [22], [32]. the power efficiency of some practical converters in terms of transistor parameters can also be found [33]. this model took the parasitic resistances, gate capacitances and the topand bottomplate parasitic factors into consideration. in general, the power efficiency is given by: , 100%,out eff out r eq dyn p p p p      (3) where pout is the output power, pr,eq is the power loss of the equivalent output resistance and pdyn is the total switching loss due to gate capacitances, topand bottomplate parasitic capacitances, and the clock driver. thus, pr,eq increases with the equivalent output resistance; and pdyn increases with the parasitic factors, α and β, clock frequency, supply voltage, and the size of the power switches. the increment of both pdyn and pr,eq gives rise to a lower power efficiency. notice that the power efficiency of a converter varies with the loading current, and the maximum achievable power efficiency of a given design can only be determined with a given value of loading current and output voltage. by assuming that the conversion ratio and the equivalent output resistance are independent of the parasitic factors, the condition for maximum power efficiency can be determined accordingly [34]. it was found that as the parasitic factors increase, the maximum power efficiency occurs at a large loading current. 3.3. output voltage ripple the behavior of output voltage ripple in a sc dc-dc converter can be understood as follow. let us consider a converter, see figs. 1 to 3 for example, has an output capacitor co and its loading current be io. as the output capacitor is not charged by the converter during φ1, the loading current is governed by the charge stored in the output capacitor only. hence the output voltage would have the smallest value of ripple of io/(2fco) when d =0.5. the largest value of the ripple is io/(fco) when the total amount of charge equal to io/f. this occurs when the charge consumed by the load in a single clock cycle is transferred instantaneously to the output capacitor at the beginning of φ2 in each clock cycle. hence, 46 o.-y. wong, h. wong, w.-s. tam, c.w. kok when a converter is operated in the complete charge transfer mode, its output voltage ripple is more likely to be larger than the minimum value. on the other hand, the output voltage ripple is closed to the minimum value when the converter is operated in the non-complete charge transfer mode. a detailed discussion on the output voltage ripple under different operation modes can be found in ref. [20]. 3.4. start-up time if the capacitors in fig. 1, 2 and 3 are not charged at the beginning, it takes several clock cycles for the coupling capacitors to transfer charge to the output co. that is, a converter will take certain time to reach its steady output voltage. this time interval for this transient period is known as the start-up time. during this period, the charge flowing in and out of the capacitors are not the same for some clock cycles. finding the start-up time requires some dynamic analyses of the converter. with the aid of dynamic analyses, closed-form solutions for the start-up times of some linear converters with different numbers of stage and parasitic factors were obtained [34]-[40]. a generalized method which can evaluate the start-up behaviors of any forms of converter topologies was proposed [41]. it involves the fig. 5 plots of the required number of clock cycles for achieving 95% of final values for linear, fibonacci and the exponential converters as a function of coupling-to-output capacitance ratio for the case of the conversion ratio equal to: (a) 8; (b) 13; (c) 16; and (d) 21 [41] topologies, analysis, and cmos implementation of switched-capacitor dc-dc converters 47 formulation of a given converter topology into some matrices, from which the output voltage of a converter can be evaluated with time using the matrix equations. in the analyses, we assume: (a) the converter is operated at ssl mode with capacitive loads only; (b) all the capacitors in the converter are initially uncharged; (c) the parasitic capacitance effects can be neglected. based on these assumptions, the number of cycle, m, for achieving 95% of final output value can be determined. figure 5 plots the number of cycle as a function of coupling-to-output capacitance ratio (defined by co/c) for the case of n=8, 13, 16 and 21[41]. it can be found that for a converter to have a short start-up time, the output capacitance should not be larger than 2 times of the coupling capacitance. 3.5. performance comparison of different topologies this section concludes with the performance comparison as given in table 1. table 1 lists the performances of the 8× linear, fibonacci and exponential converter with same conversion ratio. note that the equivalent output resistance of the linear converter at the fsl mode is smaller than those of the fibonacci and the exponential ones. in addition, the voltage stress across the transistors in the linear converter is smaller regardless the large number of cascading stages. however, the linear converter requires larger number of components to implement and has a longer start-up time. further detailed comparison on the performances of these kinds of converters can be found in refs. [42], [43]. table 1 comparison of the linear, fibonacci and exponential topologies with m = 8 topology n no. of switch no. of capacitor max. blocking voltage (v) req,out start-up time (no. of clock cycle, m, for c/co=1) ssl (1/(cf)) fsl (ron) linear 7 22 7 2vdd 7 44 75 fibonacci 4 13 4 5vdd 7 52 36 exponential 3 12 5 4vdd 10 56 30 4. cmos implementation of sc dc-dc converters the sc dc-dc converter topologies given in section 2 can be implemented using the standard cmos technology by realizing the switches with some n-type or p-type cmos switches (or called the cts’s). to minimize the reverse current and the output voltage drop, these switches should be biased in the cut-off region and the triode region when they are turned off and on, respectively. the reverse current can be further reduced by applying non-overlapping clock signals [44]. the body effects, i.e. the non-zero source-to-bulk biasing voltages, in these switches can lead to a larger threshold voltages and make the output voltage be saturated at a lower value [45]. hence, for a step-up sc dc-dc converter, we may encounter some unexpected voltage drop and power losses. in addition, as the node voltages in a step-up sc dc-dc converter are higher than vdd, several factors need to be considered. here lists some issues need to be take care:  in the step-up voltage converters, the node voltages (and therefore the drain and the source voltages of the transistors) are higher than vdd. thus, a p-type cts can be turned on easily by applying 0v (or any voltage lower than its source/drain voltage by a threshold) at the gate. however, a gate voltage higher than vdd is usually required to shut down the cts completely, which may not be available in the circuit. on the other hand, 48 o.-y. wong, h. wong, w.-s. tam, c.w. kok an n-type cts can be readily shut down by applying 0v (or any voltage lower than its source/drain voltage plus a threshold) at the gate. however, to turn on the cts completely, the gate voltage should be higher than vdd. the bodies (or the n-wells if the converter is implemented in an ordinary n-well process) of the p-type cts's are usually required to be biased at a voltage higher than vdd such that the p-n junctions in the transistors can be always in reversely biased. otherwise, substrate leakage current exists and the converter would have poor efficiency. on the other hand, if the bodies of the n-type cts's are biased at 0v (which is the usual case for the circuits implemented in an ordinary n-well process), they will suffer from the body effect and that the threshold voltages of the cts's will become larger. as the node voltages in the linear topology increase linearly from the input side to the output, while those in the other ones can rise exponentially. thus, it is easier to design an efficient converter using the linear topology. fig. 6 illustration of the three gate-biasing techniques: (a) the dynamic biasing; (b) gate-boosting; and (c) the cross-coupled techniques [46]-[49] fig. 7 illustration of the three body-biasing approaches: the (a) floating-well; (b) adaptive body-biasing; and the (c) body-source junction diode approaches [50]-[52] topologies, analysis, and cmos implementation of switched-capacitor dc-dc converters 49  in the step-down voltage converters, the node voltages (and therefore the drain and the source voltages of the transistors) are between 0v and vdd. thus, the body and the gate terminals of both nand p-type cts's can be biased properly without any difficulties. however, the use of n-type cts can usually save more silicon area due to its higher transconductance. to pass a voltage in the range of 0v to vdd, transmission gates can be used. to achieve higher power efficiency, several different gateand bodybasing techniques have been proposed to control the cts’s in the dickson charge pump. figure 6 shows the dynamic biasing [46], gate boosting [47], [48] and the cross-coupled techniques [49] for gate biasing. figure 7 shows some techniques including the floating-well [50], adaptive body-biasing [51], and the body-source junction diode approach [52], to alleviate the body effects of the cts. the advantages and disadvantages of these techniques have been discussed in detail in ref. [53]. in short, small output voltage drops can be found in the converters using the gate boosting and the cross-coupled techniques, but the gate-boosting technique would consume larger dynamic power and the cross-coupled technique requires a costive triple-well process. advanced converter circuits are usually constructed by making use of more than one of these techniques [54], [55]. figures 8 and 9 present the measurement, simulation, as well as the theoretical results on the loading characteristics and the power efficiencies of the cmos 4× exponential converter at four different frequencies ranging from 25 khz to 200 khz [56], [57]. all the switches in this converter are turned on and off properly with additional dynamic inverters. the circuits were designed for operation at vdd = 1.5v, c = 100 nf, and assuming ron of the fig. 8 theoretical (both the ssl and fsl cases), simulated, and measured loading characteristics of the 4× cmos exponential converter proposed in ref. [57] with vdd = 1.5v, at four different frequencies: (a) 25 khz; (b) 50 khz; (c) 100 khz; and (d) 200khz 50 o.-y. wong, h. wong, w.-s. tam, c.w. kok switches be 50 ω. for f = 25 khz, 50 khz, 100 khz and 200 khz, the corresponding ψ values are equal to 4, 2, 1 and 0.5, respectively. thus, it is expected that the designed converter is more likely to be operated at the ssl mode with req,out given by eq. (1) for f = 25 khz and 50 khz. for f = 100 khz and 200 khz, they are at fsl mode and req,out is given by eq. (2). the simulation results shown in fig. 8 agree well with this conjecture. in fig. 8, loading characteristics are more or less the same. it further prove that the designed circuit should work at the fsl when f = 100 khz and 200 khz. it is because the equivalent output resistance of a converter at fsl should be independent of the operation frequency according to eq. (2). the difference between the theoretical and the measurement results in fig. 8(a)-(d) should be due to the equivalent series resistance (esr) of the externally connected capacitors, interconnections, and the parasitic capacitances at each node of the real circuit. this difference is less than 10% [57]. in fig. 9, it can be further observed that when the frequency is increased from 25 khz to 200 khz, the measured power efficiency drops from 80% to 40% when the loading current is small. this agrees well with what suggested by eq. (3). in eq. (3), the dynamic power loss, pdyn, dominates when the output power is small, and pdyn increases with the operation frequency [57]. fig. 9 theoretical and measured power efficiencies versus loading current of the 4× cmos exponential converter proposed in ref. [57] with vdd = 1.5 v, at four different frequencies: (a) 25 khz; (b) 50 khz; (c) 100 khz; and (d) 200 khz topologies, analysis, and cmos implementation of switched-capacitor dc-dc converters 51 5. regulations in sc dc-dc converters as shown in fig. 8, the output voltages of the sc dc-dc converters drop as the loading currents increase. to maintain a constant output voltage under different loads, a better regulation method that using a closed-loop structure is required for the sc dc-dc converters. on the other hand, the supply voltage, like the battery used in some portable devices, may also drop significantly during the operation and that causes the output voltage of the dc-dc converters to drop continuously as the output voltage of a sc dc-dc converter is directly proportional to the supply voltage. hence, we need to maintain the output voltage over a wide supply voltage range also. these are the main tasks for the output stage of the converters. in this section, a review on some previously proposed regulation techniques will be given. 5.1. regulation for loading current fluctuation pulse-skipping modulation (psm) technique was found to be well suit for sc dc-dc converter applications. fig.10 illustrates the schematic of this this method. by skipping some of the charge transferring periods for the output according to the loading condition, a better regulation can be achieved. as shown in the figure, when the loading current decreases, the sensed output voltage, vfb, may become larger than that of the reference one, vref, the clock signals for controlling the converter will be shut down. the converter will then be disconnected from the output in effect and that stops further delivering power to the output. when vfb drops below vref due to the discharging loading current, the output capacitor will be recharged by the converter again. by this way, the charge transferring period for the output (or the average power delivering to output) can be adjusted such that constant vo will be kept under different io. here a hysteresis comparator is used and ripples exist. this method has the advantages of fast response and good stability. moreover, the switching loss can be reduced and the power efficiency can be improved especially when io is small. however, variable operation frequency and large output voltage ripple, are the main drawbacks of this method [58]. alternatively, as illustrated in fig. 11, the output voltage can be regulated by adjusting the coupling capacitors charging current according to the loading condition. in this linear control method, some transistors are used as voltage-controlled current sources and are included in the charging paths of the converter. the error amplifier in the negative feedback loop then controls the current sources, and thus the charging currents to the coupling capacitors according to the loading condition. unlike the psm method, this control method produces smaller output voltage ripple and makes use of invariant operation frequency. unfortunately, the switching loss is comparatively large when the loading current is small and that leads to poorer power efficiency [31], [59]. other regulation methods proposed in the literatures frequency modulation, segmented output components [60]-[65]. in the frequency modulation method, the output is regulated by adjusting the operation frequency of the converter. it uses a voltage-controlled oscillator (vco) in the feedback loop [60], [61]. as the equivalent output resistance of a converter is related to the component sizing (see eq. (1) and (2)), the output voltage of a sc dc-dc converter can also be regulated using some segmented devices, like the segmented capacitors [62], [63] and the segmented cts’s [64], [65]. regulated converters using more than one of the above techniques can also be found in some reports. for example, bayer and schmeller [66] used both the linear control and the psm methods to regulate its output 52 o.-y. wong, h. wong, w.-s. tam, c.w. kok voltage. patounakis, li and shepard proposed a hybrid regulator that combines the sc dc-dc converter and the low dropout regulator (ldo) to achieve high efficiency and to reduce the voltage ripple [67]. fig. 10 illustration of the pulse-skipping modulation method used for charge pump voltage regulation fig. 11 illustration of the linear control method used for charge pump voltage regulation 5.2. regulation on supply voltage variation as mentioned, sc dc-dc converters are often used in systems with varying supply voltage. it was realized that regulating a sc dc-dc converter over a wide supply voltage range based on simple feedback loops, like the ones shown in fig. 10 and 11, often results in a poor power efficiency as the maximum achievable power efficiency of a sc dc-dc converter is given by vo/(mvdd). hence, to maintain a certain output voltage, the power efficiency will drop with the supply voltage if the switch-and-capacitor-network configuration is not altered (.i.e. m is fixed). with this connection, reconfigurable sc dc-dc converters, in which the switch-and-capacitor-network configuration could be alterable, i.e. m is an available, were proposed. this method improves the power efficiency over a wide range of supply voltage. this idea can be demonstrated by considering the case that a fixed output voltage of 1.8v is required and the supply voltage may vary in the range of 3-5 v. in this case, the theoretical maximum power efficiencies which can be achieved by some sc dc-dc converters with different available m values under different supply voltages are plotted in fig. 12. clearly, higher overall power efficiency can be achieved with a converter having more available conversion ratios. fig. 13 illustrates the control of the reconfigurable sc dc-dc converter. additional circuitries are added to determine the m value of the converter based on the vdd value. that minimizes the dropout voltage at different supply voltages [68]-[71]. more complicated techniques, such as the gain hopping technique [72]-[74], were proposed to determine the m value based on the loading condition also so as to further improve the overall power efficiency. topologies, analysis, and cmos implementation of switched-capacitor dc-dc converters 53 fig. 12 plot of the theoretical maximum power efficiencies at different vdd for a reconfigurable converter using different m values for constant vo of 1.8v fig. 13 illustration of the control of a reconfigurable sc dc-dc converter 6. conclusion an overview on two-phase switched capacitor dc-dc converters is given. characteristics, including equivalent output resistances, power efficiencies, voltage ripples, and start-up times, of three commonly-used topologies, i.e. the linear, fibonacci, and the exponential topologies, are compared. some practical issues on the implementation of these converters using cmos technology are discussed. finally, revised voltage regulation schemes, being able to accommodate a wider range of loading current fluctuation, are proposed for high-efficiency voltage conversion. acknowledgement: this work is supported by an applied research grant of city university of hong kong under project number 9667083. references [1] t. tanzawa, t. tanaka, k. takeuchi and h. nakamura, "circuit techniques for a 1.8-v only nand flash memory", ieee j. solid-state circuit, vol. 37, pp. 84-98, 2002. [2] j. m. baek, j. h. chun and k. w. kwon, "a power-efficient voltage up converter for embedded eeprom application", ieee trans. circuits syst. ii: express briefs, vol. 57, pp. 435-439, 2010. 54 o.-y. wong, h. wong, w.-s. tam, c.w. kok [3] n. derhacobian, s. c. hollmer, n. gilbert and m. n. kozicki, "power and energy perspectives of nonvolatile memory technologies", proc. ieee, vol. 98, pp. 283-29, 2010. [4] c.-h. wu and c.-l. chen, "multiphase charge pump generating positive and negative voltages for tft-lcd gate driving", in ieee int. symp. electronic design, test and appl., pp. 179-183, 2008. [5] f. su and w.-h. ki, "component-efficient multiphase switched-capacitor dc-dc converter with configurable conversion ratios for lcd driver applications", ieee trans. circuits syst. ii: express briefs, vol. 5, pp. 753-757, 2008. [6] w. zhao, k. choi, s. baumana, z. dilli, t. salter and m. peckerar, "a radio-frequency energy harvesting scheme for use in low power ad hoc distributed network", ieee trans. circuits syst. ii: express briefs, vol. 59, pp. 573-577, 2012. [7] t. salter, k. choi, m. peckerar, g. metze and n. goldsman, "rf energy scavenging system utilizing switched capacitor dc-dc converter", electron. lett., vol. 45, pp. 374-376, 2009. [8] k. eguchi, s. pongswatd, h. zhu, k. tirasesth, h. sasaki and t. inoue, "a multiple-input sc dc-dc converter with battery charge process", in int. conf. intelligent networks and intelligent systems, pp. 697-700, 2009. [9] j. kim, j. m. kim and c. kim, "wide input range hybrid dc-dc conversion system for solar energy harvesting", electronics letters, vol. 48, pp. 39-40, 2012. [10] x. zhang, d. shang, f. xia, h. s. low and a. yakovlev, "a hybrid power delivery method for asynchronous loads in energy harvesting systems", in ieee int. new circuits and syst.conf., pp. 413-416, 2012. [11] m. r. sarker, s. h. m. ali, m. othman and m. s. islam, "designing a low voltage energy harvesting circuits for rectified storage voltage using vibrating piezoelectric", in ieee student conf. research and development, pp. 343-346, 2011. [12] v. gutnik and a. p. chandrakasan, "embedded power supply for low-power dsp", ieee trans. vlsi syst., vol. 5, pp. 425-435, 1997. [13] t. burd, t. pering, a. stratakos and r. brodersen, "a dynamic voltage scaled microprocessor system", ieee j. solid-state circuit, vol. 35, pp. 1571-1580, 2000. [14] j. f. dickson, "on-chip high-voltage generation in mnos integrated circuits using an improved voltage multiplier techniques", ieee j. solid-state circuit, vol. 11, pp. 374-378, 1976. [15] f. ueno, t. inoue, i. oota and i. harada, "emergency power supply for small computer systems", ieee int. symp. circuits and syst., pp. 1065-1068, 1991. [16] j. a. starzyk, y.-w. jan and f. qiu, "a dc-dc charge pump design based on voltage doublers", ieee trans. circuits syst. i: fundam. theory appl., vol. 48, pp. 350-359, 2001. [17] s. ben-yaakov, "on the influence of switch resistances on switched-capacitor converter losses", ieee trans. ind. electron, vol. 59, pp. 638-640, 2012. [18] s. ben-yaakov, "behavioral average modeling and equivalent circuit simulation of switched capacitors converters", ieee trans. power electron, vol. 27, pp. 632-636, 2012. [19] m. d. seeman and s. r. sanders, "analysis and optimization of switched-capacitor dc-dc converters", ieee trans. power electron, vol. 23, pp. 841-851, 2008. [20] w.-c. wu and r. m. bass, "analysis of charge pumps using charge balance", in ieee 31st annu.. power electron. specialists conf., pp. 1491-1496, 2000. [21] p. favrat, p. deval and m. j. declereq, "a high-efficiency cmos voltage doubler", ieee j. solid-state circuit, vol. 33, pp. 410-416, 1998. [22] g. palumbo, d. pappalardo and m. gaibotti, "charge-pump circuits: power-consumption optimization", ieee trans. circuits syst. i: fundam. theory appl., vol. 49, pp. 1535-1542, 2002. [23] a. cabrini, l. gobbi and g. torelli, "theoretical and experimental analysis of dickson charge pump output resistance", in proc. int. symp. circuits syst., pp. 2749-2752, 2006. [24] j. s. witters, g. groeseneken and h. e. maes, "analysis and modeling of on-chip high-voltage generator circuits for use in eeprom circuits", ieee j. solid-state circuit, vol. 24, pp. 1372-1380, 1989. [25] c.-h. hu and l.-k. chang, "analysis and modeling of on-chip charge pump designs based on pumping gain increase circuits with a resistive load", ieee trans. power electron, vol. 23, pp. 2187-2194, 2008. [26] c.-c. wang and j. wu, "efficiency improvement in charge pump circuits", ieee j. solid-state circuit, vol. 32, pp. 852-860, 1997. [27] i. oota, n. hara and f. ueno, "a general method for deriving output resistances of serial fixed type switched-capacitor power supplies", in ieee int. symp. circuits and syst., pp. 503-506, 2000. [28] g. van steenwijk, k. hoen and h. wallinga, "analysis and design of a charge pump circuit for high output current applications", in ieee nineteenth european solid-state circuits conf., pp. 118-121, 1993. [29] j. w. kimball, p. t. krein and k. r. cahill, "modeling of capacitor impedance in switching converters", ieee power electron. lett., vol. 3, pp. 136-140, 2005. topologies, analysis, and cmos implementation of switched-capacitor dc-dc converters 55 [30] y. allasasmeh and s. gregori, "a performance comparison of dickson and fibonacci charge pumps", in european conf. circuit theory and design, pp. 599-602, 2009. [31] b. r. gregoire, "a compact switched-capacitor regulated charge pump power supply", ieee j. solid-state circuit, vol. 41, pp. 1944-1953, 2006. [32] d. baderna, a. cabrini, g. torelli and m. pasotti, "efficiency comparison between doubler and dickson charge pumps", in ieee int. symp. circuits and syst., vol. 2, pp. 1891-1894, 2005. [33] c.-p. hsu and h. lin, "analytical models of output voltages and power efficiencies for multistage charge pumps", ieee trans. power electron, vol. 25, pp. 1375-1385, 2010. [34] a. cabrini, l. gobbi and g. torelli, "a theoretical discussion on performance limits of cmos charge pumps", in proc. european conf. circuit theory and design, vol. 2, pp. 35-38, 2005. [35] f. h. khan, l. m. tolbert and w. e. webb, "start-up and dynamic modeling of the multilevel modular capacitor-clamped converter", ieee trans. power electron., vol. 25, no. 2, pp. 519-531, 2010. [36] g. di cataldo and g. palumbo, "double and triple charge pump for power ic: dynamical models which takes parasitic effects into account", ieee trans. circuits syst. i: fundam. theory appl., vol. 40, no. 2, pp. 92-101, 1993. [37] g. di cataldo and g. palumbo, "design of an nth order dickson voltage multiplier", ieee trans. circuits syst. i: fundam. theory appl., vol. 43, no. 5, pp. 414-418, 1996. [38] t. tanzawa and t. tanaka, "a dynamic analysis of the dickson charge pump circuit,'' ieee j. solid-state circuits, vol. 32, no. 8, pp. 1231-1240, 1997. [39] g. palumbo and d. pappalardo, "charge pump circuits with only capacitive loads: optimized design", ieee trans. circuits syst. ii: express briefs, vol. 53, no. 2, pp. 128-132, 2006. [40] m. zhang and n. llaser, "optimization design of the dickson charge pump circuit with a resistive load", in proc. int. symp. circuits syst., vol. 5, pp. 840-843, 2004. [41] o.-y. wong, h. wong, c.-w. kok and w.-s. tam, "dynamic analysis of two-phase switched-capacitor dc-dc converters", ieee trans. power electron., vol. 29, no. 1, pp. 302-317, 2014. [42] t. tanzawa, "on two-phase switched capacitor multipliers with minimum circuit area", ieee trans. circuits syst. i: regular papers, vol. 57, no. 10, pp. 2602-2608, 2010. [43] w.-h. ki, y. lu, f. su and c.-y. tsui, "design and analysis of on-chip charge pumps for micro-power energy harvesting applications", ieee/ifip 19th int. conf. vlsi system-on-chip, pp. 374-379, 2011. [44] a. fantini, a. cabrini and g. torelli, "impact of control signal non-idealties on two-phase charge pumps", in proc. int. symp. circuits syst., pp. 1549-1552, 2007. [45] j. c. chen, t. h. kuo, l. e. cleveland, c. k. chung, n. leong, y. k.kim, t. akaogi and y. kasa, "a 2.7 v only 8mb × 16 nor flash memory", in ieee symp. vlsi circuits dig. tech. papers, pp. 172-173, 1996. [46] j. t. wu and k. l. chang, "mos charge pumps for low-voltage operation", ieee j. solid-state circuit, vol. 33, pp. 592-597, 1998. [47] a. umezawa, s. atsumi, m. kuriyama, h. banba, k. imamiya, k. naruke, s. yamada, e. obi, m. oshikiri, t. suzuki and s. tanaka, "a 5-v-only operation 0.6-μm flash eeprom with row decoder scheme in triple-well structure", ieee j. solid-state circuit, vol. 27, pp. 1540-1546, 1992. [48] g. van steenwijk, k. hoen and h. wallinga, "analysis and design of a charge pump circuit for high output current applications", in nineteenth european solid-state circuits conf., pp. 118-121, 1993. [49] r. pelliconi, d. iezzi, a. baroni, m. pasotti and p. l. rolandi, "power efficient charge pump in deep submicron standard cmos technology", ieee j. solid-state circuit, vol. 38, pp. 1068-1071, 2003. [50] k. h. choi, j. m. park, j. k. kim, t. s. jung, and k. d. suh, "floating-well charge pump circuits for sub-2.0v single power supply flash memories", in ieee symp. vlsi circuits dig. tech. papers, pp. 61-62, 1997. [51] j. shin, i. y. chung, y. j. park, and h. s. min, "a new charge pump without degradation in threshold voltage due to body effect [memory applications]", ieee j. solid-state circuit, vol. 35, pp. 1227-1230, 2000. [52] o. khouri, s. gregori, a. cabrini, r. micheloni and g. torelli, "improved charge pump for flash memory applications in triple well cmos technology", in ieee int. symp. ind. electronics, pp. 1322-1326, 2002. [53] o.-y. wong, h. wong, c.-w. kok and w.-s. tam, "a comparative study of charge pumping circuits for flash memory applications", microelectron. reliab., vol. 52, pp. 670-687, 2012. [54] a. cabrini, l. gobbi, and g. torelli, "enhanced charge pump for ultra-low-voltage applications", electron. lett., vol. 42, pp. 512-514, 2006. [55] o.-y. wong, w.-s. tam, c.-w. kok and h. wong, "a low-voltage charge pump with wide current driving capability", ieee int. conf. electron devices and solid-state circuits, pp. 1-4, 2010. [56] o.-y. wong, w.-s. tam, c.-w. kok and h. wong, "area efficient 2 n × switched capacitor charge pump", in proc. ieee int. symp. circuits syst., pp. 820-823, 2009. 56 o.-y. wong, h. wong, w.-s. tam, c.w. kok [57] o.-y. wong, h. wong, c.-w. kok and w.-s. tam, "a dynamic-biasing 4× charge pump based on exponential topology", int. j. circuit theory applications, in press. [58] t. tanzawa and s. atsumi, "optimization of word-line booster circuits for low-voltage flash memories", ieee j. solid-state circuit, vol. 34, pp. 1091-1098, 1999. [59] h. chung, b. o and a. ioinovici, "switched-capacitor-based dc-to-dc converter with improved input current waveform", in ieee int. symp. circuits and syst., pp. 541-544, 1996. [60] l. aaltonen and k. halonen, "on-chip charge pump with continuous frequency regulation for precision high voltage generation", in ph.d. research in microelectronics and electronics, pp. 68-71, 2009. [61] s.-c. tan, s. kiratipongvoot and s. bronstein, "adaptive mixed on-time and switching frequency control of a system of interleaved switched-capacitor converters", ieee trans. power electron., vol. 26, pp. 364-380, 2011. [62] l. su and d. ma, " monolithic reconfigurable sc power converter with adaptive gain control and on-chip capacitor sizing", in ieee energy conversion congress and exposition., pp. 2713-2717, 2010. [63] y. k. ramadass, a. a. fayed and a. p. chandrakasan, "a fully-integrated switched-capacitor step-down dc-dc converter with digital capacitance modulation in 45nm cmos", ieee j. solid-state circuit, vol. 45, pp. 2557-2565, 2010. [64] s. musunuri and p. l. chapman, "improvement of light-load efficiency using width-switching scheme for cmos transistors", ieee power electron. lett., vol. 3, pp. 105-110, 2005. [65] r. guo, l. yang, a. huang and j. endredy, "a high efficiency regulated charge pump over wide input and load range", in ieee applied power electron. conf. exposition, pp. 1172-1176, 2010. [66] e. bayer and h. schmeller, "charge pump with active cycle regulation-closing the gap between linear and skip modes", in ieee annual power electron. specialists conf., pp. 1497-1502, 2000. [67] g. patounakis, y. w. li and k. l. shepard, "a fully integrated on-chip dc-dc conversion and power management system", ieee j. solid-state circuit, vol. 39, pp. 443-451, 2004. [68] i. chowdhury and d. ma, "design of reconfigurable and robust integrated sc power converter for self-powered energy-efficient devices", ieee trans. ind. electron., vol. 56, pp. 4018-4028, 2009. [69] c.-l. wei and m.-h. shih, "design of a switched-capacitor dc-dc converter with a wide input voltage range", ieee trans. circuits syst. i: regular papers, vol. 60, pp. 1648-1656, 2013. [70] x. zhang and h. lee, "an efficiency-enhanced auto-reconfigurable 2×/3× sc charge pump for transcutaneous power transmission", ieee j. solid-state circuit, vol. 45, pp. 1906-1922, 2010. [71] v. ng and s. sanders, "a 92%-efficiency wide-input-voltage-range switched-capacitor dc-dc converter", in ieee int. solid-state circuits conf. dig. tech. papers, pp. 282-284, 2012. [72] i. chowdhury and d. ma, "an integrated reconfigurable switched-capacitor dc-dc converter with a dual-loop adaptive gain-pulse control", in ieee int. symp. circuits and syst., pp. 2610-2613, 2008. [73] v. w. ng and s. r. sanders, "a high-efficiency wide-input-voltage range switched capacitor point-of-load dc-dc converter", ieee trans. power electron., vol. 28, pp. 4335-4341, 2013. [74] low noise, high efficiency, inductorless step-down dc/dc converter, ltc1911, linear technology, 2001. industrial wsn as a tool for remote on-line monitoring facta universitatis series: electronics and energetics vol. 30, n o 1, march 2017, pp. 107 119 doi: 10.2298/fuee1701107n industrial wireless sensor networks as a tool for remote on-line management of power transformers' heating and cooling process  aleksandar nikolić 1 , nataša nešković 2 , radoslav antić 1 , ana anastasijević 2 1 university of belgrade, electrical engineering institute nikola tesla, serbia 2 university of belgrade, school of electrical engineering, serbia abstract. industrial wireless sensor network used for supervising of high power transformer cooling system is presented in the paper. due to the fact that in the thermal power plant where industrial prototype is installed is very noisy environment, a lot of problems should be solved in order to obtain high reliability and accuracy of the system. results of the analysis presented in paper are obtained from the real thermal power plant where presented wireless sensor network based on-monitoring system is used for continuous management of power transformers’ heating and control of their cooling systems. obtained results during system operation in longer period confirm its stability, accuracy and improvement in power plant operation. key words: wireless sensor networks, power transformers, power plants, management, on-line monitoring, remote control 1. introduction global processes of liberalization and deregulation of energy sector have established new technical and technology requirements to the research and development centers all over the world. imperative requirements are increase of energy efficiency, reliability and availability of energy resources. in the field of testing and diagnostics individual measurements are replaced by integrated models. timely planned maintenance is replaced by condition based maintenance in respect to the risk assessment using information technologies (databases, intranet, internet). requirements of the modern energy market are integrations of several scientific disciplines and technologies: energetic, electronics, informatics, metrology, standardization, management. significant savings could be reached by prevention of malfunctions and breakdowns by introduction of on-line diagnostics and condition based maintenance strategy. on-line monitoring gives timely information about process and helps for further decisions about  received april 10, 2016; received in revised form may 30, 2016 corresponding author: aleksandar nikolić electrical engineering institute nikola tesla, university of belgrade, 8a koste glavinića serbia (e-mail: anikolic@ieent.org) 108 a. nikolić, n. nešković, r. antić, a. anastasijević process operation [1]. this type of diagnostics became very important in large power plants and its important equipment, especially high power transformers [2]. generator power transformers are the largest units in power plants, since their capacity could be even 1400mva. nowadays, two approaches for thermal management of such transformers are used. expensive solution is based on optical sensors mounted in transformer windings during manufacturing or repairing process. other solution relay on mathematical model of transformer and uses calculation of the highest temperature in transformer (e.g. the hot-spot temperature), measuring only transformer top-oil temperature and load current [3]. solution presented in this paper is based on calculation of the hot-spot temperature using measured transformer oil on the top of housing, ambient temperature and load current. real-time calculation of the hot-spot temperature and transformer cooling control is implemented in industrial type programmable controller. temperature sensors are industrial pt100 mounted on the pipes where transformer oil is circulating. since the system also controls transformer cooling, it is wise to put one sensor at the input of cooling device and the other on the output. in that case, besides cooling control, more information about operation of cooling device could be obtained, like malfunction of fan if temperatures on the input and output are near the same value. studies show that up to 90% of actionable process and environmental data remains uncollected. wired monitoring systems are expensive and unrealistic in challenging physical environments, and manual monitoring has proven simply to be cost-prohibitive [4]. during analysis prior to implementation of one such a system in one power plant, it was found that cabling would be very difficult and expensive and could yield to a very complicated and unsuitable system. in that case, research about application of some wireless based solution is performed. the reliability of wireless networks is set by the quality of the radio link between the central access point and each endpoint [4]. as a simplest and most reliable solution, a wireless sensor network based system is proposed and results of its implementation and performance in real conditions are presented in the paper. 2. temperature monitoring of power transformers dominantly, in transformer heat is spread by convection. determination of temperature distribution in transformer, as a dominant loading factor, is very complicated task. temperature is different in all functional parts of transformer (winding, core, tank and oil) and its changes per volume of each part. the maximum temperature occurring in any part of the winding insulation system is called the "hot-spot temperature“. this parameter represents the thermal limitation of loading of the transformer. for on-line temperature monitoring it is suitable to calculate hot-spot temperature using differential equation with load factor and ambient temperature as a time variables [1]. realtime algorithm is developed using concept with differential equation from iec 60076-7 standard, since it is suitable for on-line monitoring [3]. load factor and ambient temperature are time dependent variables and there is no limit regarding loading profile. if temperature rises are calculated using exponential functions, expression for hot-spot temperature is given in the equation (1): industrial wsn as a tool for remote on-line management of power transformers' heating... 109 2 1 2 1 ( ) ( ) ( ) ( ) 1 x y h a oi or oi hi r hi r k t f t hg k f t r                                (1) where hi is hot spot temperature at the start, f1(t) is function of top oil temperature increase and f2(t) is function of hot spot temperature increase depend on top oil temperature. 2.1. importance of power transformer on-line monitoring as mentioned in introduction, power transformers in power plants, esp. generator power transformers are one of the most important units in energy power system. maintenance of these transformers is complicated and expensive and nowadays it should be wait more than 2-3 years for production of new generator transformer. monitoring and supervising systems of these transformers are very useful and necessary in order to improve efficiency, reliability and reduces risk and costs of unexpected failure [2]. monitoring of transformer run during exploitation period could give accurate failure analyses while extending life of assets. actual conditions drive maintenance and repair and give possibility for estimating additional operational costs. saving relevant data for further analysis and creating historical data is a merit for improving on-line diagnostics and creating decision making expert systems. 2.2. on-line diagnostics models the analysis of the failure modes of the various components leads to a review of the inspection and maintenance procedures of power transformers. on-line diagnostic condition assessment addressing common failure modes:  multiple sensors,  multiple on-line models,  all parameters are recorded automatically and continuously,  trend and limit alarms. on-line models are focused on the main tank of transformer [3]. these models rely on various sensors installed on the transformer and in the substation, combined with other manually entered parameters. this data is then fed into industry standard and accepted models, which calculate the various outputs. 2.2.1. load current model load current model accept on its input measurements of winding current(s) and on its output it provide trending and alarms of particular load current (fig. 1). 2.2.2. winding temperature model winding temperature model accept on its inputs top-oil and ambient temperature measurements and measurement of two or three winding currents. additionally, some fixed parameters should be entered manually as an input of model: rated hot-spot temperature (hse) rise, rated load current and winding characteristics. on its output it provides trending and alarms of hot-spot temperature for each transformer winding (fig. 2). 110 a. nikolić, n. nešković, r. antić, a. anastasijević fig. 1 load current model fig. 2 winding temperature model 2.2.3. cooling control model cooling control model accept on its inputs top-oil temperature measurement and measurement of two or three winding currents. optionally, signal about cooling stage status could be also introduced. additionally, some fixed parameters should be entered manually as an input of model: top-oil temperature set point, hot-spot temperature set point and load current set point. on its output it provides cooling stages on/off control and status, display and trending and status alarms (fig. 3). r winding current minute average current on each winding sensors rules output  measurement on one or three phases  load current is measured every second and averaged over one minute  average value is used for top-oil temperature calculation.  for the hot-spot temperature calculation, the highest value is used. s winding current (optional) average current of phase a, b, c maximum current of phase a, b, c display and trending warnings and alarms t winding current (optional) top-oil temperature sensors rules output  continuosly computes the winding hottestspot temperature on each winding.  calculations are done using proven algorithms from ieee and iec loading guides.  additional fine-tuning of the algorithm is based on transformer manufacturer data. s winding current (optional) hottest-spot temperature for each winding display and trending warnings and alarms t winding current (optional) r winding current fixed parameters rated hst rise rated load current winding characteristics ambient temperature industrial wsn as a tool for remote on-line management of power transformers' heating... 111 fig. 3 cooling control model 2.2.4 . cooling efficiency model cooling efficiency model accept on its inputs top-oil temperature measurement and measurement of two or three winding currents. optionally, signal about cooling stage status could be also introduced. additionally, some fixed parameters should be entered manually as an input of model: rated top-oil temperature rise, top-oil time constant, load losses over no-load losses ratio and oil exponent. on its output it provides information about top-oil temperature discrepancy, warning about deficiency of cooling system and gives display and trending information (fig. 4). fig. 4 cooling efficiency model top-oil temperature sensors rules output  the cooling system can be initiated from either:  top-oil temperature,  load current,  winding hot-spot temperature.  cooling control can detect discrepancies and raise alarm in the case of cooling malfunction. s winding current (optional) stage(s) on/off control control vs. status discrepancy alarm display and trending t winding current (optional) r winding current fixed parameters load current set point cooling stage status top-oil temperature set point hot-spot temperature set point top-oil temperature sensors rules output  theoretical top-oil tempereature calculated according to ieee methods.  warning is initiated if top-oil temperature is too high, indicating malfunction of the cooling system. ambient temperature top-oil temperature discrepancy cooling deficiency warning display and trending fixed parameters ratio of load losses over no-load losses cooling stage status rated top-oil temperature rise top-oil time constant oil exponent r winding current 112 a. nikolić, n. nešković, r. antić, a. anastasijević 3. industrial communication systems industrial communication networks are often required to provide tight performance figures in terms of both real time and determinism. this is a consequence of the application fields, such as motion control, factory automation, manufacturing, and networked control systems in which they are typically employed [5]. until recently, mostly industrial networks were wired based. several network solutions were defined, from serial communication known as rs-485 and digital industrial automation protocols like hart communications protocol (highway addressable remote transducer protocol), fieldbus and profibus (process field bus). subsequently, at the end of the 1990s, field networks based on the well-known ethernet technology started to be introduced. they are characterized by strong performance figures in that they are able to provide high transmission rates (typically up to 100 mb/s), very limited and predictable transfer times, high determinism, and low jitters. one of the extensions is ethercat ethernet for control automation technology an open high performance ethernet-based fieldbus system. 3.1. wireless communications in industry recently, wireless networks started being considered an interesting solution for communication at the device level as well. among the first applications was in the wireless control of cranes in warehouses, where proprietary radios achieved flexible control of moving devices. during the past decade, standardized radio technologies like wireless lan (ieee802.11), wireless hart (ieee 802.15.4) and bluetooth technology (ieee802.15.1) have become the dominating technologies for industrial use. no single wireless technology offers all the features and strengths that fit the various industrial application requirements, so standardized wireless technologies, such as wireless lan, bluetooth and wireless hart (as well as a number of proprietary technologies) are all used in practice [6]. 3.2. wireless sensor networks principle a wireless sensor network (wsn) is a wireless network consisting of distributed autonomous devices using sensors to cooperatively monitor physical or environmental conditions, such as temperature, sound, vibration, pressure, motion or pollutants, at different locations. in addition to one or more sensors, each node in a sensor network is typically equipped with a radio transceiver or other wireless communications device, a small microcontroller, and an energy source, usually a battery. although battery supplied sensor is one of wsn features, in industrial applications it is wiser to use external, stable dc supply. wireless sensor networks (wsns) have quickly become an area of great interest in terms of research for both industry and academia. nowadays, the enormous potential of this technology can be easily seen, along with its inherent difficulties. in fact, the massachusetts institute of technology recently classified wsns as one of the 10 emerging technologies that will change the world [7]. sensor nodes are connected wirelessly to the gateway in the center that performs data acquisition and analysis [8]. connecting to a wireless sensor network with other, usually ethernet networks is realized via the communication module with the function of the gateway. at the top of the hierarchy of wireless sensor networks (where it is necessary to realize a gateway functions) is possible to use the communication module with programmable controller and memory to perform the complete processing of data collected from the industrial wsn as a tool for remote on-line management of power transformers' heating... 113 sensor nodes and possibly control and management [9]. a simple example of data acquisition system based on wsn is shown in fig. 5. fig. 5 example of data acquisition system with wireless data transfer based on wsn number of sensor nodes in wsn network could be additionally increased if some of sensor modules are configured as a mesh router. in that case, besides of data transfer from sensors directly connected to the module, it also performs data transfer from some other sensor module in its proximity. wsn node in that case performs functions of data packet routing from their path from the source to the destination [10], as shown in fig. 5. this shows a way for easily expansion of existing wsn network, both in functional and in space covering aspect [10], [11]. 4. realization of wsn based monitoring system proposed data acquisition and control system for temperature monitoring of six power transformers and control of transformer cooling of two generator power transformers is based on wireless sensor network. according to the available literature, there is only one similar application in nuclear power plant in usa [12], and few applications in power plants in china [13], [14]. measuring points in this system are mostly located on the each transformer, with maximum distance less than 50m for each transformer. distance from transformers to the control room is in the range from 120m to 200m. in that case, developing sensor network with communication links such as gprs/gsm are not viable because the consumer pays the monthly charges for connectivity. finally, a wireless sensor network is defined using zigbee communication between nodes at transformer and receiver in control room. the lower power zigbee communication protocol is based on the ieee 802.15.4 standard and uses the free 2.4ghz ism band [10]. this makes it viable to read a large number of nodes and justifies implementation and operation costs compared to its benefits. the ieee 802.15.4 standard defines two layers, the mac and the physical layer (phy) and uses the three license-free frequency bands. these license-free bands have a total of 27 channels divided into 16 channels at 2.4ghz with data rates of 250 kbps, 10 channels at 902 to 928mhz with data rates of 40 kbps, and one channel at 868 to 870mhz with a data rate of 20 kbps. however, only the 2.4-ghz band operates worldwide; the others 114 a. nikolić, n. nešković, r. antić, a. anastasijević are regional bands. the 868–870-mhz band operates in europe, while the 902–928–mhz band operates in north america, australia, and other countries [7]. for proposed system in thermal power plant chosen wsn sensors are industrial type which already operates on 2.4ghz band, since it is globally available and license free. the reason is not based on the spectral content, although according to the recommendation itu-r p. 372 industrial noise of any type disappears at 900mhz [15]. usually in thermal power plant and its surrounding there are no present large number of systems that operates in 2.4ghz frequency band and that could cause interference and endanger installed wsn stability. the other fact is that 2.4ghz ism has the highest throughput data rate of 250kbps (over 40kbps at 915mhz and 20kbps at 868mhz) and it supports 16 channels (10 at 915mhz and only 1 at 868mhz). these wsn devices are designed to monitor assets or environments in outdoor or harsh settings and hard-to-reach places. it could operate in industrial temperature range (-40c to 70c) which have proved also in the case of proposed system. finally, a variety of international safety, electromagnetic compatibility, and environmental certifications and ratings are available for these devices. finally, selected equipment according to ieee 802.15.4 standard provide development of independent system that do not require significant investment in infrastructure and monthly payments in the case of mobile network based system. although sensors in wsn network could be battery supplied and operate several months using standard aa batteries, in the presented system wsn nodes and gateways are supplied from external 24vdc using industrial grade supplies with isolated input/output. line voltage for these supplies is taken from plant’s secured (uninterrupted) supply. batteries were used to supply nodes only during installation, since at that time particular transformer and its whole supply is switched off due to the maintenance procedure. sensor network consists of several measuring nodes per transformer. one analog input type node performs transformer current measurement on one of its analog inputs and control of 4 cooling units via digital outputs. temperature measuring nodes are connected to pt100 sensors that measures top-oil temperature, input and output temperature of cooling units and ambient temperature. receivers (one per each transformer) are placed in the plant control room at the point where transformer could be viewed, in order to avoid lower signal reception. one receiver is equipped with microcontroller and memory, so it is used for realtime algorithm deployment. both receivers communicate with each other and supervising computer mounted in control room via ethernet. position of sensor nodes and wsn gateways is shown in fig. 6 using aerial view of the power plant. labels on white background denote power transformers, where 5t and 3t are generator transformers, 25t and 23t are their corresponding self-consumption transformers, respectively, while 1t and 2t are common group transformers. locations of wsn nodes are marked with yellow circles, while locations of wsn gateways are marked with yellow squares. application that runs on supervising pc communicate with receivers via modbus tcp/ip protocol, display all needed data for operators and store values in mysql database. in order to provide remote supervision and application modifications, whole system is connected to the internet via power plant lan. in fig. 7 a part of the realized system is shown for 100mva generator transformer. photo is taken from the control room where main receiver with real-time application is mounted. on the right side of fig. 7 open industrial enclosure in ip65 protection is shown. installed equipment in enclosure is designated as follows: wsn nodes – 1, wsn antennas – 2, isolated power supply 24vdc – 3, measurement transmitter for translating ma signals industrial wsn as a tool for remote on-line management of power transformers' heating... 115 into mv – 4, relays for switching fans and oil pumps and reading switching status – 5, 230vac socket for supply instruments and tools during maintenance – 6. fig. 6 disposition of wsn nodes and gateways in thermal power plant fig. 7 industrial wsn based generator power transformer thermal management system 1 2 3 4 5 6 116 a. nikolić, n. nešković, r. antić, a. anastasijević 5. experimental verifications during analysis of correct signal reception from nodes to receiver, it is found that several precautions should be done in the complex industrial environments such as thermal power plant. each node should be placed in such a way that it could be viewed from the place where receiver is mounted. this is important to avoid signal breakdown at some barriers. due to the fact that there are another equipment including large cooling units around transformer, it is better to put all measurement nodes for one power transformer and its auxiliary consumption transformer in a single enclosure, as shown in fig. 7. for auxiliary transformers, signals are just added to the existing wsn nodes in the case of transformer 25t and its main 5t. but, for other transformer 23t situation is slightly different, due to the fact that transformer 23t is not placed just behind its main transformer 3t, like 25t and 5t. transformer 23t is located on the other side of the main road in power plant than 3t. in that case, additional wsn node is used for 23t. since that node is not viewable completely clear from the control room building, some modifications in wsn network is made. in that case, one of wsn nodes used for transformer 3t (mounted in enclosure near 3t) that is closer to transformer 23t is reconfigured as mesh router. this wsn node, clearly “seen” by mesh router node that resends both its measured data and data obtained from wsn node near transformer 23t. wsn modules work with a constant output power of the transmitter 10 dbm (10 mw), the receiver sensitivity is -102 dbm. operation of wireless sensor networks is analyzed by monitoring the change in signal level at the receiver input on each of wsn modules over a longer period of time (one segment results are shown in fig. 8). the observed dynamics of the signal is in the range of 9 db to 45 db, depending on the relative positions of wsn module that provides communication. wireless sensor network realized in outdoor conditions (outside buildings) in the complex propagation environment of thermal power plant te kostolac a. this thermal power plant consists of several buildings representing good reflective surface as the entrance to the recipient causes the existence of a large number of reflected components (besides direct waves). this results in great instability of signal level at the entrances of their receivers (fig. 8). fig. 8 signal level change at the inputs of wsn module receivers sampled on 1min after testing, final places for both wsn nodes at transformers and receivers (gateways) are found. the main criterion is to keep signal level (link quality) over 30%. this assures industrial wsn as a tool for remote on-line management of power transformers' heating... 117 stable work of whole system during various influences, like disturbances, power disruptions, different weather conditions, etc. although it is not necessary for a node to be viewed by gateway, for stable work in such an environment we have proved by testing that it is better to provide optical visibility between them, even on the short distances (lower than 100m). existing wireless networks should be clearly defined and separated, especially wlans since their signals have larger level than those in zigbee networks. in that case, some measurements should be made before to find the most suitable communication channel for zigbee communication. the proposed system is developed and is for almost one year under testing in one thermal power plant. it has passed all tests without communication lost or other malfunctions under different plant operation regimes and ambient conditions. it should be noted that during first installations in summer it was over 40c, while in the winter it was below than -25c. in both cases, system has worked without any stop. that is confirmed through values stored in sql database saved on a local hard disk in pc computer mounted the control room. results of proper wsn transformer thermal monitoring system are taken through installed application on the panel pc computer in thermal power plant control room. application receives every ten second updates from real-time gateway and store data into the sql database. the main screen of the application for data acquisition, which are clearly separated parts that relate to a particular transformer (5t, 25t, 3t, 23t, 1t and 2t), shown in fig 9. in order to explain meanings of some part of the screen, arrows are pointed characteristic values in the case of transformer 5t: 1 – load current, 2 – top oil temperature, 3 – temperature at the entrance and exit of the cooling group, 4 calculated hot-spot temperature, 5 – ambient temperature, 6 – cooling group status (green square – group is activated, gray square – group is deactivated). fig. 9 signal level change at the inputs of wsn module receivers sampled on 1min from the fig. 9 it could be seen that all fields for transformer 23t are grayed. that was due to the fact that transformer 23t was in maintenance, out of operation, when screen was captured from application. 1 2 3 4 5 6 118 a. nikolić, n. nešković, r. antić, a. anastasijević further merit of proposed system is that application is reachable from outside of the plant through internet via protected vpn channel. in that case system could be monitored remotely and finally application could be even updated and replaced without necessity to visit the plant. that significantly simplifies system maintenance and reduces additional costs [16]. 6. conclusions industrial wireless sensor network system for thermal management of high power transformers is presented in the paper. the importance of implemented on-line temperature monitoring system can be seen in a completely new solution based on wireless sensor networks. this is a unique solution that has been developed due to some potential problems with the installation of cables required for the same purpose. prior to installation and during the operation of the system, a detailed analysis of the signal quality was carried out on all your wireless connections in a network individually (i.e., for each pair of sensor nodes). after the third upgrade of the system, operation of the wireless sensor network can be remotely monitored and since data about signal quality is recorded in a database on a computer in the control room. low consumption wsn modules can work for several months without changing batteries (standard aa type), allows the system to have an additional level of protection in case of failure of the auxiliary power supply in the plant. also, the system allows easy upgrades by deploying and configuring new wsn module. remote control via the internet added flexibility to the entire system, allowing the time needed for analysis and testing of transformers significantly shortened. also remote changes on the software are possible, which for security reasons is performed with the communication and cooperation with operators in the plant. since in the cases of accidents in power plants efficiency of decision and realization is priority, it is clear what the significance of the realized system is. results obtained from the real industrial plant confirm the proposed wireless network configuration, even during very high and very low ambient temperatures. acknowledgement: results presented in the paper are part of an innovation project “possibilities for wireless sensor networks application in smart grid power systems”, granted by serbian ministry of education, science and technological development, no. 451-03-2802/2013-16/79, 2014. references [1] b. sparling, “transformer monitoring and diagnostics”, in proceedings of the ieee power engineering society 1999 winter meeting, 31 jan-4 feb 1999. [2] b. flynn, “case studies regarding the integration of monitoring & diagnostic equipment on aging transformers with communications for scada and maintenance”, in proceedings of the distributech 2008 conference and exhibition, tampa fl, usa, january 22-24, 2008. [3] j. li, t. jiang, s. grzybowski, “hot spot temperature models based on top-oil temperature for oil immersed transformers”, in proceedings of the ieee conference on electrical insulation and dielectric phenomena, 2009. ceidp '09, 18-21 october 2009, pp. 55. [4] d. laurence, “wireless sensor networks”, awe international, issue 15, june 2008. [5] l. seno, f. tramarin, s. vitturi, “performance of industrial communication systems”, ieee industrial electronics magazine, pp. 27-37, june 2012. industrial wsn as a tool for remote on-line management of power transformers' heating... 119 [6] m. anderson, “a look at wireless technologies for industrial applications”, industrial ethernet book, issue 71, pp. 8-13, 2012. [7] a.-b. garcı´a-hernando et al., problem solving for wireless sensor networks, springer-verlag london limited, 2008. [8] i.f. akyildiz, w. su, y. sankarasubramaniam, e. cayirci, “wireless sensor networks: a survey”, computer networks, vol. 38, no. 4, 2002, pp. 393–422. [9] l. doherty, k.s.j. pister, l. el ghaoui, “convex position estimation in wireless sensor networks”, in proceedings of the infocom 2001, twentieth annual joint conference of the ieee computer and communications societies, anchorage, usa, 2001, pp. 1655 – 1663. [10] j. n. al-karaki, a. e. kamal, “routing techniques in wireless sensor networks: a survey”, ieee transactions on wireless communication, vol. 11, issue 6, pp. 6-28, 2004. [11] k. holger, a. willing, protocols and architectures for wireless sensor networks, west sussex: john willey and sons ltd, uk, 2007. [12] r. lin, z. wang, y. sun, “wireless sensor networks solutions for real time monitoring of nuclear power plant”, in proceedings of the fifth world congress on intelligent control and automation, wcica 2004, 2004, pp. 3663-3667. [13] s. z. huang, x. z. zhao, “application of wireless sensor networks on power plants monitoring”, applied mechanics and materials, pp. 762-766, 2013. [14] t. li, m. fei, “vibration monitoring of auxiliaries in power plants based on ar (p) model using wireless sensor networks”, communications in computer and information science, vol. 98, pp. 213222, 2010. [15] international telecommunication union, radio communication sector, recommendation itu-r p.37212, radio noise, p series radiowave propagation, 07/2015. [16] a. nikolic, “remote supervising and decision support for on-line monitoring systems in power plants”, plenary lecture, in proceedings of the 2 nd international conference on intelligent control, modelling and systems engineering (icms '14), cambridge, ma, usa, january 29-31, 2014, pp. 14. instruction facta universitatis series: electronics and energetics vol. 30, n o 2, june 2017, pp. 199 208 doi: 10.2298/fuee1702199d a non-intrusive identification of home appliances using active power and harmonic current  srđan đorđević, marko dimitrijević, vančo litovski university of niš, faculty of electronic engineering, niš, serbia abstract. in recent years, research on non-intrusive load monitoring has become very popular since it allows customers to better manage their energy use and reduce electrical consumption. the traditional non-intrusive load monitoring method, which uses active and reactive power as signatures, has poor performance in detecting small non-linear loads. this drawback has become more prominent because the use of nonlinear appliances has increased continuously during the last decades. to address this problem, we propose a nilm method that utilizes harmonic current in combination with the changes of real power. the advantages of the proposed method with respect to the existing frequency analysis based nilm methods are lower computational complexity and the use of only one feature to characterize the harmonic content of the current. key words: non-intrusive load monitoring (nilm), load signature, energy management 1. introduction the rapid growth in energy consumption and carbon emissions has generated interest in the deployment of efficient household energy management system. the system for home energy management enables consumers to control and manage their electrical consumption, according to the information of individual load consumptions [1, 2]. therefore, in order to significantly reduce waste in residential energy consumption, it is necessary to use load monitoring system. appliance load monitoring is not only useful in energy saving, but also in fault detection systems, remote monitoring systems and some residential applications such as in-home activity tracking [3, 4]. there are two methods for monitoring individual electrical loads: 1. distributed direct sensing or intrusive load monitoring and 2. single point sensing or non-intrusive load monitoring (nilm). the first approach requires complex instrumentation system to measure energy consumption of each device separately. this solution has many practical disadvantages such as: complex installation, low scalability, low reliability as well as high cost due to a large number of sensors and communication devices. a more practical solution for received june 3, 2016; received in revised form september 1, 2016 corresponding author: srđan đorđević faculty of electronic engineering, aleksandra medvedeva 14, 18000 niš, serbia (e-mail: srdjan.djordjevic@elfak.ni.ac.rs) 200 s. đorđević, m. dimitrijević, v. litovski monitoring individual loads is nilm, that use only one sensor attached to the electric utility service entry. this method dis-aggregates the whole-house energy consumption into energy usage of individual appliances. the most commonly used steady-state nilm method detects operation of individual loads from the step changes in real and reactive power [5]. this method works well for devices with two states of operation, but it is not suitable for extracting variable-loads and multi-state appliances. another problem is the detection of loads that consume similar steady-state power since their two dimensional signatures overlap in the p-q plane. it is especially difficult to distinguish low-power loads with small power consumption (lower than 150 w). the current trends in electricity consumption show a rapid increase in the type and number of household appliances [6], most of which are not predictable or controlled. consequently, the task of load identification becomes more challenging. an additional problem is the inability of previous nilm algorithms to detect low-power devices, which have become more numerous and diverse. a solution to this problem has been proposed in [7], through the use of the circuit-level instead of whole-house power measurements. this approach represents a trade-off between intrusive and non-intrusive load monitoring, which facilitate load disaggregation by the expense of the cost and complexity. in order to improve performance in detecting small non-linear loads several nilm methods have used current harmonics [8-11]. however, most of these techniques are not practical due to calculation of many harmonics in real time [12, 13]. in our previous works [14, 15], we have proposed the use of distortion power for appliance identification. the aim of these papers was to improve identification of small nonlinear loads by the analysis of three electrical quantities (active, reactive and distortion power), which is easy to obtain from metering devices. however, this method does not take into account the fact that the time variations of the voltage harmonics make the load identification imprecise. namely, distortion power, which is used to characterize the nonlinear loads, mainly consists of cross-products of voltage and current harmonics of different orders. in this work we suggest the use of harmonic current instead of distorted power in order to improve the appliance disaggregation accuracy. this paper proposes a nilm method based on the analysis of steady-state values of harmonic current and active power. the proposed approach uses only one feature to characterize the harmonic content of the current, as opposed to the previous nilm methods. we explore the effectiveness of the proposed electrical quantities in recognizing lowpower loads. the remainder of the paper is organized as follows: in the next section we review some of the commonly used nilm techniques. the proposed method for load monitoring is discussed in section 3. section 4 presents the results of the application of the proposed nilm method on low power appliances. the conclusion is reported in section 5. 2. background the main stages of an nilm system are: a) the data acquisition b) the feature extraction and event detection c) the load identification. the purpose of the first stage is to gather the voltage and current measurements at an adequate sampling rate. the sampling frequency depends on the electrical characteristic used by the nilm method. generally, a nonintrusive identification of home appliances using active power and harmonic current 201 the data acquisition for nilm can be classified in terms of the sampling rate as: high frequency and low frequency. the next step is to transform the raw data into a specific appliance feature, or load signature. in order to extract features it is necessary to first detect load events like switching on/off or changing state. the load signature can be derived from the steady-state signal component, which can be expressed as a finite number of sinusoids, or from the transient signal component [16-18]. dong et. al. [19] studied non-intrusive extraction of load signatures and demonstrated their technique by using the smart meter data. the final step is the estimation of the appliance-specific states by using machine learning algorithms. there are two categories of load identification algorithms: supervised learning algorithm, which requires a training procedure, and unsupervised learning algorithms, which is able to directly recognize appliance operations. the residential nilm systems usually use steady-states instead of transients load signatures. transient load monitoring systems are not suitable for residential energy disaggregation since they require expensive hardware (high-frequency energy meters) that makes them impractical. in addition, turn-off events are very difficult to detect with transient signatures. recently, wang et. al. [20] developed a new nilm method which is not limited to transient or steady state analysis and categorize the appliances according to working style. the most common method of nilm uses power measurement to characterize appliance [5]. since the load signature of this method involves two electrical parameters, the steadystate changes in active and reactive power, they are mapped to a two-dimensional signature space (p-q plane). an important characteristic of pq signature is that it can be obtained by using data from the existing smart meters. the second advantage of the power change method is that it allows automatic identification of on-off appliances. however, the method has some limitations. at first, it requires step changes in power level to identify loads. therefore, there is a problem in detecting devices with variable power draw. despite the fact that steady-state power level between two events of on-off devices is easy to detect, the method has a problem to distinguish these devices in some cases. false positive may occur when two or more devices change state at nearly the same time, since the sum of power consumptions of these devices may be associated with another load. furthermore, different loads may exhibit overlapping of signatures in pq signature space, which become more prominent as the number of loads increases. 3. non-intrusive load monitoring by using harmonic current and active power nowadays, the number of non-linear household loads, such as energy efficient variable speed drives and switched mode power supplies, increases continuously. since the nonlinear loads inject harmonic currents into the power system, it is promising to use harmonics as a load signature in the residential nilm. this kind of methods requires spectral analysis as opposed to the power based methods which use features directly derived from the raw current and voltage waveforms. due to the presence of many linear loads in the residential buildings, it is not possible to use harmonic content as a unique load signature for load disaggregation. harmonic currents may be also caused by transients during shutdown and turn on events. the harmonic content of the transient waveforms varies with time and may have frequencies that are not related to the fundamental frequency. some of the nilm methods are based on the 202 s. đorđević, m. dimitrijević, v. litovski frequency analysis of the transient waveforms [17, 18]. the problem with this approach is that transient detection is prone to errors. to overcome this problem some researchers have proposed the use of steady-state current harmonics to characterize the nonlinear loads [10, 11]. the online calculation of many current harmonics implies higher computational requirements [12]. consequently, a practical harmonic based nilm system must use a limited number of harmonics. to solve this problem many researchers have proposed various nilm methods. cole and alike [21] have developed the first harmonic based nilm method which is based on a calculation of first eight odd harmonics. the authors of [22] have proposed the use of only 2nd and 3nd harmonic, while the authors of [23] have considered first sixteen odd harmonics. recently, several authors [24] have proposed a method which used the first three odd harmonics. the nonlinear loads can be characterized not only by harmonic components in the current signal, but also by the other quantities like total harmonic distortion of current, distortion power, harmonic current, crest factor, distortion power factor. the focus of our research is on the identification of small non-linear loads. in the case when large loads are active smaller loads are difficult to identify due to the limited resolution of the data acquisition. therefore, we need to consider the load signature that enables identification of low consuming appliance in the presence of high power devices. in a typical household most of the existing load is still linear while nonlinear loads are small. therefore the influence of small loads on the on the fundamental current harmonic is negligible. each of the aforementioned electrical parameters (thdi, d, dpf, ki, and ih) can be expressed in terms of the current and voltage harmonics. according to these equations only harmonic current, as opposed to the other quantities, is not mathematically related to the fundamental current harmonic. therefore, we propose a novel approach in which non-linear loads are characterized by steady state harmonic current. the proposed method utilizes harmonic current in combination with the changes of real power. the current signal can be represented as the sum of the fundamental and harmonic components, as follows:     2 22 1 2 0 22 1 2 0 2 h hhrms iiiiiii (1) where: i is a total rms value of the current, i1 is the rms value of the fundamental harmonic, i0 is the dc component of the current, ih is the rms value of the h-th harmonic component of the current signal. therefore, harmonic current can be simply expressed in terms of the effective current and fundamental harmonic of the current as follows: 2 0 2 1 2 iiii rmsh  (2) the effective current is usually calculated by using the root mean square method as:    n n rms ni n i 1 2 ][ 1 (3) where: n is the sample index, i[n] is the current value measured at sampling point n and n is the number of samples taken during a full-wave of the current. a nonintrusive identification of home appliances using active power and harmonic current 203 the standard method for calculation of the harmonic current and voltages is based on the use of the discrete fourier transform (dft). this method works well for estimation of periodic signal in stady state. the rms value of the fundamental harmonic can be calculated using _ _ 2 2 1 1 1 re{ } im{ } 2 i i i   (4) where _ 1 i is the first harmonic current obtained by the discrete fourier transform as:     n n n n j eni n i 1 2_ 1 ][ 1  (5) according to (1-3), the calculation of the harmonic current is less computationally demanding than the calculation of harmonics. therefore, we can claim that proposed method is more computationally effective than other approaches that use harmonic analysis. to the best of our knowledge, the computational complexity of the harmonic-based nilm algorithms were not explicitly stated by researches. however, the computational cost of these methods can be determined according to the algorithm used to calculate dft (discrete fourier transform) and the number of frequencies required by the method. in the most nilm methods steady-state current harmonics are obtained by applying fft [11, 21 ,22]. however, when only a few dft frequencies are needed, as in the proposed method, it is more suitable to use the goertzel's algorithm. the main advantage of the goertzel algorithm over often used fft algorithms is less mathematical operations required for harmonic analysis. goertzel algorithm has linear complexity for n data points and m applications (required harmonics) it is o(mn), while fft algorithms has o(nlog2n) complexity. it is clear that for m n, and   1 m nn s m r  (such that uij(||rij||) = 0 when rs  rij  rh). from formula (1), it follows that the resulting force function is:         1 1 || || || || 1 1 || || || || || || (|| || ) 0 ( || || ) ( || || ) ij ij ij h s ij h s ij ij ij m n ij sr r s ij h m n h ij kr r r r r r u f r m n r r r r r m n r r r                                            (2) compared to alternative models such as [33][45][46], the addressed model can efficiently formulate the “aggregation” of carp school. fig. 5 (a) aggregation of carp; (b) interaction zones between neighboring carp. 444 y. liang, c. wu rs, rh, and rk are illustrated in fig. 5(b), ||rij|| indicates the distance between two neighboring carps.  is constant coefficient derived from empirical data. it should be remarked that the moving orientation, water flow velocity and blind zone is not considered in the formulation of formula (1). fig. 6 (a) inter-carp potential energy; (b) inter-carp force (m = 12; n = 6) fig. 6(a) shows the potential energy incurred by the pair-wise interaction between two neighboring carps. fig. 6(b) shows the resulting inter-carp force. it can be observed that, inter-carp potential energy has a stable zone (or parallel zone), within which the intercarp potential energy is basically constant so that the neighboring carps can cruise without influencing each other. fig. 7 (a) interaction between neighboring carps; (b) ij value with variant max (denoting the visible zone) according to ichthyology [31-32], the interaction between carps is supposed to be corresponding to blind-zone (figures 5(b) and 7(a)). as illustrated in the figure 7(a), i is the velocity of i-th carp. max is the maximal perceptible angle, obviously 0  max  . ij indicates the angle between i and rij, it is defined by the following formula: arccos || || || || i ij ij i ij r r             (3) a hadoop-enabled sensor-oriented information system for knowledge discovery about... 445 given the blind-anglemax, the inter-carp potential is defined as: * ij ij ij u u  (4) where 2 2 2 max2 max ( ) 2 ij ij ij e           (5) as a consequence, the inter-carp interaction force is determined by the following formula: * * || || ij ij ij ij ij f u f r      (6) where fij is defined in equation (2). based on the above mathematical model for carp schooling, the future status of carp school can be predicted according to the currently observed sensory data using agent-based mathematics model addressed above. furthermore, based on the preliminary simulation results, the motivations of fish aggregation, such as foraging advantages, reproductive advantages, predator avoidance, or hydrodynamic efficiency, can be disclosed. fig. 8 snapshots about the simulation of carp aggregation and corresponding standard-deviation of kinetic energy (the size of fish school is 50): (a) initial stage; (b) aggregation stage 446 y. liang, c. wu figure 8 demonstrated the aggregation process of a fish school of size 50. it is illustrated that carp gradually gather due to the pairwise interaction between neighboring carp; in addition, standard-derivation of kinetic energy of carp can be used to measure the aggregation status of carp school, namely a carp school in aggregation has smaller standard deviation of kinetic energy. 5.2. vehicle traffic analysis traffic flow analysis plays a significant role in civil engineering, transportation management, and homeland security [34]. due to the influence of rapid urbanization and modern industrialization, traffic congestion has become an intolerable issue in today’s world. modeling and simulation of traffic flow provides an efficient way to understand traffic congestion and disclose corresponding remedy. mathematical models for traffic flow are categorized into microscopic (or agent-based) and macroscopic strategies. macroscopic models study traffic from an average (or continuum) perspective, while microscopic models study the motion of individual vehicles. macroscopic model uses temporal and spatial-dependent partial differential equations (generally hyperbolic partial differential equations.) to formulate the expected traffic flow. representative macroscopic models for traffic flow are lighthill-whitham-richard model [35], aw-rascle model [47], and zhang model [36]. none of the above models can efficiently and accurately those formulate complicated road scenario such as nozzle, merging, diverging, and roundabout, etc. different from above models, the proposed work defines the governing equations for traffic flow using the following partial differential equations: ( ) 0v t       (7.1) (7.2) (7.3) where  (x,t) is the number of vehicles over unit length, v(x,t) is the expected velocity of the vehicle, vmax is the speed limit, a(x) is the cross-section width (or bandwidth) of the road. equation (7.1) is derived from conservation of mass. equation (7.2) ensures that traffic flow slows down up at nozzle and keeps constant speed at fork (illustrated in figure 9). fig. 9 traffic flow through (a) nozzle; (b) fork max ( , ) ( ) log ( ) r x t v a x a x   2dv p v g dt        a hadoop-enabled sensor-oriented information system for knowledge discovery about... 447 as illustrated in figure 10(a), this work acquires the citywide traffic status using electro-optical sensor array mounted on unmanned aerial vehicle (uav). figure 10(b) shows the expected traffic velocity field resulted from the solution of governing equations. the boundary conditions and the coefficient equations for governing equations are obtained according to empirical traffic data. using the expected traffic flow as reference, the observed vehicles can be measured and evaluated. fig. 10 (a) traffic status acquired using uav-mounted optical-electro sensor array; (b) expected traffic flow derived from empirical data and equation (6). 6. macro-cell strategy real-world problem generally involves a large scene such as a metropolitan city, or a huge lake. as a result, a sois-hadoop system should be scalable so as to solve the largescale problems. fig. 11 two partition strategies: (a) euler formation of a transportation network; (b) lagrange formulation of a lake. in this work, a macro-cell strategy, which partitions the global physics domain (or scene) into multiple overlapping/non-overlapping element (cell) and then manipulates them independently [9][10][16][48][49], will be employed to enhance the capability of the sois-hadoop framework to handle large-scale problems. as illustrated in fig. 8, the physics domain (or scene) of interest can be discretized using euler formulation or lagrange formulation [49]. inter-cell communications only occur between neighboring cells and they are only triggered while somewhat anomalous crowd behavior is observed 448 y. liang, c. wu and detected. cell of particular interest will be particularly analyzed using modeling and simulation strategy (which is relatively computationally costly). table 1 lists sample features value about macro-cell-oriented carp aggregation analysis [46] methods. through sufficient training, the addressed system can accurately employ the known cellular features to predict the likelihood of aggregation occurrence through appropriate machine learning methods [50] such as logistic regression, neural network, hidden markov method (hmm), and bayesian learning, etc.[50]. table 1 cell-by-cell analytics of carp aggregation (the lake is divided into 1000 cells). cell id fish density total kinetic energy std (kinetic -energy) entropy aggregation occurs? 1 25.6 77 10.2 10 yes 2 12.7 89 30.98 40 no 3 7.9 101 105 90 … .. … … 10 14 25 133 30 7. conclusion a pilot sois-hadoop system has been set-up and applied in a variety of real-world problems such as the prediction of the aggregation of carp [16] and vehicle traffic analysis [6][9][24]. some preliminary while promising outcomes has achieved. in the near future, we intend to make progress in the following directions: (1) broaden the application of the proposed sensor-oriented information analysis system such as the simulation about the spread of epidemics diseases [16], anomalous pedestrian detection [8][15], and structural health monitoring,, etc.; (2) develop scalable numerical methods in the mathematical modeling of sensor oriented information analysis system: time integration method for the solution of governing equation, domain decomposition method in the finite element method, and polynomial preconditioning, etc.; (3) optimize the exploitation of sensory data using dimensionality reduction (e.g. such as pca) [50]; (4) optimize the cooperative control of sensor asset so as to obtain the optimal observation and high energy efficiency; (5) employ more advanced and accurate mathematical model to formulate the expected behavior about toi. for example, stochastic analysis can be introduced to formulate uncertainty of sois-hadoop framework; and multi-scale modeling can be used to a seamlessly merge microscopic and macroscopic description about toi. acknowledgement: this work is jointly sponsored by the national science foundation (nsf) with proposal number 1240734 (“a design proposal for the center of cyber sensor networks for human and environmental applications”) and 1111542 (“ri: large: collaborative research: a robotic network for locating and removing invasive carp from inland lakes”). the authors would like to thank dr. kimberly kendrick from university of nevada -las vegas (nevada, usa), dr. xiaofang wei from central state university (ohio, usa), mr. darrell barker and ms. olga mendoza-schrock of the sensors directorate in air force research laboratory (ohio, usa) for their support and guidance of this work. a hadoop-enabled sensor-oriented information system for knowledge discovery about... 449 references [1] o. bott, m. marschollek, k.-h. wolf, and r. haux, "towards new scopes: sensor-enhanced regional health information systems-part 1: architectural challenges", meth. inform. med., vol. 46, pp. 476-483, 2007. [2] j. v. c. schneider, information systems today: managing in the digital world. prentice hall, 2015. [3] f. zhao, j. shin, and j. reich, "information-driven dynamic sensor collaboration", ieee signal process. mag., vol. 19, pp. 61-72, 2002. [4] w. m. ulrich, legacy systems: transformation strategies. prentice hall, 2002. [5] s. fernandes, y. liang, s. sritharan, x. wei, and r. kandiah, "real time detection of improvised explosive devices using hyperspectral image analysis", in proceedings of the 2010 ieee national aerospace and electronics conference (naecon 2010). 2010. [6] s. fernandes and y. liang, "chipping and segmentation of target of interest from low-resolution electrooptical data", in proceedings of the spie defense, security, and sensing. 2013, pp. 87440r-87440r-8. [7] k. grolinger, w. a. higashino, a. tiwari, and m. a. capretz, "data management in cloud environments: nosql and newsql data stores", j. cloud. comput. adv. syst. appl., vol. 2, p. 22, 2013. [8] y. liang, w. melvinb, s. fernandesa, m. hendersona, s. i. sritharanc, and d. barkerd, "a crowd motion analysis framework based on analog heat-transfer model", american journal of science and engineering, vol. 2, pp. 33-43, 2013. [9] y. liang, m. henderson, s. fernandes, and j. sanderson, "vehicle tracking and analysis within a city", in proceedings of the spie defense, security, and sensing. 2013, pp. 87510f-87510f-15. [10] y. liang, m. szularz, and l. t. yang, "finite-element-wise domain decomposition iterative solvers with polynomial preconditioning", math. comput. model., vol. 58, pp. 421-437, 2013. [11] a. s. foundation. (2014). hadoop releases. available: http://www.apache.org/ [12] y. liang and c. wu, "a sensor-oriented information system based on hadoop cluster", in proceedings on the international conference on internet computing (icomp). 2014, p. 1. [13] y. liang and c. wu, "an agent-based mathematical model about carp aggregation", in proceedings of the spie sensing technology+ applications. 2015, pp. 94860q-94860q-11. [14] r. c. taylor, "an overview of the hadoop/mapreduce/hbase framework and its current applications in bioinformatics", bmc bioinformatics, vol. 11, p. s1, 2010. [15] y. liang, w. melvin, s. i. sritharan, s. fernandes, and d. barker, "cma-ht: a crowd motion analysis framework based on heat-transfer analog model", in proceedings of the spie defense, security, and sensing. 2012, pp. 84020j-84020j-13. [16] y. liang, z. shi, s. i. sritharan, and h. wan, "simulation of the spread of epidemic disease using persistent surveillance data", in proceeding of comsol 2010, boston, 2010. [17] k. langendoen and n. reijers, "distributed localization in wireless sensor networks: a quantitative comparison", comput. netw., vol. 43, pp. 499-518, 2003. [18] h. qi and j. b. moore, "direct kalman filtering approach for gps/ins integration", ieee trans. aerosp. electron. syst., vol. 38, pp. 687-693, 2002. [19] g. a. bekey, autonomous robots: from biological inspiration to implementation and control. mit press, 2005. [20] g. anastasi, m. conti, m. di francesco, and a. passarella, "energy conservation in wireless sensor networks: a survey", ad hoc networks, vol. 7, pp. 537-568, 2009. [21] t. s. rappaport, wireless communications: principles and practice. vol. 2: prentice hall, 2002. [22] j. n. al-karaki and a. e. kamal, "routing techniques in wireless sensor networks: a survey", ieee wireless commun., vol. 11, pp. 6-28, 2004. [23] y. liang, j. weston, and m. szularz, "generalized least-squares polynomial preconditioners for symmetric indefinite linear equations", parallel. comput., vol. 28, pp. 323-341, 2002. [24] j. sanderson and y. liang, "no-reference image quality measurement for low-resolution images", in spie defense, security, and sensing. 2013, pp. 874404-874404-17. [25] w. ren and r. w. beard, distributed consensus in multi-vehicle cooperative control. springer, 2008. [26] s. camazine, self-organization in biological systems. princeton university press, 2003. [27] i. d. couzin, j. krause, r. james, g. d. ruxton, and n. r. franks, "collective memory and spatial sorting in animal groups", j. theor. biol., vol. 218, pp. 1-11, 2002. [28] a. deutsch and s. dormann, cellular automaton modeling of biological pattern formation. 2005. [29] a. huth and c. wissel, "the simulation of the movement of fish schools", j. theor. biol., vol. 156, pp. 365-385, 1992. [30] p. b. johnsen and a. d. hasler, "winter aggregations of carp (cyprinus carpio) as revealed by ultrasonic tracking", trans. am. fish. soc., vol. 106, pp. 556-559, 1977. [31] r. jullien and r. botet, aggregation and fractal aggregates. world scientific pub co inc, 1987. [32] s. stöcker, "models for tuna school formation", math. biosci., vol. 156, pp. 167-190, 1999. http://www.apache.org/ 450 y. liang, c. wu [33] t. vicsek, a. czirók, e. ben-jacob, i. cohen, and o. shochet, "novel type of phase transition in a system of self-driven particles", phys. rev. lett., vol. 75, p. 1226, 1995. [34] m. bando, k. hasebe, a. nakayama, a. shibata, and y. sugiyama, "dynamical model of traffic congestion and numerical simulation", phys. rev. e: stat., nonlinear, soft matter phys., vol. 51, p. 1035, 1995. [35] m. j. lighthill and g. b. whitham, "on kinematic waves. ii. a theory of traffic flow on long crowded roads", in proceedings of the royal society of london a: mathematical, physical and engineering sciences. 1955, pp. 317-345. [36] h. m. zhang, "a mathematical theory of traffic hysteresis", transport. res. b-meth., vol. 33, pp. 1-23, 1999. [37] a. c. r. c. committee, "asian carp control strategy framework", 2013. [38] e. h. buck, h. f. upton, c. v. stern, and j. e. nicols, "asian carp and the great lakes region", 2010. [39] j. rosenfeld, "assessing the habitat requirements of stream fishes: an overview and evaluation of different approaches", trans. am. fish. soc., vol. 132, pp. 953-968, 2003. [40] r. naylor, s. williams, and d. strong, "aquaculture-a gateway for exotic species", science (wash.), vol. 294, pp. 1655-1656, 2001. [41] r. goldburg, m. s. elliott, r. naylor, and p. o. commission, marine aquaculture in the united states: environmental impacts and policy options. pew oceans commission, 2001. [42] t. m. koel, k. s. irons, and e. n. ratcliff, "asian carp invasion of the upper mississippi river system", us department of the interior, us geological survey, upper midwest environmental sciences center, 2000. [43] p. bajer, c. chizinski, and p. sorensen, "using the judas technique to locate and remove wintertime aggregations of invasive common carp", fish. manage. ecol., vol. 18, pp. 497-505, 2011. [44] a. r. leach, molecular modelling: principles and applications. prentice hall, 2001. [45] a. czirók, m. vicsek, and t. vicsek, "collective motion of organisms in three dimensions", physica. a., vol. 264, pp. 299-304, 1999. [46] c. w. reynolds, "flocks, herds and schools: a distributed behavioral model", in acm siggraph computer graphics. 1987, pp. 25-34. [47] a. aw and m. rascle, "resurrection of" second order" models of traffic flow", siam journal on applied mathematics, vol. 60, pp. 916-938, 2000. [48] y. liang, the use of parallel polynomial preconditioners: in the solution of systems of linear equations. lap lambert academic publishing, 2013. [49] a. toselli and o. widlund, domain decomposition methods: algorithms and theory. vol. 3: springer, 2005. [50] s. theodoridis, machine learning: a bayesian and optimization perspective. academic press, 2015. instruction facta universitatis series: electronics and energetics vol. 30, n o 1, march 2017, pp. 1 25 doi: 10.2298/fuee1701001b microelectronic reliability models for more than moore nanotechnology products  alain bensoussan institute of technology antoine de saint exupery, toulouse, france abstract. disruptive technologies face a lack of reliability engineering standards and physics of failure (pof) heritage. devices based on gan, sic, optoelectronics or deepsubmicron nanotechnologies or 3d packaging techniques for example are suffering a vital absence of screening methods, qualification and reliability standards when anticipated to be used in hi-rel application. to prepare the hirel industry for just-in-time cots, reliability engineers must define proper and improved models to guarantee infant mortality free, long term robust equipment that is capable of surviving harsh environments without failure. furthermore, time-to-market constraints require the shortest possible time for qualification. breakthroughs technologies are generally industrialized for short life consumer application (typically smartphone or new pcs with less than 3 years lifecycle). how shall we qualify these innovative technologies in long term hi-rel equipment operation? more than moore law is the paradigm of updating what are now obsolete, inadequate screening methods and reliability models and standards to meet these demands. a state of the art overview on quality assurance, reliability standards and test methods is presented in order to question how they must be adapted, harmonized and rearranged. here, we quantify failure rate models formulated for multiple loads and incorporating multiple failure mechanisms to disentangle existing reliability models to fit the 4.0 industry needs? key words: reliability, gan, sic, dsm, nanotechnology, more than moore. 1. introduction hi-rel embedded system applications in aeronautic, space, railways, nuclear, telecommunication rely on reliability engineering standards [1] [2] related to physics of failure (pof) [3]. when systems are constructed on innovative and disruptive technologies, such standards and methods are in general obsolete and inadequate to prepare their industrialization and qualification for just-in-time commercialization. suggested probabilistic design for reliability (pdfr) [4] and prognostic health monitoring (phm) [5] concepts open the door to anticipate and assess their reliability and quantification. reliability prediction as remaining useful life (rul), failure rate and accelerating factors are mathematic and tools related to pof describing macroscopic changes in materials and devices  received may 18, 2016 corresponding author: alain bensoussan irt saint exupery, 118 route de narbonne cs 44248, 31432 toulouse cedex 4, france (e-mail: alain.bensoussan@irt-saintexupery.com) 2 a. bensoussan having their own microscopic behavior. indeed statistics helps to predict population comportment but are unable to predict the performance on a single item as part of this population. this is exactly what did ludwig boltzmann (1844-1906) [6] when he gave a new perception of the universe on microscopic scale in the kinetic theory: a macroscopic state for some probability distribution of possible microstates. section 1 of this paper will review existing standards and clarify some route to implement and generalize existing reliability jedec or mil standards. these standard methods develop failure mechanism models and their associated activation energies or acceleration factors that may be used in making system failure rate estimations. for large scale integration processes in the nanoscale range (now lower than the 10 nm) used for microcontrollers or pc‟s chip, the physic of interaction, the temperature distributions and the critical path for signal processing are extremely variable. the average value of the apparent activation energies of the various failure mechanisms can‟t be exploited because a) different failure mechanisms have different weighting factors and effects differently each portion of an ic‟s and b) the apparent activation energy values affect the acceleration factor exponentially rather than linearly. section 2 will detail accelerated stress models as exposed in well-established jedec documents prior to recall the multiple stress boltzmann-arrhenius-zhurkov (baz) reliability model [7], [8] which can be considered also as a development of the cox proportional hazards model [9]. we will settle multiple failure mechanisms [10] as mandatory to be pondered for dsm nanotechnology nodes and will show how the htol reliability model elaborated by j. bernstein [11] [12] can support a more robust easy-to-use theory. section 3, will show how a multi-dimensional tool named m-storm (multi-physics multi-stressors predictive reliability model) [13] can be implemented in a concrete situation existing for the deep-submicron process devices highlighting the remaining steps to be carried out for a complete tool release. 2. quality standard overview well-known quality standards in various industry domains rely or are close to military standards mil-std and jedec methods. now entering the 4.0 industry paradigm as the fourth industrial revolution (the age of cyber and robots), quality/reliability models and tools headed by health monitoring (hm) leads toward more crucial and vital questions. this section is not intended to be an exhaustive cookbook but on the other hand will highlight how generic approaches and hypotheses are considered to assure products and equipment‟s quality and how to built-in reliability products dynamically. the name “dynamically” means that hardware‟s and software‟s must be designed in order to preidentify and characterize system degradation when still in-operating condition. to diagnostic the healthiness of a system for anticipating failure requires to open new roads to imagine and to design dedicated hardware and software installed within the system itself and to define procedures and tests which will decide self-corrections at hardware and/or software level (artificial intelligence). this requires a high level of intelligence integration within a system or a product and this is the challenge of the 4.0 era. jedec or mil standards are generally based on the principle of separating the variables and considering a single stress at a time and a single failure mode and mechanism at a time. a failure mechanism may be characterized by how a degradation process proceeds including the microelectronic reliability models for more than moore nanotechnology products 3 driving force, e.g., oxidation, diffusion, electric field, current density. when the driving force is known, a mechanism may be described by an explicit failure rate model; identifying that model with associated parameters is the main objective. the existing technologies, extended also to highly critical innovative technologies, oblige design engineers to consider those driving forces to be quantified considering multiple internal stress parameters inducing interfering stress settings (current, voltage, power and temperature) and loads (dc and ac, environment as thermal cycling, radiation, electrostatic discharge -esd, electrostatic over-stress -eos, energetic electromagnetic pulse, etc.). 2.1. european standards as an example, the european cooperation for space standardization (ecss) (www.ecss.nl) is an initiative established to develop a coherent system of european space standards. the ecss organization standardization policy develops a documentation architecture with three branches (project management, product assurance and engineering) to overcome issues due to the existing standard resulting in higher costs, lower effectiveness and in a less competitive industry. the framework and basic rules of the system were defined with the involvement of the european space industry. a short overview of the main system documentation is presented here with the intention to show how, when and where the quality assurance requirements affect electronic parts supply chain considering long term harsh environment space missions. most of space product assurance documents are constructed to guarantee final customers‟ and operators‟ satisfaction for satellite mission duration greater than 18 years without repair. most of them rely on well-established technologies and products avoiding to use innovative products. the ecss-q-st-60c [2] standard defines the requirements for selection, control, procurement and usage of electronic, electrical, and electromechanical (eee) components for space projects considering the characteristics of the space environment condition. when selected, parts must be integrated on system based on best design practices. the “space product assurance derating eee components” ecss-q-st-30-11c [14] specifies electrical derating requirements applicable to eee components. derating is a long standing practice applied to components used on spacecraft‟s. cots microcontrollers and core ic chips produced on nanoscale technology are now integrating 1 billion transistors (below the 10 nm node) on a single chip with cash memory, i/o accesses, cpu, flash and ddr memory, all biased at low voltage (below 1v) and accessed at increasing clock frequency (few ghz). as derating is under the control of designers and manufacturers nanoscale makers: due to the tremendous increase of system capability, big data management, world-wide telecommunication and internet of things, the space industry must collaborate or impose new design rules if they want to use such innovative technologies. another scale, is for new packaging and connection techniques to be pondered. the ecss-q-st-70-08c, [15] ―space product assurance manual soldering of high-reliability electrical connections” is a standard defining the technical requirements and quality assurance provisions for the manufacture and verification of manually-soldered, high-reliability electrical connections. for temperatures outside a normal range (−55°c to +85°c) special design, verification and qualification testing is performed to ensure the necessary environmental survival capability. packaging and assembly reliability models must be improved too when additive manufacturing techniques and new materials for high power dissipation are mobilized. “commercial electrical, electronic and electromechanical (eee) 4 a. bensoussan components” document named ecss-q-st-60-13c [16] applies only to commercial components which meet technical parameters that are on the system application level demonstrated to be unachievable with existing space components or only achievable with qualitative and quantitative penalties. all of these normative documents as ecss and escc standards are generally based on mil-std and jedec test methods. component failures and system failures determination have been extensively described on handbook and tools but all of them are now mostly obsolete with respect to the emerging technologies proposed on the cots market. they are unable to predict and quantify the reliability of new products having short product‟s life cycle and being complex and technically highly sophisticated. 2.2. standards and handbooks for eee parts, the at&t reliability manual [17] is more than just a prediction methodology. although it contains component failure data, it outlines prediction models based on a decreasing hazard rate model, which is modeled using weibull data. fides [18] is a new reliability data handbook (available since january 2004). the fides guide is a global methodology for reliability engineering in electronics, developed by a consortium of french industry under the supervision of the french dod (dga). the important fact is that fides evaluation model proposes a reliability prediction with constant failure rates. the infant mortality and wear out periods are today excluded from the prediction. the iec 62380 electronic reliability prediction supports methods based on the latest european reliability prediction standard. it was originally, the rdf 2000 (ute c 80810, iec-62380-tr ed.1) [19] from cnet handbook previously published as rdf93 and covers most of the same components as mil-hdbk-217. mil-hdbk-217 [1] reliability prediction of electronic equipment, has been the main stay of reliability predictions for about 40 years, but it has not been updated since 1995. the siemens sn29500 [20] failure rates of components and expected values method was developed by siemens ag for use by siemens associates as a uniform basis for reliability prediction. the reliability prediction procedure for electronic equipment documents telcordia sr-332 [21] recommends methods for predicting device and unit hardware reliability. this procedure is applicable for commercial electronic products whose physical design, manufacture, installation, and reliability assurance practices meet the appropriate telcordia (or equivalent) generic and product-specific requirements. in july 2006, riac released 217plus tm [22] as the successor to the dod-funded, defense technical information center (dtic)-sponsored version 1.5 of the prism ® software tool. the rac (eprd) electronic parts reliability data handbook database is the same as that previously used to support the mil-hdbk-217, and is supported by prism ® . the models provided differ from those within mil-hdbk-217. the prism software is available from the reliability analysis center [23]. the models contain failure rate factors that account for operating periods, non-operating periods and cycling. traditional methods of reliability prediction model development have relied on the statistical analysis of empirical field failure rate data. the riac new approach is predicated on component models considering the combination of additive and multiplicative model forms that predict a separate failure rate for each class of failure mechanism. a typical example of a general failure rate model that takes this form is: microelectronic reliability models for more than moore nanotechnology products 5 (1) where, λ p = predicted failure rate λ o = failure rate from operational stresses π o = product of failure rate multipliers for operational stresses λe = failure rate from environmental stresses π e = product of failure rate multipliers for environmental stresses λ c = failure rate from power or temperature cycling stresses π c = product of failure rate multipliers for cycling stresses λ i = failure rate from induced stresses, including electrical overstress and esd λ sj = failure rate from solder joints π sj = product of failure rate multipliers for solder joint stresses one can note that part-count prediction assumes a “constant failure rate per part” as a linear combination (+ and x) of  factors and specific  factors. failure rate is for a stated period of the life of an item, the ratio of the total number of failures in a sample to the cumulative time of that sample. a consistent frame work for reliability qualification using the physics-of-failure (pof) concept is provided by the jedec jep148 procedure [24]. the physics-of-failure (pof) concept [25] is an approach to design and development of reliable product to prevent failure based on the knowledge of root-cause failure processes. it is based on understanding  relationships between requirements and the physical characteristics of the product (and their variation in the production process),  interactions of product materials with loads (stresses at application conditions) and their influence on product reliability with respect to the use conditions. 2.3. discussion reliability engineering and mathematics have been many times presented, see for example detailed by suhir, e. in his book “reliability applied probability for engineers and scientists”, mcgraw-hill, [26]. talking about reliability engineering of objects is studying property of complex elements that do not lend themselves to any restauration (repair) and have to be replaced after first failure. the reliability is completely due to their dependability. this property is measured by the probability that a device or a system will perform a required function under stated conditions of a stated period of time. suhir explain, this involves three major concepts: 1. probability: the performance of a group of devices in a system described as a failure rate. such an overall statistic does not have a meaning for an individual device. 2. definition of a “reliability function”: for a device, a failure is relatively easy to be fixed, based on guaranteed performance which can be measured. for a system, this concept is rather elusive and harder to set since based on customer satisfaction. 3. time: what is “time”, in defining reliability? there may be many critical time period, at component, equipment or at system level, but the reliability for each critical time period can be determined in appropriate terms. standards listed in section 1.2 are generally related to item as parts and system hardware functions based of constant failure rate considering the element of interest have been manufactured and screened efficiently, operating in a given environment and assuming 6 a. bensoussan wearout failure rate well beyond the operating end of life time (eol). the next sections developed in this paper will show how these hypotheses must be reexamined for present and future application based on new technologies but also on existing ones as deep sub micron nanotechnologies already used for asics, fpga or memories. the book from p. a. tobias and d.c. trindade [27] “applied reliability” (3 rd edition), is an extensive and powerful document exposing mathematics and methods, statistical software helping reliability engineers addressing applied industrial reliability problems. once developing statistical life distribution models, reliability prediction and quantification on emerging technology is somewhere a matter to look inside a fuzzy crystal. we are unable to obtain reasonable set of data from short endurance stress tests and extrapolate or approximate what should be the effect at normal use condition on their behavior. what a product is likely to experience at much lower stress knowing its failure rate at a higher stress? the model used to bridge the stress gap are known as acceleration models but assumes to be constructed and grounded on some hypotheses:  lot homogeneity and reproducibility: it is assumed components under stress are manufactured from an homogeneous lot and supposing no major change in manufacturing technology,  stress effects are representative, homogeneous and reproducible,  failure mechanism duplication: independent of level of stress, and reproducible,  the failure rate of a device is independent of time. this is the usual, but often very inappropriate, assumption in conventional reliability-prediction methods .  linear acceleration: when every time to failure, every distribution percentile is multiplied by the same acceleration factor to obtain, the projected values at another operating stress, we say we have linear acceleration [27].  temperature effect governed by arrhenius law: “things happen faster at high temperature”. lower temperatures may not necessarily increase reliability [10] [5], since some failure mechanisms are accelerated at lower temperature as seen for example for hot carrier degradation mechanisms. generally quality standards and prediction tools are focusing only on high temperature acceleration models.  multiplicity: multiple stresses (loads) and multiple failure mechanisms at a time (cf discussion in section 3 and 4).  pof signature: activation energy determined from experiments based on catastrophic degradation or related to electrical parameter drift (a predictor).  temperature definition: an accurate and agreed concept to be the core of reliability prediction tool based on thermal accelerated testing. reliability of electronic equipments are designed considering affected by the temperature. influence of temperature on microelectronics and system reliability published by p. lall, m. pecht and e. b. hakim in 1997 [28], discussed various modelling methodologies for temperature acceleration of microelectronic device failures. mil-hdbk-217, fides and jedec standards have advantages to describe such models but are mostly not adapted to breakthrough and new immature technologies. microelectronic reliability models for more than moore nanotechnology products 7 how to quantify reliability for disruptive technologies? knowing, a) multiple failure mechanisms are in competition, b) activation energies are parameters determined experimentally, c) based on accelerated tests carried out at extreme temperatures (both at high and low) and d) supposed to be constant but modified by stress conditions, physics of failure (pof) methodology is the alternative suggested approach in the mid 90‟s by the u.s., cadmp alliance now known as electronic components alliance [5]. problems arise when the failure mechanisms precipitated at accelerated stress levels are not activated in the equipment operating range as highlighted by lall, pecht and hakim [28]. since 2010, we first define a generalized multiple stress reliability model and suhir, e. published a comprehensive model called boltzmann-arrhenius-zhurkov (baz) model [7], [8], [29], [30]. the premises of this model was addressed by d. cox [9] in journal of the royal statistical society 1972. in last decade view, two advanced probabilistic design-for reliability (pdfr) concepts were addressed in application to the prediction of the reliability of aerospace electronics: 1) boltzmann-arrhenius-zhurkov (baz) model, which, in combination with the exponential law of reliability, and 2) extreme value distribution (evd) technique that can be used to predict the number of repetitive loadings that closes the gap between the capacity (stress-free activation energy) of a material (device) and the demand (loading), thereby leading to a failure. the second concern illustrated by the previous discussion is related to multiple failure mechanism being in competition. the monograph and papers published since 2008 by pr. j. bernstein [11], [25], [31] quite precisely define the context and the modified m-htol [12] approach. the development of which is part of the following section 2 and 3. 3. reliability mathematics and tools many books and papers define basic concepts in reliability and particularly on reliability prediction analysis such as a fmeca (failure modes, effects and criticality analysis), rbd (reliability block diagram) or a fault tree analysis. in reliability engineering and reliability studies, the general convention is to deal with unreliability and unavailability values rather than reliability and availability (see for example http://www.reliabilityeducation.com/):  the reliability r(t) of a part or system is defined as the probability that the part or system remains operating from time t0 to t1, set that it was operating at t0.  the availability, a(t) of a part or system is defined as the probability that the component or system is operating at time t1, given that it was operating at time t0.  the unavailability, q(t) of a part or system is defined as the probability that the component or system is not operating at time t1, given that it was operating at t0. hence, r(t) + f(t) = 1 or unreliability f(t)= 1 – r(t) and a(t) + q(t) = 1 (2) figure 1 shows the schematic representation of failure distribution functions. the instantaneous failure rate (ifr), also named the hazard rate (t), is the ratio of the number of failures during the time period t, for the devices that were healthy at the beginning of testing (operation) to the time period t. ( ) ( ) 1 ( ) f t t f t    (3) 8 a. bensoussan fig. 1 instantaneous failure rate, probability density function and reliability distribution functions the cumulative probability distribution function f(t) for the probability of failure is related to the probability density distribution function f(t) as 0 ( ) ( ). t f t f x dx  (4) and the reliability function r(t), the probability of non-failure is defined as ( ) 1 ( )r t f t  (5) failure rates are often expressed in term of failure units (fits): 1 fit = 1 failure in 10 9 device-hours. probability data obtained when performing accelerated tests (halt or foat) can be modeled by various distribution models, such as exponential law, weibull law, normal or log-normal distributions, etc. in most practical applications, life is a function of more than one or two variables (stress types). the next and an important question is how to consider and relate the reliability figures when applying other stresses than temperature, as thermal cycling or radiation? on jedec standard jep122g, reliability models as electromigration [32], ohmic contact degradation [33] [34], coffin-manson [35], eyring [36], humidity [37], time dependent dielectric breakdown tddb [38], hot carrier injection [39] [40] [41], hydrogen poisoning [42] [43], thermo-mechanical stress [44], nbti [45] are generally expressed by a function of stress parameter or by a function of an electrical predictor multiplying the exponential activation energy factor. talking about stress parameters named stressors or electrical predictors may sometimes be confusing because the first one (e.g. stressors), give warning on how is high or low the free gibbs energy barrier to cross, and the second concept (e.g. predictors) gives information on how fast the device will cross that barrier. the core of generalizing the existing models must unified this apparent antagonism by using precise definitions and effects. in general this has been unthank by major papers published. reader will see in the next paragraph how such confusion is considered. all studies argue and consider the activation energy are deduced experimentally as a constant with respect to temperature (low vs high), stress conditions, and other predictors as for example charge de-trapping for hot carrier degradation or nbti for pmos devices microelectronic reliability models for more than moore nanotechnology products 9 under negative gate voltages at elevated temperature. these models are generally applicable for a given technology. even some end-users and customers are focused to qualify lot production instead of a process. there is a need to simplify the forest of existing models. is it possible to harmonize the mathematics of the existing paradigm? first consideration is to define precisely the elements and roles of each parameter separating the thermodynamics (activation energy, free gibbs energy), stressor and predictor parameters and their effects in failure mechanisms. 3.1. reliability standards and accelerated stress models formerly, activation energy is related in one hand to a single pure temperature effect and disregard other stress parameters. it is true in second hand, the activation energy is defined as an effective activation energy mostly modified by several type of other stresses applied and failure mechanisms considered. steady state temperature stress tests are considered the only stress parameter affecting reliability and are typically time-dependent temperature related. failure mechanisms are thermally activated or not and can be either catastrophic or parametric (drift of characteristics). a sudden catastrophic failure can be observed due to electrical overstress and is called burnout or due to high electrical field inducing catastrophic breakdown. breakdown and burnout limits are also temperature dependent. as a consequence it is reasonable to consider a same failure mechanisms being induced by a pure thermal stress to a pure electrical stress: in this case any intermediate condition between these two extremes will be modeled by a pure arrhenius activation energy modified by a factor depending of stresses applied. this postulate justify the boltzmann-arrheniuszhurkov model (baz) presented in the section 2.2. the idealized experimental bathtub curve of a material or a device shown in figure 2 exhibits the combined effect of the statistics-related and reliability-physics-related processes. in the analysis developed by suhir [46], a probabilistic predictive model (ppm) is developed for the evaluation of the failure rates and the probabilities of non-failure. here we draw a synthetized view on how we can clarify some concept for a comprehensive harmonization of existing reliability model of failure mechanism:  internal electrical stresses labelled stressor parameters are responsible of the wearout failure rate (weibull  greater than 1). they are only of four types of applied and imposed stress conditions: they are voltage, current, dissipated power and input signal or esd/eos/emc energies and can be either static, dynamic, transient or surge. they are quantified with respect to their level of stress applied compared to their level of burnout instantaneous failure mode. but for sake of standardization and normalization they are limited by the maximum values allowed by the technology.  when device operates under external stress (thermal management constraints, packaging and assembly constraints, atmosphere contaminants, radiations environments), such stressor parameters level are modified with respect to their maximum burnout and breakdown limits thus accelerating wearout failures compared to temperature and biasing stress in the absence of external environment.  failure modes of interest are electrical or mechanical signatures related to failure mechanisms observed and are predictor parameters. such parameters can be measured as absolute drift value of electrical parameter or as relative percentage of drift. 10 a. bensoussan fig. 2 bathtub curve. weibull distribution with two parameters (shape and time).  constant failure rate (random) are caused by random defects and random events. the failure rate is modeled by a weibull shape parameter close to 1 which is equivalent to an exponential distribution law.  lot-to-lot production variation (respectively device-to-device) and performance dispersion from a single manufacturing lot (respectively device) will affect the burnout limits, inducing in return a change of percentage of stress applied on a given lot (device). statistic dispersion will affect the time to failure on similar way (producing the same statistical effect). such dispersion at lot and device level will impact the remaining useful life (rul) for some part of the population.  infant mortality failure population are caused by “defects” and correlates with defect-related yield loss. they are reduced by improved quality manufacturing and by screening. 3.2. baz model and transition state theory accelerated stresses design for reliability (dfr) is a set of approaches, methods and best practices that are supposed to be used at the design stage of the product to minimize the risk that it might not meet the reliability requirements, objectives and expectations. these considerations have been the basis of the generalized baz model mentioned in section 1 constructed from the 1965 zhurkov‟s [47] solid-state physics model, which is a generalization of the 1889 arrhenius‟ [48] chemical kinetics model, which is, in its turn, a generalization of the 1886 boltzmann‟s (“boltzmann statistics”) [49] model in the kinetic theory of gases. the paradigm of the transition state theory (tst) developed by e. wigner in 1934 [50] and by m. evans, m. polanyi in 1938 [51] is viewed as the equivalent approach we apply to the concept of a unified semiconductor reliability model. the arrhenius equation relates reaction rate r of transition from a reactant in state a to a product in state b is depending on temperature and the activation energy as also modeled by transition state theory. the probability that the particular energy level u is exceeded has been expressed in boltzmann‟s theory of gases: microelectronic reliability models for more than moore nanotechnology products 11 ( ) (6) and a total distribution is found to be: ∫ ( ) (7) this function defines the probability p that the energy of a defect exceeds the activation energy can be assessed as a function of the ratio of time constant 0 to lifetime equal to: ( ) (8) figures 3.a show a schematic drawing of the principle of the transition state theory which represents the amount of free energy δgǂ required to allow a chemical reaction to occur from an initial state to a final state. if the chemical reaction is accelerated by a catalyst effect the height of energy δgǂ is reduced allowing the transition initial state → final state to occur with a transition state energy being a lower value of the energy barrier to cross. in transition state theory with catalyst effect it is possible to get an effective activation energy being negative (shown in figure 3.b), as observed for example for hci failure mechanism. it is observed that hot carrier injection induced effects are exaggerated at lower temperatures demonstrating clear negative effective activation energies. fig. 3.a transition state theory principle diagram fig. 3.b with catalyst effect with negative ea and for hci failure mechanism. the boltzmann-arrhenius-zhurkov (baz) model [8] determines the lifetime  for a material or a device experiencing combined action of an elevated temperature and external stress: ( ) (9) where s is the applied stress (can be any stimulus or a group of stimuli, such a voltage, current, signal input, etc), t is the absolute temperature, γ is a factor of loading characterizing the role of the level of stress (the product γ · s is the stress per unit volume and is measured in the same units as the activation energy ea), and k the boltzmann's constant (1.3807 10 −23 j/k or 8.6174 10 −5 ev/k). the generalized baz model proceeds from the rationale that the process of damages is temperature dependent, but is due primarily to the accumulation of damages resulting 12 a. bensoussan from loading above the threshold stress level. each level of stress is characterized by the corresponding term ·s normalized by the term k ·t, thereby defining the relationship between the elevated temperature and the energy contained in an elementary volume of the material or the active zone of a device. in a recent papers e. suhir et al. presented [52] [53] the substance of the multiparametric baz model considering the lifetime  in the baz model be viewed as the mttf. the failure rate for a system is given by the baz equation can be found as: ( ) (10) assuming the probability of non-failure at the moment t of time is (11) this formula is known as exponential formula of reliability. if the probability of failure p is established for the given time t in operation, then the exponential formula of reliability can be used to determine the acceptable failure rate. such an assumption suggests that the mttf corresponds to the moment of time when the entropy of this law reaches its maximum value. using the famous expression due to gibbs for the entropy which was later used by shannon to define information [54] from the formula: (12) we obtain that the maximum value of the entropy h(p) is equal to e -1 = 0.3679. with this probability of non-failure, the formula (9) yields: ( ) (13) comparing this result with the arrhenius equation (1), suhir concludes that the t50% or mttf expressed by this equation corresponds to the moment of time when the entropy of the time-depending process p=p(t) is the largest. let us elaborate on the substance of the multi-parametric baz model using an example of a situation when the product of interest is subjected to the combined action of multiple stressors si (electrical stress as for example dc biasing current, voltage, power dissipation or dynamic input signal). let us assume that the wearout failure rate wf(t) of an electronic product, which characterizes the degree of propensity of a material or a device to failure, is determined during testing or operation by the relative drift of an electrical predictor parameter p as the electrical signature of the failure mode of concern [55] and considering equation (10), one could seek the probability of the material or the device non-failure in the form: [ ( ) ( ∑ )] (14) where p0 is the value of the predictor parameter at time = 0 and , i values reflect respectively the sensitivities of the device to the corresponding predictor and stressors. the model can be easily made multi-parametric, i.e. generalized for as many stimuli as necessary [55]. the sensitivity factors must be determined experimentally. because of that, the structure of the multi-parametric baz expressed by the equation (14) should not be interpreted as a superposition of the effects of different stressors, but rather as a convenient and physically meaningful representation of the foat data. microelectronic reliability models for more than moore nanotechnology products 13 in such condition the suggested approach is to determine the  factors reflecting the sensitivities of the device to the corresponding stimuli (stressors). this will be detailed when considering the baz model derived from the transition state theory in the following section related to multiple dimensional reliability model. one‟s note the equation (14) can be viewed as a cox proportional hazards model [9]. survival models consist of two parts: the underlying hazard function, denoted 0(t), describing how the risk of event per time unit changes over time at baseline levels of covariates; and the effect parameters, describing how the hazard varies in response to explanatory covariates. the hazard function for the cox proportional hazard model has the form: ( ) (15) this expression gives the hazard rate at time t for subject i with covariate vector (explanatory variables) xi. saying this, one limitation of the cox model is observed on reliability analysis method: for a sound part at time t, the failure probability during time [t, t+dt] is related to stress applied during this period of time dt but not taking into account history of stresses applied before t. this may be a limitation when modeling nonconstant stress applied during time (e.g. step stress test for example). the proportional hazards (ph cox) model can be generalized (gph) by assuming that at any moment the ratio of hazard rates is depending not only on values of covariates but also on resources used until this moment. the application of the pdfr concept and particularly the multi-parametric baz model enables one to improve dramatically the state of the art in the field of the microelectronic products reliability prediction and assurance. 4. multi-dimensional reliability models as seen in section 1 and 2, existing quality standards are considering stress tests and related pof mechanisms without entanglements. device failure rates are seen to be a sum of each existing failure rate taken individually. bathtub curve is an idealized view of instantaneous failure rate scenario generally considered in well-known mil, jedec or telcordia standards. the multidimensional variable addressed by boltzmann-arrhenius-zhurkov (baz) reliability model and the multi mechanism model htol (high temperature operating lifetest) proposed by j. bernstein are discussed now with the intend to generalize how their implementation can be suitable for an easy to use, to quantify and to predict probability of failure of new products and technologies. 4.1. multiple stressors and predictors the baseline of the model deals with concept issued from the transition state theory and the healthiness of a population of device must grow and change with time and stresses applied. the first concept is that a device or a homogeneous lot of item constituted of population of “identical” device must fail after an observed time due to aging either under operation or under storage conditions. the statistics of this behavior has to do with entropy evolution of such item of population. the transformation from a sound item to a failure is similar to what is described in the transition state theory considering similarly a system of products to combine in a new system of product when energy is provided to the system. 14 a. bensoussan stressor definition and normalization in a similar way considering a population of devices submitted to heating will only degrade continuously up to malfunction and failure. but when superposing high (or low) temperature and adequate stressors, the time-to-failure of such alike population will reduce. the term “stressors” here is defined as the electrical factors applied to the device of concern. stressors are all limited by technology boundaries defined by the burnout values of each related electrical parameter (breakdown voltage, current overstress and burnout, power burnout, input signal overstress). these stressors can be normalized with respect to their burnout limits and strains are pondered as percentage of breakdown limits. the main hypotheses, verified by experiments on electronic devices and population of similar devices, are: i. the physical instantaneous degradation phenomena due to electrical stress above the limits is observed at any temperature and depend of the active zone temperature of the device under test (sze, s. m [56]) ii. the relative drift of a predictor parameter is a function of time (for example square root for diffusion mechanisms) and relate to a failure mechanism activated by temperature and biasing. iii. for a biasing set higher and close to the breakdown limit, the two failure mechanisms (e.g. the diffusion and the instantaneous catastrophic ones) are in competition and occurred simultaneously; for sake of simplicity it is assumed they are progressively and linearly combined from a pure diffusion mechanism at nominal biasing to a pure burnout at high bias (voltage or current of power dissipation). this last hypothesis is the foundation of the baz model, as the stressor is seen like a catalyst effect able to modify the height of the barrier of the pure temperature failure mechanism (arrhenius thermally activated) and to quantify the effect of biasing on the barrier properties. the predictor parameters is then the sensitive tool we can use to measure this barrier height under various temperature and bias conditions. for unit homogeneity, the stressor is multiplied by a constant factor to be determined by experiments and the term · s is in ev unit. indeed the  coefficients can be easily determined because of hypothesis iii) above and as shown on figure 3, the apparent height of the barrier is reduced to zero and we can verify: (16) e.g. when the bias is high enough to reach the instantaneous catastrophic failure. this major principle is called failure equivalence (fe) principle. because ea (pure thermal effect) is assumed to be a constant and considering the burnout limit is temperature dependent potentially distributed (gaussian distribution), the  factor should also reflect temperature dependence and have a same gaussian like distribution. the present paper will not consider this extension and the  factor is supposed to be a constant on a first basis. predictor definition as mentioned previously, an electrical predictor parameter p is defined as the electrical signature (failure mode) of a failure mechanism of interest. such a parameter is normalized with respect to its initial value at time zero. similarly to the stressor context, we can define an equivalent energy using a prefactor  as outlined in equation (14). microelectronic reliability models for more than moore nanotechnology products 15 figure 4 is a schematic drawing showing how the fe principle applied and how predictors and stressors takes place in the baz model highlighted by the transition state theory. all vertical axes are transformed in energy unit. fig. 4 predictor p and stressor s for baz model and transition state theory the predictor relative drift shown is an example of actual measurements performed on microwave transistors when submitted to steady state aging testing [57]. the predictor of each single device is normalized with respect to its initial measurement (mean value) and the failure criteria was 20% drift reached. so, the drawing is set in order to consider failed devices for all drift greater than 20%. 4.2. baz model simplification and applicability it is observed from section ii, the baz model is a generalization of existing well known arrhenius equation modified by commonly accepted industrial models as eyring for example. as presented in ref [29], all failure mechanisms models as detailed in jedec jep122 can be rearranged in the following form ( ) (17) where the function g(s) are a function of stressor parameter always expressed in two ways generalized expressions: [ ] (18.a) or (18.b) where m and p = 1 or -1 is a power law factor. applying the normalization process for each stressor si with respect to its burnout limit parameters or electrical parameter limits, we set: (19.a) ( ) ( ) (19.a) 16 a. bensoussan from these equations, it is assumed the xi and xj are varying from 0 when no electrical stress is applied to 1 when maximum electrical stress induces an instantaneous failure at any given temperature. the value of stressor burnout is considered in a first approximation not temperature dependent. this can be reformulated when the model will be refined to take into account this statement. merging equations (17 to19), it is easy to express the general equation of failure rate as: ( ) (20) with the effective activation energy in the form [13]: ( ∑ [ ( )] ) (21) expression 21 is based on the assumption that the stressors are temperature independent and are applied simultaneously, so simply added because of a linear approximation point of view. the stressors are considered independent and they aggregate each other up to a value which compensate exactly the “pure” arrhenius activation energy leading to an instantaneous burnout (see figure 3.a for clarification): consequently the principle of superposition cannot be invoked in this case, rather it is a principle of aggregation and compensation. the stressors defined above are considered through literature experiments and accumulated data. of course any other type of stressor can be easily introduced in lieu of or together with the listed stressors providing they are relevant in the considered model. this proposed reliability methodology is agile and consists of measuring the burnout or breakdown true limits (including lot dispersion values mean and standard deviation) or some physical limit as for hci in order to normalize new stress parameter with respect to its limit and to include it in the equation 13. 4.3. multiple failure mechanisms (m-tol) the key novelty of the multiple-temperature operational life (m-tol) testing method proposed by j. bernstein [58], is its success in separating different failure mechanisms in devices in such a way that actual reliability predictions can be made for any user defined operating conditions. this is opposed to the common approach for assessing device reliability today is the high temperature operating life (htol) testing [59], which is based on the assumption that just one dominant failure mechanism is acting on the device [31]. however, it is known that multiple failure mechanisms act on the device simultaneously [25]. the new approach m-tol method predicts the reliability of electronic components by combining the failure in time (fit) of multiple failure mechanisms [60]. degradation curves are generated for the components exposed to accelerate testing at several different temperatures and core stress voltage. data clearly reveals that different failure mechanisms act on the components in different regimes of operation causing different mechanisms to dominate depending on the stress and the particular technology. a linear matrix solution, as presented in [60], allows the failure rate of each separate mechanism to be combined linearly to calculate the actual reliability as measured in fit of the system based on the physics of degradation at specific operating conditions. an experimental results of the m-tol method tested on both 45 and 28 nm fpga devices from xilinx that were processed at tsmc (according to the xilinx data sheets) is running in the frame of a project granted by research institute of technology named irt saint exupery, toulouse (france). the fpgas are tested over a range of voltages, microelectronic reliability models for more than moore nanotechnology products 17 temperature and frequencies, and the test program is conducted by j. bernstein, ariel university, ariel (israel). ring frequencies of multiple asynchronous ring oscillators simultaneously during stress in a single fpga were read and recorded. hundreds of oscillators and the corresponding frequency counters were burned into a single fpga to allow monitoring of statistical information in real time. since the frequency itself monitors the device degradation, there is no recovery effect whatsoever, giving a true measure for the effects of all the failure mechanisms measured in real time. the common intrinsic failure mechanisms affecting electronic devices are, hot carrier injection (hci), bias temperature instability (bti), electromigration (em) and time dependent dielectric breakdown (tddb). tddb will not be discussed in this paper since it was never observed in our test results. the standard models for failure mechanisms in semiconductor devices are classified by jedec solid state technology association and listed in publication jep-122g. the failure mechanisms can be separated due to the difference of physical nature of each individual mechanism. the theory of using fpgas as the evaluation vehicle for our m-tol verification utilizes the fact that this chip is built with the basic cmos standard cells that would be found in any digital process using the same technology. the system runs hundreds of internal oscillators at several different frequencies asynchronously, allowing independent measurements across the chip and the separation of current versus voltage induced degradation effects. when degradation occurred in the fpga, a decrease in performance and frequency of the ro could be observed and attributed to either increase in resistance or change in threshold voltage for the transistors. the test conditions were predefined for allowing separation and characterization of the relative contributions of the various failure mechanisms by controlling voltage, temperature and frequency. extreme core voltages and environmental temperatures, beyond the specifications, were imposed to cause failure acceleration of individual mechanisms to dominate others at each condition, e.g. sub-zero temperatures, at very high operating voltages, to exaggerate hci. the acceleration conditions for each failure mechanism allowed us to examine the specific effect of voltage and temperature versus frequency on that particular mechanism at the system level, and thus define its unique physical characteristics even from a finished product. finally, after completing the tests, some of the experiments with different frequency, voltage and temperature conditions were chosen to construct the m-tol matrix. the results of our experiments give both ea and  for the three mechanisms we studied at temperatures ranging from -50 to 150°c. the eyring model [36] is utilized here to describe the failure in time (fit) for all of the failure mechanisms. the specific ttf of each failure mechanisms follows these formulae: (22) (23) (24) correct activation energy simultaneously with corresponding voltage factor were determined. the procedure was followed for all three mechanisms for the 45nm as well as the 28nm devices. the ea and  for hci found in 45nm are summarized in table 1. 18 a. bensoussan table 1 summary of ea and  for fpga 45 nm. ea (ev)  hci -0.37 22.7 bti 0.52 3.8 v 1 em 1.24 3.8 as presented by regis, d. et al. [61], the impact of scaling on the reliability of integrated circuits is the actual concern. it is particularly necessary to focus on three basics of safety analyses for aeronautical systems: failure rates, lifetimes and atmospheric radiations' susceptibility. the deep sub-micron technologies, in terms of robustness and reliability, need to be modeled because the increase in failure rate, reduction in useful life and increased vulnerability to high energy particles are the most critical concerns in terms of safety. when considering the well documented failure mechanisms related to the die only, they can be defined in two families, one for those related to what is call front end of line (feol) meaning at transistor level and those occurring in the back end of line (beol) mainly metallization. as illustrated on figure 5 (extracted from paper [61]), ics are affected by different degradation mechanisms during their useful life. these degradation mechanisms can shift the properties of electronic devices and thereby affect the circuit performance. due to the exponential nature of acceleration factor (referring to equations 22 to 24) as function of voltage, frequency (equivalent to current) or temperature, it is mandatory to consider at least 3 mechanisms, each of them in competition and accelerated. fig. 5 wear-out phenomena localization (65 nm ic cross section) (from [61]). the paper proposed by j. bernstein [12] is offering a new reliability point of view and is synthetized hereunder. the proposed m-tol approach is defined with multiple failure mechanism in competition and on the assumption of non-equal failure probability at-use conditions to describe and to determine the correct proportionality. the basic method for solving the system of equations is described in another paper from j. bernstein [62], and using the suggestion of a sum-of-failure-rate method as described in jedec standard jep122g. it is clear that the manufacturers of electronic components recognize the importance of combining failure mechanisms in a sum-of-failure-rates method. each mechanism „competes‟ with the others to cause an eventual failure. when more than one mechanism exists in a system, then the relative acceleration of each one must be defined microelectronic reliability models for more than moore nanotechnology products 19 and averaged under the applied condition. every potential failure mechanism should be identified and its unique af should then be calculated for each mechanism at given temperature and voltage so the fit rate can be approximated for each mechanism separately. then, the final fit is the sum of the failure rates per mechanism, as described by: (25) where each mechanism leads to an expected failure unit per mechanism, fiti. thus, we describe here, the prediction of a system reliability using a linear matrix solution. although until today, we have only verified the methodology on verifiable microelectronic device failure mechanism, the methodology will apply directly to additional mechanisms including thermal and mechanical stresses due to wafer bonding and any failure mechanism that can be modelled by physics of failure, including wide bandgap semiconductors and even packaging failures whereas each intrinsic mechanism is known to have different statistical distributions, the combination of distributions becomes, at the ensemble level, approximately constant rate as demonstrated by r.f. drenick [63]. in its theorem, drenick suggests and justifies the summation of failure rate approach also as explained in the jedec handbook. the mechanism matrix is described in table 2. each row of the matrix describes various operating conditions under which the system is tested. each experiment, i, is operated with its unique voltage, frequency and temperature. the „„results‟‟ column, fiti is the average time when the failure occurs under the experimental condition, which is associated with a pre-determined failure point. the example studied uses 10% performance degradation as the failure point, however any reasonable value will work as long as it is consistent with the application. the result fiti is a failure rate () and measured as 10 9 /mttf. table 2 m-tol matrix used to solve models with measured times to fail [12] hci bti em results v1, f1, t1 x·a1 y·b1 z·c1 fit1 v2, f2, t2 x·a2 y·b2 z·c2 fit2 v3, f3, t3 x·a3 y·b3 z·c3 fit3 we assume that each mechanism (a–c) affects the system linearly with its own acceleration factor (af) for a given frequency. the acceleration factor formulas are in table 3. each equation is calculated with the experimental condition of each result on the right hand side. table 3 the equations for the acceleration factors matrix [12] hot carrier injection ai  afhci = ( ) ( ) negative bias temperature instability bi  afnbti = ( ) ( ) electromigration ci  afem = ( ) ( ) 20 a. bensoussan then the matrix is solved to find a set of constants, pi, shown here as x–z, across the whole matrix that matches the experimental results with calculated acceleration factors. this linear matrix is solved by multiplying the inverse matrix, af -1 , with lambda at each condition, as shown in table 4. the solution give the coefficients (x–z), which make up the relative contribution of each failure mechanism on the system. table 4 matrix solution [12]. af pi  [ ] [ ] [ ] (af) · (pi) = ()  (pi) = (af) -1 · () knowledge of these coefficients, allows prediction of the mttf or the fit for any other work conditions that were not tested and give an accurate prediction of the reliability of the device under different conditions. this matrix has been used then to construct the full reliability profile whereby fit is calculated versus temperature for several conditions for fpga 45 nm process, as shown in figure 6. the 45 nm technology shows frequency related effects at both low temperatures (below 5°c) due to hci and at high temperatures. it is observed the high voltage bias (@ 1.2 v) enhance the effect of frequency which reduce the overall hci contribution at low frequency. the dominant failure mechanism at medium ambient temperature (range from 10°c to 150°c) is related to nbti while em failure mechanism is rather observed at high temperature. fig. 6 reliability curves for 45nm technology showing fit versus temperature for voltages above and below nominal (1.2v) and frequencies from 10 mhz (dashed line) to 2ghz (solid line). 0,01 0,1 1 10 100 1000 10000 100000 1000000 10000000 -50 0 50 100 150 f it temperature t (°c) v = 0,8 v ; f = 1,0 ghz v = 1,0 v ; f = 1,5 ghz v = 1,2 v ; f = 2,0 ghz v = 0,8 v ; f = 0,01 ghz v = 1,0 v ; f = 0,01 ghz v = 1,2 v ; f = 0,01 ghz hci bti em microelectronic reliability models for more than moore nanotechnology products 21 how to disentangle reliability models for more than moore microelectronics based on nanotechnologies? an innovative and practical way is to use the various physics of failure equations together with accelerated testing for reliability prediction of devices exhibiting multiple failure mechanisms. we presented an integrated accelerating and measuring platform to be implemented inside fpga chips, making the m-tol testing methodology more accurate, allowing these tests at the chip and at the system level, rather than only at the transistor level. the calibration of physics models with highly accelerated testing of complete commercial devices allows to perform physical reliability prediction. the m-tol matrix can provide information about the proportional effect of each failure mechanism in competition and offering an easy and simply tool to extrapolate the expected reliability of the device under various conditions. this practical platform can be implemented on almost any fpga device and technology to enable making fit calculations and reliability predictions. the results of this approach provide the basis for improvements in performance and reliability given any design or application. this method can be extended to other processes and new technologies, and can include more failure mechanisms, thus producing a more complete view of the system's reliability. the baz model together with the m-tol methodology has been combined in a general multi-dimensional tool named m-storm (multi-physics multi-stressors predictive reliability model) [13] which can be implemented in a concrete situation existing for the deep-submicron process devices but also for any other microelectronic disruptive technology. 5. conclusion to this day, the users of our most sophisticated electronic systems that include optoelectronic, photonic, mems device, gan power devices, asic and deep-sub-micron technologies etc. are expected to rely on a simple reliability value (fit) published by the supplier. the fit is determined today in the product qualification process by use of htol or other standardized test, depending on the product. the manufacturer reports a zero-failure result from the given conditions of the single-point test and uses a single-mechanism model to fit an expected mttf at the operator‟s use conditions. the zero-failure qualification is well known as a very expensive exercise that provides nearly no useful information. as a result, designers often rely on halt testing and on handbooks such as fides, telcordia or mil-hdbk-217 to estimate the failure rate of their products, knowing full well that these approaches act as guidelines rather than as a reliable prediction tool. furthermore, with zero failure required for the “pass” criterion as well as the poor correlation of expensive htol data to test and field failures, there is no communication for the designers to utilize this knowledge in order to build in reliability or to trade it off with performance. prediction is not really the goal of these tests; however, current practice is to assign an expected failure rate, fit, based only on this test even if the presumed acceleration factor is not correct. we presented, in this paper, a simple way to predictive reliability assessment using the common language of failure in time or failure unit (fit). we evaluated the goal of finding mtbf and evaluate the wisdom of various approaches to reliability prediction. our goal is to predict reliability based on the system environment including space, military and 22 a. bensoussan commercial. it is our intent to show that the era of confidence in reliability prediction has arrived and that we can make reasonable reliability predictions from qualification testing at the system level. our research will demonstrate the utilization of physics of failure models in conjunction with qualification testing using our multiple – temperature operating life (m-tol) matrix solution to make cost-effective reliability predictions that are meaningful and based on the system operating conditions. the baz model together with the m-tol methodology has been combined in a general multi-dimensional tool named m-storm (multi-physics multi-stressors predictive reliability model) applicable to microelectronic disruptive technologies. acknowledgement: the paper is a part of the research done at irt saint exupery, toulouse, france. the study was conducted in the frame of electronic robustness contract project irt-008 sponsored by the following funding partners: agence nationale de la recherche, airbus operations sas, airbus group innovation, continental automotive france, thales alenia space france, thales avionics, laboratoire d'analyse et d'architecture des systèmes — centre national de la recherche scientifique (laas-cnrs), safran labinal power systems, bordeaux university, institut national polytechnique bordeaux (ims — umr 5218), and hirex engineering. i would like to particularly thank professor joseph b. bernstein, from ariel university, ariel (israël) for the major comments and deep discussions we had during the manuscript preparation. references [1] dod, "mil-hdbk-217, military handbook for reliability prediction of electronic equipement," washington, dc, usa, 1991, december. [2] european space comp. information exchanges systems, "ecss-q-st-60c rev2 space product assurance, electrical, electronic and electromecanical (eee) components.," ecss secretariat-esa-estec-component, material and processes related ecss and esa pss standards; requirements & standards divisionnoordwijk, the netherlands, 21 october 2013. [online]. available: https://escies.org. [accessed april 2016]. [3] standard, "jedec jep-122g failure mechanisms and models for semiconductor devices," jedec solid state technology association, arlington, 2011. [4] m. pecht, "a prognostics and health management roadmap for information and electronics-rich systems," ieice fundamentals review, vol. 3, no. 4, p. 25, 2010. [5] p. lall, r. lowe and k. goebel, "prognostic health monitoring for a micro-coil spring interconnect subjected to drop impacts," in proc. of the 2013 ieee conference on prognostics and health management (phm), 2013. [6] l. boltzmann, "further investigations on the thermal equilibrium of gas molecules," in proceeding of the imperial academy of science, vienna, vol. ii, no. 76, p. 428, 1872. [7] e. suhir, "probabilistic design for reliability," chipscale rev., vol. 14, no. 6, 2010. [8] e. suhir, "predicted reliability of aerospace electronics: application of two advanced concepts," in proc. of the ieee aerospace conference, 2-9 march 2013. [9] d. cox, "regression models and life-tables," journal of the royal statistical society. series b (methodological), vol. 34, no. 2. (1972), pp., vol. 34, no. 2, pp. 187-220, 1972. [10] j. w. mcpherson, reliability physics and engineering time-to-failure modeling; 2nd edition, plano (tx) usa: springer , 2013. [11] m. white and j. b. bernstein, "microelectronics reliability: physics-of-failure based modeling and lifetime evaluation," jpl publication 08-5, jet propulsion laboratory/california institute of technology, february 2008. [12] j. b. bernstein, m. gabbay and o. delly, "reliability matrix solution to multiple mechanism prediction," microelectronics reliability journal, vol. 54, pp. 2951-2955, 2014. microelectronic reliability models for more than moore nanotechnology products 23 [13] a. bensoussan, "m-storm: multi-physics multi-stressors predictive reliability model," microelectronic reliability journal, 2016 (to be published). [14] n. t. n. esa-estec secretariat, "ecss-q-st-30-11c : space product assurance derating-eee components, rev 1, october 4th 2011," [online]. [15] esa-estec, ecss-q-st-70-08c, space product assurance manual soldering of high-reliability electrical connections, noordwijk, the netherlands. [16] esa-estec secretariat, noordwijk, the netherlands, "ecss-q-st-60-13c: “commercial electrical, electronic and electromechanical (eee) components”," 21 october 2013. [online]. available: www.ecss.nl. [accessed 2016]. [17] klinger, d.j., yoshinao nakada and m. a. mendez, at&t reliability manual, van nostrand reinhold, 1990. [18] fides_guide, "reliability methodology for electronic systems, dga," 2004. [19] "rdf 2000 (ute c 80-810, iec-62380-tr ed.1)," [online]. available: http://www.ute-fr.com/lanormalisation/ute-and-standardisation. [20] siemens ag, sn29500, reliability and quality specifications failure rates of components, siemens technical liaison and standardisation, 1986. [21] bellcore, sr-332, reliability prediction procedure for electronic equipment, bellcore telcordia, 2011. [22] d. nicholls, "what is 217plus (tm) and where did it come from?," in proc. of the annual reliability and maintainability symposium , orlando, fl, 2007. [23] "prism now 217plus (tm) riac the reliability analysis center," [online]. available: http://www.theriac.org . [24] jep148b, "reliability qualification of semiconductor devices based on physics of failure risk and opportunity assessment," jedec solid state technology association, arlington, va, december 2004. [25] j. b. bernstein, s. salemi, l. yang, j. dai and j. qin, physics-of-failure based handbook of microelectronic systems, utica, ny: reliability information analysis center, 2008. [26] e. suhir, applied probability for engineers and scientists, new york: mcgraw-hill, 1997. [27] p. a. tobias and d. c. trindade, applied reliability (3rd ed.), boca raton, fl: crc press, 2012. [28] p. lall, m. pecht and e. b. hakim, influence of temperature on microelectronics and system reliability, boca raton, ny: crc press, 1997. [29] a. bensoussan and e. suhir, "design-for-reliability (dfr) of aerospace electronics: attributes and challenges," ieee aerospace conf., 2-9 march 2013. [30] a. bensoussan, "how to quantify and predict long term multiple stress operation: application to normally-off power gan transistor technologies," microelectronics reliability journal, special issue on reliability in power electronics, vol. 58, pp. 103-112, march 2016. [31] j. b. bernstein, reliability prediction from burn-in data fit to reliability models, london: elsevier ap, 2014. [32] j. black, "electromigration a brief survey and some recent results," ieee, trans. electron devices, vols. ed-16, p. 388, 1969. [33] m. yoder, "ohmic contacts in gaas," solid state el., vol. 23, pp. 117-119, 1980. [34] c. lee, b. welch and w. fleming, "reliability of auge/pt and auge/ni ohmic contacts on gaas," electronics letters, vol. 17, pp. 407-408, 1981. [35] h. cui, "h. cui, “accelerated temperature cycle test and coffin-manson model for electric packaging," ieee trans. rams, pp. 556-560, 2005. [36] h. eyring, s. lin and s. lin, basic chemical kinetics, new york chichester brisbane toronto: john willey & sons, 1980. [37] d. peck, "comprehensive model for humidity testing correlation," in proc. of the ieee 23rd international reliability physics symp. (irps), anahiem, 1986. [38] i. chen, s. holland and c. hu, "a quantitative physical model for time-dependent breakdown in sio2," in proc. of the ieee 23rd international reliability physics symp. (irps), orlando, 1985. 24 a. bensoussan [39] m. dai, c. gao, k. yap, y. shan, z. cao, k. liao, l. wang, b. cheng and s. liu, "a model with temperature-dependent exponent for hot-carrier injection in high-voltage nmosfets involving hot-hole injection and dispersion," ieee trans. electron devices, vol. 55, pp. 1255-1258, 2008. [40] e. takeda, y. nakagome, h. kume and s. asai, "new hot-carrier injection and device degradation in submicron mosfet‟s," iee proc, vol. 130, no. 3, pp. 144-149, 1983. [41] e. takeda, h. kume, t. toyabe and s. asai, "submicron mosfet structure for minimizing channel hotelectron injection," in proc. of the symposium on vlsi tech., 1981. [42] k. decker, "gaas mmic hydrogen degradation study," in gaas reliability workshop, philadelphia, pa, usa, 1994. [43] m. delaney, t. wiltsey, m. chiang and k. yu, "reliability of 0.25μm gaas mesfet mmic process: results of accelerated lifetests and hydrogen exposure," in gaas reliability workshop, philadelphia, pa, usa, 1994. [44] m. ciappa, f. carbognani and w. fichtner, "lifetime modeling of thermomechanics-related failure mechanisms in high power igbt modules for traction applications," in proc. of the ieee 15th int. symp. power semicond. devices ics, pp. 295-298, 2003. [45] d. schroder and j. babcock, "negative bias temperature instability: road to cross in deep submicron silicon semiconductor manufacturing," journal of applied physics, vol. 94, no. 1, pp. 1-18, 2003. [46] e. suhir, "statistics-related and reliability-physics-related failure processes in electronics devices and products," modern physics letters b, vol. 28, no. 13, 2014. [47] s. zhurkov, "kinetic concept of the strength of solids," int. j. of fracture mechanics, vol. 1, no. 4, 1965. [48] s. arrhenius, "ueber den einfluss des atmosphärischen kohlensäurengehalts auf die temperatur der erdoberfläche," in proceedings of the royal swedish academy of science, vol. 22, no. 1, 1896. [49] l. boltzmann, "the second law of thermodynamics. populare schriften," in essay 3, address to a formal meeting of the imperial academy of science,, 1886. [50] e. wigner, "the transition state method," faraday society (london) trans, vol. 34, pp. 29-41, 1938. [51] m. evans and m. polanyi, "inertia and driving force of chemical reaction," faraday society (london) trans, vol. 34, pp. 11-29, 1938. [52] e. suhir, a. bensoussan, g. khatibi and j. nicolics, "probabilistic design for reliability in electronics and photonics: role, significance, attributes, challenges.," in proc. of the international reliability physics symposium, monterey, ca, 2015. [53] e. suhir, r. mahajan, a. lucero and l. bechou, "probabilistic design-for-reliability concept and novel approach to qualification testing of aerospace electronic products," in aerospace conf., 2012 ieee, big sky, mt, march 2012. [54] c. shannon, "a mathematical theory of communication," bell system technical journal 27 (3): 379– 423, vol. 27, no. 3, pp. 379-423, 1948. [55] e. suhir and a. bensoussan, "quantified reliability of aerospace optoelectronics," in sae international j. aerosp. 7(1), cincinnati, 2014. [56] s. sze, physics of semiconductor devices, new york: john wiley and sons, 1981. [57] a. bensoussan, p. coval, w. roesch and t. rubalcava, "reliability of a gaas mmic process based on 0.5µm au/pd/ti gate mesfets," in proc. of the 32nd annual proceeding reliability physics, irps, san jose (ca), april 1994. [58] j. bernstein, a. bensoussan and e. bender, "reliability prediction with mtol," to be published in microelectronics reliability journal, elsevier, 2016. [59] xilinx, "reliability report," xilinx ug116 (v10.4), 1st april 2016. [60] j. bernstein, "reliability prediction for aerospace electronics," in proc. of the ieee aerospace conference, big sky (mn), 2015. [61] d. regis, j. berthon and m. gatti, "dsm reliability concerns impact on safety assessment," in sae 2014 aerospace systems and technology conference,, cincinnati, oh, september 2014. [62] j. bernstein, m. gurfinkel, x. li, j. walters and y. shapira, "electronic circuit reliability modeling," microelectronics reliability j., vol. 46, pp. 1957-1979, 2006. microelectronic reliability models for more than moore nanotechnology products 25 [63] r. drenick, "mathematical aspects of the reliability problem," journal of the society for industrial and applied mathematics, vol. 8, no. 1, pp. 125-149, 1960. [64] b. agarwala and al., "dependence of electromigration-induced failure time on length and width of aluminium thin-film conductors," j. appl. phys., vol. 41, p. 3954, 1970. [65] n. 1. r. hdbk, "nswc-10 reliability hdbk jan2010 45818/," 2010. [online]. available: http://everyspec.com/usn/nswc/nswc-10_reliability_hdbk_jan2010_45818/. [accessed 2016]. [66] j. evans, p. lall and r. bauernschub, "a framework for reliability modeling of electronics," in proc. of the reliability and maintenability symposium, 1995. 10646 facta universitatis series: electronics and energetics vol. 35, no 4, december 2022, pp. 619-634 https://doi.org/10.2298/fuee2204619l © 2022 by university of niš, serbia | creative commons license: cc by-nc-n original scientific paper performance analysis and optimization of 10 nm tg nand p-channel soi finfets for circuit applications abdelaziz lazzaz1, khaled bousbahi2, mustapha ghamnia1 1laboratoire des sciences de la matière condensée (lsmc), département physique, université d’oran 1 ahmed ben bella, oran, algérie 2esgee d’oran, oran, algérie abstract. this paper analyses the electrical characteristics of 10 nm tri-gate (tg) nand pchannel silicon-on-insulator (soi) finfets with hafnium oxide gate dielectric. the analysis has been performed through simulations by using silvaco atlas tcad with the bohm quantum potential (bqp) algorithm. the influence of the geometrical parameters on the threshold voltage vth, the subthreshold swing (ss), the transconductance and the on/off current ratio, ion/ioff, is investigated. the two structures have been optimized for cmos inverter implementation. the simulation results show that the n-finfet and the p-finfet can reach a minimum ss value with fin heights of 15 nm and 9 nm, respectively. in addition, low threshold voltages of 0.61 v and 0.27 v for nand p-channel soi finfets, respectively, are obtained at a fin width of 7 nm. key words: finfet, cmos, quantum effect, leakage current 1. introduction nanoelectronics is a field of engineering technology which is used for controlling device properties at the nanometric scale. to meet the increasing demands for highperformance and high-speed applications, transistors need to be aggressively scaled down. this poses huge modifications both in the development of new device structures and in the fabrication processes. when the channel length is less than 20 nm, short-channel effects (sces) become insurmountable and, consequently, the device performance degrades. multi-gate fets have successfully enabled cmos scaling and are considered to be the best alternative structures that can extend to 10 nm node technology. received april 8, 2022; revised june 21, 2022 and june 26, 2022; accepted june 26, 2022 corresponding author: abdelaziz lazzaz laboratoire des sciences de la matière condensée (lsmc), département physique, université d’oran 1 ahmed ben bella, oran, algérie e-mail: lazzaz.abdelaziz@gmail.com 620 a. lazzaz, k. bousbahi, m. ghamnia most of the reported finfets are fabricated with a silicon channel, they present different advantages such as: i) reduced sces and a low leakage current, ii) superior electrostatic control through tri-gate structures, iii) reduced effect of substrate bias on the threshold voltage and excellent carrier transport properties along with more aggressive channel length scaling possibilities [1]. the conventional finfet technology has to face the competition from other technology options because of its high access channel resistance due to its extremely thin body. to improve finfet performance, one must address the quantum confinement problem. hence, the use of the bqp algorithm, which is based on the bohm interpretation of quantum mechanics [8], may become more important. n.p. maity et al. in 2017 [22] have explored the application of the promising high-k dielectric material, hfo2, on mos devices. they observed that the tunneling current is inversely proportional to the dielectric constant of the oxide material. niladri pratap maity et al. in 2016 [23] have developed an analytical model to evaluate the impacts of the hfo2 on the current density model with a comparison between the theoretical model and the experimental measurements. lazzaz et al. in 2022 [27] have demonstrated that quantum effects play a dominant role in nanostructures. they used the bqp method to fit experimental measurement of the ids-vds characteristics for 14 nm tg n-finfet. neha gupta et al. in 2020 [28] have explored the performance evaluation of high-k gate stack on the analog and rf figure of merits (foms) of 9 nm soi finfet. the results of their simulation confirm that the limitations of the transistor device such as sces, leakage current and parasitic capacitance have been reduced and pave the way for high-speed switching and rf application due to the use of high-k dielectric material with sio2 between gate and fin. anisur rahman et al. in 2018 [29] found that intel’s 10 nm technology achieved scaling benefits over its preceding 14 nm generation at matched or better transistor reliability. marupaka aditya et al. in 2021 [30] have confirmed that using high-k dielectric materials increase the on current and improve the device performance. sanghamitra das et al. in 2021 [31] have studied the effect of finfet geometric parameters (channel length and fin height) on the rf foms by using tcad simulations. their results confirm that decreasing the channel length or increasing the fin height improves the rf parameters. mostak ahmed et al. in 2021 [31] have simulated the electrical characteristics of a 3d tg n-channel soi finfet with a channel length of 5 nm using different gate dielectric materials. the results of their simulation confirm that high-k dielectric materials are the better option in the fabrication for future tg finfet devices. the above literature survey indicates the importance of using high-k dielectrics in finfet devices to reduce sces. in this paper, the transfer and the transconductance characteristics have been computed in order to find the electrical response of tg nand pchannel soi finfets with 10 nm channel length. the bqp algorithm has been used from silvaco atlas tcad software to simulate the i-v characteristics. the simulated devices have been optimized in terms of geometry to have optimal voltage transfer characteristics (vtc) for a cmos inverter [14], [15]. performance analysis and optimization of 10 nm tg nand p-channel soi finfets for circuit applications 621 2. device structure the hafnium-based oxide is extensively used because of its low leakage property and its high thermal stability with silicon [25]. the geometric parameters used in this simulation are represented in table 1 and the operating parameters of the two structures are presented in table 2. table 1 geometric parameters symbol designation value l channel length 10 nm ld, ls drain, source length 12 nm eot equivalent oxide thickness 0.68 nm vdd supply voltage (v) 0.75 v table 2 operating parameters symbol designation value eg gap energy 1.12 ev k(si) dielectric constant of silicon 11.9 k(hfo2) dielectric constant of hafnium dioxide 24 nch channel doping concentration 1016 cm-3 ns/d φg source/drain doping concentration gate work function 1021 cm-3 4.85 ev tg finfet technology is based on the following fin geometry: fin length (l), fin width, wfin, fin height, hfin, and oxide thickness, tox. the numerical resolution, which includes the gate work function and the choice of physical models, represents one of the two main steps in the silvaco atlas tools. the shockley-read-hall (srh) theory has been used. figure (1a) shows the top-view layout of tg soi finfet with 10 nm gate length, and figure (1b) illustrates the 3d schematic view of finfet. the gate oxide thickness is the same for all three sides of the fin. the hfin is considered as the distance between the top gate and the bottom gate oxides. the wfin is represented as the distance between front gate and back gate. lg is the gate length and box is buried oxide. (a) (b) fig. 1 (a) top-view layout of tg soi finfet [21], (b) 3d schematic view of tg soi finfet 622 a. lazzaz, k. bousbahi, m. ghamnia all simulations have been performed using atlas and devedit 3d device simulator and different operating parameters such as the supply voltage, are extracted from the predictive technology model (ptm) [32]. 3.drain current model of the tg finfet the device electrostatics is governed by the 3-d poisson’s equation [5][19]: si zyxqn dz zyxd dy zyxd dx zyxd   ),,( ² ),,(² ² ),,(² ² ),,(² =++ (1)  : electrostatic potential; q: electron charge; εsi: silicon permittivity, n(x,y,z): electron density. quantum effects become more dominant and are difficult to control in the device. hence, in this study, one must consider them by selecting the appropriate model such as the bqp. the bqp model can also be used with the energy balance and hydrodynamic models, where the semi-classical potential is modified by the quantum potential in a similar way as for the continuity equations [20]. according to the bohm interpretation of quantum mechanics, the wave function can be represented in polar coordinates by the following expression [8]: 𝜓 = 𝑅𝑒𝑥𝑝( 𝑖𝑆 ℏ ) (2) r: probability density per unit volume; s has the dimension of an “action” (energy × time) the schrödinger equation can be written as: )(re)(re)(re 2 ² 1   is xpe is xpv is xpm =+             − − (3) m-1∇s: the local velocity of the particle associated with the wave function. e is conserved and equal to the sum of the potential energy and v is the kinetic energy [8]. the quantum potential is derived from the use of the bohm interpretation of quantum mechanics and it is described by the following equation [8][20]: r rm q )( 2 ² 1  −= − (4) the threshold voltage expression in the case of a finfet structure can be defined by [18]: dsdsbbsd ox b bfbth vqn c vv    −+++= ))2(2(2 (5) vfb: flat band voltage; ∅b: body potential; cox: gate oxide capacitance; q: electron charge; ns: doping concentration; εs: dielectric constant of the semiconductor; vsb: the reverse bias between the source and the body; λd: drain-induced barrier lowing (dibl) coefficient; λds: channel length modulation; λb: barrier variation coefficient. performance analysis and optimization of 10 nm tg nand p-channel soi finfets for circuit applications 623 the transconductance, gm, represents the drain current variation with respect to gate voltage. it is represented by the following equation [4][13]: gs d m dv di g = (6) ss is a major parameter for calculating the leakage current, and is calculated as [13]: )(log 10 ds gs id dv ss = (7) the value of the dibl is [3]: ds th v v dibl   = (8) vth: threshold voltage; vds: drain-source voltage. 4. results finfet is considered to be a promising candidate for ultimate cmos device structure because it has robustness against sces and higher current drivability. soi finfets have shown several advantages over bulk finfets. soi finfet could suppress the leakage current between source and drain through the body below the channel fin, and it has low source/drain-to-substrate capacitance, thereby improving the speed characteristics. in this section, the effect of changing the fin width and the fin height is analyzed and investigated for finfet structures. the different electrical parameters which are derived in this simulation, such as leakage and ss, are compared with other published results [9][12][17] in order to validate our model. the bqp model has better convergence properties in many situations and it can be calibrated against results from the schrödinger-poisson equation under conditions of negligible current flow. 4.1. simulation and analysis srh theory accounts for the generation and recombination of charges carriers through electron and hole capture and emission states within the energy gap. the software tcad was used to simulate the structure and the characteristics of the tg finfet. figure 2 represents ids-vgs transfer characteristics of the n-channel finfet in linear scale with vds = 0.7 v, which is higher than the threshold voltage. the on current in this simulation is 4 μa when vgs = vdd. one can observe that the threshold voltage, vth, in this simulation is 0.62 v. the vth of the device is related to the position of the fermi level with respect to the sub-bands energy levels. increasing the fin height will actually reduce the carrier quantum confinement, thereby reducing the sub-band energy. 624 a. lazzaz, k. bousbahi, m. ghamnia fig. 2 transfer characteristics for n-channel finfet to include quantum confinement in the computation of the id-vgs characteristics, bqp has been used. as a result, a correction of the value of drain current has been achieved. figure 3 represents the transfer characteristics in logarithmic scale. the gate voltage is swept from 0 to 0.75 v. we note that the leakage current is 10-14 a when vgs = 0 v. the leakage current obtained in this simulation is less than that obtained by dhananjaya tripathy et al [26]. fig. 3 transfer characteristics n-channel finfet in logarithmic scale figure 4 represents the output characteristics for n-channel soi finfet. we note that the drain saturation current is 2x10-6a at vds = vdd = 0.75 v. we observe that the early effect is more pronounced. we find that an increase in vgs will result in higher channel conductivity. the output characteristics have been obtained for vgs = 0.7 v. performance analysis and optimization of 10 nm tg nand p-channel soi finfets for circuit applications 625 fig. 4 output characteristics of n-channel finfet figure 5 represents the transconductance of n-channel finfet. the gate voltage is swept from 0 to 0.75 v. we note that the maximum value of the transconductance at vds = 0.7 v is 5x10-5 a/v. fig. 5 transconductance characteristics of n-channel finfet the higher value of the transconductance can be attributed to the higher strain in the short channel device. the transconductance peak can be reduced by reducing the channel length. shorter gate length, lg, provides less resistance and lower surface-roughness scattering, which leads to a higher transconductance and mobility. the higher mobility is induced by the quasi-ballistic transport instead of mobility increase. table 3 represents the different performance parameters of n-channel soi finfet [6][11]: table 3 performance parameters of n-channel finfet parameter value ion 4 µa ioff 10-14 a ion/ioff 4x108 vth 0.62 v dibl 21.05 mv/v ss 79.48 mv/dec 626 a. lazzaz, k. bousbahi, m. ghamnia the values of ss and dibl indicate a performance comparable with the state of the art obtained in ptm 10 nm hp nmos and pmos, such as the value of ss which is 102.4 mv/dec and dibl which is 212 mv/v. but, we must optimize these values in order to have a good performance of the device [17]. the ss provides a good performance comparable to the one obtained by ajay kumar et al [12], and the calculated performance ratio is better than that obtained by buryk et al [9]. figure 6 represents the transfer characteristics of p-channel finfet in linear scale with vsd = 0.7 v. we note that the on current in this structure is 6.25x10 -5 a. the on current is measured at vsg = vdd= 0.75 v. the value of the threshold voltage in this simulation is 0.30 v. the performance ratio, ion/ioff, calculated in this simulation is higher than that calculated by a.s. opanasyuk et al [9]. fig. 6 transfer characteristics of p-channel finfet figure 7 represents the transfer characteristics of p-channel finfet in logarithmic scale. we note that the leakage current is 1.58x10-8 a when vsg = 0 v. fig. 7 transfer characteristics in logarithmic scale of p-channel finfet performance analysis and optimization of 10 nm tg nand p-channel soi finfets for circuit applications 627 figure 8 represents the output characteristic of p-channel finfet. we note that the drain current saturation is 5.5x10-5 a. fig. 8 output characteristics p-channel finfet table 4 represents the different performance parameters: table 4 performance parameters of p-channel finfet parameters values ion 5.5x10-5 a ioff ion/ioff vth ss 2.51x10-8 a 2.19x103 0.31 v 133.50 mv/dec 4.2. effect of the fin height in this section, we investigate the impact of the fin height variation on id-vgs characteristics. quantum effects are included in the simulation. we think that the calculated optimized values of heights in nand p-channel finfets can be used as geometric parameters in the ptm for the design of cmos inverters. fig. 9 transfer characteristics with different fin heights for n-channel finfets 628 a. lazzaz, k. bousbahi, m. ghamnia figure 9 represents the transfer characteristics of n-channel finfets with different height values, 9 nm, 11 nm, 13 nm and 15 nm. it is clear that as we increase the fin height, the on current increases from 2.5 μa up to 4 μa. the on current allows the driving capability of the device. the increase of fin height increases the inversion charge density and thereby increases the on current [24]. we note that the leakage current increases with the increase of the fin height because trap-assisted-tunneling is more important than direct tunneling. table 5 represents the impact of fin height on the subthreshold swing and the threshold voltage of the simulated device: table 5 impact of fin height of n-channel finfet parameter 9 nm 11nm 13nm 15nm ss (mv/dec) 73.86 78.57 79.76 83.75 vth (v) 0.67 0.66 0.65 0.64 we note that the increase of fin height increases the subthreshold swing, and the threshold voltage decreases in the device. the increase of the subthreshold swing is due to the increase in the total capacitance, so we need to minimize the parasitic capacitance in order to reduce the power consumption. the decrease of the threshold voltage vth is due to the decrease of the fermi level. figure 10 represents the impact of fin height on the performance ratio (ion/ioff). we note that the increase of fin height up to 13 nm decreases the performance of the device. the performance ratio increases because of the decrease of leakage current [2]. fig. 10 the impact of fin height on the performance ratio of n-channel finfet the suitable value of fin height for the simulated device is 15 nm because it shows a larger performance ratio equal to 4x109. figure 11 represents the impact of fin height variation on p-channel finfet transfer characteristics. we note that the on current increases with the increase of fin height. we note that the leakage current increases with increase of fin height because direct tunneling is more important than trap-assisted-tunneling. performance analysis and optimization of 10 nm tg nand p-channel soi finfets for circuit applications 629 fig. 11 impact of fin height on the transfer characteristics of tg p-channel finfet table 6 displays the impact of fin height on subthreshold swing and vth in 10 nm pchannel finfet. table 6 impact of fin height on p-channel finfet parameter 9 nm 11 nm 13 nm 15 nm ss (mv/dec) 129.62 137.50 133.33 133.50 vth (v) 0.35 0.33 0.32 0.31 the variation of subthreshold swing is due to the total capacitance of the device. the threshold voltage in p-channel finfet decreases with increasing fin height. figure 12 illustrates the impact of fin height on performance ratio of p-channel finfet. we note that the increase of fin height until hfin = 13 nm decreases the performance ratio, then the performance ratio increases when hfin > 13 nm because of the decrease of the leakage current[2]. fig. 12 impact of fin height on the performance ratio of p-channel finfet the suitable value of this optimization is 9 nm because it allows a desirable performance ratio. 630 a. lazzaz, k. bousbahi, m. ghamnia 4.3. effect of fin width in this section, we investigate the impact of fin width on the performance of n-channel finfet. fig. 13 impact fin width on n-channel finfet figure 13 shows the transfer characteristics of n-channel finfet for different values of fin width. the gate voltage is swept from 0 v to 0.75 v. we note that the on current increases from 4x10-6 a up to 5.25x10-6 a, then it falls down to 3.30x10-6 a because the strain effect in the channel increases. the large fin width decreases the mobility and the inversion charge and results in a smaller drain current. the leakage current increases from 10-14 a up to 3.16x10-12 a when fin width is 9 nm. when fin width is greater than 9 nm, the leakage current decreases down to 1.58x10-13 a because direct tunneling is more important than the trap-assisted-tunneling. the following table 7 represents different values of subthreshold swing and vth for different fin widths. we note that the subthreshold swing and the threshold voltage increase with increasing fin width. because of the quantum effects along the wfin direction, the channel electrons will populate the discrete sub-bands. the vth will increase because more gate-bias is required to populate electrons into the lowest sub band, which is significantly above the bottom of the conduction band by evth. it should be underlined that a large fin width allows the enlarging of the total gate width therefore, the gate and depletion capacitance increases and subthreshold swing increases [16]. table 7 impact of fin width of n-channel finfet parameter 7 nm 8 nm 9 nm 10nm ss (mv/dec) 95.31 106.89 114.54 171.05 vth (v) 0.61 0.62 0.63 0.65 figure 14 illustrates the performance ratio of n-channel finfet, we note that ion/ioff decreases down to 1.66x106 with wfin = 9 nm then, the performance ratio increases up to 4x108.the increase of the performance ratio is due to the decrease of leakage current [2]. performance analysis and optimization of 10 nm tg nand p-channel soi finfets for circuit applications 631 fig. 14 impact of width fin on the performance n-channel finfet transistor dimensions are scaled down in order to improve drive current and circuit speed and the ratio ion/ioff is needed to exceed 10 6 [2]. the suitable value of fin width on n-channel finfet is 10 nm because it has the best performance ratio 4x108. fig. 15 impact of fin width on p-channel finfet figure 15 represents the impact of fin width on the transfer characteristics of p-channel finfet, we note that the on current increases and the maximum value is 6.25x10-6 a. the increase of fin width increases the leakage current in the device after wfin=9 nm because direct tunneling current is more important than trap-assisted-tunneling. figure 16 represents the impact of fin width on performance ratio of p-channel finfet. we notice that the performance of device decreases with increasing fin width up to 9 nm, then above this value, it starts to increase [2]. 632 a. lazzaz, k. bousbahi, m. ghamnia fig. 16 impact of fin width on the performance ratio of p-channel finfet table 8 represents the different results of subthreshold swing and vth of p-channel finfet: table 8 impact of fin width of p-channel finfet. parameter 7 nm 8 nm 9 nm 10nm ss (mv/dec) 168.75 175.01 132.01 137.25 vth (v) 0.27 0.28 0.33 0.35 the increase of threshold voltage is due to the presence of the sub bands. the quantum confinement raises the conduction band edge, ec, to the lower order eigenvalues. this shift has a direct influence on the device threshold voltage because as it requires more band bending (potential energy lowering) in order to create the inversion layer [10]. the variation of subthreshold swing is due to the total capacitance of the device [7]. the suitable value of fin width in p-channel finfet is 7 nm because it has a larger performance ratio of 4x102. 5. discussion and conclusion throughout this study, we have shown that both nand p-channel finfets have good performance ratios only when short-channel effects are minimized. we have also shown that the bqp algorithm is a good simulation tool for computing parameters that control the quantum effects. it also allows the calculation of the optimal geometrical parameters for optimal performance of devices that can be implemented in cmos circuits. the results show that in order to have a good threshold voltage, one needs to increase the fin height that allows the increase of the energy level of the sub-bands. to minimize the sces, the subthreshold swing must be around 60 mv/dec and the total capacitance must be decreased in both devices by using high-k oxides and wide thicknesses. the integration of tri-gate soi finfet provides new opportunities in achieving high performance in cmos technology. this requires the improvement of certain parameters such as leakage currents and the control the threshold voltages. performance analysis and optimization of 10 nm tg nand p-channel soi finfets for circuit applications 633 we have also shown that the device characteristics depends on fin widths and fin heights. as a result, 10 nm finfet with hafnium dioxide using quantum confinement can be considered as a promising device for future cmos manufacturing process. our results can be used as spice parameters for ptm in cmos inverter design. acknowledgments: the author lazzaz wishes to thank professor ghibaudo of inp grenoble and professor pierpaolo palestri for their very helpful pieces of advice. references [1] b. yu, l. chang, s. ahmed, h. wang, s. bell, c. yang, d. kyser, "finfet scaling to 10 nm gate length", in digest of the ieee international electron devices meeting, 2002, pp. 251-254. [2] y. eng, l. hu, t. chang, s. hsu, c. chiou, t. wang, m. chang, "importance of ∆𝐷𝐼𝐵𝐿𝑆𝑆/(𝐼𝑜𝑛/ 𝐼𝑜𝑓𝑓) in evaluating the performance of n-channel bulk finfet devices", ieee electron devices society, vol. 6, pp. 207-213, january 2018. [3] m. lundstorm, "fundamentals of nano transistors (lessons from nanoscience: a lecture notes)", wspc, 2015, pp. 05-334 [4] j. baker, "cmos circuit design,layout and simulation, 3rd edition (ieee press series on microelectronic systems)". wiley-ieee press, 2010, pp. 187-188. [5] a. tsormpatzoglou, "characterization and modeling of modeling nanoscale multi-gate mosfets", ph.d dissertation, institut polytechnique de grenoble, grenoble, france, 2009. [6] n. collaert, "high mobility materials for cmos applications", woodhead publisching series in electonic and optical materials (1reed), ed. woodhead publishing, 2018, pp. 297-298. [7] y. chauhan, d. lu, s. venugopalan, s. khandelwal, j. duarte, n. paydavosi, c. hu, "finfet modeling for ic simulation and design": using the bsim cmg standard (1re éd), ed. academic press, pp. 131-135, 2015. [8] g. iannaccone, g. curatola, g. fiori, "effective bohm quantum potential for device simulators based on drift-diffusion and energy transport", simulation of semiconductor processes and devices, springer, vienna, pp. 275-278, 2004. [9] i. buryk, m. ivashchenko, a. golovnia, a. opanasyuk, "numerical simulation of fet transistors based on nanowire and fin technologies", ieee, pp. 257-259, 2020. [10] w. han, z. wang, "toward quantum finfet", springer, pp. 54-67, 2013. [11] a. zhang, j. mei, l. zhang, h. he, j. he, m. chan, "numerical study on dual material gate nanowire tunnel field-effect transistor", in proceedings of the 2012 ieee international conference on electron devices and solid state circuit (edssc), 2012, pp. 1-5. [12] a. kumar, s. saini, a. gupta, n. gupta, m. tripathi, r. chaujar, "sub-10 nm high-k dielectric soifinfet for highperformance low power applications", in proceedings of the 2020 6th international conference on signal processing and communication (icsc), 2020, pp. 310-314. [13] n. bourahla, a. bourahla, b. hadri, "comparative performance of the ultra-short channel technology for the dg-finfet characteristics using different high-k dielectric materials", indian journal of physics, pp. 1-8, 2020. [14] c. hu, "3d finfet and other sub-22 nm transistors", in proceedings of the 19th ieee international symposium on the physical and failure analysis of integrated circuits, 2012, pp. 1-5. [15] k. kuhn, "cmos scaling for the 22 nm node and beyond: device physics and technology". in proceedings of 2011 ieee international symposium on vlsi technology, systems and applications, pp. 1-2, 2011. [16] n. boukortt, b. hadri, s. patane, "investigation on tg n-finfet parameters by varying channel doping concentration and gate length. silicon", vol. 9, no. 6, pp. 885-893, 2017. [17] a. rassekh, m. fathipour, "a single-gate soi nanosheet junctionless transistor at 10-nm gate length: design guidelines and comparison with the conventional soi finfet", journal of computational electronics, vol. 19, no. 2, pp. 631-639, 2020. [18] o. bonnaud, "physique des solides, des semiconducteurs et dispositifs. université de rennes", vol. 1, p. 78, 2003. [19] k. ren, y. liang, c. huang, "compact physical models for algan/gan mis-finfet on threshold voltage and saturation current", ieee transactions on electron devices, vol. 65, no 4, pp. 1348-1354, 2018. [20] c. mohan, s. choudhary, b. prasad, "gate all around fet: an alternative of finfet for future technology nodes", international journal adv. res. sci. eng., vol. 6, no. 7, pp. 563-569, 2017. 634 a. lazzaz, k. bousbahi, m. ghamnia [21] a. lazzaz, k. bousbahi, m. ghamnia, "modeling and simulation of dg soi n finfet 10 nm using hafnium oxide", in proceedings of the ieee 21st international conference on nanotechnology (nano 2021), 2021, pp. 177-180. [22] n. maity, r. maity, s. baishya, "voltage and oxide thickness dependent tunneling current density and tunnel resistivity model: application to high-k material hfo2 based mos devices", superlattices and microstructures, vol. 111, pp. 628-641, 2017. [23] n. maity, r. maity, r. thapa, s. baishya, "a tunneling current density model for ultra thin hfo2 high-k dielectric material based mos devices", superlattices and microstructures, vol. 95, p. 24-32, 2016. [24] n. boukortt, "3-d simulation of nanoscale soi n-finfet at a gate length of 8 nm using atlas silvaco", transaction on electrical and electronic materials, vol. 16, pp. 156-161, june 2015. [25] n. bourahla, a. bourahla, b. hadri, "comparative performance of the ultra-short channel technology for the dg-finfet characteristics using different high-k dielectric materials", indian journal of physics, vol. 95, no. 10, pp. 1977-1984, 2021. [26] d. tripathy, d. acharya, p. rout, s. biswal, "influence of oxide thickness variation on analog and rf performances of soi finfet”, facta universitatis, series: electronics and energetics, vol. 35, no 1, p. 001-011, 2022. [27] a. lazzaz, k. bousbahi, m. ghamnia, "optimized mathematical model of experimental characteristics of 14 nm tg n finfet", micro and nanostructures, p. 207210, 2022. [28] n. gupta, a. kumar, "assessment of high-k gate stack on sub-10 nm soi-finfet for high-performance analog and rf applications perspective", ecs journal of solid-state science and technology, vol. 9, no. 12, p. 123009, 2020. [29] a. rahman, j. dacuna, p. nayak, pinakpani, "reliability studies of a 10 nm high-performance and lowpower cmos technology featuring 3rd generation finfet and 5th generation hk/mg", in proceedings of the ieee international reliability physics symposium (irps 2018), p. 6f. 4-1-6f. 4-6, 2018. [30] m. aditya, k. rao, srinivasa, k. sravani, "design, simulation and analysis of high-k gate dielectric finfield effect transistor", international journal of nano dimension, vol. 12, no 3, pp. 305-309, 2021. [31] m. ahmed, s. islam, d. al mamun, "numerical simulation of the electrical characteristics of nanoscale tg n-finfet with the variation of gate dielectric materials", international journal of semiconductor science & technology (ijsst), vol. 11, no. 3, 2021. [32] s. sinha, g. yeric, v. chandra, b. cline, y. cao, "exploring sub-20 nm finfet design with predictive technology models", in proceedings of the ieee dac design automation conference, 2012, pp. 283-288. bridging the snmp gap: simple network monitoring the internet of things facta universitatis series: electronics and energetics vol. 29, n o 3, september 2016, pp. 475 487 doi: 10.2298/fuee1603475s bridging the snmp gap: simple network monitoring the internet of things  mihajlo savić university of banja luka, faculty of electrical engineering, republic of srpska, bosnia and herzegovina abstract. things that form internet of things can vary in every imaginable aspect. from simplest devices with barely any processing and memory resources, with communication handled by networking devices like switches and routers to powerful servers that provide needed back-end resources in cloud environments, all are needed for real world implementations of internet of things. monitoring of the network and server parts of the infrastructure is a well known area with numerous approaches that enable efficient monitoring. most prevalent technology used is snmp that forms the part of the ip stack and is as such universally supported. on the other hand, “things” domain is evolving very fast with a number of competing technologies used for communication and monitoring. when discussing small, constrained devices, the two most promising protocols are coap and mqtt. combined, they cover wide area of communication needs for resource constrained devices, from simple messaging system to one that enables connecting to restful world. in this paper we present a possible solution to bridge the gap in monitoring by enabling snmp access to monitoring data obtained from constrained devices that cannot feasibly support snmp or are not intended to be used in such a manner. key words: iot, monitoring, snmp, coap, mqtt 1. introduction internet of things (iot) may mean many different things to many different people, but few would disagree that in order to achieve the full potential of smart environment based on iot one needs to be able to fully monitor all of the things that do make iot possible. although there is a wealth of monitoring products as well as comparable number of standards and platforms that go hand in hand with them, there is one standard that has been around for a long time, is implemented in almost all networking devices and is even a part of the set of the protocols that enable modern networking to exist.  received june 30, 2015; received in revised form november 12, 2015 corresponding author: mihajlo savić university of banja luka, faculty of electrical engineering, patre 5, 78000 banja luka, republic of srpska, bosnia and herzegovina (e-mail: badaboom@etfbl.net) 476 m. savić as it often is, with age it gained robustness and reliability, but lost some of the appeal to newer generations and younger monitoring systems, though one would be hard pressed to find a monitoring product that does not support it. it is also important to note that monitoring is never easy and in production tried and true solutions have proven themselves worthy throughout the history. to monitor the iot we need to monitor any and every device that makes it or provides the services to it, from smallest and simplest single function sensors to ritualized back-end services needed to transform raw data into usable information. currently, simple network management protocol (snmp) is the protocol that enables uniform monitoring of all parts of the iot infrastructure, save for the simplest of devices. as even those devices need to be monitored, presented in this paper one of the possible snmp based solutions for end-to-end monitoring is. solution described in this paper covers one possible use of snmp in monitoring iot infrastructures, enabling monitoring of just iot devices as well as larger heterogeneous infrastructures that can also contain complex iaas entities that provide services to iot devices. 2. simple network management protocol simple network management protocol (snmp) is a part of internet protocol suite (ip) set of protocols as defined by internet engineering task force (ietf) [1], organization in charge of defining standards and protocols that provide base for existence and exchange of data over the internet. snmp defines a set of standards for network management that include application protocol, database schema, as well as the definition of data sets. relatively small numbers of what we generally consider to be standards are in fact full standards and this only gives weight to snmp and its use in management and monitoring areas. common use of snmp is in default configuration that consists of at least one computer or other device that has administrative role (master) and a group of managed networked devices that are controlled by the master device. every managed device (slave) is running a software component called an agent that is in charge of communication with master node. agents provide for access to various system variables of managed device (e.g. system identification, available resources, resource consumption, etc.) but also provide a mechanism to control the device by setting the values of specified variables to desired values (e.g. bringing network interfaces up or down, changing their addresses, etc). data transfer is typically done over user datagram protocol (udp) and default port numbers 161 on the agent side and 162 on master side. communication can be initiated by the master through use of get operations for accessing the data and set operations used to modify the data, as well as by the managed device through the use of trap or inform operations used to send data to management node. 2.1. versions of snmp protocol snmp standard has been so far defined by three versions as will be described in following text. snmp version 1 (snmpv1) was defined by rfc documents number 1155, 1157 and 1215. although it has a “historic” status today, it is still widely used as it is supported by almost all network equipment manufacturers for nearly all networking devices. security model leaves a lot to be desired as it is based on so called “community” strings that can be seen as a shared secret or access passwords. biggest issue lies in the fact that all communication, including community strings, is performed in unencrypted bridging the snmp gap: simple network monitoring the internet of things 477 form. snmp version 2 (snmpv2) was defined by rfc documents 1441-1452 and introduced a host of improvements in the area of security, by utilizing more complex security model, and performance, by introduction of getbulk operation. snmp version 3 (snmpv3) as defined by rfc documents 3411-3418 is also known as std0062 and represents the official version of the standard recognized by ietf. older versions of the standard are considered to be historic or obsolete. the main improvement in this version is advanced security model based on version v2.it is important to note that there is no compatibility between different versions of snmp protocol as the message format and the protocol itself was changed. possible scenarios for coexistence between different versions of snmp protocol are described in rfc 2576. 2.2. data organization every network device accessible by snmp protocol is defined by one or more management information bases (mib) – a virtual database representing a hierarchically organized set of information available for a given device. mib consists of managed objects (mib objects) that are uniquely identifiable in mib hierarchy by value named object identifier (oid). mib tree has an unnamed root node that is branched out to branches controlled by organizations in charge of standards that are further divided on lower levels of hierarchy. mib object consist of at least one instance that can be seen as a variable or variables. there are two types of mib objects: scalars (that define a single instance of the object) and tables (that define multiple linked instances that make up the mib table). one of the aims of the snmp standard is to solve the problem of differing data representations on various platforms, a task that was solved by the use of subset of iso osi abstract syntax notation one (asn.1) – structure of management information (smi). snmpv1 smi specific data types can be either simple (integer, octet-string, oid) or application-wide (network address, counter, gauge, time tick, opaque, integer, unsigned integer). 2.3. extending the snmp functionality as was previously described, snmp allows for a flexible approach and management of networked devices, but is unfortunately limited to functionality implemented in the agent component. if one desires to access additional data or enable new functionality, there are several approaches, among which the most used are: modification of the agent, use of external programs and use of agentx protocol. the most efficient, but also the most difficult to implement and least flexible approach is modifying the agent to implement required functions through access to and modification of the source code of the agent in question. if it is impossible or infeasible to modify the agent, or if there is a need for several agents on the same device, solution can be obtained by the use of snmp proxy software. use of proxy increases the complexity of the system as the introduction of additional layer in the architecture also requires full support for all relevant requirements on this layer as well (e.g. proxy layer becomes a key component in security aspect). alternative solution is the use of external programs for access to required data. the simplest solution is execution of the external program every time the need for a specific data arises. this approach can have severely degraded performances as the program could be executed during any snmp operation. better solution is parallel execution of 478 m. savić both agent and external program, providing the means for communication between them. as this problem was present since the early days of snmp, parallel to development of various ad-hoc solutions, a process for standardized solution of the problem was created. result of this process is agentx protocol [2] that is based on master-slave principle within one or more devices. this protocol is continuation of snmp-smux and snmp-dpi protocols that were relegated to historic and experimental statuses. in 1995 ietf formed snmp agent extensibility working group [3] which defined an extension framework [2] and corresponding mib document [4]. these documents define the protocol, master agent, sub-agents, coding of all required data types, as well as the handling of all communication between parties. 3. internet of things and monitoring when talking about iot and monitoring, there are two major protocols that cannot be overlooked: coap (constrained application protocol) and mqtt (message queuing telemetry transport). as per rfc 7252 that defines it, coap “is a specialized web transfer protocol for use with constrained nodes and constrained (e.g., low-power, lossy) networks” aimed at m2m (machine to machine) applications and is intended to be usable on devices with very limited processor, memory and networking resources [5]. it is udp based and employs an adapted subset of http optimized for m2m use cases, offering features not present in http but highly valuable in m2m environment such as discovery, multicast support, and asynchronous message exchanges[5]. it was specifically designed to utilize insignificant processing resources in normal operation. from request perspective, coap messages are very similar to http request methods, but are limited to get, post, put and delete messages that implement corresponding http method functions. core (constrained restful environments) link format as described by rfc 6690 [6] defines a well-known entry point ("/.well-known/core") that enables client to list the links hosted by the server and as such can be used for discovery, resource collection and resource directory and similar needs. there is an ongoing work on implementing coap on alternative transports such as tcp, p2p, websockets, zigbee and other network protocols that would enable wider use of coap in iot scenarios. mqtt as defined by oasis [7] is a light weight, open and simple client server oriented publish/subscribe messaging transport protocol. like coap, it is aimed at use in m2m applications and resource constrained devices. it runs over tcp/ip or other network protocols that need to provide ordered, lossless and bi-directional connections (for example zigbee protocol [8]). there is a special version of mqtt aimed at sensor networks under the name mqtt-sn that enables use of mqtt in very unreliable networking conditions by severely resource constrained devices via mqtt-sn forwarders and mqtt-sn gateways [9]. mqtt utilizes publish-subscribe pattern in which clients, here referred to as publishers, connect to servers (messaging brokers) and are able to send the messages to select topics with no need to specify exact recipient of the message, in this context called subscriber. messages are filtered by their attributes, chief of which is called topic and is represented by utf-8 string. topics can have hierarchical organization in which different levels are separated by forward slash. an example of such topic is “building1/room007/ rack02/server27/temperature”. as messaging is asynchronous, topics can exist even with bridging the snmp gap: simple network monitoring the internet of things 479 no currently connected publishers or subscribers which enables for use in unreliable environments as individual nodes can connect and disconnect as the need arises. this allows for considerable flexibility as subscriber can precisely choose to listen only to messages in topics related to, for example, certain room or building, or to listen to all messages related to temperature data in all rooms or buildings. but, iot does not consist of constrained devices only. fundamental to proper functioning of any iot infrastructure is also the proper functioning of interconnecting network as well as, most often, proper functioning of back-end services, running on any kind of server device. further complicating the things is the fact that both networking and service components of modern architectures can be virtualized. this represents a problem specially for monitoring of the performance as the nms traditionally has access to monitoring data inside virtualized environment and performance data of actual physical device running the virtualization software is available on to infrastructure provider. when discussing the networking component, outside of possible specialized hardware, for example mqtt-sn forwarders and similar, almost all networking devices support snmp for monitoring. devices that do not support it are usually unmanageable devices that provide no means for remote monitoring and are as such not suitable for use in described circumstances. virtualized servers running back-end services are under control of infrastructure user and can be easily configured to support snmp monitoring if it is not already the case. as mentioned earlier, the real problem lies in the fact that the virtual machine that contains the service has no access to non-virtualized performance data of physical host. following example illustrates the issue. let’s assume that the server running our hourly data collection service is spending proportionally large percentage of time waiting for database server to complete processing of new records. in non-virtualized situation we could monitor the processes in the system and see that, for example, we are waiting for storage system to complete the writing to disk as another process, archiving of previous data in this example, is consuming the resource at same time. this would give us enough information to solve the issue by rescheduling the offending process or decreasing the priority in order to ensure that data collection is completed properly. but, in virtualized environment, if another virtual machine is consuming resources, we have no idea that is happening, as all the performance data suggests nothing is consuming resources but they are unavailable to our service. this is but one example that illustrates how any of the limited resources on the physical host (processor, memory, networking, storage, etc) can be temporarily unavailable without having any means to determine whether the issue lies with our code or just wider environment. fact that virtual machines can be migrated, without shutting down, from one host to another with different resources available further complicates the monitoring aspect of back-end. 3.1. use of snmp for monitoring the internet of things we can divide devices we want to monitor into three categories depending on their support for snmp. first category consists of devices that do support snmp and provide needed monitoring data. second category includes devices that do support snmp but do not provide needed data directly, while the third category would be made of devices that do not support snmp. for our needs, second and third category are essentially the same, as there is no simple way for our monitoring system to directly access the required data, whatever it may be. 480 m. savić first group mostly consists of devices providing network connectivity as they were usually designed to be remotely managed and monitored by snmp. there is very little to do for us here, barring the cases where supported version of snmp does not provide sufficient security (versions 1 and 2) or there are other reachability issues (vpn, nat, etc). most of these issues can be solved by using snmp proxy services or other similar technique. physical infrastructure in cloud environment can also be in this category, providing that we are self hosting operation or have specific arrangements with hosting provider. there are four principal ways to gather data from devices that do not provide them in suitable form for monitoring: 1. devices that support messaging or event notifications allow us to subscribe to relevant topics and queues or implement listeners and receive the data as it is generated by the device. this is the best approach as all the data is current and the required resources are minimal, but is limited by the support by the monitored device. 2. polling (predefined intervals) is a simple, robust and enables us to estimate needed resources in advance. down sides are possible monitoring of devices that are not required, risk of stale data or higher resource consumption if polling more frequently. 3. proxying data collection as requests are made. this provides for minimal resource usage as we are collecting only the data that is needed when it is needed, but introduces unknown response delay in the system as we have to wait for all required devices to respond, makes estimates about resource usage difficult as we are dealing with, for us, random requests (example would be frequent monitoring of a slow responding device by large number of clients) and makes aggregate data calculations almost impossible. 4. proxying with caching extends previous approach by introducing a proxy level cache that can reduce system load at the price of not returning current data to all requests and significantly increasing the complexity of the system. described approaches can be combined in a number of ways to create hybrid solution that would tailor to one’s specific needs, again at the price of increasing already significant level of complexity. as lindholm-ventola and silverajan have shown in [10], monitoring of constrained devices using coap can be done by using coap to snmp proxy, with or without database component, in principle corresponding to third and fourth approach described above. in their work they conclude that further work must be done on research regarding implementation of notifications in iot monitoring systems. of the four described ways to monitor the devices in iot environment, only the first approach provides for meaningful handling and generating of notifications. remaining three approaches will either introduce a possibly significant delay in case of polling, or might completely miss the event if there were no requests to monitor the device. if a device supports messaging or can generate snmp notifications we can process and respond to event with minimal delay. 3.2. mqtt-snmp bridge in order to enable snmp monitoring of mqtt and mqtt-sn devices, we need to implement a system that would listen to messages generated by monitored devices, if needed send requests to monitored devices and transform collected data into form suitable bridging the snmp gap: simple network monitoring the internet of things 481 for serving to snmp clients. although it is possible to serve standalone snmp clients, most often setup like this are a part of larger monitoring infrastructure where snmp clients are in fact nmss (network management systems). architecture of such iot-snmp bridge system is presented in figure 1. the system consists of: monitored devices either supporting mqtt or in case of severely constrained devices mqt-sn protocol, mqtt-sn gateways and forwarders, mqtt broker, iotsnmp collector and server and various number of snmp clients. mqtt-sn forwarders and gateways exist in configurations where there is a need to monitor mqtt-sn devices. mqtt-sn gateways can, and usually are a part of mqtt broker. the broker itself should be chosen to be a polyglot type broker, enabling simple use of different messaging protocols by other endpoints in the system. choice of a suitable broker would also enable simplifying the infrastructure of a complete system that will be described later in the text. when it comes to collecting the data and serving snmp clients, it is possible to create monolithic system where both functions would be centralized, but by separating the collector and server we can easily scale the system or introduce additional load balancing and fault tolerance by employing multiple instances of needed service. broker infrastructure can also be made scalable and/or fault tolerant by employing suitable broker like apache activemq [11] that can function in both classic clustered environment as well as in a so called network of brokers that enables distributed queues and topics across a number of brokers. fig. 1 overview of iot-snmp bridge 3.3. snmp monitoring of iaas development of monitoring component for iaas in this paper is a continuation of work performed in the areas of grid computing and monitoring of distributed services started in see-grid-sci project [12] that resulted in bbmgridsnmp system [13] and is heavily influenced by implemented solutions. architecture of cloudsnmp system is given in figure 2. data is collected from various iaas endpoints via listening to messages generated by endpoints and sent through queue server (broker), by listening to snmp notifications and performing snmp monitoring of physical devices that are a part of the infrastructure as well as accessing needed information through iaas api specific for a 482 m. savić given iaas implementation or through generalized and standardized interfaces like ones produced by dmtf cmwg (distributed management task force cloud management working group) [14], etsi (european telecommunications standards institute) [15], oasis camp tc (organization for the advancement of structured information standards cloud application management for platforms technical committee) [16] and ogf occi (open grid forum open cloud computing interface) [17]. depicted queue server also supports at least one of the jms (java message service) [18] or amqp (advanced message queuing protocol) [19] protocols. fig. 2 architecture of cloudsnmp (iaas-snmp bridge) overall architecture mirror the one used in collecting and processing the data from constrained devices enabling unification of many of the components in this complex infrastructure. for example, it is possible to use the same brokers connected in load balancing and fault tolerant architecture to handle messages from both constrained devices as well as iaas services endpoints. this also enables for sharing the code on the iot snmp and cloudsnmp collector and server components and further modularization of the code. bridging the snmp gap: simple network monitoring the internet of things 483 there are two principal users of served data: operator and client. operator access should allow for full access to real monitored data and should provide for any information of interest to the operator. this can be achieved by designing and implementing a custom mib that contains tables where rows represent monitored resources and enable the operator to easily access summary data for any required parameter. as the monitoring is already done, at least in part, by using snmp there are existing snmp servers with already configured access rules, thus the simplest solution is to extend their functionality by using agentx protocol. client access has various restrictions imposed and enables the client to access only the data relevant for a specific virtual machine, or set of individual virtual machines. this requirement mandates either the use of many instances of snmp server, one for every monitored virtual machine, or some other mechanism that would allow for efficient access to the monitoring data. in order to provide possibility of the client of iaas infrastructure to access the data of the physical server hosting the monitored virtual machine, we employ snmp contexts. in simple terms, snmp contexts provide for creating multiple instances of data structure, be a full tree or some subset, serving the right instance to client. in our use, this enables one snmp server to perform the function of several servers, one for each context, without unnecessary duplication of resources. as cloudsnmp server has the data from all physical virtualization servers in the infrastructure, by connecting a certain context value to a unique virtual machine, client can be served data from the correct virtualization server even after migration to another server has taken place. example in figure 3 presents data propagation for a snmp sub-tree providing data for processor, memory and basic storage statistics from physical device to cloudsnmp server to be served for infrastructure operator as an extension of existing snmp data by utilizing agentx protocol as well as for the client by using custom snmp server that masks and transforms the data prior to replying to client request. depending on the requirements of the system, it is possible to serve different versions of data to clients, both to ensure that we are serving only the data that needs to be served and to avoid sudden changes in configuration of monitored device after migration. for example, it is possible to provide following levels of data masking: 1. no masking – served data is identical to data gathered from physical server. this enables for best performance monitoring by the client but also provides deep insight into actual configuration of infrastructure and can cause troubles for monitoring software as it is possible for a server to suddenly gain or lose cpu cores, ram or networking interfaces. 2. normalize to virtual machine resource – data will be normalized to maximum resources that can be occupied by monitored virtual machine. for example, if the virtual machine can utilize up to 8 cpu cores and server has 16 cpu cores, served data will be scaled to 8 cpu cores, even after migration to different server with 64 cpu cores. this provides for both limiting the amount of information we are publishing to client and for consistent measurements as the maximum values remain the same. the issue arises from the fact that it is now possible to serve data that is in collision with data recorded within the virtual machine. 3. normalize to fixed value – any resource is to be normalized to a predefined fixed value and be seen as proportion of resource currently utilized. this hides almost all information from end users while still providing for limited performance monitoring and troubleshooting. 484 m. savić fig. 3 data propagation in cloudsnmp to operator and client 3.4. overview of security aspects while iot promises a wealth of future possibilities in future, there are also some worrisome aspects that cannot go unmentioned, security as being the chief one. due to pervasive nature of iot and access to sensitive information, any compromise can have potentially grave consequences. when discussing the security of the described system, we can divide it into several possible attack surfaces: snmp based components, messaging components and iot components. when discussing the security of snmp it is important to distinguish between different versions. versions 1 and 2c are prone to packet sniffing and other general attacks applicable to unencrypted communication. only non-obsolete version of the protocol is version 3 that employs standard cryptographic features. due to the limits imposed by stateless nature of the protocol, the protocol can be attacked by brute force and dictionary attacks. modular architecture of snmp enables use of tls [21] and dtls [22] within transport subsystem [23]. proper configuration and utilization is of paramount importance in order to provide for secure operating environment. messaging components allow for use of complex authentication and authorization mechanisms as well as use of encryption. while this component and its security analysis lie outside of the scope of this paper, it is worth noting that there have been a number of bridging the snmp gap: simple network monitoring the internet of things 485 security vulnerabilities in various widely used ssl/tls libraries in the past few years, affecting systems ranging from simple embedded solutions to mobile devices and dedicated servers [24][25][26][27]. discussing security models of iot is complicated by the nature of iot and the fact that it covers everything from simple sensors to connected cars and vast industrial infrastructures. examples of security issues range from vulnerabilities in widely used zigbee protocol [28] to vulnerabilities present in connected cars [29]. concise overview is given by sadeghi, wachsmann and waidner in [30]. one of the benefits of described monitoring system is a possibility to provide effective monitoring to users of the infrastructure while limiting possible attack surfaces to exposed monitoring servers. it is also worth noting that this approach also enables the system to function as a proxy that exposes secure snmp version 3 to outside world although the monitored devices might be able to support only insecure versions of the protocol. in a stark contrast to resource constrained devices, these servers can possess ample hardware and software resources and are much better equipped to handle possible attacks, possible through detection in cooperation with ids (intrusion detection system) or mitigation when coupled with ips (intrusion prevention system). 3.5. integration with existing systems although there is a possibility to use specialized systems to gather, analyze and present monitoring data related to iot, most organizations already use some form of nms (network management system) that can be used for both management and monitoring of the infrastructure. there exists a vast variety of monitoring system, running on different platforms, utilizing different architectures, operational procedures and data collection methods. some of the representatives of popular nmss are nagios [31], zenoss [32], zabbix [33] and opennms [34]. one example of using zenoss in iot monitoring was given in [35]. mazhelis et al have analyzed the possibilities of use of the coap protocol for monitoring of iot infrastructure as well as adapting existing accounting and monitoring of authentication and authorization infrastructure services (amaais) project [36] for such use [37]. although nms products can differ significantly from each other, practically all of them support at least data gathering via snmp. this enables previously described system to extend the reach of general purpose network monitoring systems to iot part of the infrastructure. depending on the exact purpose and system configuration, it is possible to serve either raw collected data or data derived after previously defined transformations. this can be used to also mitigate or solve some of the privacy aspects of possibly sensitive data as the said data can be thoroughly filtered and modified to provide anonymization and/or aggregation. one example of complex monitoring system in the heterogeneous and distributed computing infrastructure such as see-grid [12] was described in [13]. developing software for systems as diverse as iot infrastructures are can be a daunting task. shear diversity of available devices and implementations provides for a very dynamic environment, often difficult to set up for testing purposes. while developing the system contiki [38] based cooja network simulator [39] can be used in place of physical devices. for testing mqtt and coap as well as stress testing the system simple load generator was developed in java utilizing californium coap framework [40] and fusesource mqtt libraries [41]. proof of concept snmp server was first created in java using jax toolkit [42] utilizing agentx protocol, but was rewritten in python programming language [43] utilizing pysnmp library [44]. 486 m. savić 4. conclusion in this paper we presented one solution for end-to-end monitoring of iot devices, including severely constrained devices such as sensors, iaas installations, as well as the networking infrastructure that connects them together. on the constrained devices end of spectrum, use of coap and mqtt was covered, while networking infrastructure natively supports snmp and four approaches to iaas and virtualization equipment data gathering were presented. integration into existing network management and monitoring systems enables simpler transition to full utilization of iot infrastructures in practice. often neglected aspect of harmonization of operational procedures in different domains can be significantly simplified by enabling uniform view and/or control interface for the whole infrastructure. by limiting exposed attack surfaces to simpler to manage and secure monitoring servers, security of the complete system can be increased, also alleviating some of the privacy aspects of the data gathering through the use of data transformation and anonymization prior to serving. described solution provides for non-blocking asynchronous data collection, scalable and fault tolerant data processing and serving, but most importantly, it provides an uniform standards based interface needed for reliable monitoring. references [1] “rfc 2571 an architecture for describing snmp management frameworks.” [online]. available: https://tools.ietf.org/html/rfc2571. [2] “rfc 2741 agent extensibility (agentx) protocol version 1.” [online]. available: https://tools.ietf. org/html/rfc2741. [3] “agent extensibility working group (agentx).” [online]. available: http://www.ietf.org/html.charters/ agentx-charter.html. [4] “rfc 2742 definitions of managed objects for extensible snmp agents.” [online]. available: https://tools.ietf.org/html/rfc2742. [5] “rfc 7252 the constrained application protocol (coap).” [online]. available: https://tools.ietf.org/ html/rfc7252. [6] “rfc 6690 constrained restful environments (core) link format.” [online]. available: https://tools. ietf.org/html/rfc6690. [7] “mqtt version 3.1.1.” [online]. available: http://docs.oasis-open.org/mqtt/mqtt/v3.1.1/mqtt-v3.1.1.html. [8] zigbee alliance. zigbee specification. technical report document 053474r06, version 1.0, 2005. [9] a. stanford-clark and h. linh truong, mqtt for sensor networks (mqtt-sn) protocol specification,. ibm, http://mqtt.org/new/wp-content/uploads/2009/06/mqtt-sn_spec_v1.2.pdf. [10] lindholm-ventola, hanna; silverajan bilhanan , “coap-snmp interworking iot scenarios,” tampere university of technology, department of pervasive computing. report 3, tampere, 2013. [11] “apache activemq.” [online]. available: http://activemq.apache.org/. [12] a. balaž, o. prnjat, d. vudragović, v. slavnić, i. liabotis, e. atanassov, b. jakimovski, m. savić, “development of grid e-infrastructure in south-eastern europe,” j of grid comp, vol. 9, no. 2, pp. 135-154, 2011. [13] m. savic, s. gajin, m. bozic, “snmp based grid infrastructure monitoring system,” in proceedings of the 34th international convention mipro, 2011, pp. 231-235. [14] d. davis, g. pilz, “cloud infrastructure management interface (cimi) model and restful http-based protocol,” technical report, distributed management task force (dmtf), 2012. [15] “etsi ict standards, gsm, tetra, nfv, gprs, 3gpp, its, umts, utran, m2m.” [online]. available: http://www.etsi.org/standards. [16] “oasis cloud application management for platforms (camp) technical committee | charter.” [online]. available: https://www.oasis-open.org/committees/camp/charter.php. [17] “open cloud computing interface” [online]. available: http://occi-wg.org/. bridging the snmp gap: simple network monitoring the internet of things 487 [18] m. hapner, r. burridge, r. sharma, j. fialli, and k. stout, “java message service,” sun microsystems inc., santa clara, ca, 2002. [19] “advanced message queuing protocol website” [online]. available at http://www.amqp.org/. [20] “rfc 2576 coexistence between version 1, version 2, and version 3 of the internet-standard network management framework.” [online]. available: https://tools.ietf.org/html/rfc2576. [21] “the transport layer security (tls) protocol version 1.2” [online]. available: https://tools.ietf. org/html/rfc5246. [22] “datagram transport layer security” [online]. available: https://tools.ietf.org/html/rfc4347. [23] “transport layer security (tls) transport model for the simple network management protocol (snmp)” [online]. available: https://tools.ietf.org/html/rfc5953. [24] “cve-2014-1266” [online]. available: https://web.nvd.nist.gov/view/vuln/detail?vulnid=cve-2014-1266. [25] “cve-2015-0282” [online]. available: https://web.nvd.nist.gov/view/vuln/detail?vulnid=cve-2015-0282. [26] “cve-2014-0160” [online]. available: https://web.nvd.nist.gov/view/vuln/detail?vulnid=cve-2014-0160. [27] “microsoft security advisory 3046015.” [online]. available: https://technet.microsoft.com/en-us/library/ security/3046015. [28] “zigbee exploited – the good, the bad and the ugly” [online]. available: http://cognosec.com/zigbee_ exploited_8f_ca9.pdf [29] s. kamkar, “drive it like you hacked it”[online]. available: http://samy.pl/defcon2015/2015-defcon.pdf [30] a.-r. sadeghi, c. wachsmann, and m. waidner, “security and privacy challenges in industrial internet of things”, in proceedings of the 52nd annual design automation conference, 2015, p. 54. [31] “nagios core. nagios open source project.,” nagios. [online]. available: https://www.nagios.org/. [32] “zenoss,” zenoss. [online]. available: http://www.zenoss.com/. [33] “zabbix: the enterprise-class open source network monitoring solution.” [online]. available: http://www.zabbix.com/. [34] “the opennms project.” [online]. available: http://www.opennms.org/. [35] u. gupta, “monitoring in iot enabled devices,” arxiv preprint arxiv:1507.03780, 2015. [36] o. mazhelis, m. waldburger, g. s. machado, b. stiller, and p. tyrväinen, “extending monitoring and accounting infrastructure towards constrained devices in internet-of-things applications”, technical paper, university of zurich, 2013. available: https://www.merlin.uzh.ch/contributiondocument/download/5076 [37] b. stiller, “accounting and monitoring of aai services.” switch journal, 2010(2):12–13,october 2010. [38] a. dunkels, b. grönvall, and t. voigt, “contiki-a lightweight and flexible operating system for tiny networked sensors,” in proceedings of the 29th annual ieee international conference on local computer networks, 2004, pp. 455-462. [39] f. osterlind, a. dunkels, j. eriksson, n. finne, and t. voigt, “cross-level sensor network simulation with cooja”, in proceedings of the 31st ieee conference on local computer networks, 2006, pp. 641-648. [40] “californium (cf) coap framework java coap implementation.” [online]. available: http://people.inf. ethz.ch/mkovatsc/californium.php. [41] “fusesource mqtt libraries.” [online]. available: https://github.com/fusesource/mqtt-client. [42] “jasmin: jax java agentx client toolkit.” [online]. available: https://www.ibr.cs.tu-bs.de/projects/ jasmin/jax.html. [43] g. vanrossum and f. l. drake, the python language reference. python software foundation, 2010. [44] “snmp library for python.” [online]. available: http://pysnmp.sourceforge.net/. 10904 facta universitatis series: electronics and energetics vol. 36, no 1, march 2023, pp. 121-131 https://doi.org/10.2298/fuee2301121b © 2023 by university of niš, serbia | creative commons license: cc by-nc-nd original scientific paper performance of wearable circularly polarized antenna on different high frequency substrates for dual-band wireless applications rama s. r. basupalli1, naresh k. darimireddy2, rajasekhar. nalanagula3, sujatha. mandala4 1bvrit (a), medak, telangana, india 2,3,4lendi institute of engineering & technology, ap, india abstract. this paper proposes the effect of different dielectric constants to construct a microstrip patch antenna deployed on jean's textile covering military wireless applications. initially, the structure is designed with double l-shaped slits inserted on both sides of the patch with an fr4 dielectric constant of 4.4. antenna dimensions are 40 × 25 mm2, which is miniature compared to the wave's length (λ) at the desired operating frequency. the proposed antenna performance in terms of simulated parameters such as gain in dbi, reflection loss (s11), directivity, and patch antenna radiation efficiency are executed by the cst mw em simulator. however, the conventional way of this design with fr4 may not be so reliable when it is designed on jean's substrate. besides all the above parameters extracted from the simulator should hold a low value to implement a high-performance deployed wearable antenna. the paper's outcome shows the importance of simulations and measurements undertaken for the proposed antenna assuming both the dielectric constants of fr4 and jeans cloth material (with ℇr of 1.7). the main contribution of the antenna is to resonate at the frequencies of 3.17 ghz with circular polarization and 5.04 ghz with linear polarization. the antenna prototype is described, and its performance is validated using measurements. the proposed structure also provides a better enhancement in terms of 10-db impedance bandwidth, with an average gain of 5 dbi. key words: jean’s dielectric, fr4 substrate, sar, textile antenna, dual-band, circular polarization 1. introduction wearable antennas are becoming extremely popular due to their profound potential in multiple applications covering services such as military soldiers, firefighters, and paramedics. establishing a communication link from the wearable antenna to the base station camp would undoubtedly help the military soldier get rid of heavy-weight telecommunication equipment to received july 09, 2022; revised august 17, 2022, september 06, 2022 and october 09, 2022; accepted october 12, 2022 corresponding author: naresh k. darimireddy lendi institute of engineering & technology, ap, india e-mail: darn0005@uqar.ca 122 r. s. r. basupalli, n. k. darimireddy, r. nalaganula, s. mandala carry along with them [1]-[2]. several current and upcoming modern wireless modules either require or can exhibit a solution by implanting one or more antennas on a piece of textile or cloth or directly integrated in to personal accessories which cover shoes, glasses, buttons, and helmets [3]. specific operating and thermal conditions in which textile antennas designed impose with specific requirements explicitly listed would be included in the performance criteria of antenna experimentation. to be developed as wearable, an antenna must combine a suitable choice of materials, both for the conductive and non-conductive parts, with a standard adopted antenna configuration. the most straightforward approach is to combine, knit, or conductive yarns into a portion of clothing [4]-[5]. for a designated wearable antenna, the wearer's placement, stance, and movements detrimentally impact the input impedance and radiation pattern. wearable antennas are usually broadband to compensate for such casual variations [6]. several techniques have been applied to design single and multiband wearable antennas over the past years, and few include slit [7] or u or l-shaped slot-loaded configurations [8], ebg structure [9], and monopole/planar antennas [10]. although these wearable antennas exhibit single/dual-band characteristics, a limited bandwidth is exhibited at higher resonant modes. furuya et al. [11] proposed a wearable antenna with a wideband for digital tv reception in the frequency range of 470-700 mhz. however, antenna radiation efficiency needs to be sufficiently improved with the required -10db return loss and 3-db axial ratio bandwidth. earlier, several slits/slots were loaded along boundaries of the various patch configurations [12]-[13] fabricated on non-flexible substrates to generate orthogonal modes for circularly polarized (cp) radiation by adequately positioning the feed point. owing to the interest in obtaining dual bands and the circular polarization feature, the proposed antenna is discussed to resonate frequencies covering wireless military applications. this article designs a novel cp-based antenna with double l-shaped slits. dual lshaped slits are fixed on either side of the conductive surface of the textile. slits on either side of the wearable patch antenna provide bandwidth variation. furthermore, the design obtains circular polarization exciting two orthogonally polarized tm01 and tm10 modes by placing the exact location of the feed point. 2. wearable antenna design and simulations the propagation and loss properties at the desired frequency band(s) must be known for the candidate material before antenna design and fabrication. in addition, permittivity and loss tangent have to be characterized to choose textile dielectric material as a substrate with different constructions and thicknesses. effective permittivity and loss tangent can be extracted using the formula based on the resonant frequency [14]-[15]. 2.1. proposed antenna structure the structure of the proposed antenna with the dimensions of ls × ws is represented as shown in figure1 with the placement of narrow dual l shapes slits. a modified ground plane for the improvement of characteristics is shown in figure 1. by optimizing the dimensions of the wearable patch, it is observed that there is an improvement in impedance bandwidth and circular polarization feature. performance of wearable circularly polarized antenna on different high frequency... 123 fig. 1 (a) top and (b) bottom layers of the wearable antenna 2.2. design parameters the parametric study of both dielectric materials (fr4 and jeans) with microstrip line feed is analyzed in this section. a single l-shaped slit on the top layer of the conventional patch makes an additional resonant band possible and slightly enhances the bandwidth. moreover, an additional l-shaped slit introduces a circular polarization feature with the formation of orthogonal modes at the initial resonant band of 3.17 ghz. the dimensions (in mm) for the parameters are listed as follows: ls = 75, ws = 40, wp = 25, lp = 40, w1 = 15, w2 = 1, w3 = 3.95, w4 = 2.8, w5 = 2, l1 = 39, l2 = 26, l3 = 10, l4 = 20. the effect of both dielectric substrates fr4 (4.4) and jean’s cloth (1.7) are discussed to analyze various characteristics. the relevant critical parameter is the fabric's conductivity (σ), holding the units as siemens per meter (s/m) as part of the antenna design. equation (1) gives the relation between the surface resistivity (s) and the thickness (t) of the fabric: 1 s t   = (1) 2.3. effect of different substrates with simulations the proposed wearable antenna with dual l-shaped slits and monopole ground plane improves gain and impedance bandwidth at the resonance bands 3.17 ghz and 5.04 ghz, respectively. it is shown from figure 2 that the degenerate modes are formed at 3.17 ghz leading to an axial ratio bandwidth of 8 mhz for the circular feature. additionally, a high impedance bandwidth is obtained at both resonant bands compared to the fr4. the coverage area using cp would better the short-range communication for military applications. this is the reason for which the prototype is implanted in textile materials. for this study, the characteristics of the wearable design are compared with the fr4 type substrate. fig. 3 shows the comparison of the vswr values for both bands. the values of the vswr are maintained less than 2 to evaluate the simulated results of impedance bandwidth and vswr responses. cst microwave studio simulator is used to analyze the case studies of both dielectric constants for different thickness values. for this study, the thickness values are 1 mm and 0.8 mm. 124 r. s. r. basupalli, n. k. darimireddy, r. nalaganula, s. mandala conventional rectangle shape geometry with l-shaped slits is chosen as implantable on electro textiles due to its simplicity. however, the thickness intensity of the yarns in the woven structures would be an ideal choice for the prototype design. minor variation in the selection of length and width of the conductive fabric as a wearable antenna results in a slight variation in the antenna characteristics, as displayed in figure 2. in general, wearable material holds a very low dielectric constant. therefore, the thickness and substrate dielectric impact are vital in designing the wearable patch to extract the efficient parameters. fig. 2 s11 response of different dielectric constants(fr4 and jean’s) fig. 3 vswr response of different dielectric constants (fr4 and jean’s) table 1 highlights the summary of comparing the executed parameters for both the substrates performed for this work, along with the effect of different thickness values considered. in addition, it mentions the high impedance bandwidth of 1470 mhz (54%) and 60 mhz (1.2%) with a thickness of 1 mm obtained at 3.17 ghz and 5.04 ghz, respectively. performance of wearable circularly polarized antenna on different high frequency... 125 fig. 4 gain versus frequency response for both (fr4 and jean’s) substrates it is proved that the reduction in the wearable dielectric constant increases the antenna performance with the novel structure design proposed. moreover, it is observed that a moderate gain of 5dbi is displayed with a wearable antenna when compared with the low-value gain for fr4 substrate, maintaining the same thickness of the substrate. feature of circular polarization is also obtained for the operating frequency of 3.17 ghz. gain response curve drawn for jean’s substrate show the gain values for 0.9 ghz and 3.2 ghz, respectively as 2 dbi and 3.2 dbi. table 1 performance comparison of both the substrates (jean’s and fr4) in terms of gain and impedance bandwidth dielectric (fr4)-4.4 substrate thickness frequencies (ghz) impedance bandwidth (mhz), % vswr gain(dbi) t=0.8mm 1.82 100, 5.49 1.86 2 4.17 30, 1.79 1.79 1.8 t=1mm 1.81 270,14.6 1.6 1.6 2.72 200,7.3 1.8 2.5 4.06 60,1.4 1.5 3.68 dielectric (jean’s cloth)-1.7 t=0.8mm 3.17 1420,52.3 1.31 3.5 5.07 30, 0.6 1.56 4 t=1mm 3.17 1470,54 1.1 4.1 5.04 60,1.2 1.3 5 sar simulated response for an operating frequency of 0.91ghz is represented in figure 5 and it shows that the sar value denoted on the scale is moderately at 18.2 w/kg. this sar value is to be maintained at low since the textile antenna is worn on a conductive body. material thickness, conductivity, operating frequency range and resonant behavior are 126 r. s. r. basupalli, n. k. darimireddy, r. nalaganula, s. mandala carefully chosen to be distinct to better understand the resulting sar. sar is a measure of power absorbed per unit mass (kg), in the human body tissue. fig. 5 simulated sar response of the antenna operating at 0.91ghz 3. experimental results and discussion the proposed novel textile antenna prototype with optimized dimensions is constructed and investigated experimentally. fig. 6(a) and (b) respectively shows the conventional and modified printed textile antennas fabricated. it is seen that the radiator patch with dual l-slits significantly improves the matching conditions of impedance for high resonance bands and maintains stable gain across the remaining bands. the experimental setup of the proposed dual-band antenna is shown in fig. 7 to measure radiation pattern and axial ratio. simulated and measured return losses of the antenna design are shown in fig. 8. the impedance bandwidth values for the proposed antenna prototype are 1470 mhz and 60 mhz, respectively, for both the generated bands. the anomalies between measured and simulated results are due to the in-house manufacturing and soldering losses. the return loss of the proposed antenna module is measured using the keysight e5071c ena series network analyzer. fig. 9 presents the measured axial ratio of the designed antenna. the lower resonant band's 3-db axial ratio (ar) bandwidth is about 11 mhz (3.17 to 3.172 ghz). the 3 db ar bandwidth is within the 10 db impedance bandwidth (overlapped), which is desirable. though the measured response is not in close agreement with the simulated response due to tolerance issues and sar, measured impedance bandwidths are about 60 mhz and 112 mhz, covering both the dual resonant bands at 900 mhz and 3.2 ghz. performance of wearable circularly polarized antenna on different high frequency... 127 (a) (b) fig. 6 fabricated textile antenna prototypes (a) front-view and (b) rear view of proposed dual band antenna fig. 7 experimental setup of proposed dual band antenna 128 r. s. r. basupalli, n. k. darimireddy, r. nalaganula, s. mandala fig. 8 measured and simulated return loss response of the proposed textile antenna fig. 9 measured axial ratio against frequency of the proposed textile antenna it is also found that the radiation pattern plots are drawn for both dual bands and observed that co-polarization patterns dominate, indicating the dual-band antenna is ideal for military applications. the measured radiation patterns are represented for jeans substrate operating at frequency of 3.17 ghz in fig. 10 and operating frequency of 5.04 ghz as shown in the fig. 11. it is found that unstable pattern at 5.04 ghz for jeans substrate. this is due to the back radiation of the monopole structure on the ground plane. performance of wearable circularly polarized antenna on different high frequency... 129 fig. 10 far field radiation patterns in xoy plane and xoz plane at 3.17ghz for jean’s cloth fig. 11 far field radiation patterns in xoy plane and xoz plane at 5.04ghz for jeans cloth the parameters covering dimensions, operating frequencies, return loss bandwidth, and axial ratio bandwidth and gain are compared with the existing design as tabulated in table 2 below. the wearable antenna proposed in this work is evaluated at a small size and operates at dual-band with circular polarization at one band. 130 r. s. r. basupalli, n. k. darimireddy, r. nalaganula, s. mandala table 2 calculated bandwidth and gain parameters of proposed wearable antenna comparing with existing structures ref. dimensions (mm2) frequency (ghz) 10-db rlbw (mhz) (%) 3-db arbw (lp/cp) gain (dbi) [8] 110×130 1.927 2.45 -6 db lp lp 0 0 [9] 120×120 2.45 5.5 4 % 16% lp lp 3 2 [11] 240×125 6.2 48% lp [14] 80×80 10 20% lp 8.5 [proposed] 40×25 3.17 5.04 60% 1.2% 11 mhz (cp) lp 4.1 5 [measured] 40×25 0.91 3.2 2% 10% lp lp 2 3.2 rlbw: return loss bandwidth, arbw:axial ratio bandwidth, lp: linearly polarized, cp: circularly polarized 4. conclusions a novel double l-shaped slit textile antenna is analyzed and developed for military wireless applications. the comparative study is performed for both fr4 and jeans dielectric and tabulated the parameters extracted. this paper majorly focuses on obtaining good cp radiation at the first band and lp at the second resonant band as obtained from multiple iterations of the antenna. moderate gain of 4.1 dbi and 5 dbi is achieved for 3.17 ghz and 5.04 ghz frequencies, respectively. displayed results show high impedance bandwidth with jean's wearable dielectric compared to the conventional fr4 substrate. it is observed that the proposed wearable antenna gives 0.2 db ar, indicating good quality cp (close to 0 db) for the resonant band. though this extended slit-based model executes tri-band cp, the radiation patterns are degraded. top and bottom layered structures with line feeding techniques conclude that the proposed prototype antenna could benefit the operating frequencies of military wireless applications. experimental evaluation and significance of the latest wearable dielectrics with different substrate thicknesses are also carried as part of future work. acknowledgement: the authors would like to thank to the advanced communications laboratory, bvrit(a), medak and also the central r & d cell, lendi iet (a) for providing the required facilities to execute the proposed work. references [1] a. kalis, t. antonakopoulos and v. maklos, "a printed circuit switched array antenna for indoor communications", ieee transactions on consumer electronics, vol. 46, no. 3, pp. 531-538, aug. 2000. [2] s. han and s. k. park, "performance analysis of wireless body area network in indoor off-body communication", ieee transactions on consumer electronics, vol. 57, no. 2, pp. 335-338, may 2011. [3] j. park and j. chun, "dtv receivers using an adaptive switched beamformer with an online-calibration algorithm", ieee transactions on consumer electronics, vol. 56, no. 1, pp. 34-41, february 2010. performance of wearable circularly polarized antenna on different high frequency... 131 [4] c. ahn, b. ahn, s. kim and j. choi, "experimental outage capacity analysis for off-body wireless body area network channel with transmit diversity", ieee transactions on consumer electronics, vol. 58, no. 2, pp. 274-277, may 2012. [5] b. zhang and f. yu, "lswd: localization scheme for wireless sensor networks using directional antenna", ieee transactions on consumer electronics, vol. 56, no. 4, pp. 2208-2216, november 2010. [6] s. koskinen, l. pykäri and m. mäntysalo, "electrical performance characterization of an inkjet-printed flexible circuit in a mobile application", ieee transactions on components, packaging and manufacturing technology, vol. 3, no. 9, pp. 1604-1610, sept. 2013. [7] b. zhang and f. yu, "lswd: localization scheme for wireless sensor networks using directional antenna", ieee transactions on consumer electronics, vol. 56, no. 4, pp. 2208-2216, november 2010. [8] p. salonen et al., "dual-band wearable textile antenna", in proceedings of the ieee antennas and propagation society international symposium, 2004, vol. 1, pp. 463-466. [9] z. shaozhen, r. langley, "dual-band wearable textile antenna on an ebg substrate", ieee transs. on ants. and propag, vol. 57, no. 4, pp. 926-935, 2009. [10] s. lingam, b. gupta, "development of textile antennas for body wearable applications and investigations on their performance under bent conditions", piers b; vol. 22, pp. 53-71, 2010. [11] k. furuya et al., "wide band wearable antenna for dtv reception," in proceedings of the 2008 ieee antennas and propagation society international symposium, 2008, pp. 1-4. [12] n. k. darimireddy, r. r. reddy, a. m. prasad, "asymmetric and symmetric modified bow‐tie slotted circular patch antennas for circular polarization", etri journal, vol. 40, no. 5, pp. 561-569, 2018. [13] n. k. darimireddy, r. r. reddy, a. m. prasad, "asymmetric triangular semi-elliptic slotted patch antennas for wireless applications", radioengineering, vol. 27, no. 1, p. 85, 2018. [14] k. x. wang and h. wong, "a wideband millimeter-wave circularly polarized antenna with 3-d printed polarizer", in ieee transactions on antennas and propagation, vol. 65, no. 3, pp. 1038-1046, march 2017. [15] i. bouhassoune, et al., "optimization of uhf rfid five-slotted patch tag design using pso algorithm for biomedical sensing systems", int. j. environ. res. public health, vol. 17, id. 8593, 2020. instruction facta universitatis series: electronics and energetics vol. 27, n o 2, june 2014, pp. 275 298 doi: 10.2298/fuee1402275s why and how photovoltaics will provide cheapest electricity in the 21 st century  rajendra singh, githin f. alapatt, guneet bedi holcombe department of electrical and computer engineering and center for silicon nanoelectronics clemson university, clemson, sc, usa abstract. with the advent of solar panels and windmills, and our ability to generate and use electrical energy locally without the need for long-range transmission, the world is about to witness transformational changes in energy infrastructure. the use of photovoltaics (pv) as source of direct current (dc) power reduces the cost and improves the reliability of pv system. dc microgrid and nanogrid based on pv and storage can provide sustainable electric power to all human beings in equitable fashion. bulk volume manufacturing of batteries will lead to cost reduction in a manner similar to the cost reduction experience of pv module manufacturing. future manufacturing innovations and r & d directions are discussed that can further reduce the cost of pv system. if the current trends of pv growth continue, we expect pv electricity cost with storage to reach $0.02 per kwh in the next 8-10 years. key words: photovoltaics, direct current, local electricity, nanogrid, energy policy, batteries 1. introduction the world population is currently about 7 billion and by the end of the 21 st century the world population is projected to reach nearly 11 billion people [1]. providing green energy to all human beings in equitable fashion is one of the biggest technical challenges. the costs of generating, transmitting and utilizing energy must be decreased to ensure sustainability. for any energy technology to be truly sustainable it must be environmentally friendly, conserves water, and be affordable [2]. as a solid state device, silicon based integrated circuits popularly known as “computer chips” have brought revolution in the information technology that stated in the 20 th century and is continuing to shape the future world of tomorrow [3]. the use of solid-state devices for power generation, power delivery and power utilization is bringing green energy revolution in a manner similar to the role played by solid-state devices in the field of global  received february 6, 2014 corresponding author: rajendra singh holcombe department of electrical and computer engineering and center for silicon nanoelectronics clemson university, clemson, sc, usa (e-mail: srajend@clemson.edu) 276 r. singh, g. f. alapatt, g. bedi communication [4]. in particular, photovoltaics (pv) is playing the central role in the emerging green energy revolution [5], [6]. the objective of this paper is to demonstrate that the progress made in recent years in reducing the cost of the electricity generated by pv is phenomenal and in the very near future pv will emerge as the lowest cost and sustainable electricity generation technology. for achieving the goal of lowest cost, the importance of direct current (dc) generated by pv systems will be highlighted. in addition we will also provide manufacturing innovations, research directions and energy policies that will continue to further reduce the cost of electricity generated by pv systems. 2. energy sources for equitable sustainable energy scenario, we must consider environmental concerns and conservation of water for future generations of mankind. in earlier publications we have stated that nuclear energy is not economical for any country and no more new nuclear reactors should be constructed [4]-[7]. hydro-energy, biomass energy, and geothermal energy are partially renewable, but are not totally sustainable [2]. as of today, no cost effective technology exists for producing bio-fuels. fundamental breakthrough is required to produce cost-effective bio-fuels [2]. consideration of renewable and nonrenewable energies (figure 1) shows that only solar and wind energies are truly renewable and can provide the ultimate in sustainable energy to meet the global energy needs of the 21 st century [2]. 3. why photovoltaics? there is no direct competition between solar and wind energy, since without storage solar energy can be used during the daytime and wind energy mostly during the nighttime. however, other than the larger amount of available solar energy, there are fundamental differences between solar energy and wind energy. as shown in figure 2 and 3, solar energy is more uniformly distributed than wind energy. 98 % of world population receives more than 3 kwh/m 2 solar irradiance per day. the other difference relates to the cost and reliability of pv systems and wind energy systems. during the last several years the annual global installation of wind energy was much more than the pv systems. in 2010, one of us predicted that due to inherent advantages, pv will take over wind and eventually we will have pv as the dominant electricity generation technology [11]. globally in 2013, 33.8 gw of new onshore wind farms plus 1.7 gw of offshore wind capacity will be installed [12]. the total wind capacity of 35.5 in 2013 is lower than the 36.7 gw pv installed in 2013 [12]. since 2008, solar pv panel prices have fallen well over 70 percent, with the cost of wind turbines decreasing by 40 percent during that same time. similar to the experience of semiconductor products, the cost of pv systems will continue to decrease in coming years. why and how photovoltaics will provide cheapest electricity in the 21st century 277 fig. 1 importance of solar and wind energy in global context [8] fig. 2 global mean wind speed at 80 m [9] 278 r. singh, g. f. alapatt, g. bedi fig. 3 global mean solar irradiance [10] solar energy received on earth surface per year is about 89 pw (1 pw = 10 15 w). 2009 global energy consumption is about 16 tw (1 tw = 10 12 w) that is about 0.016 % of solar energy received on earth surface. the challenge is to convert the enormous amount of solar energy into electricity at lower cost than any other technique of generating electricity. solar energy can be converted into electricity either by concentration solar power (csp) or by photovoltaics. due to a number of cost and reliability related factors the successful implementation of csp is much lower than pv. as an example in 2012, installed capacity of csp is 2.8 gw [13] while pv installed capacity is 28.4 gw [14]. in fact, between 2007 and 2012, only 7.44 gw of csp was installed. the primary interest in csp is due to the fact that other than optical system, the operation of csp is similar to coal, nuclear and gas generation of electricity. thus initially utilities were more interested in csp than pv, however due to intrinsic cost advantages of pv over csp the situation has changed now. 4. important role of pv as dc source of power and the development of pv based dc nanogrid and dc microgrid current global electric infrastructure is dominated by alternating current (ac). due to the development of solid state dominated power electronics in the last 50 years, high voltage dc transmission has certain advantages [15] over high voltage long haul transmission and currently about 2 % of installed global generating capacity is handed by high voltage dc transmission [16]. however, except few applications, all loads around us (smart phone, lap top, refrigerator, air conditioner, light source etc.) need dc power source. due to local generation of power by pv and the availability of power electronics to step up or step down dc voltages, it is important to visit thomas edison’s original why and how photovoltaics will provide cheapest electricity in the 21st century 279 concept of local dc power generation [17]. in the context of 21 st century power generation and utilization, edison’s concept can be extended in to two directions. ideally the distance between electricity generation sources and loads must be at a minimum; however cost-effective solar and wind farms at a particular site also meet the requirements of the local dc power. minimum conversion from dc to ac and or ac to dc must take place to conserve energy. according to us energy information administration (eia), local power generation is defined as electricity that is (i) self-generated, (ii) produced by either the same entity that consumes the power or an affiliate, and (iii) used in the direct support of a service or industrial process located within the same facility or group of facilities that house the generating equipment. because of the novelty of direct use of the electricity, local electricity generation is on the rise in the united states. this increase is partly due to the compatibility of local dc electricity infrastructures, which can co-exist with existing electrical infrastructures that are based upon alternating current (ac). regarding power storage, dc storage devices such as batteries, capacitors and fuel cells also meet the requirements of local dc electricity. in essence, the self-sufficient power network of energy generation and energy storage sources, known as the micro grid, is basically a smaller version of the larger power grid. in the absence of no external connectivity of the microgrid with the main grid, this self-sufficient pv based “nanogrid” can generate, store and distribute its own power. figure 4 show the structure of the proposed pv based nanogrid. this concept is innovative in that it uses dc power generation sources, dc storage devices and minimum distance between power sources and the dc loads for the 21 st century new electricity infrastructure. rooftop pv with storage is a typical example of nanogrid. pv based nanogrid is also ideally suited for rural electrification where there is no existing grid. fig. 4 pv based nano grid for rural electrification though there are many components driving the growth of local dc electricity, we list the key points below: (i) the traditional model of large base-load ac centralized electrical power generation and long haul distribution via high-voltage transmission and low voltage lines causes huge losses of energy and costs required to operate such systems. as clearly shown in table 1, approximately 70% of electricity produced is lost in generation, 280 r. singh, g. f. alapatt, g. bedi transmission and distribution. assuming that the cost of electricity is $0.1/kwh, the annual energy loss amounts to about 40 trillion dollars. (ii) direct current (dc) electricity locally generated by renewable energy sources such as solar panels and wind mills, and used with a minimum conversion (dc to ac or ac to dc) and minimum transmission can reduce energy losses by as much as 30% or more energy that is typically lost in ac generation, transmission, and distribution. (iii) unlike 20 th century technologies, the cost of generating local power generated from solar pv and wind systems is decreasing daily, with the substitution of dc for ac power further reducing that cost. the cost of centralized ac power generation, however, has either increased or remained unchanged during that time. wind and solar generated power is cheaper than coal-fired power plants when factoring the social costs of carbon. some utilities are now using more pv as it has become more cost-competitive with natural gas. us companies are increasingly turning to solar panels and dc microgrid to offset energy costs. in minnesota, for example, roof-top pv electricity cost 36-75% less than natural gas during peak delivery times. steam generation by pvforso-called enhanced oil reserve projects costs about $5 to $7 per million british thermal units of energy, half of the $12 to $18 price for liquefied natural gas [18]. (iv) dc based pv and wind power systems are more reliable than ac based systems. while the inverter cost is less than about 20% of pv system cost, any system malfunction can shut the system down, with a total loss in energy production [19]. wind turbines are more reliable in dc configurations, due to the greatly reduced complexity of the mechanical transmissions that are required for turbine ac generation [16]. (v) pv systems are extremely reliable. after almost 20 years of continuous outdoor exposure, silicon pv module average performance decay is only 4.42% for the whole period [20]. reliability results guarantee safe investments, for the benefit of all pv users and stakeholders [20]. (vi) batteries, capacitors, and fuel cells are used to store dc electricity. the use of ac in place of dc increases the cost of storage device, as with batteries in which ac based storage systems increase their cost to as much as 50 % [21]. (vii) increase in energy efficiency translates to job creation and economic growth. according to the energy information administration the electricity consumed in the us in 2011 was 3,839 billion kwh and is expected to increase by 0.91% annually until 2040. assuming that by 2015 the dc electricity is used for 10% of generation and distribution of the electricity consumed in us, more than 60,000 jobs will be created. (viii) integrated circuits and other solid state devices revolutionized virtually every facet of human life. except very few cases, (e.g. certain motor-based systems), all other loads require dc power. for example, unlike old cathode-ray tube televisions, solidstate tvs do not use ac current. similarly, though lighting consumes about 20% of the electricity produced worldwide, it too uses dc power. also, unlike dc current, why and how photovoltaics will provide cheapest electricity in the 21st century 281 typical ac based cell phone chargers waste approximately 20-35% energy used [16]. electrical vehicles do not require ac power for charging batteries. with revolution in the it industry, more semiconductor-based electronics are being used, with a concurrent increase in dc loads and a decrease in ac loads. (ix) battery-based hybrid and electrical vehicles and solid-state based led lighting are transforming the transportation and lighting industries, both of which are powered by direct current. (x) energy-efficient appliances use adjustable speed motor drives in which a rectifier converters the ac from the grid into an internal dc bus voltage. though one option entails directly powering the appliances from a dc source, it is also possible to redesign appliances without these embedded rectifiers. such redesigns may require a refit of manufacturing facilities, state and/or federal government subsidies and related financial incentives to the consumer can offset the costs. globally, 268.1 million major appliances were sold in 2012. developing public policies to offset the cost of retrofitting manufacturing factories and exporting this new technology will create many new jobs (xi) a dc nanogrid is the key-enabler of the “zero energy building model.” with minimum wastage in transmission and conversion, the use of locally generated dc electricity can provide 100% energy needs of a building. table 1 2008 world energy consumption by sector [source: us energy information administration (eia)] end-use sectors energy end-use includes end-use of electricity but excludes losses (quadrillion btu) electricity losses includes generation, transmission, and distribution losses (quadrillion btu) total energy use includes electricity losses (quadrillion btu) share of total energy use (quadrillion btu) commercial 28 32 60 12% industrial 191 64 255 51% residential 52 37 89 18% transportation 98 2 100 20% total end-use sectors 369 electric power sector 194 39% total electricity losses 135 total energy use 505 5. global manufacturing advantages of pv generated dc electricity cutting energy costs increases the competiveness of manufacturing industry and saves jobs worldwide, the energy cost of which in some cases is as great as one-third of the operating cost of the manufacturing plant. this is typically true for aluminum plants and many other high energy consuming manufacturing industries. as shown in table 2 [22], aluminum plants lose 6.3% of the total energy due to conversion from ac to dc current, a process that cannot be avoided today. based on world aluminum data, 93,576thousand metric tons of aluminum was produced in 2012. using the average data of table 2 and an 282 r. singh, g. f. alapatt, g. bedi electricity cost of $0.1/kwh, a net saving of $9.6 billion is possible through the use of dc instead of ac power. similarly, other high energy consuming industries (such as the pulp and paper industries) can also be retrofitted for dc current. table 2 2012 data of energy consumed in producing one ton of aluminum [22] nation or region dc energy (kwh) ac energy (kwh) % loss in current plants north america 14,540 15,458 6.31 world 13,756 14,639 6.42 china 13,014 13,844 6.38 as clearly indicated in table 3 [23], different ac standards of voltage and frequency are used in different countries. japan, however, is an exception in that two sets of frequency standards are used in that nation. the worldwide adoption of dc power can prevent such a redundancy of effort by providing uniform voltage standards worldwide, thus reducing the cost of related power electronics to yield an overall lower manufacturing cost of every dc based electrical system. table 3 voltage and frequency standards of 16 developing/developed nations [23] country voltage (v) frequency (hz) australia 230 50 brazil 110 and 220 60 canada 120 60 china 220 50 cyprus 240 50 egypt 220 50 guyana 240 60 south korea 220 60 mexico 127 60 japan 100 50 and 60 oman 240 50 russian federation 220 50 spain 230 50 taiwan 110 60 united kingdom 230 50 united states 120 60 6. current status of photovoltaics as shown in fig. 5, by the end of year 2012 the cumulative installed solar pv electricity generation capacity has exceeded 100 gw and is expected to double from about 100gw in 2012 to 200gw in 2015 [24]. the installed capacity of pv is expected to reach 36 gw and 49 gw by the end of year 2013 and 2014 respectively [25]. the large-scale solar pv market that is comprised of rooftop projects above 100 kilowatts (kw) in size and ground-mounted solar pv projects is about 26 gw in 2013 [26]. based why and how photovoltaics will provide cheapest electricity in the 21st century 283 on manufacturing considerations discussed at length in earlier publications [5], [27]-[29] silicon based pv modules will continue to dominate pv market. fundamentally, there is nothing wrong in assuming that concentration photovoltaic (cpv) systems should provide lower cost compared to non-concentration solar cells. however the engineering problems that include the thermal and optical challenges have not permitted the largescale commercialization of concentration solar cells (fig. 6). the average cost of installed pv system for various segments of us market is shown in fig. 7 [31]. average solar pv system coast for various sizes and locations in australia is given in table 4 [32]. the cost of pv modules for various countries is shown in fig. 8 [33]. the data of fig. 7, fig. 8 and table 4 clearly indicate that we are reaching towards pv module and installed pv system cost of $0.50/wp and $1.00/wp respectively. as shown in fig. 9 and fig. 10 european union (eu) has dominated the pv market in the past and china, japan and us are currently dominating the pv market. pv module manufacturing share in 2013 is dominated by companies based in china and taiwan [34]. in 2013, only one us based pv company (first solar) is in top 10 pv manufacturers list [34]. however, pv growth in us is significant, since water conservation advantages of pv are quite important [35]. table 4 installed cost of pv system in australia (1 australian $~$0.87 us $) [32] 284 r. singh, g. f. alapatt, g. bedi fig. 5 global growth in pv electricity generation capacity worldwide [24] fig. 6 past, current and projected market of cpv [30] why and how photovoltaics will provide cheapest electricity in the 21st century 285 fig. 7 average installed prices of pv system in us for various market segments [31] fig. 8 silicon pv module prices for various countries [33] 286 r. singh, g. f. alapatt, g. bedi fig. 9 dominance of european union pv market between 2007-2011 [33] fig. 10 dominance of china, japan and us pv market between 2012-2016 [33] 7. manufacturing innovations leading to constant reduction of pv system cost as we have stated before more than 90% of the installed pv capacity employs bulksilicon solar cells. the rise in pv market (fig. 5) and innovation in materials and processing leads to reduced cost of silicon solar cells (fig. 11). the highest reported am 1.5g efficiency of silicon solar cells and silicon pv modules are 25 % and 21.5 % respectively [29]. except some minor improvements, no major improvement is expected in increasing the efficiency of silicon solar cells. further cost reduction will be achieved by using thinner wafers, building new process equipment with higher throughput, lower defect density and reduced foot print. all the new tools must provide lower cost of ownership [29] than current manufacturing tools. in a recent publication [37] we have shown that as compared to conventional furnace processing and rapid thermal processing, the use of ultra violet (uv) and or vacuum ultra violet (vuv) photons enhances the why and how photovoltaics will provide cheapest electricity in the 21st century 287 diffusion coefficient of dopants by many orders of magnitudes. as shown in figure 12, for wavelength below about 0.3 micrometer, the diffusion coefficient is higher by two to four order of magnitudes thus in case of rapid photothermal processing (rpp), other than thermal energy, the vuv photons are used as an additional source of optical energy. the principal advantages of rpp over other thermal processing techniques are (i) lower density of defects, (ii) minimum process variation, (iii) higher throughput, and (iv) lower deposition temperature. based on a conservative estimate, we expect that the throughput of rpp based diffusion and annealing tools will be at least an order of magnitude higher than current thermal processing tools. similar to silicon integrated circuit (ic) manufacturing (fig. 13), pv manufacturers can use larger size substrate to further reduce the cost of silicon solar cells. gigawatt pv system manufacturing shown in fig. 14 will provide the ultimate lowest cost of pv system. fig. 11 innovations and supply-chain advantages leading to low-cost of silicon solar cells [36] fig. 12 diffusion coefficient under vacuum ultra violet (vuv) photons [37] 288 r. singh, g. f. alapatt, g. bedi fig. 13 use of larger wafer size by silicon ic manufacturers to reduce the cost of silicon based ics [38] fig. 14 giga watt scale pv system manufacturing will lead to ultimate lowest cost of pv system why and how photovoltaics will provide cheapest electricity in the 21st century 289 8. r & d directionsthat can lead to further reductions of cost of pv system and lead to new applications of pv in a recent publication [29] we have shown that most of the research community is working in areas that either has fundamental flaws or does not meet fundamental manufacturing requirements. interested reader should read reference [29]. since the publication of reference [29], lead halide perovskite solar cell [39] has received lot of publicity. following are the main reasons that this type of solar cell will never be manufactured: (i) as a single junction solar cell, silicon cannot be replaced by other solar cells, unless the abundant materials based solar cell has at least 30 % efficiency of large area cells. (ii) the use of lead in the solar cell reported in reference [39] does not meet the manufacturing requirements. (iii) the area of the high efficiency solar cell reported in ref. [39] is less than 0.1 cm 2 . the authors of reference [40] reported that any silicon solar cell with area less than 0.25 cm 2 would not show the efficiency degradation due to series resistance effects. in other words, simply scaling to larger size of area less than this critical size will lead to significant reduction in efficiency. using am 1.5g spectrum and other data as used in theoretical calculations in ref. [29], we have used method of reference [40] to calculate the minimum solar cell area that should be used in reporting any new type of solar cells. these results are shown in fig. 14. (iv) the band gap of lead halide perovskite of 1.5 ev is not optimum for multi junction multiterminal solar cells [29]. (v) the use of graphene in lead halide perovskite solar cell will lead to significant process variability and the performance of module will be worse than other thin film materials (cdte, cigs and a-si) used in manufacturing thin film pv modules [29]. thus module based on lead halide perovskite solar cells will not be able to complete with existing thin film pv modules. based on solid scientific and engineering principles, following are productive r &d directions that can lead to advancement of pv module manufacturing: (a) multiterminal multijunction solar cells the next major improvement in the performance and cost reduction of silicon solar cells can be achieved by taking advantage of both bulk silicon and thin film solar cells. in reference [29] we have introduced the concept of multiterminal multijunction solar cells. fig. 15 shows the concept for two terminals four junctions device. the optimal band gap of top junction should be about 1.8 ev. (b) thin film solar cells for building integrated photovoltaics (bipv) enormous opportunity exists to convert glass used in the construction of every building for generating pv electricity. the low efficiency leading to high cost of existing commercial thin film for bipv application is the major barrier. reliability of bipv must be very high, since glass does not degrade under normal weather conditions. (c) use of pv in transport sector all kind of vehicles used in transport sector can use pv for power generation [7]. limited surface area of the vehicle requires high efficiency and low-cost of pv modules. mounting of pv modules on surface of the vehicle requires innovative design concepts. 290 r. singh, g. f. alapatt, g. bedi (d) integration of pv and consumer products with rise in the number of consumer products used by modern man, there is need to use ambient light to convert into electricity. this will require innovation in the integration of pv and consumer products. recent patent filed by apple demonstrate the importance of this area [41]. (e) use of pv electricity for desalination pv electricity can be used for desalination. however, to reduce the cost of drinkable water the design of pv system and desalination must be reconsidered to conserve energy. (f) pv as source of combined heat and power the unused part of the spectrum by pv can be used to collect heat generated by the pv system. however, cost effectiveness of such a chp system has not been proven. (g) solid state capacitor as energy storage device the fundamental problem of the current capacitor technology is the low value of capacitance density.based on ultra-high dielectric constant (k) materials (k > 10 6 ) solid-state capacitors have the potential of providing high energy and high power density. unfortunately, currently, due to defects in the material the dielectric constant degrades with both electric field and temperature and the leakage current is very high. finding a solution to all these problems can provide a cost effective large-scale solution of storing electric energy (h) use of nanostructures in the fabrication of solar cells for almost two decades, the buzzword “nanotechnology” has been advocated by a large group of researchers to improve the efficiency of solar cells. however, to date there is not a single work where the efficiency of nanostructure based solar cells for terrestrial applications has exceeded the efficiency of bulk solar cells. in previous publications we have critically examined the role of nanostructures for solar cell applications [29],[42][44] and the summary of our findings is reported here. figure 16 shows the quantization of properties as the dimension changes from “3-d” to “0-d”.the 2-d properties of quantum wells have been exploited very well in the fabrication of iii-v solar cells. the use of self-assembly for 2-d and 1-d nanostructures may provide properties of isolated structures as expected theoretically. however, when self-assembly is used in device fabrication, the process variation results into lower efficiency devices. further when such devices are used in module manufacturing, the lowest performance device will dictate the efficiency of the modules. as shown in figure 17 [46], the quantum dot or “0-d” devices shows increase in the value of energy gap in the quantum confinement region. below 8 nm, the band gap of silicon quantum dots increases (figure 18). there is no experimental proof that at any given wavelength quantum dots can provide quantum efficiency greater than one. similarly there is no experimental proof that hot electrons can provide higher efficiency than normal operating devices. as discussed at length in reference 29, the concept of intermediate band gap is flawed and one cannot get higher conversion efficiency than the normal bulk material solar cell. there is need to invent new control processes so that the unique properties of nanostructures can provide tolerable process variation. in the absence of such processing tools, no practical devices can be made where one can exploit the unique properties offered by the nanostructures. why and how photovoltaics will provide cheapest electricity in the 21st century 291 fig. 15 minimum solar cell area as a function of band gap to observe the degradation of efficiency of solar cell. fig. 16 (a) schematic of the proposed two-junction four-terminal solar cell. (b) external electric circuitry to combine the electricity generated separately by the two junctions. 292 r. singh, g. f. alapatt, g. bedi fig. 17 quantization of properties with scaling of dimensions [45] fig. 18 increase in energy gap with decrease in the size of quantum dot [46] why and how photovoltaics will provide cheapest electricity in the 21st century 293 9. storage of electricity generated by pv batteries [21], ultracapacitors [48] and fuel cells [49] are all useful for storing dc electricity. other than safety issue, fuel cells are expensive. significant progress has been made in recent years to reduce the cost of batteries. also the increasing use of evs [50] and large-scale grid storage [51]-[53] will increase the demand of batteries. similar to the experience of cost reduction of pv by volume manufacturing (fig. 19), the cost reduction of battery prices will continue. indeed, utility scale battery storage is now competitive with natural gas in the us; eos energy storage inc. has developed a battery system that costs approximately $160/kwh [21]. semiconductor manufacturing techniques can also further reduce the cost of batteries. similar to solar cells [29], the use of a series and parallel combination of various cells in batteries yield the desired watt-hours of the battery. the equipment used in battery manufacturing is generally based on statistical process control, and the resultant process variations leads to variations in the output of various cells of the battery. advanced process control can reduce this process variation resulting higher power out from the same battery. in addition, large scale manufacturing of batteries in a single location will provide tight control on supply chain and further reduce the cost of batteries. fig. 19 experimental results on the variation of optical bandgap of nanostructured silicon with diameter of silicon nanograins [47] 294 r. singh, g. f. alapatt, g. bedi 10. misconception about subsidies in the context of pv, wind energy and other renewables, there are many misconceptions regarding the concept of energy subsidies, a review of which is provided elsewhere [55]. globally, fossil fuel industries receive nearly $1 trillion a year in subsidies, approximately twelve times of that allocated to the renewables industry [56]. most alarmingly, nearly 43% of subsidies to fossil fuel industries in the developing world end up in the pockets of the richest 20%; only 7% go to the bottom 20% of households [57]. eliminating subsidies for oil, gas, coal and other fossil fuels would make a significant dent in curbing global warming pollution [56]. in table 5 we provide a historical average of us federal energy subsidies. similar situation exists in other parts of the world. table 5 historical average of average federal energy subsidies in us [55] energy source subsidies 2010 $billion oil and gas 4.86 nuclear 3.50 biofuels 1.08 renewables 0.37 11. photovoltaics and the future of utilities the traditional business model of the utility “investing in equipment, turning meters, and earning steady profits” is undergoing a transformational change with the emergence of rooftop pv leading to new business models [58]. rooftop pv dominated concept of the dc microgrid poses no threat to a utility industry that is willing to adapt to rapid technological changes in the power industry that pv, storage, power electronics and wind technologies will accelerate. if the utility fails to adapt to these all but certain developments, they will become as archaic as the sears catalog business of the 20 th century. 13. energy policy issues the worldwide adoption of pv generated dc power is a wise global public policy move in terms of sustainability and economic growth of developed, developing and underdeveloped economies pv trade war between any two countries is not in the best interest of any nation [59]. a new business model that capitalizes the buying power of a nation or group of nations further reduce the cost of implementing pv generated dc power. the real or virtual vertical business model [2] will lead to the lowest cost of pv electricity generated by either the dc microgrid or dc nanogrid. 14. photovoltaics for underprivileged people (pup) economic disparity is a serious issue, since the 85 richest people are as wealthy as poorest half of the world [60]. globally, 2.5 billion people in the developing world rely on biomass (fuel from wood, charcoal and animal dung) to meet their energy needs for cooking and other daily necessities. the continuous decrease in cost of pv generated why and how photovoltaics will provide cheapest electricity in the 21st century 295 electricity is now making it possible to provide electrical energy to those populations who can be served entirely by pv generated dc electricity. similar to the explosive growth of cell phones (no need of land lines), pv combined with a dc energy distribution system will provide badly needed clean alternatives to dirty sources of fuel. unlike developed economies, in which replacing an aging electricity infrastructure is a challenge, implementing a new low-cost dc power system infrastructure in developing economies that have no such infrastructures is a much easier proposition. a pv based dc nanogrid (figure 4) is the most practical low-cost method of providing cost effective electricity to such developing societies worldwide. indeed, the market size is huge and the societal implications are monumental. united nations, world bank, developed, developing and underdeveloped countries need to work together and invent new real or virtual business model as shown in fig. 20. underdeveloped countries do not have the technology or capital to manufacturer pv systems, new business model must be developed where the combined purchasing power of the underdeveloped countries must be considered as a single entity and the developed economies or developing economies gets a huge market share without investment on marketing. the power situation in emerging and underdeveloped countries is a serious issue. as an example shortage of electricity in india is a significant hindrance in economic growth [7]. although not an optimal low-cost engineering solution, under desperation india is running a pilot project where each customer will get uninterrupted 100 w dc power that will be obtained from each substation and run on a separate lime and separate meter [61]. nanogrid for underprivileged people is also a solution of national security [62], [63] fig. 20 doubling the volume manufacturing reduces the pv module prices by 20 % [54] 296 r. singh, g. f. alapatt, g. bedi fig. 21 proposed virtual or real vertical integrated business model will provide lowest cost of solar electricity 15. concluding remarks in this paper we have provided an in depth review of photovoltaics for generating sustainable green electricity in the 21 st century. the importance of pv is examined from global perspective. without storage the cost of pv system is approaching less than about $1per peak watt, which can provide dc electricity generation cost of about $0.02$0.03 per kwh for most of the world’s population. we have identified future manufacturing innovations and r & d directions that can further reduce the cost of pv system. bulk volume manufacturing of batteries will lead to cost reduction in a manner similar to the cost reduction experience of pv module manufacturing. for underprivileged people, united nations and world bank need to seriously think about pv and develop a new vertically integrated business model that can capitalize the buying power of the underprovided people living all over the world. current trends of pv market growth are such that in about two years the market size doubles. if this trend continues, we expect terawatt (1,000 gw) pv installations by the end of this decade. under this scenario we expect pv electricity cost with storage to reach $0.02 per kwh in the next 8-10 years. why and how photovoltaics will provide cheapest electricity in the 21st century 297 references [1] http://blogs.ei.columbia.edu/2013/07/15/world-population-projected-to-cross-11-billion-threshold-in-2100/ [2] r. singh, and g. f. alapatt, “innovative paths for providing green energy for sustainable global economic growth”, proc. spie 8482, photonic innovations and solutions for complex environments and systems (pisces), 848205 (october 11, 2012); doi:10.1117/12.928058 [3] r. singh, l. colombo, k. schuegraf, r. doering, and a. diebold, “semiconductor manufacturing,” in guide to state-of-the-art electronic devices, j. n. burghartz, ed.; new york, ny, usa: wiley, ch. 10, pp. 121–132., 2013 [4] r. singh, n. gupta, k.f. poole “global green energy conversion revolution in 21st century through solid state devices”, proc. 26th international conference on microelectronics, nis, serbia, may 11-14, 2008, vol. 1, pp. 45-54, ieee, new york, ny. [5] r. singh, g., f. alapatt and k. f. poole., “photovoltaics: emerging role as a dominant electricity generation technology in the 21st century”, proc. 28th international conference on microelectronics, ieee, and new york, ny. 53-63, 2012 [6] r. singh, k. shenai, g.f. alapatt, and s.m. evon ,"semiconductor manufacturing for clean energy economy", invited paper, proc. ieee energy tech 2013 technology frontiers in sustainable power and energy, may 21-23, 2013, case western reserve university, published by ieee, publication year: 2013, pp: 1 – 7 doi: 10.1109/energytech.2013.6645351 [7] r. singh, g. f. alapatt, and m. abdelhamid , “green energy conversion & storage for solving india's energy problem through innovation in ultra large scale manufacturing and advanced research of solid state devices and systems”, 2012 international conference on emerging electronics, pp. 1-8, 2012, digital object identifier: 10.1109/icemelec.2012.6636220 [8] http://www.asrc.cestm.albany.edu/perez/2011/solval.pdf [9] http://www.3tier.com/static/ttcms/us/images/support/maps/3tier_5km_global_wind_speed.jpg [10] http://www.3tier.com/static/ttcms/us/images/support/maps/3tier_solar_irradiance.jpg [11] http://www.renewableenergyworld.com/rea/news/article/2010/10/champions-of_photovoltaics [12] http://www.sunwindenergy.com/news/more-photovoltaics-wind-power-installed-2013 [13] http://www.sbc.slb.com/sbcinstitute/publications/~/media/files/sbc%20energy%20institute/sbc%20e nergy%20institute_solar_factbook_jun%202013.ashx [14] http://cleantechnica.com/2013/04/11/total-global-solar-pv-capacity-approaching-100-gw/ [15] http://www.economist.com/blogs/babbage/2013/01/power-transmission [16] http://smartgrid.ieee.org/questions-and-answers/902-ieee-smart-grid-experts-roundup-ac-vs-dc-power [17] http://www.abb.com/cawp/seitp202/c646c16ae1512f8ec1257934004fa545.aspx [18] http://www.renewableenergyworld.com/rea/news/article/2014/01/solar-beats-natural-gas-to-unlockmiddle-easts-heavy-oil-says-glassdoor-solar [19] http://www.solarindustrymag.com/issues/si1401/feat_02_a-lifecycle-approach-to-invertermanagement.html [20] d. polverini, m. field, e. dunlop, and w. zaaiman, ”polycrystalline silicon pv modules performance and degradation over 20 years”, prog. photovolt: res. appl. vol. 21:pp. 1004–1015, 2013. [21] http://reneweconomy.com.au/2013/eos-utility-scale-battery-storage-competitive-with-gas-36444 [22] http://www.world-aluminium.org/statistics/ [23] m.h. el-sharkawi, ”electrical energy: an introduction”, taylor & francis group, third edition, chapter 2, 2003. [24] m. barker. (2013, mar. 15). reaching new heights: cumulative pv demand to double again by 2015 [online]. available: http://www.solarbuzz.com/resources/blog/2013/03/ reaching-new-heights-cumulativepv-demand-to-double-again-by-2015 [25] http://www.pv-magazine.com/news/details/beitrag/global-solar-pv-demand-to-reach-49-gw-in-2014--saynpd-solarbuzz_100013796/#axzz2sdoix3mn [26] http://www.businessspectator.com.au/news/2014/1/13/solar-energy/large-scale-pv-exceeded-26-gw-year [27] r. singh and j.d. leslie, "economic requirements for new materials for solar cells", solar energy, vol. 24, pp. 589 592, 1980. [28] r. singh, “why silicon is and will remain the dominant photovoltaic material”, journal of nanophotonics, vol. 3, 032503 (16 july 2009). [29] r. singh, g. f. alapatt, and a. lakhtakia,” making solar cells a reality in every home: opportunities and challenges for photovoltaic device design, ieee journal of the electron devices society, vol. 1, no. 6, pp. 129-144, june 2013. [30] http://www.renewableenergyworld.com/rea/news/article/2013/12/cpv-outlook-demand-doubling-costshalved-by-2017?cmpid=solarnl-tuesday-december17-2013 298 r. singh, g. f. alapatt, g. bedi [31] http://www.computerworld.com/s/article/9244836/solar_power_installation_costs_fall_through_the_floor [32] http://www.businessspectator.com.au/article/2014/1/20/solar-energy/solar-pv-price-check---january [33] http://www.greentechmedia.com/articles/read/regional-pv-module-pricing-dynamics-what-you-need-to-know [34] http://www.renewableenergyworld.com/rea/news/article/2014/01/top-ten-pv-manufacturers-from-2000-topresent-a-pictorial-retrospective?cmpid=solarnl-thursday-january23-2014 [35] http://campverdebugleonline.com/main.asp?sectionid=36&subsectionid=73&articleid=41284 [36] http://www.businessspectator.com.au/article/2013/11/25/solar-energy/solar-silicon-wafers-below-20cwatt [37] s. shishiyanu, r. singh, t. shishiyanu, s. asher and r. reedy, “the mechanism of enhanced diffusion of phosphorus in silico[46]n during rapid photothermal processing of solar cells”, ieee transaction of electron devices, vol. 58, pp. 776-781, 2011 [38] a simulation study of 450mm wafer fabrication costs, s. w. jones, ic knowledge llc [39] g. hodes, and d. cahen, “perovskite solar cell roll forward”, nature photonics, vol. 8, pp. 87-88, 2014. [40] k. rajkanan and j. shewchun, "a better approach to the evaluation of the series resistance of solar cells" solid-state electronics, vol. 22, no. 2, pp. 193-197, 1979. [41] http://au.ibtimes.com/articles/536794/20140201/apple-macbook-pro.htm [42] n. gupta, g. f. alapatt, r. podila, r. singh, and k. f. poole, “prospects of nanostructure-based solar cells for manufacturing future generation of photovoltaic modules,” int. j. photoenergy., vol. 2009, article no. 154059, 2009. [43] g. f. alapatt, r. singh, and k. f. poole, “fundamental issues in manufacturing photovoltaic modules beyond the current generation of materials”, advances in optoelectronics, article id 782150, 10 pages, doi:10.1155/2011/782150, 2011 [44] g. f. alapatt, r. singh, n. gupta and k. f. poole,, ”fundamental problems of self assembly for manufacturing semiconductor products”, emerging materials research, vol. 1, issue s1, pp. 1-5, 2012. [45] j. hank, introduction to the theory of nanostructures, international max planck research school of science and technology of nanostructures, 2006. [46] http://www.sigmaaldrich.com/materials-science/nanomaterials/quantum-dots.html [47] v. a. belyakov, v. a. burdov, r. lockwood, and a. meldrum, “silicon nanocrystals: fundamental theory and implications for stimulatedemission,” adv. opt. technol., vol. 2008, article no. 279502, 2008. [48] http://www.solarpowerworldonline.com/2014/01/ultracapacitors-grid-scale-solar-smoothing/ [49] http://www.njspotlight.com/stories/13/04/02/hydrogen-fuel-cells-could-add-year-round-reliability-torenewable-energy/ [50] http://www1.eere.energy.gov/vehiclesandfuels/pdfs/1_million_electric_vehicles_rpt.pdf [51] http://energy.gov/sites/prod/files/2013/12/f5/grid%20energy%20storage%20december%202013.pdf [52] http://www.pv-tech.org/news/saft_and_ingeteam_to_build_combined_pv_plant_and_battery_storage _project_on [53] http://www.renewableenergyworld.com/rea/news/article/2014/01/transmission-and-energy-storage-2014outlook-the-macro-and-micro-transformation-of-electric-grids?cmpid=wnl-friday-january24-2014 [54] http://www1.eere.energy.gov/solar/pdfs/47927_chapter4.pdf [55] http://i.bnet.com/blogs/dbl_energy_subsidies_paper.pdf. [56] http://www.nrdc.org/international/rio-2012/cleanenergy.asp [57] http://www.economist.com/news/finance-and-economics/21593484-economic-case-scrapping-fossil-fuelsubsidies-getting-stronger-fuelling [58] http://www.greentechmedia.com/articles/read/new-utility-business-models [59] r. singh, “can the us return to manufacturing glory?,” photovoltaics world, pp. 40-43, march/april 2012. [60] http://www.latimes.com/business/money/la-fi-mo-oxfam-world-economic-forum-income-inequality20140120,0,7080817.story#axzz2s8ppvaqu [61] http://economictimes.indiatimes.com/news/news-by-industry/energy/power/iit-madras-project-to-supplylow-power-dc-may-end-outages/articleshow/29458820.cms [62] http://online.wsj.com/news/articles/sb10001424052702304851104579359141941621778?mg=reno64wsj&url=http%3a%2f%2fonline.wsj.com%2farticle%2fsb10001424052702304851104579359141941 621778.html [63] http://live.wsj.com/video/mystery-assault-on-power-grid-raises-alarms/9afcc446-5b2e-4749-a8ac6e4b0a8a7301.html#!9afcc446-5b2e-4749-a8ac-6e4b0a8a7301 11151 facta universitatis series: electronics and energetics vol. 36, no 2, june 2023, pp. 239-251 https://doi.org/10.2298/fuee2302239s © 2023 by university of niš, serbia | creative commons license: cc by-nc-nd original scientific paper strength analysis of a blade with different cross-sections* bader somaiday1, ireneusz czajka1, muhammad a. r. yass2 1agh university of science and technology, krakow, poland 2university of technology, baghdad, iraq abstract. the efficiency of horizontal axis wind turbine (hawt) blades is examined in this paper concerning the effect of cross-section airfoil type. three dif-ferent airfoils were examined: symmetric (naca 4412), asymmetric (naca 0012), and supercritical (naca 4412). (eppler 417). the anal-yses that were performed combined theory and experiment. theoretical analyses were carried out using fortran 90 code and the blade element momentum-based qblade code. the blade was created using solidworks software and a 3d printer for testing purposes. the findings of experi-mental tests supported the conclusions of the theory. research revealed that the eppler 417 blade, which has a supercritical airfoil, performed better than other examined objects. naca 4412, naca 0012, and eppler 417 each have a power coefficient of 0.516, 0.492, and 0.510. according to the experimental data, the eppler 417 airfoil outperforms other air-foils in terms of power and speed reduction. to calculate the deformation and stresses of the three blades with various cross sections, cfd analysis was done in ansys workbench. the cfd results showed that naca 4412 has the highest strength but eppler 417 was considered the optimum cross-section based on power generation and acceptable stress values. key words: hawt, cfd analysis, optimum power coefficient, qblade code 1. introduction the aerodynamic efficiency of an airfoil is defined by the lift-to-drag ratio. it achieves the highest values at a specific angle of attack, and the value of this angle varies between airfoils. [2] the lift-to-drag ratio depends on zero-lift drag, aspect ratio, and span efficiency and is independent of weight. the use of airfoil in a wind turbine is no more limited than in an aircraft wing because a wind turbine operates at a lower speed than an aircraft. [3] there have been many theoretical and scientific studies on the performance of wind turbine blades. received september 29, 2022; revised december 11, 2022; accepted january 06, 2023 corresponding author: bader somaiday agh university of science and technology, krakow, poland e-mail: somaiday@agh.edu.pl * an earlier version of this paper was presented at the 7th virtual international conference science, technology and management in energy (energetics 2021), belgrade, serbia, december 16-17 2021 [1]. mailto:somaiday@agh.edu.pl 240 b. somaiday, i. czajka, m. a. r. yass symbols a axial factor á angular factor cd drag coefficient cl lift coefficient cp power coefficient n number of revolutions of rotor per minute (rpm) p power (w) r radius (m) r radius of turbine rotor (m) u wind speed (m/s) 𝛼 angle of attack (degree) 𝜆 tip speed ratio φ relative flow angle (degree) ω angular velocity of the rotor(rad/s) acronyms cfd computational fluid dynamics hawt horizontal axis wind turbine bem blade element momentum naca national advisory committee for aeronautics fem finite element method urans unsteady reynolds averaged navier-stokes [4] this paper investigated an aerodynamic performance evaluation system using two groups of naca profiles which were used in a series of five-digit naca (63-221, 65415; 23012,23021) and four-digit naca (2421, 2412,4412, 4424) for three hawt blades. the same airfoils were used along the entire blade. a computer pro-gram was developed to automate the entire procedure. their results show that the elementary power coefficient of naca 4412 and naca 23012 was higher than the other profiles. [5] in this paper, a stable and aerodynamic design using naca 4412 profile with blade length (800 mm) and power (600 w) with mini hawt was pro-posed. the length chord and twist angle distributions of the initial blade model were calculated. a reasonable compromise between high efficiency and good starting torque was obtained. the blades were developed using matlab software. the op-timized blade chord was reduced by 24% and the thickness by 44%. the power level of the optimized blade was significantly increased to 30% compared to the standard blade. [6] this paper explained the design and optimization of a small hawt blade using custom code. the blades were made using naca 4412, naca 2412 and naca 1812 at a wind speed of 5 m/s, which was the most frequent wind speed pre-vailing in the indian peninsula. based on a self-created code based on bem theory, an optimum blade profile was generated which performs with high efficiency using multiple airfoils. the twist angle distribution, chord distribution and other parame-ters for different airfoil sections along the blade were determined using the proposed code. for a rotor with a diameter of 4.46 m, a power factor of 0.490 and an output power of 0.56 kw was obtained. the blade analysis result obtained using q-blade software showed reliable agreement with the proposed code and wind turbine per-formance analysis. the power factor obtained using the matlab code was 0.490, which was very close to that obtained using q-blade (0.514). in addition, the differ-ence in output power between the two values was only 28.58 w. [7] the behaviour and performance of a multi-section hawt blade with and without a fence are researched in this paper. the multi-section hawt blade was designed using supercritical airfoils (sg6043, fx63-137s and fx66-s-196v). the overall performance of the multi-section blade was compared with the single-section naca4412 blade. numerical analyses were performed using the author's code (fortran 90) and the qblade package based on bem theory. the multi-pass vanes show an increase of about 8% in power factor compared to the single-pass vanes. the boundary layer theory was used to design the fences and their strength analysis of a blade with different cross-sections 241 position was de-termined experimentally. an increase in total power factor of about 16% with the use of fences and high flutter stability. [8] this paper investigated the aero-elastic behavior of horizontal axis wind tur-bine (hawt) flexible blades by using computational and aerodynamic models ap-proach. to study the unstable blade airfoil aerodynamic properties, the b-l (bed-does-leishman) dynamic stall model was added to the modified blade element mo-mentum (bem) model. [9] in this paper investigation of the aero-elastic model of multi-rotor hawt was done. the method used in this system was to integrate the single-rotor hawcstab2 with the multi-rotor tool hawc2. therefore, this method of fidelity linear time-invariant aero elastic modelling was verified by comparing the frequency responses of different rotors. [10] this paper studied the aero elastic be-havior of a mw (multi-megawatt) hawt is influenced by the integrity of the aero-dynamic simulation. the main purpose of this research is the comparison between engineering model results and cfd aero elastic simulations results that needs less empirical modeling. to investigate the influence of the aerodynamic models on the aero elastic results for large hawt, two distinct models (bemand cfd-based) were used. [11] the two-part study looks at horizontal axis wind turbines which have improved aero elastic performance and hence boost yearly energy output is proposed in this paper. the structural characteristics of a standard blade were then idealized using an adaptable shear. this development's power curve is evaluated, and it is proven to dramatically boost yearly energy output over traditional systems by 1.51 % than the maximum power at a wind speed of 15 m/s. [12] researchers developed the urans equations that were integrated with the fem in a flexible way to describe the aero-elastic behaviour of tjreborg horizontal axis wind turbine blades. at four different horizontal inflow wind speeds, this approach was validated by comparing simulated and experimental data. the aero-elastic behaviour of the tjreborg wind turbine was also estimated and studied for yaw angles of 10, 30, and 60 degrees. [13] this research was done by developing a horizontal axis wind tur-bine rotor blade model for showing the coupling effect of rotor bulk rotation and blade flexible motion. the model was created with lagrange's technique, as well as the blade was discretized utilizing the finite element method (fem). the two differ-ent relationships between aerodynamics wind and structural behavior are captured in this design. [14] researchers developed a wind speed model for n-blades of hori-zontal axis wind turbines with the considerations of wind shear and buildings shad-ow impacts. the systematic approach was utilized to calculate the wind shear, build-ing shadows, synthesis, and equivalent wind velocity disturbances elements, as well as their relative positions in the rotor disc region. [15] this paper investigated the influence of diagonally input upon the wake parameters of a hawt (horizontal axis wind turbine) inside a wind farm. a hawt with a generator limit of 30 kw and then rotor diameter of 10.0 m was employed in this work. a field study was used to analyse the influence of tilt angle on the energy and thrust efficiency of a hawt. on this foundation, the hawt's wake properties were investigated with various wind orientations and pitch angles. as a result, the peak power coefficients cp were 0.31, 0.33, and 0.27, respectively, and correlate to tip velocity ratios l= 7.5, 7.4, and 6.8 with pitch angles b= 0°, 2°, and 4°. the wind turbine experimental model predicts around 51% of annual power generation compared to the experimental research model. in this study, the behavior and performance of different blade cross-sections, symmetrical, asymmetrical and supercritical airfoils (naca0012, naca 4412 and eppler 417) were investigated experimentally and by ansys. the horizontal axis wind turbine blade design was performed using fortran 90 code and qblade software based on bem theory. 242 b. somaiday, i. czajka, m. a. r. yass 2. the blade cross-section table 1 shows the airfoil characteristics in cross sections and the airfoil shapes are shown in figure 1. the hawt rotor design parameters are shown in table 2. table 1 cross-section airfoils distribution airfoil max t/c max cl/cd naca 4412 12% at 30% 129.4 at 5.25° naca 0012 12% at 30% 75.6 at 7.5° eppler 417 14.2% at 38.35% 135.9 at 2.25° fig. 1 airfoils geometry table 2 parameter of design rotor diameter 1.07 m hub diameter 0.20 m number of blades 3 rated power 600 w cut in speed 2 m/s2 3. power coefficient fig. 2 shows maximum power that can be produced by wind flowing through the ring. [7]. the velocity around the disc is assumed to be constant (u2 = u3) with the assumption that the upstream and downstream pressures are equal. the equations yield the rotor power coefficient. equations [16]. 1 4 2 2 u u u + = (1) strength analysis of a blade with different cross-sections 243 1 2 1 u u a u − = (2) 2 1 (1 )u u a= − (3) 4 1 (1 2 )u u a= − (4) h h r r r r p dp dq= =   (5) 2 3 1 1 2 h r r p wind dqp c p r u  = =  (6) \ 3 2 8 (1 ) 1 cot h d p r r l c c a a d c         =  − −           (7) the tip speed ratio; 1 2 60 r n u   = (8) fig. 2 wind turbine actuator disk model 4. design and manufacturing blades a program in fortran (f.90) was written and the qblade package was used to calculate the aerodynamic data and power factor based on blade element momen-tum (bem) theory, as shown in tables 3, 4 and 5 and figure 3. solidworks software was used to design the 3d blade shapes (see figure 4). the developed models were fabricated on a 3d printer (see figure 5). due to the limited size of the printer's print area, the blades were divided into several sections and then combined. the blades with different profiles in sections (naca 4412, naca 0012 and eppler 417) were mounted to the wind turbine for testing as shown in figure 6. 244 b. somaiday, i. czajka, m. a. r. yass table 3 naca 4412 cross-section geometry position (m) chord (m) twist (deg) foil 1 0.00 0.167 43.42 naca 4412 2 0.10 0.156 23.14 naca 4412 3 0.17 0.136 16.72 naca 4412 4 0.27 0.109 10.91 naca 4412 5 0.37 0.090 7.42 naca 4412 6 0.47 0.076 5.11 naca 4412 7 0.57 0.066 3.47 naca 4412 8 0.67 0.058 2.25 naca 4412 9 0.77 0.052 1.30 naca 4412 10 0.87 0.046 0.56 naca 4412 11 0.97 0.042 -0.05 naca 4412 12 1.07 0.038 -0.56 naca 4412 table 4 naca 0012 cross-section geometry position (m) chord (m) twist (deg) foil 1 0.00 0.206 41.43 naca 0012 2 0.10 0.194 21.14 naca 0012 3 0.17 0.168 14.72 naca 0012 4 0.27 0.136 8.91 naca 0012 5 0.37 0.112 5.42 naca 0012 6 0.47 0.094 3.11 naca 0012 7 0.57 0.082 1.47 naca 0012 8 0.67 0.072 0.25 naca 0012 9 0.77 0.064 -0.69 naca 0012 10 0.87 0.057 -1.44 naca 0012 11 0.97 0.052 -2.05 naca 0012 12 1.07 0.048 -2.56 naca 0012 table 5 eppler 417 cross-section geometry position (m) chord (m) twist (deg) foil 1 0.00 0.283 47.42 eppler 417 2 0.10 0.266 27.14 eppler 417 3 0.17 0.231 20.72 eppler 417 4 0.27 0.186 14.90 eppler 417 5 0.37 0.153 11.42 eppler 417 6 0.47 0.129 9.11 eppler 417 7 0.57 0.112 7.47 eppler 417 8 0.67 0.098 6.25 eppler 417 9 0.77 0.087 5.31 eppler 417 10 0.87 0.079 4.55 eppler 417 11 0.97 0.072 3.94 eppler 417 12 1.07 0.066 3.44 eppler 417 strength analysis of a blade with different cross-sections 245 fig. 3 cross-section blades by qblade package fig. 4 cross-section blade by solidworks (a) naca 4412 blade (b) naca 0012 blade (c) eppler 317 blade fig. 5 3d printing process 246 b. somaiday, i. czajka, m. a. r. yass fig. 6 wind turbines (a) with naca4412 cross-section blades (b) with naca0012 cross-section blades (c) with eppler 317 cross-section blades the material assigned to the blades was carbon fiber and its properties are shown in figure 7. applied pressure was 14.25 mpa, 251 mpa, 986.4 mpa and 1370 mpa to derive the post-processing results of total deformation, equivalent stress and equivalent strain shown in table 7. fig. 7 carbon fiber properties strength analysis of a blade with different cross-sections 247 5. results and decisions the primary design element of a wind turbine blade is the cross-sectional area of the airfoil, which transforms the airflow velocity into a pressure distribution throughout the length of the blade. in this investigation, many profiles including symmetrical, asymmetrical, and supercritical have been used. when evaluating the performance of the profiles, the primary factors to consider are the amount of energy absorbed from the free stream, the maximum lift-to-drag ratio, and the angle of attack. not just the power factor peak, but also the airfoil's overall cross-sectional efficiency, was taken into account. figure 8 demonstrates that compared to the other profiles, the eppler 417 profile produced less drag. additionally, as shown in figure 9, the eppler 417 profile produced the greatest pressure dispersion in the second third of the blade radius. as indicated in fig. 10, the naca 4412 profile had the highest power factor value (cp = 0.516), followed by the naca 0012 (cp = 1.491), and the eppler 417 (cp = 0.510) profiles. however, according to figure 11, the eppler 417 profile had the highest overall efficiency. according to the experimental findings (see table 6), eppler 417 performs the best and produces the most power of the other profiles. cfd results showed that naca 4412 goes through less deformation and stresses (figures 12 to 17 and table 7). fig. 8 normal force distribution along the blades radius fig. 9 tangential force distribution along the blades radius 248 b. somaiday, i. czajka, m. a. r. yass fig. 10 the power coefficient of the cross-sections blades versus tip speed ratio fig. 11 the area under power coefficient curve (a) naca 4412 cross-section blade (b) naca 0012 cross-section blade (c) eppler 317 cross-section blade strength analysis of a blade with different cross-sections 249 table 6 experimental results wind speed (m/s) naca 4412 naca 0012 eppler 417 rpm power w rpm power w rpm power w 3 62 13 51 10 68 20 4.2 88 93 69 28 95 122 5.4 107 144 96 125 122 173 6.5 125 306 114 285 147 330 7.5 171 506 132 363 178 546 fig. 12 equivalent stress for naca 4412 at 251 pa and 1370 pa fig. 13 equivalent stress for naca 0012 at 251 pa and 1370 pa fig. 14 equivalent stress for eppler 417 at 251 pa and 1370 pa fig. 15 total deformation for naca 4412 at 251 pa and 1370 pa 250 b. somaiday, i. czajka, m. a. r. yass fig. 16 total deformation for naca 0012 at 251 pa and 1370 pa fig. 17 total deformation for eppler 417 at 251 pa and 1370 pa table 7 cfd results blade models applied pressure (pa) total deformation (mm) equivalent stresses (mpa) equivalent strain (mm/mm) eppler 417 14.25 0.1016 0.0653 3.49e-06 251 1.7888 1.1505 6.15e-05 986.4 7.0299 4.5213 2.42e-04 1370 9.7638 6.2796 3.35e-04 naca 0012 14.25 0.4138 0.1628 8.70e-06 251 7.2898 2.8688 1.53e-04 986.4 28.648 11.274 6.02e-04 1370 39.789 15.659 8.37e-04 naca 4414 14.25 2.05e-04 1.49e-04 8.13e-09 251 3.61e-03 2.62e-03 1.43e-07 986.4 1.42e-02 0.01028 5.63e-07 1370 1.97e-02 0.01427 7.81e-07 6. conclusions the effects of several cross-section airfoil types on the effectiveness of hawt blade efficiency were studied. analysis was done on three different airfoils: supercritical (eppler 417), asymmetric (naca 0012), and symmetric (naca 4412). the analyses that were performed combined theory and experiment. theoretical analyses were carried out using fortran 90 code and the blade element momentum-based qblade code. the findings of experimental tests supported the conclusions of the theory. at a short angle of attack, supercritical airfoils always produce the highest lift-to-drag ratio. eppler 417 has a high chord length and twist angle, so it generates the highest power. strength analysis of a blade with different cross-sections 251 since the cfd results show that naca 4412 has less total deformation and equivalent stress. this is due to the reason that naca 4412 has a greater cross-section area and stress is inversely proportional to the area. overall eppler 417 is the optimum blade cross-section as it produces more power and has less deformation than the naca 0012. references [1] b. somaiday, i. czajkal, m. a. r. yass, "the influence of cross-section airfoil on the hawt efficiency", in proceedings of the 7th virtual international conference on science, technology and management in energy (energetics 2021), belgrade, serbia, 16-17 2021, pp. 545-551. [2] c. bak et al., "wind tunnel test on wind turbine airfoil with adaptive trailing edge geometry", in proceedings of the 45th aiaa aerospace sciences meeting and exhibit, 2007, p. 1016. [3] d. g. hull, fundamentals of airplane flight mechanics, vol. 19. springer, 2007. [4] n. tenguria, n. d. mittal and s. ahmed, "evaluation of performance of horizontal axis wind turbine blades based on optimal rotor theory", j. urban environ. eng., vol. 5, no. 1, pp. 15-23, 2011. [5] s. a. kale and r. n. varma, "aerodynamic design of a horizontal axis micro wind turbine blade using naca 4412 profile", int. j. renew. energy res., vol. 4, no. 1, pp. 69-72, 2014. [6] f. javed, s. javed, t. bilal and v. rastogi, "design of multiple airfoil hawt blade using matlab programming", in proceedings of the ieee international conference renewable energy resources application, 2016, pp. 425-430. [7] a. h. muheisen, m. a. r. yass and i. k. irthiea, "enhancement of horizontal wind turbine blade performance using multiple airfoils sections and fences", j. king saud univ. eng. sci., 2021. [8] w. w. mo, d. y. li, x. n. wang, c. t. zhong, "aeroelastic coupling analysis of the flexible blade of a wind turbine", energy, vol. 89, pp. 1001-1009, 2015. [9] o. t. filsoof, a. yde, p. bøttcher and x. zhang, "on critical aeroelastic modes of a tri-rotor wind turbine", int. j. mech. sci., p. 106525, 2021. [10] m. sayed, l. klein, th. lutz and e. kramer "the impact of the aerodynamic model fidelity on the aeroelastic response of a multi-megawatt wind turbine", renew energy, vol. 140, pp. 304-318, 2019. [11] m. capuzzi, a. pirrera and p. m. weaver, "a novel adaptive blade concept for large-scale wind turbines. part ii: structural design and power performance", energy, vol. 73, pp. 25-32, 2014. [12] l. dai, q. zhou, y. zhang, s. yao, s. kang and x. wang, "analysis of wind turbine blades aeroelastic performance under yaw conditions", j. wind eng. ind. aerod., vol. 171, pp. 273-287, 2017. [13] d. ju and q. sun, "modeling of a wind turbine rotor blade system", j. vib. acoust. trans. asme, vol. 139, pp. 1–15, 2017. [14] s. wan, l. cheng and x. sheng, "numerical analysis of the spatial distribution of equivalent wind speeds in large-scale wind turbines", j. mech. sci. technol., vol 31, no. 2, pp. 965-974, 2017. [15] y. wang, y. kamada, t. maeda, j. xu, s. zhou, f. zhang and c. cai, "diagonal inflow effect on the wake characteristics of a horizontal axis wind turbine with gaussian model and field measurements", energy, vol 238, p. 121692, 2022. [16] m. ragheb and a. m. ragheb, "wind turbines theory-the betz equation and optimal rotor tip speed ratio", fundam. adv. top. wind power, vol. 1, no. 1, pp. 19-38, 2011. [17] m. a. r. yass, "highest power coefficient of horizontal axis wing turbine (hawt) using multiple airfoil section", test eng. manag., vol. 83, pp. 30029-30041, 2020. instruction facta universitatis series: electronics and energetics vol. 27, n o 3, september 2014, pp. 435 453 doi: 10.2298/fuee1403435n wireless sensor node with low-power sensing  goran nikolić 1 , mile stojčev 1 , zoran stamenković 2 , goran panić 2 , branislav petrović 1 1 faculty of electronic engineering, university of niš, niš, serbia 2 ihp innovations for high performance microelectronics, frankfurt, germany abstract. wireless sensor network consists of a large number of simple sensor nodes that collect information from external environment with sensors, then process the information, and communicate with other neighboring nodes in the network. usually, sensor nodes operate with exhaustible batteries unattended. since manual replacement or recharging of the batteries is not an easy, desirable or always possible task, the power consumption becomes a very important issue in the development of these networks. the total power consumption of a node is a result of all steps of the operation: sensing, data processing and radio transmission. in most published papers in literature it is assumed that the sensing subsystem consumes significantly less energy than a radio block. however, this assumption does not apply in numerous applications, especially in the case when power consumption of the sensing activity is comparably bigger than that of a radio. in that context, in this work we focus on the impact of the sensing hardware on the total power consumption of a sensor node. firstly, we describe the structure of the sensor node architecture, identify its key energy consumption sources, and introduce an energy model for the sensing subsystem as a building block of a node. secondly, with the aim to reduce energy consumption we investigate joint effectiveness of two common power-saving techniques in a specific sensor node: duty-cycling and power-gating. duty-cycling is effective at the system level. it is used for switching a node between active and sleep mode (with the dutycycle factor of 1%, the reduction of in dynamic energy consumption is achieved). power-gating is used at the circuit level with the goal to decrease the power loss due to the leakage current (in our design, the reduction of dynamic and static energy consumption of off-chip sensor elements as constituents of sensing hardware within a node of is achieved). compared to a sensor node architecture in which both energy saving techniques are omitted, the conducted matlab simulation results suggest that in total, thanks to involving duty-cycling and power-gating techniques, a three order of magnitude reduction for sensing activities in energy consumption can be achieved. key words: wireless sensor networks, sensor elements, power cosumption, duty-cycling, power-gating   received february 18, 2014; received in revised form may 29, 2014 corresponding author: goran nikolić faculty of electronic engineering, aleksandra medvedeva 14, 18000 niš, serbia (goran.nikolic@elfak.ni.ac.rs) 436 g. nikolić, m. stojĉev, z. stamenković, g. panić, b. petrović 1. introduction wireless sensor networks, wsns, consist of a large number of sensor nodes, sns, deployed randomly (or in some specific places) within a restricted area. applications for wsns range from consumer electronics, military target tracking, industrial monitoring, health monitoring, home environmental control, forest fire detection, greenhouse monitoring, etc [1]. since sns are usually battery-powered devices and operate unattended for a relatively long period of time, maximizing energy efficiency of sn is critical [1], [2]. typically, this constraint is imposed by the limited capacity of the sn's battery [3]. to optimize the design of sn, an accurate power consumption model, which allows a good forecast of battery lifetime, is needed. in order to extend the lifetime of sn, a wide variety of techniques for minimizing sn's energy consumption have been proposed in literature [4], [5], [6]. some of them deal with saving energy at mac (media access control) level [7], [8], [9], others at routing protocols [10], [11], [12], third with dissemination data aggregations or fusion [13], [14], fourth with involving novel architectures that utilize the optimized radio and digital parts [15], [16], [17], fifth employ on-chip power gating in order to reduce the static power loss [18], [19]. to address the problem of power saving within a sn, two promising approaches based on dynamic voltage scaling [20], [21] and power gating [22], [23] are used. the first represents a useful solution for high performance sns, while the second is effective in sns operating with low duty-cycle where the sns alter between off and on states to minimize the energy consumption [22], [16], [17]. sns, as constituents of wsn, are capable of performing computation, communication and sensing of oriented tasks. accurate prediction of the sn lifetime requires an accurate energy consumption model and estimation of sensor activities. the energy model which accurately reveals the energy consumption of sn is an extremely important part of the protocol development, sensor node micro-architecture design (radio, microcontroller and sensing subsystem), battery capacity, and performance evaluation in wsns. there have been various attempts to model sn energy consumption. in [24] a model that includes mcu processing and radio transmission and receiving is considered. in [25] and [26] sensing activities including sensor sensing, sensor logging and actuation are omitted. in [23] a comprehensive energy model for wsn that takes into account all key energy consumption sources within a sn is described. by studying component energy consumption in different sn states the authors in [27] present the energy models of the sn core components. in [28] a combination of two complementary approaches intended to reduce the energy consumed by a sensor node, duty cycling (waking up a sensing board only for the time needed to acquire a new set of samples and powering it off immediately afterwards) and adaptive sensing strategy (a huge computation approach which is able to dynamically adapt the sensor activity to the real dynamics of the process) is proposed. as is reported in [4], [29], [30], on time radio operation dominates the system power budget for order of magnitude in respect to the other two operations (data processing and sensing) combined, even when the radio module operates at a low duty cycle (approximately from 1 to 2 %). since data processing and sensing activities account for a small fraction of power budget, the authors suggest that sn's lifetime improvement requires a significant reduction in communication activities. however, our current research shows that by using a more realistic power consumption model of the sensing subsystem which clearly separates the power consumption of each sensor element, it is possible to derive clearer wireless sensor node with low-power sensing 437 results which provide insight into which sensing elements are limiting the wsn performance. in other words, in this work we extract the impact of sensing hardware on the total power consumption and point to the fact that the contribution of the sensing subsystem to the total power consumption of the sn cannot be neglected (ignored) especially in the case when wsns with medium(high-) energy consuming sensor elements are used. in other words, the main novelty presented in this paper deals with involving a joint combination of two common power saving techniques (duty cycling and power gating) during the operation of a sensor node. due to space constraints this paper concentrates only on sensing subsystem power consumption. for discussions on wireless communications and data processing activities, readers can refer to the following papers [6], [27], [30], [31]. the rest of the paper is organized as follows. in section 2, sensor node architecture is involved and operating functionalities of all constituents are identified. in addition, details which deal with specifics of connectivity at sensor elements and the power supply are given. section 3 concentrates on sensor node energy profile. justification of involving two power saving techniques, duty-cycling, at system level, and off-chip power-gating, at sensing subsystem level is discussed, too. section 4 deals with power estimation. also, the energy profile during initialization and sensing activities is calculated. section 5 concludes the paper. 2. sensor node architecture an overall hardware structure of a sn is presented in fig. 1 fig. 1 overall block scheme of a sensor node. the sn consists of several building blocks: a) mcureferred as a processing subsystem, controls the operation of all constituents within the sn and performs data processing. the mcu includes microcontroller and memory for local data processing. most existing processing subsystems employ microcontrollers, notably texas instruments' msp430, intel's strong arm, or atmel's avr. these microcontrollers enable some of their internal components to be turned-off completely when they are idle or sleep. cmos compatible memories including static random-access memory, sram, and embedded dynamic random-access memory, dram, permit sns to perform more complex digital signal processing algorithms (collection, aggregation, and compression) and log more sensor data. 438 g. nikolić, m. stojĉev, z. stamenković, g. panić, b. petrović b) off-chip sensor elements (ocse) – called a sensing subsystem, implemented as a set of passive and active sensors (digital or analog) convert input information from the external environment into electrical signals. in most applications, wireless sns are used for monitoring light, pressure, vibration, flow rates in pipelines, temperature, ventilation, electricity, etc. commonly, sensor elements generate voltage or current signals at their outputs. these signals are first amplified (conditioned) and then digitized with an analogto-digital converter, adc, before data are digitally processed, stored and transmitted. c) radio block (rb) – implemented as a short range transceiver which provides wireless communication with the host or sns within a wsn. the power consumption of a transceiver can be reduced both at: i) the circuit level by developing more energy-efficient rf circuits (using weak inversion operation in the rf building blocks, rf-mems passive components, ultra-wideband transceivers which send narrow pulses of energy to transmit data), and ii) at a system level by using rf communication (including shortening the communication distance, minimizing the amount of data sent over the rf link or using energy-efficient communication protocols, or powering down the transceiver during idle periods, i.e., using a duty-cycling concept). for more details about this problematic see reference [30]. d) battery supply unit (bsu) – is a part of the power subsystem acting as a controllable unit which individually switches on/off the power supply of each sn's building blocks. bsu is responsible for providing the right amount of supply voltage to each individual sn hardware component. a bulky battery is included in the bsu to power the sn's subsystems. the bsu is a very important building block of the sn intended to improve the wsn lifetime, and therefore numerous techniques based on the efficient exploitation of energy resources have been introduced with the aim to prolong the wsn lifetime. for more details see [31]. as we have already mentioned, currently, sns are powered by batteries. however, batteries are characterized by several disadvantages, including: i) the need to either replace or recharge then periodically; and ii) being of a big size and weight compared to sn electronics. one promising solution to overcome these drawbacks is to harvest energy from the environment to either recharge a battery or even to directly power the sn. as is presented in table 1, the energy harvesting circuits can be classified into two groups. table 1 classifications of energy harvesting circuits energy source type of energy human kinetic, thermal environment kinetic, thermal and radiation for more details see references [32], [33]. among the most popular harvesting circuits used in sns are those based on converting solar energy, as a radiation type of energy. the main advantages for using solar energy are as follows: i) it is excellent in remote or difficult access location; ii) it is a totally clean and renewable source; iii) for supplying small current loads such as sns; and iv) in any country the use of solar energy like this is feasible throughout the entire territory. depending on the specific application, sns may also include additional components like the location finding system to determine their position, a mobilizing unit to change their location, etc. more details about sn architectures and functionalities of their wireless sensor node with low-power sensing 439 building blocks can be found in [34]. different types of communication interfaces, such as parallel and serial buses interconnect the aforementioned subsystems. among serial buses the most frequently used interconnects are spi (serial peripheral interface) and i 2 c (interintegrated circuit). a spi is a preferable design solution for high-speed, while i 2 c for low-speed communication. today's wireless sn is a simple device, and its components that make up its subsystems are commonplace off-the-shelf components usually located on a printed circuit board. 2.1. connecting sensor elements within an sn architecture, sensor elements can be implemented as: a) on-chip constituents typical for future generation (advanced system-on-chip, soc design) of wireless sn designs, and b) off-chip constituents sn composed of discrete components typical for currently common market available (on-the-shelf) wireless sn systems. the recent progress in ultra-low power circuit design is creating new opportunities in sn architectures with on-chip for temperature and image sensor elements [35], [36]. important advances have been made to achieve millimeter-scale sn and standby power as low as 30 pw [37], or microwatt successive approximation register sar-adc with the figure of merit down to 4.4 fj per conversion step [38], but many design challenges remain yet open. our design choice is based on the use of the off-the-shelf components. such solution implies that sensor elements are of the off-chip type, i.e., externally connected components to the adc (in our proposal adc is a constituent of the mcu). in this paper, by involving adequate energy models, we will consider implementations of duty-cycling and powergating techniques and investigate how to reduce the dynamic and static power when both power saving approaches are used. 2.2. power supply subsystems in a sn, each subsystem/circuitry requires different supply voltage for its operation. for example, in most common currently used designs, the mcu and other digital circuits can run at supply voltage which ranges from 3 v to 1.8 v. analog components such are rf transceiver and sensor elements, in order to provide correct operation and noise margins, require higher supply voltages which range from 1.2 v to 2.5 v. batteries (lithium 3.3 v−4.2 v) incorporated as power sources in sns are limited in their output voltage by their chemistries, and their voltages degrade with use. since battery voltages do not usually match the desired subsystem/circuit supply voltages, switching dc-to-dc or linear low drop-out voltage regulator power converting electronics is used. bearing in mind that a current consumption of sn is within a range of several tens of ma (in active mode) down to several a (in sleep mode) the power electronics must be specifically designed for a low-power operation. as a preferable solution, we propose linear low drop-out voltage regulator for powering the sn subsystem. in general, for powering lowlevel of power devices, such as sn, the linear low drop-out voltage regulator has a better performance in respect to dc-to-dc converter (dc-to-dc converters are usually designed for high output power levels and do not efficiently convert the low level of the power needed by sns [30]). 440 g. nikolić, m. stojĉev, z. stamenković, g. panić, b. petrović before we start describing the principle of operation of bsu (see fig.1), it is necessary to first explain the meaning of the following two terms: power gating and duty cycle. power gating is a technique used in integrated circuit design to reduce power consumption, by shutting off the current to blocks of the circuit that are not in use [39]. a duty cycle is the percentage of one period in which a signal is active. a period is the time it takes for a signal to complete an on-and-off cycle [40]. in our case, the time interval during which the sn is on or off is known as its active, ton, and inactive (sleep) time interval, toff, respectively. according to the previous, the duty cycle (dc) is defined as: ( ) on on off dc t t t  (1) the focus of our interest in this paper is the implementation of a power distribution system, as part of bsu which relates to switching on/off both the sensor elements within ocse and the transceiver (as a constituent of rb) in a timely defined manner. by using a combination of a duty-cycling which relates to powering the sn at a system level, and power-gating technique intended to power the sn at a sensor element level, a significant saving of dynamic and static power during sn operation can be achieved. a global scheme of bsu is presented in fig. 2. it consists of: a) battery – acts as a main energy source for powering sn's functional unit; b) dual-channel controllable ldo regulator – implemented as single-input (in1) twooutput (out1 and out2) linear low drop-out voltage regulator. by setting the control enable signals en1 and en2 to logic one/zero, the voltage at the outputs out1 and out2 can be switched on/off. at the output out1 voltage is always present, since en1= {1}, while voltage at the output out2 can be switched on/off by setting en2= {1/0}, respectively. power-gating for ocse is achieved by switching on/off the pin out2 (output of the ldo, see fig. 2). dual-channel controllable ldo regulator -ldocontrollable turn-on/off load switch -cls1controllable turn-on/off load switch -clsncontrollable turn-on/off load switch -cls2battery uninterruptable power supply line to mcu ... analog or digital sensor 1 -se1analog or digital sensor 2 -se2analog or digital sensor n -senon/off power supply line out1 out2 en1 en2 global power-gating enable line for sensor block n individual power-gating enable lines for sensor element from mcu in off-chip sensor elements battery supply unit controllable turn-on/off load switch on/off power supply line to rf block power-gating enable line from mcu to adc or spi part of mcu ... fig. 2 power distribution system of sensor node http://en.wikipedia.org/wiki/integrated_circuit http://en.wikipedia.org/wiki/electric_power http://en.wikipedia.org/wiki/electric_current http://en.wikipedia.org/wiki/frequency http://en.wikipedia.org/wiki/turn_%28geometry%29 wireless sensor node with low-power sensing 441 c) controllable turn on/off load switches (clss) – each cls is implemented as a pchannel, or n-channel mosfet transistor which can be individually switched on/off. in this manner power-gating at a local control level within the sensing subsystem is provided (i.e., mcu can separately switches on/off the power supply voltage for each sensor element by setting a corresponding control line to logic one/zero). 3. energy profile the proposed wsn considered in this paper is composed of several sns deployed in a restricted area. this system is primarily intended to monitor scalar values like acceleration, space orientation, and audio signals. in this type of application almost all of the mentioned sensor measurements do not need to be taken continuously which implies that the environmental conditions can be periodically sampled. for example, taking one sample per two minutes could be adequate to monitor temperature, pressure, light, humidity, etc. power management is an efficient way to conserve energy in wsn. the crucial idea of power management is to dynamically make the sns inactive in order to reduce their energy consumption, i.e. to decide when a sn should go to the inactive state and the amount of time to stay so. most power management strategies proposed in literature [31], [41] assume that data acquisition (sensing activity) consumes significantly less energy than wireless data transmission [4]. however, in a large number of practical applications, this assumption does not hold, especially in the case when the power consumption of active (not passive) sensor element can be comparable to that of the communication subsystem. similar problem was considered in reference [42], [43]. in order to cope with this challenge in an effective way, we propose to implement the power management concept into two levels, system and component level, respectively. at the first level, a duty cycle technique is used, by which we identify the idle and active time periods of sn's constituents. at the second level, power gating technique is used, by which unutilized sensor elements are switched off while the analyzed sensor element is switched on. in other words, our goal is that during most of the time, the inefficient (unnecessary) power consumption of sensor elements due to not-optimal configuration of hardware and software components is significantly reduced. let us note that a sensor node as an electrical system is time invariant, i.e. the total energy consumption depends on its individual energy consumption components. having this in mind, in the sequel we will separately analyze the effects and benefits of implementation of duty-cycling and power-gating techniques on energy consumption only for the sensing subsystem as sn constituent. 3.1. duty cycling duty cycling is a well-known technique for minimization of power consumption in wireless sns. the main idea behind this is clear: keep hardware (sensing-, communication-, and some parts of powerand processing-subsystems – see fig. 1.) in a low power sleep state, except during instances when the hardware is needed. many realizations of duty-cycling technique allows even the mcu to be put into a low power state for long time periods, while its internal or external clock tracks the time in order to trigger a later wake-up. the wake-up time is the time from activation of the interrupt signal (by a real time clock, rtc, circuit) to the beginning of an interrupt service routine. let us note that, all activities which deal with the duty cycle 442 g. nikolić, m. stojĉev, z. stamenković, g. panić, b. petrović operation (switching into different power modes the transceiver, mcu, and low-drop out regulator) are performed by the mcu under software control. the total energy consumed by sn, et, depends on the dynamic (active), ed, and static (leakage) power loss, es. t d s lf d lf s e e e dc t p t p       (2) where, 0 < dc < 1, tlf is the lifetime of a sn, and pd and ps correspond to dynamic and static power, respectively. from eq. 2, the portion of energy lost due to the leakage is 1 1 s dt s e pe dc p   (3) the ratio pd / ps is technology dependent and is proportional to the mos transistor channel properties. similarly as in [18] 1 , taking the corresponding pd / ps for three different cmos technologies, we have calculated the impact of energy loss in respect to the total energy consumption in terms of a dc factor. the obtained results are presented in fig. 3. 00.10.20.30.40.50.60.70.80.9 0 10 20 30 40 50 60 70 80 90 100 duty-cycle e s /e t [% ] energy loss due to leakage as a function of duty-cycle for different cmos technologies for 0.18 um cmos technology for 0.13 um cmos technology for 0.25 um cmos technology fig. 3 energy loss due to leakage as a function of duty-cycle for different cmos technologies for digital components of the sn, similarly as in reference [44], we assume that pd / ps is  1000 for 0.25 m technology,  20 for 0.18 m technology, and  4 for 0.13 m technology. by analyzing fig. 3 we can conclude the following: 1 for the sake of clarity, the reference [18] defines the power consumption in active state (pa = pd + ps) and the power consumption in inactive state (pi = ps) wireless sensor node with low-power sensing 443 1. with cmos technology, scaling the energy loss due to static power increases. in other words, the static power loss is comparable to dynamic power loss (high amount of power is lost due to the leakage currents of cmos circuitry [45]). 2. in standard applications a dc factor of the sn is low ( 1% ), which makes the total system power dominated by the standby power, i.e. static power losses. 3. theoretically, better energy efficiency (achieved by decreasing ed) can be obtained by further decreasing the dc factor. however, in this case the influence of the clock system, as components of sn, on the overall time synchronization accuracy of the wsn becomes critical [46], [47]. namely, the impact of variations in environmental temperature on clock drift in highly duty-cycled wireless sns is emphasized [47]. 3.2. power gating with the aim to switch-off the leakage currents of inactive sensor elements we decide to implement power-gating, because as a design technique it is primarily used to reduce the overall static power loss in a circuit [48]. the efficiency of power gating depends on the activity profile of sn's components. by adapting an event-driven control mechanism we will first present the activity model of a sn at a general level (see fig. 4), and then in section 4 we will study energy consumption issues of sensor elements units (constituents of the sensing subsystem – see fig. 2) that switch-on and –off during the sensing period. sn_active state sn_sleep state see timing in fig. 6 t = 2 min wake-up event generated by on-chip rtc wake-up event generated by on-chip rtc t toffton initialization sensing data processing communication duty cycle profile profile of sn activities during sn_active state events profile power consumption of a single sensor element tsen consumption profile see fig. 5 initialization and sensing activities of sn (a) (b) (c) fig. 4 activity profile of a sensor node 444 g. nikolić, m. stojĉev, z. stamenković, g. panić, b. petrović as can be seen from fig. 4 a), the rtc circuit, as a building block of the mcu, periodically generates an interrupt signal called wakeup. the period of wakeup is t, in our case t = 2 min. the appearance of the signal wakeup initiates a sn and it enters into sn_active state (see fig. 4b)). during sn_active state (fig. 4c)) four sequential activities are performed, initialization, sensing, data processing, and communication. activity initialization deals with restoring the content of mcu registers to the preceding sn_active state and setting peripherals (ldo regulator, controllable load switches, and transceiver – see fig. 2) into the corresponding operating mode. the sensing activity is responsible for information collection and analog-to-digital conversion. the energy consumption during this activity comes from multiple operations, including power-on (-off) switching of sensor elements, signal sampling, and analog-todigital conversion/spi communication. if we assume that n sensor elements are connected to the mcu (see fig. 2), then the total energy consumption of the sensing subsystem, est, can be expressed as: 1 ( ) n st foi ofi wi ci i e n e e e e       (4) where:  efoi (eofi) is the one time energy consumption of opening (closing) sensor element operation – switching sensor element i from off (on) to on (off) state;  ewi – energy consumption during warm-up time period of sensor element i;  eci – energy consumption during analog-to-digital conversion period;  n – number of sn active states during lifetime of a sn, and  n – number of sensor elements in a sn. power consumption profile of a sn during single sensing activity of a sensor element is sketched in fig. 5. power time off on t  wu t con t on off t  off p on p sensor period t fig. 5 power consumption profile of a single sensor element notice: a time interval toffon (tonoff) includes transient time of controllable load switch clsi, i = 1,...,8, and transient time of a sensor element sei wireless sensor node with low-power sensing 445 as is marked in fig. 5, toffon (tonoff) corresponds to a time interval needed for switching the sensor element from off (on) to on (off) state, twu to warm-up time interval, and tcon to analog-to-digital conversion time interval. basic constituents of most sensor elements are analog circuits (input and output amplifiers, active filters, etc.). in analog circuits, the power gates must be turned-on long enough before the active system operation in order to allow the circuits to reach a stable dc state. this implies that both the sensor elements and their coupling with the source of stimulus cannot always respond instantly. namely, the sensor is characterized with a timedepended characteristic, and a delay (latency) appears in representing a true value of a stimulus. in fig. 5 this delay corresponds to the warm-uptime twu. in essence, warm-up time is the time between applying to the sensor power or excitation signal and the moment when the sensor can operate within its specified accuracy [48]. the warm-up time depends on the type of sensor. many sensors may have a negligible short warm-up time (in the range from 100 s up to1ms ), but those that operate in a thermally or humidity controlled environments, such as a thermostat and humidity sensor, may require from several hundred up to seconds or minutes of warm-up time after powering-up only the sensor elements. from the aspect of energy consumption, a sensor with a shorter warm-up time causes a lower amount of power loss. let us assume that all sensors are homogenous. this means that for i, i = 1,...,n, the following is valid ewi = ew, eci = ec and efo = eof = es. according to the aforementioned, the eq. (4) now has the form ( )2 st s w c e n n e e e      (5) the total energy consumed during warm-up time is tw w e n n e   (6) if we take that in average, es = 0.1ec, then a portion of energy due to the warm-up is (2 ) 1 1.2 1 tw w st s w c c w e n n e e n n e e e e e            (7) if we further take that ew = k  ec where k is an real number, the portion of energy loss, etw / est, in terms of ew is presented in table 2. table 2 a portion of energy lost in term of k k 0 0.1 0.2 0.5 0.8 1 2 5 10 20 50 100 1000 ∞ etw / est 0 0.077 0.143 0.294 0.400 0.454 0.625 0.800 0.892 0.943 0.976 0.989 0.999 1 by analyzing the results presented in table 2 we can conclude that as warm-up time increases the portion etw / est asymptotically brings closer to value 1 . this means that at the lower limit, twu = 0, the total energy loss is est = n * n * (2es + ec), and at the upper 446 g. nikolić, m. stojĉev, z. stamenković, g. panić, b. petrović limit twu  , the total energy loss is est  n * n * ew, i.e. ew becomes dominant. in general, better design solution concerning etw is one in which tsw  0, but in this case the sensor elements are all time active. as a direct consequence of this approach the power consumption of a sensing subsystem will be high. to cope efficiently with this problem, involving of power gating technique represents a good compromise. but in such a solution, the sensor warm-up time cannot be ignored when sn's energy model is considered. 4. power estimation in this article we continue our work [49], and present a complete energy consumption profile of the wireless sensor node during the activities initialization, and sensing, only within the sn_active state. in our case, the sensing subsystem ocse (see fig.2) is composed of eight sensor elements, se1, ..,se8. sensor elements from se1up to se7 are of analog type and drive the on-chip adc (as a component of mcu (msp430fr59xx)). these sensor elements are used for sensing temperature (lmt87), humidity (sht21s), acceleration (adxl377), ambient light (isl76671), position (ss345pt), motion (l3g3250a), and audio microphone (mp33ab01), respectively. the last sensor (t5400) is used for measurement pressure and it transfers data to mcu via an spi interface. for more details about electrical and time specifications of sensor elements see farnell website [50]. the power supply voltage out 2 = 3v (marked as vout2 output of a low-drop out dual-channel voltage regulator tlv716 [51]). aclsi is implemented as a p-channel mosfet transistor tps22908 [52] (see fig.2). electrical and time specifications (found in the devices documentations and determined by direct measurements) and energy consumption per sensor element (determined by calculation and direct measurements) are presented in table 3. table 3 electrical and timing specifications, and calculated energy consumption per sensor element s e n so r e le m e n t t y p e s e n so r o u t2 [v ] s e ia v . c u rr e n t [m a ] t o f f -o n [m s] e f o [u j] t w u [m s] e w [u j] t c o n * [m s] e c [u j] t o n -o f f [ m s] e o f [u j] 1 lmt87 3 0.0041 2.01 0.012  n.a. 0 .0 0 3 5 0.000043 0 .0 0 5 0.000031 2 sht21 0.1811 150.11 40.778 8000 4346.400 0.001902 0.001358 3 adxl377 0.3011 5.11 2.308  n.a. 0.003162 0.002258 4 isl76671 0.0361 0.205 0.011 0.350 0.038 0.000379 0.000271 5 ss345pt 3.0011 0.11 0.495 0.0015 0.014 0.031512 0.022508 6 l3g3250a 6.3011 0.11 1.040 0.3 5.671 0.066162 0.047258 7 mp33ab01 0.3011 0.11 0.050  n.a. 0.003162 0.002258 8 t5400 0.7911 2.61 3.097 10 23.733 16 37.9728 0.005933 notice: conversion time tcon is determined by sar-adc, as constituent of mcu, and for 12-bit resolution and it is 3.5 s (identical for all sensor elements); n.a. stands for not available data from catalog wireless sensor node with low-power sensing 447 according to eq. (4) and data presented in table 3, under the assumption that n = 1 and n = 8, the estimated energy consumption of our design during powering-up of sensor elements (initial phase of sensing activity) can be expressed as 8 1 47.791 0.081875 4375.856 38.07912) 4461.808 ( ) st foi ofi wi ci i e e e e e j            (8) let us note that this value corresponds to the worst-case of energy consumption for all clsi and sei during the sensing activity (namely, after powering-up of sensor elements this activity happens only once during the life-time of the sensor node. it is typical for sensor element stabilization to environment conditions. therefore, its impact, concerning power estimation, can be neglected). 4.1. energy profile during initialization and sensing activities with the aim to determine the total energy consumption during the active state of a sensor node it is necessary to take into account the energy loss of other building blocks (mcu and bsu (see fig. 1 and 2)) during the time period tsen (see fig. 4). a detailed timing diagram during initialization and sensing activities is presented in fig. 6. as can be seen from fig. 6a) the initialization activity begins at tstarton and ends with t1. the activity sensing deals with the right part of fig 6 a), time interval from t1 to t2, continues with fig. 6 b), time interval from t2 to t3, and ends with the left part of fig. 6c), time interval from t3 to tendon. the right part of fig. 6c) includes data processing and communication activities, time interval from tendon to tstartoff, and sn_sleep state, time interval from tstartoff to tendoff. duration of a time interval from tstarton to tendoff is 2min. in table 4, details concerning time interval durations of all activities during initialization and sensing activities (defined in fig. 6) including the average current and energy consumption for each time-subinterval are given. total time duration of initialization and sensing activities is tsen = 191.036ms and the corresponding energy consumption during this period is 278.31j. timing diagrams and power consumption profile during initialization and sensing activities (obtained by matlab,) are presented in fig. 7. figure subplot 1 (down-left part of fig. 7) deals with the initialization activity and acquiring data from se1 and se2. figure subplot 2 (down-right part of fig. 7) refers to acquiring data from se3 to se8. 448 g. nikolić, m. stojĉev, z. stamenković, g. panić, b. petrović (a) (b) mcu current first instruction of the user program is executed activate ldo mcu wake up ldo out2 current cls1 current activate load switch ldo shutdown current out 2 ldo quiescent current out2 ldo quiescent current out 1 n *clsx leakage current clsx quiescent current and (n-1) * clsx leakage current activate temperature sensor s1 current s2 current activate humidity sensor cls2 current tstart-on t1 t2 twum tin twul twls tws1 tadc tcmt tws1 tadc tcmt sensinginitialization mcu current ldo out2 current cls3 current ldo quiescent current out2 ldo quiescent current out 1 clsx quiescent current and (n-1) * clsx leakage current s3 current s4 current s5 current cls4 current cls5 current s6 current cls6 current activate accelerometer sensor activate ambient light sensor activate position sensor activate motion sensor t2 tcmt tws3 tadc tcmt tws4 tadc tcmt tws5 tadc tcmt tws6 tadc tcmt t3 sensing mcu current ldo out2 current ldo quiescent current out2 ldo quiescent current out 1 clsx quiescent current and (n-1) * clsx leakage current s7 current activate microphone sensor cls7 current s8 current activate pressure sensor cls8 current ldo shutdown current out 2 n *clsx leakage current lpm 3.5 operating mode tcmt tadc tcmttws7 tws8 tspi tol t3 tend-on tstart-off tend-off data processing & communication sn_sleep sensing (c) fig. 6 profile of power consumption during sensing activity wireless sensor node with low-power sensing 449 table 4 time interval duration, average current and energy consumption during initialization and sensing activities for each mcu and ocse sub-interval time interval duration [ms] average current [ma] energy [j] twum 0.3500 0.290 0.1522500 tin 1.0000 0.495 1.4850000 twul 0.9000 0.495 0.6682500 twls 0.1600 0.570 0.1368000 tws1 1.9000 0.590 1.6815000 tadc 0.0035 0.720 0.0075600 tcmt 0.0050 0.075 0.0005625 0.1600 0.078 0.0187200 tws2 150.000 0.729 164.02500 tadc 0.0035 0.877 0.0092085 tcmt 0.0050 0.075 0.0005625 0.1600 0.235 0.0564000 tws3 5.0000 0.869 6.5175000 tadc 0.0035 1.017 0.0106785 tcmt 0.0050 0.075 0.0005625 0.1600 0.375 0.0900000 tws4 0.4450 0.604 0.4031700 tadc 0.0035 0.752 0.0078960 tcmt 0.0050 0.075 0.0005625 0.1600 0.110 0.0264000 tws5 0.0015 3.569 0.00803025 tadc 0.0035 3.717 0.0390285 tcmt 0.0050 0.075 0.0056250 0.1600 3.075 0.7380000 tws6 1.0000 6.869 10.303500 tadc 0.0035 7.017 0.0736785 tcmt 0.0050 0.075 0.0056250 0.1600 6.375 1.5300000 tws7 3.0000 0.869 3.9105000 tadc 0.0035 1.017 0.0106785 tcmt 0.0050 0.075 0.0056250 0.1600 0.078 0.0187200 tws8 10.000 1.359 20.385000 tspi 16.000 1.373 65.904000 tol 0.1000 0.496 0.0744000 notice: where twum – wake-up time of the mcu; tin – mcu initialization; twul – out2 wake-up time of the ldo; twls – wake-up time of the cls; twsx – warm-up time of a sensor x={1,2, ..,8}; tadcx – conversion time x={1,2, ..,7}; tcmt – switching time which includes tturn-off(lsx+sx) + t turnon(lsx); tspi – spi time; tol – time-off ldo 450 g. nikolić, m. stojĉev, z. stamenković, g. panić, b. petrović fig. 7 diagrams of power consumption during initialization and sensing activities for mcu and ocse blocks in order to evaluate the performance of our design concerning energy reduction, we have compared the following two design solutions: a) total energy consumption of a sensor subsystem epg during initialization and sensing activities, with the implemented duty-cycling and power-gating techniques (epg = 638j ); and b) total energy consumption of a sensor subsystem ewpg without implementation of duty-cycling and power-gating techniques (ewpg = 3.92 j). the estimated ratio is ewpg / epg = 6146. the obtained result justifies the involvement of both power saving techniques in a sensing subsystem of a wireless sensor node. 5. conclusion wireless sensor nodes place sensor elements in the physical world in order to gather information. this activity consumes energy. due to the limited battery capacity, energy conservation becomes a goal. this paper attempts to provide a comprehensive insight into aspects of energy consumption of a sensing subsystem within a sensor node architecture. in order to achieve reduction in energy consumption in a sensor node operation, we propose using a combination of two power saving techniques. the first one, called dutycycling, is used for power reduction at a system level, i.e. switching on/off the sensor node architecture between active and sleep state. the second one, referred to as power wireless sensor node with low-power sensing 451 gating, is intended for switching on/off the sensor elements (constituent of the sensing subsystem within sn), during acquiring information from the external environment. the obtained results based on the analysis and validation by matlab show that on average, three order of reduction in energy consumption can be achieved when the mentioned two techniques intended for power saving are implemented with respect to the case when they are turned off. for the time period of two minutes the energy consumption when the two techniques are used is 638 j compared to 3.92 j in the case when the duty-cycling and powergating techniques are turned off. acknowledgement: this work was supported by the serbian ministry of education, science and technological development, project no. tr-32009 – “low-power reconfigurable fault-tolerant platforms”. references [1] i. f. akyildiz, and m. c.vuran, "wireless sensor networks", john wiley & sons ltd, 2010 [2] a.j. goldsmith, and s. b. wicker, "design challenges for energy constrained ad hoc wireless networks", ieee wireless communications, 2002, vol. 9, no. 4, (pp. 8-27) [3] g. pistoria, "battery operated devices and systems", elsevier bv., amsterdam, the netherlands, 2009 [4] v. raghunathan,s. ganerival, andm. srivastava, "emerging techniques for long lived wireless sensor networks", ieee communication magazine, 2006,vol.41, no. 4,(pp. 130-141) [5] g.anastasi, m. conti, m. di francesco, and a.passarella, "energy conservation in wireless sensor networks: a survey", ad hoc networks, 2009, vol. 7, (pp. 537–568) [6] m. n. halgamuge, m. zukerman, and k. ramamohanarao, "an estimation of sensor energy consumption, progress in electromagnetics research b", 2009, vol. 12, (pp. 259-295) [7] w. ye, j. heidemann, and d. estrin, "an energy-efficient mac protocol for wireless sensor networks," proc. ieee infocom, new york (usa) 2002, (pp. 1567-1576). [8] m. al ameen, s.m. riazul islam, and k.kwak, "energy saving mechanisms for mac protocols in wireless sensor networks", hindawi publishing corporation international journal of distributed sensor networks, volume 2010 (2010), article id 163413, (pp 1-16) [9] m. r. ahmad, e.dutkiewicz, and x. huang (2011), "a survey of low duty cycle mac protocols in wireless sensor networks", ch. 5,(pp. 69 – 90), in "emerging communications for wireless sensor networks", eds. a. foerster and a. foerster, pub. by intech, 2011, rijeka, croatia [10] j. n. al-karaki and a. e. kamal, "routing techniques in wireless sensor networks: a survey,", ieee wireless communications, 2004, vol. 11, no. 6, (pp. 6-28). [11] e. y. lin, "a comprehensive study of power-efficient rendezvous schemes for wireless sensor networks", phd thesis, university of california, berkeley, 2005 [12] e. a. lin, j. m. rabaey, and a.wolisz, "power-efficient rendez-vous schemes for dense wireless sensor networks", in proceeding of icc2004, paris, france, june 2004, vol.7, (pp. 3769 – 3776) [13] m. hempstead, n. tripathi, p. mauro, g.-y. wei, and d. brooks, "an ultra low power system architecture for sensor network applications," proc. 32nd annual international symposium on computer architecture, madison (usa) 2005, (pp. 208-219). [14] a. boulis, s. ganeriwal, and m. srivastava, "aggregation in sensor networks: an energy accuracy trade-off", ad hoc networks, vol. 1, 2003, (pp. 317–331) [15] b. h. calhoun, d. c. daly, n. verma, d. finchelstein, d. d. wentzloff, a. wang, s.-h. cho, and a. p. chandrakasan, "design considerations for ultra-low energy wireless micro-sensor nodes," ieee trans. computers, 2005, vol. 54, no. 6, (pp. 727-740) [16] c. lynch, and f. o'reilly, "processor choice for wireless sensor networks", workshop on real-world wireless sensor networks, realwsn'05, stockholm, sweden, 20-21 june 2005, (pp. 1-5) [17] d. singh, "micro-controller for sensor networks", msc. th., department of computer science and engineering, indian institute of technology, kharagpur, india, may 2008 452 g. nikolić, m. stojĉev, z. stamenković, g. panić, b. petrović [18] g. panić, z.stamenković, and r.kraemer, "power gating in wireless sensor networks", wireless pervasive computing, 2008.iswpc2008. 3rd international symposium on, santorini,greece,may 2008, (pp. 499-503) [19] h. jiang, m. marek-sadowska, and s. nassif, "benefits and costs of power-gating technique", proc. ieee int'l conf. computer design: vlsi in computers and processors (iccd '05), san jose, ca, usa, 2-5. oct. 2005, (pp. 559-566) [20] t. burd, and r. brodersen, "energy efficient microprocessor design", kluwer academic publishers, norwell ma, usa, 2002 [21] n. weste and d. harris, "integrated circuit design", pearson education, boston, usa, 2011 [22] g. panić, d. dietterle, and z. stamenković, "architecture of a power-gated wireless sensor node" , proc. 11 th euromicro conference on digital system design, 2008, parma, italy, (pp. 844-849) [23] y. lee, g. chen, s. hanson, d. sylvester and d. blaauw, "ultra-low power circuit techniques for a new class of sub-mm 3 sensor nodes", custom integrated circuits conference (cicc), 2010 ieee, 1922 sept. 2010, san jose, ca, usa, (pp. 1 – 8) [24] w. heizelman, a. chadrakasan, and h. balakrishnan, "an application-specific protocol architecture for wireless micro-sensor networks", ieee trans. on wireless communications, vol. 1, no. 4, oct. 2002, (pp. 666-670) [25] j. zhu, and s. papavassilion, "on the energy-efficient organization and the lifetime of multi-hop sensor networks", ieee communication letters, vol. 7, no. 11, nov. 2003, (pp. 537-539) [26] m. mille and n. vaidya, "a mac protocol to reduce sensor network energy consumption using a wake-up radio", ieee trans on mobile computing, vol. 4, no. 3, may, 2005, (pp. 228-242) [27] h.y. zhou, d. luo, y. gao, and d. zuo, "modeling of node energy consumption for wireless sensor networks", wireless sensor networks, vol. 3, 2011, (pp. 18-23) [28] c. alippi, g. anastasi, m. di francesco, and m. roveri, "energy management in wireless sensor networks with energy-hungry sensors", ieee instrumentation and measurement magazine, vol.12, no. 2, april 2009, pp. 16-23 [29] p. dutta, d. culler and s. shenker, "procrastination might lead to a longer and more useful life", in proceedings of the acm sixth workshop on hot topics in networks (hotnets-vi), 2007, atlanta, georgia, usa, (pp. 1-7) [30] g. chen, s. hanson, d. blaauw, and d. silvester, "circuit design advances for wireless sensing applications", proceedings of the ieee, vol. 98, no. 11, november 2010, (pp. 1808-1826) [31] w. dargie, “dynamic power management in wireless sensor networks: state-of-the-art”, sensors journal, ieee, vol. 12, no. 5, 2012, (pp. 1518 1528) [32] l. mateu; and f. moll, "review of energy harvesting techniques and applications for microelectronics", proc. spie 5837, vlsi circuits and systems ii, seville, spain, may 09, 2005, (pp. 115 ); [33] s. beeby, and n. white, "energy harvesting for autonomous systems", artech house, norwood, ma usa, 2010 [34] m.a.m viera, c.n. coelho, d.c. da silva jr., j.m. mata, ”survey on wireless sensor network devices”, ieee conference emerging technologies and factory automation, lisbon, portugal, 16-19 sept. 2003, vol.1, (pp. 537-544) [35] a.l. aita, m. pertijs, k. makinwa, and j.h. hujsing, "a cmos smart temperature sensor with a batchcalibrated inaccuracy of ±0,25 0 c(3δ) from -70 0 c to 130 0 c" in proceedings of the ieee solid state circuits conference, san francisco, ca, usa, feb. 2009, (pp. 342-343,343a) [36] s. hanson and d. sylvester, "a 0.45-0.7 v sub-microwatt cmos image sensor for ultra-low power applications", in proceedings of the symposium on very large scale integration (vlsi) circuits, vol. 1, kyoto, japan, jun. 2009, (pp. 176-177) [37] s. hansen, m. seok, y.s. liu, z.y. fao, d. kim, y. lee, n. liu, d. sylvester, and d. blaauw, "a lowvoltage processor for sensing applications with picowatt standby mode", ieee journal of solid-state circuits, vol. 44, no.4 april 2009, (pp. 1145-1155) [38] n. verma, and a.f.chandrakasan, "an ultra-low energy 12-bit rate resolution scalable sar adc for wireless sensor nodes", ieee journal of solid-state circuits, vol. 42, no. 6, june 2007, (pp. 1196-1205) [39] m. kuorilehto, m. kohvakka, j. suhonen, p. hamalainen, m. hannikainen, and t. d. hamalainen, "ultra-low energy wireless sensor networks in practice: theory, realization and deployment", john wiley & sons ltd, 2007, chichester, uk [40] b. krishnamachari, "networking wireless sensors", cambridge university press 2005, cambridge, uk [41] f. juan, b. lian, and z. hongwei, "hierarchically coordinated power management for target tracking in wireless sensor networks", international journal of advanced robotic systems, feb. 2013, vol. 10, (pp. 1 14) wireless sensor node with low-power sensing 453 [42] v. jeliĉić, "power management in wireless sensor networks with high-consuming sensors", technical project report, april 2011, university of zagreb, faculty of electrical engineering and computing, (pp. 1-9), av. february 2014 at http://www.ztel.fer.unizg.hr/_download/repository/vjelicic,kdi.pdf [43] h. joe, j. park, c. lim, d. woo, and h. kim, "instruction-level power estimator for sensor networks", etri journal, vol. 30, no. 1, february 2008, (pp. 47 58) [44] leibniz institute for high performance microelectronics – ihp, frankfurt (oder), germany, http://www.ihp-microelectronics.com [45] n.s. kim, t. austin, d baauw, t. mudge, k.flautner, j.s. hu, m.j. irwin, m.kandemir, and v. narayanan "leakage current: moore's law meets static power ", ieee computer, dec. 2003, vol.36, no. 12, (pp. 68 75) [46] m. kosanovic, m. stojcev, "rpats – reliable power time synchronization protocol", microelectronics reliability, vol. 54. no. 1, 2014, (pp.303-315) [47] t. schmid, r. shea, z. charbiwala, j. friedman, m. srivastava, and y. cho, "on the interaction of clocks, power, and synchronization in duty-cycled embedded sensor nodes", acm transactions on sensor network, vol. 7, no. 3, 2010, (pp. 1-19), article no. 24 [48] j. fraden, "handbook of modern sensors: physics, designs, and applications", fourth edition, springer new york, 2010 [49] g. nikolic, g. panic, z. stamenkovic, g. jovanovic and m. stojcev, "implementation of external powergating technique during sensing phase in wireless sensor networks", 29th international conference on microelectronics miel 2014, belgrade, serbia, 12-15 may 2014, accepted for presentation [50] online catalogue of www.farnell.com av. at january. 2014 [51] texas instruments, low-dropout voltage regulator, av. at www.ti.com/lit/gpn/tlv716120275p, january. 2014 [52] texas instruments, low ron load switch, av. at http://www.ti.com/lit/ds/symlink/tps22908.pdf, january. 2014 http://www.ztel.fer.unizg.hr/_download/repository/vjelicic,kdi.pdf http://www.ihp-microelectronics.com/ http://www.farnell.com/ http://www.ti.com/lit/gpn/tlv716120275p http://www.ti.com/lit/ds/symlink/tps22908.pdf demonstration of protein hydrogen bonding network application to microelectronics facta universitatis series: electronics and energetics vol. 27, n o 2, june 2014, pp. 205 219 doi: 10.2298/fuee1402205h demonstration of protein hydrogen bonding network application to microelectronics  marin h. hristov 1 , rostislav p. rusev 2 , george v. angelov 1 , elitsa e. gieva 1 1 technical university of sofia/department of microelectronics, sofia, bulgaria 2 technical university of sofia/department of technology and management of communication systems, sofia, bulgaria abstract. model of hydrogen bonding networks in active site of -lactamase during the last intermediate ey of acylenzyme reaction semicycle is presented. the i-v characteristics of each hydrogen bond are calculated following marcus theory and theory of protein electrostatics. simulations showed that hbn characteristics are similar to the characteristics of microelectronic devices such as amplifier, signal modulator, triangular pulse source. the results demonstrated the analogy of hbns in the active site of β-lactamase protein to microelectronic integrated circuit with multiple outputs each with different characteristics. key words: bioelectronics, microelectronics, proteins, hydrogen bonding networks, -lactamse, acylenzyme reaction, proton transfer. 1. introduction bioelectronics is a relatively new field associated with the integration of biomolecules with electronic elements to yield functional devices. the first molecular materials used in electronics originate from material science and are related to the development of electronic and optoelectronic devices that utilize the macroscopic features of organic compounds. the remarkable biochemical and biotechnological progress in tailoring new biomaterials by genetic engineering or bioengineering provides unique and novel means to synthesize new enzymes and protein receptors, and to engineer monoclonal antibodies for nonbiological substrates (such as explosives or pesticides) and dna-based enzymes. all these materials provide a broad platform of functional units for their integration with electronic elements. after many years of research, the organic light-emitting devices, synthetic electronic circuits, chemical and biochemical sensors have drawn attention. molecular scale devices attract substantial research efforts because of the basic fundamental scientific questions and the potential practical applications of the systems. in  received january 20, 2014 corresponding author: george v. angelov technical university of sofia/department of microelectronics, sofia, bulgaria (e-mail: angelov@ecad.tu-sofia.bg) 206 m. hristov, r. rusev, g. angelov, e. gieva particular, efforts are focused on single molecule behavior or behavior of group of molecules and precise 3d position control over single atoms and molecules. the major activities in the field of bioelectronics relate to the development of biosensors that transduce biorecognition or biocatalytic processes in the form of electronic signals [1], [2], [3]. there are certain applications of bioelectronics molecular structures in switches, dna and other molecular devices that could be implemented in standard solid-state silicon electronics [4]; such molecule structures that could become real competitor to state-of-the-art silicon microelectronic devices. other research efforts are directed at utilizing the biocatalytic electron transfer functions of enzymes to assemble biofuel cells that convert organic fuel substrates into electrical energy [5], [6]. exciting opportunities exist in the electrical interfacing of neuronal networks with semiconductor microstructures. the excitation of ion conductance in neuronsmay be followed by electron conductance of semiconductor devices, thus opening the way to generating future neuron-semiconductor hybrid systems for dynamic memory and active learning [7]. one goal of molecular electronics is to imitate complex process behavior of solid-state circuits in molecular structures that would allow for creation of bioelectronics devices. biological molecules – namely enzymes, proteins, and dna – are unique, in that they have benefited from natural selection and evolution, which has resulted in highly optimized properties custom-tailored for specific biological functions. these molecules have evolved to function in a wide range of environmental conditions, often with efficiencies unmatched by nonbiological or synthetic methods. rational selection by the researcher can bring their novel functionalities to device applications. from the standpoint to application of bioobjects to devices that are similar to conventional microelectronics devices proteins are widely sensitive to the presence of many types of molecules, through both specific and nonspecific binding. this is especially true for enzymes, which participate in a large variety of specific interactions with small molecules. the primary result is conformational modification of the protein, resulting in a change of activity. molecules that interact with proteins through nonspecific or indirect interactions typically disrupt noncovalent bonds which in turn alter structure. the challenge here is to detect these changes and transform them into signals that can be processed analogous to solid-state microelectronic devices. for certain classes of proteins, e.g. photoactive yellow protein (pyp) [8], green fluorescence protein [9], bacteriorhodopsin [10], the problem with signal formation and processing is simpler because these proteins can produce (or at least can be made to produce) a measurable response that can be sensed into the form of a signal. the understanding of charge transport phenomena through biological structures is essential for the problem of signal formation. it is constantly evolving due to the intensive theoretical and experimental work. the contributions of the marcus theory [11], the super exchange charge transfer theory [12], and the definition of superior tunneling paths in proteins [13] had an enormous impact on the understanding of biological processes in numerous electrochemical and photoelectrochemical biosensing systems. bacteriorhodopsin (br) has drawn a large amount of attention from the perspective of both the basic and applied sciences. it is a unique protein because it acts as a light-driven engine that converts light into chemical energy in an efficient manner. bacteriorhodopsin possesses hydrogen bonding networks that executes proton transport. this implies that demonstration of protein hydrogen bonding network application to microelectronics 207 proteins with its hydrogen bonding networks (hbn) can process information. such protein with hydrogen bonding networks is β-lactamase [14]. in this paper, we demonstrate the application of hbns in the active site of β-lactamase protein as analog to microelectronic integrated circuit with multiple outputs each with different characteristics. 2. hbn modeling approach signal transfer in hydrogen bonding networks (hbns) is carried out by protons. the model of proton transfer in hydrogen bonds is based on marcus theory and the protein electrostatic theory [15]. proton current in the hydrogen bonds depends on the value of ph. changing the ph causes polarization and ionization of the protein groups. in result, the charges in protein-water system are redistributed and the donor/acceptor electrostatic potentials change. the proton transfer parameter (respectively proton current) between changes as well. in analogy to traditional microelectronic four-terminal elements, where the input circuit correlates to the output circuit and the electrical current itself is formed by electrons, some protein hydrogen bonds that are connected in a hb network could be modeled as four terminal block-elements; the current in each block-element is formed by a transfer of protons between donor and acceptor parts of the heavy atoms in the network [16]. this analogy allows us to model hbns with four terminal circuit block-elements. the i-v characteristics of each block-element are proportional to the k-v characteristics of the respective hydrogen bonds. the current (i) of each block-element represents the proton transfer parameter (k) of each hydrogen bond and the voltage (v) of each blockelement represents the electrostatic potential (el. pot.) [16]. 2.1. types of hbns studied in our earlier research in the field of bioelectronics so far [17]-[21], we have investigated -lactamase protein and in particular, its hydrogen bonding networks (hbns). we have examined different types of hbns including branching hbns, linear hbn with protein residues and water molecules, hbn from protein main chain, and hbn in active site of protein. we have compared hydrogen bonds characteristics to the characteristics of various well-known electronic elements such as transistors, amplifiers, filters, currents sources, decoders, etc. branching network is depicted in fig. 1. linear networks with and without water molecule are given in fig. 2. these hydrogen bonding networks (hbn) extracted from -lactamase, consist of residues in the periphery of the protein and water molecules. b. atanasov et. al [22] assume that proton transfer in the active site hbns of the -lactamase is performed during the interaction of the protein with the ligand (acylenzyme reaction). 208 m. hristov, r. rusev, g. angelov, e. gieva fig. 1 hydrogen bonding network. nh1, nh2, and ne — nitrogen atoms of arginine residue r164, oe1 and oe2 — carboxyl oxygen atoms of glutamic acid residue e171, od1 and od2 carboxyl oxygen atoms of aspartic acid residues, oh — are oxygen atoms of water molecules (w295, w753 and 859w) fig. 2 hydrogen bonding network is virtually separated into two parts (with dashed line). (m182) is methionine residue, og1 is hydroxyl oxygen of threonine residues (t160, 181, 189), od2 is carboxyl oxygen of aspartic acid residue (d157), nz is nitrogen atom of lysine residue (k192), oh is oxygen atom of water molecules (w356, 440) 2.2. acylenzyme reaction the hbns in the active site of -lactamase protein during acyl enzyme reaction is given in fig. 3. there are two hbns participating in the acylenzyme reaction. the first hbn, referred to as nucleophilic, consists of residues s70, w297, n170, e166, k173, n132. the second hbn, referred to as electrophilic, consists of s130, k234, w309, d214, s235. it should be noted that there is proton transfer in parallel in both the two hbns during the different intermediates of the reaction. acyl enzyme reaction cycle has the following intermediates: (e) the networks in the active site of the free enzyme, (es) the formation of michaelis complex, (t1) the transient state of reaction where the networks change due to the nucleophilic attack by s70, (ey) the end of the reaction when the acylenzyme is formed up and the networks of hydrogen bonds have changed due to the opening of the ligand ring and the combination of the ligand with s70. the catalyzed -lactam nitrogen protonation is supposed to be energetically favored at the initiating event, followed by nucleophilic attack on the carbonyl carbon of the lactam group. nitrogen protonation is catalyzed through a hydrogen bonding network involving the 2-carboxylate group of the substrate, s130 and k234 residues, and a water molecule. the nucleophilic attack on the carbonyl carbon is carried out by the s70 with deprotonation abstraction catalyzed by a water molecule hydrogen-bonded to the side chain of e166. demonstration of protein hydrogen bonding network application to microelectronics 209 fig. 3 acylenzyme reaction intermediates – hbns in the active site of: (e) free enzyme, (es) michaelis complex, (t1) transient state, (ey) acyl enzyme. the first hbn is referred to as ―nucleophilic‖ consists of residue s70, water molecule w297, and residues n170, e166, k173, n132 and ligand. the second referred to as ―electrophilic‖ consists of consists of residues s130, k234, water molecule w309, and residues d214, s235 proton transfer between donor and acceptor of each hydrogen bond in protein is studied following marcus theory. proton transfer parameter k is calculated by / 2 exp 2 b b k t eb h k k t          (1) where kb – boltzmann constant, eb – barrier energy, h – plank constant,  – frequency, t – temperature in kelvins. the energy barrier is calculated by: 2 12 12 2 ))())2)((exp(( )))((( evdarts esvtdarseb ccc baaa   (2) where r(da) – distance between donor and acceptor, e12 – the difference between the energies of donor and acceptor (cf. two-well potential). k has dimension of free energy and from the calculations, it can be interpreted as follows: the greater parameter k – so much readily accomplished proton transfer between donor and acceptor from hbns, i.e. the proton current will be greater. on the other hand, the parameter of 210 m. hristov, r. rusev, g. angelov, e. gieva proton transfer depends on the donor/acceptor potentials similarly to the potentials supplying the microelectronic components. therefore we can construct three and four-terminal electronic block-elements analogous to the hydrogen bonds in the following way. the electrostatic potential of each protein atom is calculated by protein electrostatic theory. both k parameter and electrostatic parameter depend on ph of the environment. it is observed that the electrostatic potential of donor/acceptor of each hydrogen bond could be compared to the voltage of a conventional microelectronic circuit device. proton transfer parameter k could be compared to the circuits‘ device current. this lets us introduce the analogy between hydrogen bonds and standard microelectronic devices, respectively hbns and microelectronic circuits. in particular, we will study the behavior of block-elements circuits, modeled with polynomials, in matlab [23]. afterwards, the matlab simulations with polynomials are compared to the results of simulations obtained using marcus theory in [24]. 3. circuit model 3.1. circuit formulation we create circuit block-elements corresponding to each hbn in order to emulate the operation of hbns. hbn is divided in heavy atoms that form the hydrogen bonds (fig. 4). fig. 4 sample hbn with its heavy atoms x, d, and a fig. 5 block-element that is analogous to heavy atom from the hydrogen bonding network in the analogous circuit each heavy atom (which is both donor and acceptor, designated with ‗x‘) is represented as a separate block-element (fig. 5). in fig. 5 the acceptor part of the heavy atom ‗x‘ is assigned as the input of the respective block-element where the input current iin flows in; the donor part of the heavy atom ‗x‘ is assigned as the output of the respective block-element, where the output current iout flows out. the potentials at the input and the output of the block-element are equal to the potential u of the heavy atom. the magnitude of the input and output currents are proportional to the proton transfer parameter of the hydrogen bonding network where the heavy atom is present. in each protein hbn we can find strong proton donor, strong proton acceptor, and atoms exhibiting both donor and acceptor properties. strong proton donor of each hbn always behaves as circuit input and strong acceptor of each hbn always behaves as circuit output. we consider the application of the hbn properties on the example of hydrogen bonding network in active site during the intermediate of acyl enzyme reaction (case ey demonstration of protein hydrogen bonding network application to microelectronics 211 of fig. 3). both hbns in active site of -lactamase protein are represented with analogous circuit depicted on fig. 6. this circuit could be considered as ―integrated‖ circuit built of molecules. fig. 6 correspondence between protein residues and water molecules in ey case and respective block–elements: in k73 and k234 correspond to t1 and t10 blockelements (i.e. they are circuit inputs), e166  t4 (uout1), n170 and w297  t2, n132  t5, n170  t6, d214  t12, s130og  t13, s235og  t14 subject of the modeling effort is to describe the hbn in the active site of  -lactamase protein; the equivalent circuit of the hydrogen bonding network is given in fig. 6 which is analogous to the circuit on fig. 3 (ey). the equivalent circuit consists of two subcircuits corresponding to the respective nucleophilic and electrophilic hbn of the acyl enzyme (ey). proton transfer depends on ph of the environment and the interaction of enzyme with ligand. therefore, the two circuits are bound together in common circuit (which can be considered as “integrated” circuit) and cannot be separated. the ligand is represented in the equivalent circuit by the switch s – the switching of elements and the change of i/o currents depends on the position of s; by the position of the switch in fact we may select the intermediates of the acyl enzyme reaction. it should be noted that the ligand formation and its charge in active site, strongly affects proton transfer through the hbns. the output of the first electric circuit is denoted with uin1; this circuit is analogous to the nucleophilic hbn. t1 block-element corresponds to k73nz lysine which here behaves as proton donor and that is why it is interpreted as current source in the circuit; t1 has equal input and output voltages but different input and output currents. t2 substitutes the water molecule w297. here we do not have a block-element designated by ―t3‖ because we examine the last intermediate of the acylenzyme reaction where the s70og residue has already completed the reaction and we reassign t2 to represent the water molecule; in the other intermediate of the reaction the water molecule is represented by t3 block-element but now there is no s70 residue we reassign the number for the water molecule blockelement. t4 is juxtaposed to e166 which is proton acceptor and can form different hydrogen bonds; t4 sums three input currents. t5 and t6 block-elements are analogous 212 m. hristov, r. rusev, g. angelov, e. gieva to n132 and n170 asparagines, respectively. aspargines can be both donors and acceptors and thence they can alter the current direction; in the circuit, this is modeled by the s switch (s corresponds to the ligand). the input of the second electric circuit is denoted by uin2; this circuit is analogous to electrophilic hbn. t10 block-element corresponds to k234nz proton donor. it is a current source (similarly to the current source t1 in the first electric circuit) and again it has equal input and output voltages but different input and output currents. t11 is analogous to the water molecule w309. t12 is juxtaposed to d214 residue which is in fact output of the circuit. t13 represents s130og residue – it has equal input and output voltages but different input and output currents. t14 is the other output of the circuit; it is compared to s235og residue and has the same properties as s130og. 3.2. equation formulation the i-v characteristics of block-elements that correspond to proton transfer parameter and electrostatic potential of hydrogen bonds are coded in matlab. because the acyl reaction goes together with proton transfer, the proton transfer through each hydrogen bond in each reaction intermediate is simulated. this proton transfer is also compared to the current flow through known electronic elements. the relations between currents and voltages of each block-element in the equivalent circuit are given by polynomials. first, we list the equations for nucleophilic subcircuit. equations (3) and (4) describe voltage and current of first output of t1 (the input voltage uin is between –2.3  +2.2 v): uin = u1 (3) i1 = 3*10 -5 u1 2 + 0.0004u1 + 0.0045 (4) the i-v equations for t5 are: u51 = 0.0516u1 – 0.2473 (5) u5 = 1.0595u51 – 0.242 (6) i5 = –3*10 -6 u5 4 + 5.5*10 -6 u5 3 +2*10 -5 u5 2 – 5*10 -5 u5 + 1.2*10 -4 (7) t2 block-element is modeled by: u2 = 1.0211u1 – 0.1005 (8) i2 = = –1.0658u2 3 – 0.1179u2 2 + 10.912u2 + 151.84 (9) voltage and current of t6, which is output no. 3 of the circuit, are: u6 = 1.1204u1 – 0.3978 (10) i6 = 0.0045u6 2 + 0.0139u6 + 0.0251 (11) t4 that is output no. 1 of the circuit is described by u4 = 1.0732u2 – 0.1933 (12) demonstration of protein hydrogen bonding network application to microelectronics 213 the current of t4 is a sum of all input current: i4 = i3 + i5 + i6 = iout1 (13) the equations for electrophilic subcircuit of hbns are listed below. t10 blockelement is modeled by: u10 = 0.9658u1 + 0.2266 (14) i10 = -0.0062u10 3 -0.002u10 2 +0.0751u10 + 0.5283 (15) the i-v characteristics of t11 are: u11 = 1.002u10 +0.1153 (16) i11 = –3*10 -6 u11 5 + 9*10 -6 u11 4 + 1.7*10 -5 u11 3 – 4*10 -5 u11 2 – 9.2*10 -5 u11 2 + 0.00025(17) for t12, which is circuit output no. 12, we have: u12 = 0.9904u11 + 0.4309 (17) i12 = i11 (18) circuit output no. 13 is described by : u13 = 1.0318u10 – 0.1714 (19) i13 = 0.0008u13 3 – 0.0018u13 2 – 0.0001u13 +0.0042 (20) the last output, no. 14, is modeled by: u14 = 1.01u11 – 0.1509 (21) i14 = –0.0431u14 4 + 0.1242u14 3 + 0.1987u14 2 – 0.1987u14 + 12.893 (22) 3.3. matlab code below we list an excerpt of the matlab code used to model the equivalent circuit behavior: % block t14(s235) out14-> 1inp=1out arguments % function -> 1inp-1out % equation for u114 = f(u11) u14 = 1.01*u11 -0.1509; plot(u11,u14,'linewidth',2); set(gca,'fontweight','b','fontsize',14) grid on title('t14 out14'); xlabel('u11 [v]'); ylabel('uout14 [v]'); legend('simulation','data'); set(legend('simulation','data',1),'fontsize',12); pause; % equation for i14=f(u14) inp1=out1 i14 = -0.0431*u14.^4 +0.1242*u14.^3 +0.1987*u14.^2 -0.1987*u14 +12.893; plot(u14,i14,'linewidth',2); set(gca,'fontweight','b','fontsize',14) grid on 214 m. hristov, r. rusev, g. angelov, e. gieva title('t14 out14'); xlabel('uout14 [v]'); ylabel('iout14 [pa]'); legend('simulation','data'); set(legend('simulation','data',1),'fontsize',12); pause; % % % ++++++++++++++++++++++++++ plot(u7,i51,'-r',u7,i7,'--b','linewidth',2); set(gca,'fontweight','b','fontsize',14) grid on title('i51, i7 vs u7'); xlabel('u7 [v]'); ylabel('i51, i7 [pa]'); set(legend('i51','i7',12),'fontsize',12); h = legend('cos','sin',2); pause; 4. simulation results and discussion we perform dc and transient analyses to study circuit behavior in different modes of operation in matlab. 4.1. dc analysis the dc analysis is carried out by sweeping input voltage between –2.3 and +2.2 v. the simulated results with the above polynomial equations are compared to the results of [24] (cf. fig. 7). it is observed that the polynomials well describe the behavior of the modeled block-elements. we observed similar results for the rest of the block-elements (we do not show them here). the maximal error is 5.66 %. fig. 7 i-v characteristics of block-elements t1 and t11 (representing k73 and w309) for (ey) intermediate of the acylenzyme reaction next, we perform static analysis of the equivalent circuit (fig. 6). the simulated i-v characteristics are illustrated in figures 8, 9, 10, 11, 12, and 13. demonstration of protein hydrogen bonding network application to microelectronics 215 fig. 8 iout1 vs. uout1 fig. 9 iout2 vs. uout2 fig. 10 iout3 vs. uout3 simulation results show s-type form of the outputs characteristics (in fig. 10 the form is similar to an exponent) which implies that the circuit can operate as an amplifier. fig. 11 iout12 vs. uout12 fig. 12 iout13 vs. uout13 fig. 13 iout14 vs. uout14 the i-v characteristic in fig. 11 cannot be directly compared to a common microelectronic device. conversely, in fig. 12 the current exhibits two regions: 1) iout13 increases almost linearly and then 2) iout13 saturates, hence the characteristic is analogous to the i-v characteristics of a transistor. in fig. 13 we observe a characteristic that is typical for a tunnel diode. we also simulated the dependence of output voltages versus input voltage. the results showed that all output voltages are linearly increasing with the input voltage. 4.2. dynamic analysis taking into account the specifics of the hydrogen bonding networks and the proton transfer, which takes place for a period of approximately 10 -11 s, the analogous electronic circuit should transfer signals in the ghz-range. that is why, we begin with input voltage with amplitude between –2.2 and +2.2 v at frequency of 10 ghz and a time sweep between 0 and 0.1 ns. afterwards, we feed input voltages with positive amplitude only and then with negative amplitude only (fig. 14). 216 m. hristov, r. rusev, g. angelov, e. gieva a) uin = 2.2×sin(5×10 11 ×t) b) uin = 1 +sin(5×10 11 ×t) c) uin = –1 +sin(5×10 11 ×t) fig. 14 uin vs. time in fig. 15 we show the characteristics of t4 block-element which is output no. 1 of the nucleophilic circuit. a) uin = 2.2×sin(5×10 11 ×t) b) uin = 1 +sin(5×10 11 ×t) c) uin = –1 +sin(5×10 11 ×t) fig. 15 iout1 vs. time at different input voltages the results show that iout1 is always positive regardless of whether the input voltage is positive and negative, positive only, or negative only. in the case of fig. 15a) we observe that the signal iout1 at is cut from the bottom. therefore, we can compare the results to the characteristics of a signal limiter. in fig. 16 we show the characteristics of t2 block-element which is output no. 2 of the nucleophilic circuit. a) uin = 2.2×sin(5×10 11 ×t) b) uin = 1 +sin(5×10 11 ×t) c) uin = –1 +sin(5×10 11 ×t) fig. 16 iout2 vs. time at different input voltages demonstration of protein hydrogen bonding network application to microelectronics 217 we observe that the sine output characteristic in fig. 16a) is cut from the top and bottom, in fig. 16b) – from the top, and in fig. 16c) – from the bottom. in fig. 17 we show the characteristics of t6 block-element which is output no. 3 of the nucleophilic circuit. a) uin = 2.2×sin(5×10 11 ×t) b) uin = 1 +sin(5×10 11 ×t) fig. 17 iout3 vs. time at different input voltages fig. 18 gives the characteristics of t12 block-element which is output no. 12 of the electrophilic circuit. a) uin = 2.2×sin(5×10 11 ×t) b) uin = 1 +sin(5×10 11 ×t) fig. 18 iout12 vs. time at different input voltages in fig. 19 we present the characteristics of t13 block-element that is output no. 13 of the electrophilic circuit. a) uin = 2.2×sin(5×10 11 ×t) c) uin = –1 +sin(5×10 11 ×t) fig. 19 iout13 vs. time at different input voltages 218 m. hristov, r. rusev, g. angelov, e. gieva fig. 20 shows the characteristics of t14 block-element which is output no. 14 of the electrophilic circuit. a) uin = 2.2×sin(5×10 11 ×t) b) uin = 1 +sin(5×10 11 ×t) c) uin = –1 +sin(5×10 11 ×t) fig. 20 iout14 vs. time at different input voltages the signal in fig. 21a) indicates modulator‘s nature – the output signal is modulated. from transient analyses, we obtain characteristics that can be compared to amplitude limiter, modulator and triangular pulse source. 5. conclusion the presented circuit model of hydrogen bonding networks in the active site of lactamase protein proved that such biofunctional system exhibits properties that are akin to common microelectronic devices. the analogous microelectronic circuit may operate in static mode as a signal amplifier, transistor or tunnel diode. in dynamic mode, the microelectronic circuit may operate as signal limiter or signal modulator in the ghzrange. furthermore, signals with different frequency, amplitude, and width might be generated at each circuit output. thus, it can be concluded that the electrophilic and nucleophilic networks of hydrogen bonds in the active site can operate like an integrated circuit consisting of individual devices in the form of hydrogen bonds. the biocircuit is extremely flexible and it is applicable to multiple circuit purposes. the results are expected to have important applications for finding novel solutions in bioelectronics research. acknowledgement: the present paper is a part of the research carried out in the framework of project дунк 01/03 of 29.12.2009. references [1] k. habermüller, m. mosbach, and w. schuhmann, ―electron-transfer mechanisms in amperometric biosensors,‖ fresenius' journal of analytical chemistry, vol. 366, no. 6-7, 2000, pp. 560-568. [2] a. heller, ―electrical connection of enzyme redox centers to electrodes‖, acc. chem. res., 23, pp. 128– 134, 1990. [3] f.a. armstrong and g.s. wilson, ―recent developments in faradaic bioelectrochemistry‖, electrochim. acta 45 (15-16), pp. 2623-2645, 2000. demonstration of protein hydrogen bonding network application to microelectronics 219 [4] c. dekker and m.a. ratner, ―electronic properties of dna‖, physics world, 14 (8), pp. 29-33 august 2001. [5] a. heller, ―miniature biofuel cells‖, phys. chem. chem. phys., 6, pp. 209–216, 2004. [6] e. katz, a.n. shipway, i. willner, in handbook of fuel cells – fundamentals, technology, applications (eds.: w. vielstich, h. gasteiger, a. lamm), vol. 1, part 4, wiley, chichester, chapter 21, pp. 355–381, 2003. [7] p. fromherz, ―electrical interfacing of nerve cells and semiconductor chips‖, chem. phys. chem., 3, pp. 276-284, 2002. [8] m. baca, g, borgstahl, m. boissinot, p. burke, d. williams, k. slater, e. getzoff, ―complete chemical structure of photoactive yellow protein: novel thioester-linked 4-hydroxyxinnamyl chromophore and photocycle chemistry‖, biochemistry, 33, pp. 14369–14377, 1994. [9] h. zhang, q. sun, z. li, s. nanbu, s.s. smith, ―first principle study of proton transfer in the green fluorescent protein (gfp): ab initio pes in a cluster model‖, computational and theoretical chemistry, 990, pp. 185–193, 2012. [10] k. j. wise, n. b. gillespie, j. stuart, m. p. krebs, and r. r. birge, ―optimization of bacteriorhodopsin for bioelectronic devices‖, trends in biotechnology, vol. 20, no. 9, pp. 387–94, september 2002. [11] r.a. marcus, n. sutin, ―electron transfers in chemistry and biology‖, biochim. biophys. acta, 811, pp. 265–322, 1985. [12] m. bixon, j. jortner, ―electron transfer. from isolated molecules to biomolecules‖, adv. chem. phys., 106, pp. 35–202, 1999. [13] h.b. gray, j.r.winkler, ―electron tunneling through proteins‖, q. rev. biophys., 36, pp. 341–372, 2003. [14] f.k. majiduddin, i.c. materon, t.g. palzkill, ―molecular analysis of beta-lactamase structure and function‖, international journal of medical microbiology, vol. 292, iss. 2, pp. 127-137, 2002. [15] m.a. lill and v. helms, ―compact parameter set for fast estimation of proton transfer rates‖, j. chem. phys., vol. 114 (3), p.1125-1132, 2001. [16] r. rusev, g. angelov, t. takov, m. hristov, ―biocircuit for signal modulation based on hydrogen bonding network‖, annual j. of electronics, vol. 3, no. 2, pp. 155-158, 2009. [17] r. rusev, g. angelov, b. atanassov, t. takov, m. hristov, ―development and analysis of a signal transfer circuit with hydrogen bonding‖, in proc. of the 17 th intl. scientific and appl. science conf. (electronics et’2008), sozopol, bulgaria, book 4, september 2008, pp. 37-42. [18] r. rusev, g. angelov, t. takov, b. atanasov, m. hristov, ―comparison of branching hydrogen bonding networks with microelectronic devices‖, annual j. of electronics, vol.3, no. 2, pp. 152-154, 2009. [19] r. rusev, g. angelov, e. gieva, t. takov, m. hristov, ―hydrogen bonding network as a dc level shifter and a power amplifier‖, in proc. of 17 th intl. conf. mixed design of integrated circuits and systems (mixdes 2010), wroclaw, poland, june 24-26, 2010, pp. 408-411. [20] r. rusev, g. angelov, e. gieva, m. hristov, t. takov, ―hydrogen bonding network emulating frequency driven source of triangular pulses‖, international journal of microelectronics and computer science, vol. 1, no. 3, pp. 293-298, 2010. [21] e. gieva, l. penov, r. rusev, g. angelov, m. hristov, ―protein hydrogen bonding network electrical model and simulation in verilog-a‖, annual j. of electronics, vol.5, no. 2, pp. 132-134, 2011. [22] b. atanasov, d. mustafi, m. makinen, ―protonation of the -lactam nitrogen is the trigger event in the catalytic action of class a -lactamases‖, proc. natl. acad. sci., usa, 97 (7), p. 3160-3165, 2000. [23] matlab website http://www.mathworks.com/ [24] r. rusev, g. angelov, e. gieva, b. atanasov, m. hristov, ―microelectronic aspects of hydrogen bond characteristics in active site of -lactamase during the acylenzyme reaction‖, annual j. of electronics, vol. 6, no. 2, pp. 35-38, 2012. instruction facta universitatis series: electronics and energetics vol. 30, n o 3, september 2017, pp. 391 402 doi: 10.2298/fuee1703391m an investigation of side lobe suppression in integrated printed antenna structures with 3d reflectors  marija milijić 1 , aleksandar nešić 2 , bratislav milovanović 3 1 faculty of electronic engineering, university of niš, serbia 2 "imtel-komunikacije" a.d., novi beograd, serbia 3 university singidunum, 11000 belgrade, serbia abstract. the paper discusses the problem of side lobe suppression in the radiation pattern of printed antenna arrays with different 3d reflector surfaces. the antenna array of eight symmetrical pentagonal dipoles with corner reflectors of various angles is examined. all investigated antenna arrays are fed by the same feeding network of impedance transformers enabling necessary amplitude distribution. considering the different reflector surfaces, the influence of parasitic radiation from feeding network on side lobe suppression is studied to prevent the reception of unwanted noise and to increase a gain. key words: printed antenna array, reflector antennas, side lobe suppression, symmetrical pentagonal dipole 1. introduction modern wireless communication systems establish strong antenna requirements relating to theirs size, weight, cost, performance and ease of installation. printed (microstrip) antennas can meet the most requests set by many government and commercial applications (mobile radio and wireless communications), high-performance aircraft, spacecraft, satellite, and missile applications. printed antennas feature low profile and low weight, simple and inexpensive production using standard photolithographic technique, great reproducibility and the possibility of integration with other microwave circuits [1]. major disadvantages of microstrip antennas are spurious feed radiation, tolerances in fabrication, very narrow frequency bandwidth and surface wave effect [2]. the printed antenna arrays with symmetrical pentagonal dipoles can mostly overcome mentioned limitations of printed antenna. the antenna array is an assembly of radiating elements in an electrical and geometrical configuration improving the majority of antenna parameters. received october 3, 2016; received in revised form november 14, 2016 corresponding author: marija milijić faculty of electronic engineering, university of niš, serbia, aleksandra medvedeva 14, 18000 niš, serbia (e-mail: marija.milijic@elfak.ni.ac.rs) 392 m. milijić, a. nešić, b. milovanović the symmetrical pentagonal dipole operates on the second resonance (antiresonance) enabling both much slower impedance variation with frequency and useful wide bandwidth than in case of operation on the first resonance [3]-[7]. consequently, proposed antenna array has lower sensitivity to fabrication’s tolerances enabling the use of low-cost photolithography printing process for its manufacture. further, the feeding network is also symmetrical printed structure causing the reduction of parasitic radiation and surface wave effect. however, spurious radiation from the feed network has a very substantial, although indirect, effect on side lobe level [2]. international standards and recommendations define side lobe suppression (sls) for modern communication systems between 20 db and 40 db. moreover, the radar systems that are employed to control civil and military object, have more serious sls requirements. side lobe should be minimized to avoid false target indications through the side lobes that can cause catastrophic consequences. also, insufficient sls indicates that more power is radiated in side lobes resulting in a low antenna gain. the proposed antenna array of eight symmetrical pentagonal dipoles uses dolphchebyshev distribution of the second order with 19 db pedestal (imax/imin ratio) in order to achieve sls of 44 db in ideal case without realization errors. besides realization deviations that can reduce sls, the parasitic radiation from feeding network may influence on antenna parameters, especially sls. the reflector plates can be good tool to overcome undesired effect by feeding structure’s radiation. furthermore, they can improve gain and control radiation pattern that is desirable for many modern wireless applications. 2. printed antenna arrays with high side lobe suppression the investigated printed antenna consists of array of eight symmetrical pentagonal dipoles (labelled as d1-d8), feeding network, balun and 3d reflector surface (fig. 1). the array, feeding network and balun are printed on the same dielectric substrate of 0.508 mm thickness with εr=2.1. the vertex of corner reflector with angle α is at distance h from the antenna array. fig. 1 printed antenna array of eight symmetrical pentagonal dipoles with feeding network, balun and reflector http://www.mwjournal.com/buyersguide/buyersguide.asp?catid=115&ref=autoarticle http://www.mwjournal.com/buyersguide/buyersguide.asp?catid=114&ref=autoarticle an investigation of side lobe suppression in integrated printed antenna structures with 3d reflectors 393 2.1. antenna array the eight symmetrical dipoles form the antenna array. they have pentagonal shape whose one half is on one side and another half, contrariwise turned, is on the opposite side of the substrate. the single pentagonal dipole has very large bandwidth regardless used reflector plate. considering their impedance variation in desired frequency range, its bandwidth (vswr is bellow 2) is more than 35% of central frequency [8]. however, mostly antenna parameters of single pentagonal dipole are hardly controlled. therefore, pentagonal dipoles associate in array to achieve higher gain, better sls and other required parameters. the pentagonal dipoles in array are axially positioned at distance d. the array is mirror symmetrical whose line of symmetry passes through the middle of array. consequently, dipoles d1 and d8 have the same dimensions and parameters, as well as dipoles d2 and d7, d3 and d6, and d4 and d5. each array’s dipole is fed by symmetrical feeding line that penetrates through the holes at the contact between two reflector metal surfaces. the holes must be sufficient diameter (2.3mm) to minimize the influence of the metallic plates. the previous researches [8-10] have showed that reflector plate influences the antenna parameters like sls, gain, bandwidth, etc. the planar reflector plates [8-10] enable that antenna array can apply in a wide frequency range, although its gain and sls are not satisfactory for many modern communication systems. unlike planar reflector plates, the antenna array with corner reflectors [8-9] can achieve greater gain and sls but narrow bandwidth. the improvement in a gain and sls is more noticeable when the angle of corner reflector is smaller. even though the antenna array with corner reflector of 90° and 60° angle [8-9] have been examined, their slss are not satisfactory for many wireless applications. further investigations should optimize the angle of corner reflector to obtain satisfactory sls and gain. furthermore, the antenna array with optimal parameters is planned to be realized and measured. the radiation patterns in h plane will be also investigated. 2.2. feeding network the feeding network, also symmetrical printed structures, enables the amplitude distribution calculated by linplan software [11]. it begins with balun that is used for transition from conventional printed to symmetrical printed structure. after it, there are impedance transformers, tjunctions, and feeding lines (fig. 2). fig. 2 feeding network of impedance transformers for antenna array with high side lobe suppression feeding lines with impedance zc = 100 ω correspond the dipoles with the same impedances zd = 100 ω. there are a few t-junctions (one in the first stage, two in the second 394 m. milijić, a. nešić, b. milovanović stage and four in the third stage of feeding network). the tjunctions are marked by points a, b, c, and d (fig. 2). the impedance in separating points a, b, c, and d is zs = 50 ω. the impedance transformer zt = 70.7 ω is used to transform impedance of feeding line zc = 100 ω into impedance in separating point zs = 50 ω. the other impedance transformers (z1, z2, z3, z4, za, and zb) are employed to obtain a requested amplitude weight for every array’s radiating elements obtained by linplan software [11]. linplan software enables the adjustment of an antenna array’s parameter in order to achieve optimal value of sls, gain and hpbw (half-power beamwidth) [8]. also, the distance d between radiating elements should have optimal value to prevent oversizing of their mutual impedance. furthermore, the value of pedestal determine the width of impedance transformers in feeding network that should be moderate due to easier realization by photolithographic printing. considering all requests, the pedestal of 19 db and distance between array’s elements d = 0.77λ0 = 19.25 mm (λ0 is wavelength in vacuum at centre frequency fc=12 ghz) are chosen [8]. the amplitude distribution is shown in table 1. table 1 the distribution coefficients for dolph-chebyshev distribution of the second order with pedestal of 19 db dipoles number i 1/8 2/7 3/6 4/5 distribution coefficients ui 0.121 0.387 0.742 1 all impedance transformers are λg/4 length (λg is wavelength at the centre frequency fc for the dielectric substrate whose thickness is 0.508 mm and dielectric constant is 2.17). their characteristics and dimensions have been calculated [9]-[10] using values ui i = 1,2,3,4 from table 1 for dielectric substrate of 0.508 mm thickness, 2.17 relative dielectric permittivity, 41 ms/m conductivity of metal, insignificantly small values of loss tangent and conductor thickness (table 2). table 2 the impedance transformers of the feeding network impedance transformer z1 z2 z3 z4 za zb width [mm] 0.152 1.232 0.615 0.97 0.147 1.245 characteristic impedance [ω] 236.95 74.08 118.66 88 228.37 74.36 moreover, expected tolerances in standard photolithographic process have been assumed in order to estimate the sls degradation, due to amplitude, phase and radiating elements positioning deviations from optimized values at the operating frequency of 12 ghz. besides ideal case without realization errors, two more cases have been considered:  the real case when deviations in distances between radiating elements in the array are 1 percent of λ0, phase deviations are 0.908° (approximately 40 μm tolerances in the length of the feeding line) and amplitude deviations along feeding lines are 1db;  the worst case when deviations in distances between radiating elements in the array are 2 percent of λ0, phase deviations are 1.835° (approximately 80 μm tolerances in the length of the feeding line) and amplitude deviations along feeding lines are 2 db. an investigation of side lobe suppression in integrated printed antenna structures with 3d reflectors 395 the radiation patterns simulated by linplan [11] for all three cases show that the proposed antenna array in ideal case has sls of 44.8 db at 12 ghz and that the expected sls is 39.8 db (in the real case) and 36.2 db (in the worst case) at the same frequency. further, linplan [11] works with abstract radiating elements and it can expect that real antenna arrays with real elements would have bigger degradation. 3. side lobe suppression of printed antenna array initially, all dipoles are fed by singular generators and their dimensions are adjusted in order to obtain all dipoles’ impedance of zd = 100ω at the centre frequency (fc = 12 ghz) taking into consideration mutual coupling and reflector influence. afterwards, the array of dipoles with optimized dimensions is connected with feeding network of impedance transformers. 3.1. simulation results the first investigated antenna array is situated in corner reflector with angle of 90° whose vertex is at distance h = 0/2 = 12.5 mm from centres of dipoles. both reflector plates have 308mm x 60.8mm dimensions. its simulated radiation patterns, run by wipld software, in both e and h plane are presented in fig. 3 and fig. 4, respectively. first simulated model when dipoles are fed by single generators has side lobe suppression 40.6 db in e plane (fig. 3). the second model is generated by integration antenna array with feeding network. its sls is 36.65 db. it can suppose that the unwanted radiation from feeding network degrades the sls. fig. 3 radiation pattern of printed antenna array in corner reflector with angle of 90° in e plane 396 m. milijić, a. nešić, b. milovanović fig. 4 radiation pattern of printed antenna array in corner reflector with angle of 90° in h plane however, it does not influence gain and radiation pattern in h plane (fig. 4). gain in e plane for both antenna simulation models is 19 dbi. the second investigated antenna array is located between two metallic plates of 308mm x 76mm dimensions joined at 60° angle. the distance between array and vertex of corner reflector is h = 0/2 = 12.5 mm. the antenna array fed by eight single generators has sls = 43.7 db and gain g = 20.5 dbi in e plane (fig. 5). when feeding network of impedance transformers is integrated with antenna array, sls decreases to 37.3 db while gain stays approximately the same. fig. 5 radiation pattern of printed antenna array in corner reflector with angle of 60° in e plane an investigation of side lobe suppression in integrated printed antenna structures with 3d reflectors 397 fig. 6 radiation pattern of printed antenna array in corner reflector with angle of 60° in h plane the simulated radiation pattern in h plane, run by wipl-d software, does not change using the different feeding method (fig. 6). meanwhile, it is significantly narrower than the simulated radiation pattern in h plane of antenna array with corner reflector with 90° angle. the last examined antenna array is with corner reflector of angle of 45°. the dimensions of each reflector plate are 308mm x 106 mm. the smaller angle of corner reflector requests greater distance between its vertex and antenna array. therefore, the distance h = 0.6 0 = 15 mm is selected. the first wipl-d simulation model when dipoles in array are fed by single generators has sls = 41 db (fig. 7). the second model, that integrates antenna array of eight dipoles with feeding network of impedance transformers, has sls = 38.8 db (fig. 7). due to the simulation results are satisfied for both wipl-d models, the proposed antenna has been realized. fig. 9 shows a photograph of a fabricated antenna in such a way that fig. 9 a is a view of antenna array in corner reflector with angle of 45° and fig. 9.b is a view of antenna array with one metallic plate and with feeding network. measured results are presented in fig. 7 and fig. 8. the gain in e plane of realized antenna is about 21 dbi which is the value obtained by wipl-d software for both simulated models (fig. 7). however, sls is smaller than value expected by wipl-d simulations. sls of realized antenna is 32 db that is about 6.8 db smaller then simulated sls of antenna array with feeding network. the possible reasons for sls degradation of realized antenna can be: an accidental reflection during measuring, tolerances in fabrication of very thin impedance transformers (z1 and za), the influence of corner reflector metallic plates on feeding structure, etc. although all these influences are hardly investigated and some of them cannot be solved, the measured sls is appropriate for many commercial wireless services. furthermore, the realized antenna has very good gain of 21 dbi which is very important for applications where all potential users request wireless signal of good quality. fig. 8 presents the simulated and measured radiation pattern in h plane for antenna array in corner reflector with 45°angle. it is obvious that it is the narrowest beam in h plane among the examined antennas. 398 m. milijić, a. nešić, b. milovanović fig. 7 radiation pattern of printed antenna array in corner reflector with angle of 45° in e plane fig. 8 radiation pattern of printed antenna array in corner reflector with angle of 45° in h plane entire symmetrical printed structure composing eight dipoles array in corner reflector, feeding network and balun are simple and easy to fabricate by printing on the unique substrate. in particular, it is fabricated by cheap and simple photolithographic process satisfying the requirements of mass productions. an investigation of side lobe suppression in integrated printed antenna structures with 3d reflectors 399 a) b) fig. 9 a) realized antenna with corner reflector b) antenna array, feeding network and one metallic plate of corner reflector the simulated and measured vswr is presented in fig. 10. the simulated vswr of the antenna array is less than 2 in wide frequency range for every considered corner reflector: between bellow 9 ghz and 13.9 ghz for reflector with 90° angle, between 10.4 ghz and 13.4 ghz for reflector with 60° angle and between 10.3 ghz and 13.7 ghz for reflector with 45° angle. however, the realized antenna array in corner reflector with 45° angle is characterized by vswr, measured by agilent n5227a network analyzer, below 2 for range of frequencies between 11.13 ghz and 13.11 ghz. while realized antenna is less wideband than simulation models, possibly due to fabrication tolerance and losses introduced by connectors, it still demonstrates good bandwidth of 1.98 ghz (16.5% of central frequency). 400 m. milijić, a. nešić, b. milovanović fig. 10 vswr of antenna arrays in corner reflector with different angles the presented results show that the corner reflector can significantly influence both radiation patterns in e and in h plane. using corner reflector of smaller angle can increase gain and sls of antenna array (fig. 11). in order to confirm the advantages of corner reflector, the antenna array with the same parameters as investigated arrays (distribution, feeding network, distance between elements, dielectric, frequency, etc.), although without any reflector plate, is studied. the simulation results of antenna array without reflector are presented in fig. 11. fig. 11 simulated radiation pattern in e plane of antenna arrays without reflector and with corner reflector with different angles an investigation of side lobe suppression in integrated printed antenna structures with 3d reflectors 401 it is obvious that its simulation results are far worse than all results of antenna array in corner reflectors. its gain is 8.6 dbi while its sls is 15.5 db. even if its simulation results are not satisfactory for communication system that require high sls, thanks its planar form and optimal dimensions (185mm x 50mm), the printed antenna array without reflector is suitable for many other applications: iot equipment [12], portable devices, rfid systems, etc. the distance h between the corner reflector’s vertex and the dipoles must increase as the angle α of the reflector decreases [1]. furthermore, for reflectors with smaller angles, the dimensions of reflector plates must be larger [1] increasing the size of entire antenna. although a gain increases as the angle between the reflector plates decreases, there is an optimum dipoles-to-vertex distance h for the angle α of corner reflector. if the distance h becomes too small, antenna can be inefficient. for very large distance h, the system produces undesirable multiple lobes and it loses its directional characteristics [1]. consequently, the corner reflector whose plates are set at the 45° angle has the distance h = 0.60 between its vertex and dipoles in array and the satisfactory simulated and measured results. even though the using a corner reflector with a smaller angle can increase antenna gain and sls, also it will result in larger entire antenna dimensions as well as inadequate radiation pattern. 4. conclusion side love suppression is one of the most important antenna parameters whose value must be sufficient to minimize false target indications through the side lobes. side lobe levels of −20 db or smaller are usually not desirable in most applications. antennas with side lobe suppression bigger than 30 db or 40 db (mostly radar systems) must be carefully designed and realized. furthermore, modern wireless systems request compact, light and simple antennas that are easy to implement. microstrip (printed) antennas satisfy all listed requirements although they have several limitations to achieve high side lobe suppression: tolerances in fabrication, mutual coupling between radiating elements, surface wave effect as well as parasitic radiation from a feeding network. the symmetrical printed antenna array of pentagonal dipoles can overcome mostly obstacles to achieve great side lobe suppression. the presented simulated and measured results show that symmetrical printed antenna arrays can achieve great sls. but, there are several factors that must be considered for their design. the appropriate choice of used distribution determines maximum sls that can be obtained but also and the parameters of impedance transformers in feeding network. if tapered distribution with great pedestal is used, the transformers with the greatest and the smallest impedance will have the smallest and the biggest width. the impedance transformers with the smallest width are mechanically unreliable; they can easily be broken. the impedance transformers with the biggest width can have high modes. also, the technical tolerances of photolithographic realization must be considered because deviations from projected values in width and length of impedance transformers can lead to change in amplitude and phase of radiating elements causing sls degradation. an unwanted radiation from the feeding structure has significantly influence on side lobe level. although the feeding network of symmetrical impedance transformers features less radiation than standard microstrip feeding structures, it cannot be completely eliminated. the simulated results show that use of corner reflector with different angle can partially 402 m. milijić, a. nešić, b. milovanović solve the problem of parasitic radiation from feeding network. the corner reflector with smaller angle better prevent the spurious radiation from feeding network and greater sls can be achieved. moreover, the greater gain can be obtained using corner reflector with smaller angle. furthermore, the corner reflector with different angle influences on width of beam in h plane. besides all mentioned advantages of corner reflector, the side lobe suppression of realized antenna array in corner reflector of 45° is less about 6.8 db than expected value obtained by simulation. the reason for measured sls degradation can be in weakness of measuring condition and in tolerances of realization. however, the gain of realized antenna is 21 dbi that is optimal value for many modern wireless applications. acknowledgement: this work was supported by the ministry of education, science and technological development of republic serbia under the projects no. tr 32052. the authors would like to thank n. tasić and m. pešić from "imtel-communication" for the antenna fabrication. special thanks also go to msc i. radnović and n. popović for their help and support. references [1] c.a. balanis, antenna theory: analysis and design, 3rd edition, wiley-interscience, 2005. [2] d. m. pozar and b. kaufman, ”design considerations for low sidelobe microstrip arrays”, ieee trans. on antennas and propagation, vol. 38, no.8, pp. 1176-1185, august 1990. [3] a. nešić, i. radnović, z. mićić, s. jovanović, “side lobe suppression of printed antenna arrays for integration with microwave circuits”, microwave j., vol. 53, no. 10, pp. 72-80, 2010. [4] m. milijić, a. nešić, b. milovanović, “design, realization and measurements of a corner reflector printed antenna array with cosecant squared-shaped beam pattern”, ieee antenn. wirel. pr., vol. 15, pp. 421-424, 2016. [5] a. nešić, i. radnović, “new type of millimeter wave antenna with high gain and high side lobe suppression”, optoelectron. adv. mat., vol. 3, no. 10, pp. 1060-1064, 2009. [6] a. nešić, z. mićić, s. jovanović, i. radnović, d. nešić, “millimeter wave printed antenna arrays for covering various sector width”, ieee antennas propag., vol. 49, no. 1, pp. 113-118, 2007. [7] m. milijić, a. nešić, b. milovanović, “wideband printed antenna array in corner reflector with cosecant square-shaped beam pattern”, in proc. of the 22nd telecommun. forum telfor, belgrade, serbia, november 25-27, 2014, pp.780-783. [8] m. milijić, a. nešić, b. milovanović, “the investigation of reflector influence on the bandwidth of symmetrical printed antenna structures”, in proc. of the 3rd international conference on electrical, electronic and computing engineering icetran 2016, zlatibor, serbia, june 13 – 16, 2016, pp. mti1.1.1-6. [9] m. milijić, a. nešić, b. milovanović, “printed antenna arrays with high side lobe suppression: the challenge of design”, microwave review, 2013, vol. 19, no. 2, pp. 15-20. [10] m. milijić, a. nešić, b. milovanović, “side lobe suppression of printed antenna array with perpendicular reflector”, in proc. of the 11th int. conf. telsiks, nis, serbia, october 16–19, 2013, pp. 217-220. [11] m. mikavica, a. nešić, cad for linear and planar antenna array of various radiating elements, norwood, ma, artech house, 1992. [12] i. đurić, v. ratković-ţivanović, m. labus, d. groj, n. milanović, “designing an intelligent home media center”, facta universitatis, series: electronics and energetics, vol. 29, no 3, pp. 461 – 474, september 2016. instruction facta universitatis series: electronics and energetics vol. 30, n o 1, march 2017, pp. 67 80 doi: 10.2298/fuee1701067l a non-inverting buck-boost converter with an adaptive dual current mode control  srđan lale 1 , milomir šoja 1 , slobodan lubura 1 , dragan d. mančić 2 , milan đ. radmanović 2 1 university of east sarajevo, faculty of electrical engineering, east sarajevo, bosnia and herzegovina 2 university of niš, faculty of electronic engineering, niš, serbia abstract. this paper presents an implementation of adaptive dual current mode control (adcmc) on non-inverting buck-boost converter. a verification of the converter operation with the proposed adcmc has been performed in steady state and during the disturbances in the input voltage and the load resistance. the given simulation and experimental results confirm the effectiveness of the proposed control method. key words: adaptive dual current mode control, non-inverting buck-boost converter, operating modes, transient response 1. introduction a non-inverting buck-boost power electronics converter is one of the most versatile non-isolated converter topologies. it has become increasingly popular in many applications, including: electric vehicles [1], dc microgrids [2], battery-powered portable electronic devices (e.g. cellular phones and laptops) [3], [4], power factor correction (pfc) circuits [5], photovoltaic systems [6], etc. the non-inverting buck-boost converter provides the output voltage that is either lower or higher than the input voltage. this property is significant in socalled dynamic voltage scaling (dvs)-based power-efficient supplies, which provide adjustable voltage levels, according to the instantaneous operating conditions [7]. one of the most important features of this converter type is bidirectional operation, which is especially useful in applications such as dc microgrids and electric vehicles. the conventional two-switch topology of the non-inverting buck-boost converter is shown in fig. 1 (a), being a result of a cascaded combination of a buck converter followed by a boost converter. it contains two power switches t1 and t2. if a bidirectional operation of the noninverting buck-boost converter is required, a four-switch topology from fig. 1 (b) must be used,  received january 25, 2016; received in revised form april 10, 2016 corresponding author: srđan lale university of east sarajevo, faculty of electrical engineering, vuka karadžića 30, 71126 lukavica, east sarajevo bosnia and herzegovina (e-mail: srdjan.lale@etf.unssa.rs.ba) 68 s. lale, m. šoja, s. lubura, d. d. manĉić, m. đ. radmanović where the diodes d1 and d2 from fig. 1 (a) are replaced with additional power switches t3 and t4. fig. 1 a 2-switch (a) and 4-switch (b) topology of the non-inverting buck-boost converter depending on the ratio between the input voltage vg and the output voltage vo, the non-inverting buck-boost converter can operate in buck mode (vovg). as it is discussed in [8], these operating modes can be achieved in different ways. a conventional way is to control simultaneously the switches t1 and t2 with the same gate signal. although this switching scheme is simple, it provides low converter efficiency. in order to increase the efficiency, the operating modes are split: converter operates either as buck converter (only switch t1 is controlled, while t2 is always turned off) when vovg. however, the control of the switches is more complicated in this case, because it is necessary to provide mode detection and smooth and stable transition between the modes. different control methods can be applied to the non-inverting buck-boost converter, depending on the application. this paper is focused on using only current mode control (cmc). in most cases, for example in [2], [3], [5], [9], regardless of the applied cmc method, it is suggested that the non-inverting buck-boost converter operates either as buck or boost converter, as described above. in [5], a non-inverting buck-boost converter as a part of pfc rectifier works in both modes during fundamental period. after detection of each operating mode, the built-in control logic decides to work as conventional peak cmc (pcmc) or valley cmc (vcmc). therefore, the control shifts between pcmc (boost mode with duty cycle below 0.5) and vcmc (buck mode with duty cycle above 0.5) when the input rectified voltage crosses the output dc voltage, without need for slope compensation. in [9] a synchronous buck-boost led driver controller is presented, which uses more complex control as a combination of pcmc and vcmc with slope compensation. however, as it is stated in [2], an implementation of conventional cmc methods to this converter, such as pcmc and vcmc, is not a simple task, because they require information about converter operating modes. an average cmc (acmc) can be applied to the non-inverting buck-boost converter, without determination of operating modes [2]. by using a dual-carrier modulator described in [2], it is possible to achieve a smooth transition between the buck and the boost mode and to precisely control the inductor current throughout the entire operating range. there are other acmc approaches, for example in [3], which unlike the above mentioned acmc [2] has a mode selector circuit, which determines the operating mode during a switching cycle. due to the inherent ability of natural transition between pcmc and vcmc and vice versa, a dual current mode control (dcmc) proposed in [10] could be suitable for implementation on the non-inverting buck-boost converter, with simultaneously controlled a non-inverting buck-boost converter with an adaptive dual current mode control 69 switches t1 and t2. the converter will operate in buck mode with pcmc (duty cycle below 0.5) and in boost mode with vcmc (duty cycle above 0.5). in this way, there is no need for detection of operating modes. also, the converter is stable for the entire range of duty cycle from 0 to 1, that is, the subharmonic oscillations do not exist. on the other hand, all excellent features of pcmc and vcmc are preserved, such as fixed switching frequency, good dynamics and what is very important simplicity. a modified version of dcmc, named adaptive dual current mode control (adcmc), is proposed in [11] and elaborated in detail in [12], which improves some features of dcmc, while the basic operating principles remain the same. in [12], only simulation results are given for the non-inverting buck-boost converter. in this paper, besides some simulation results, the experimental verification of adcmc of this converter is presented. the paper is organized in the following way. the basic operating principles of adcmc, on the example of the non-inverting buck-boost converter, are described in section 2. the simulation and experimental results are presented in section 3. section 4 gives the concluding remarks. 2. operating principles of adcmc of non-inverting buck-boost converter the basic scheme of adcmc of the conventional non-inverting buck-boost converter is presented in fig. 2 (a). the switches t1 and t2 are controlled simultaneously with the same gate signal. in order to increase the converter efficiency, there is a possibility of synchronous version of this converter, where the diodes d1 and d2 are replaced with power switches, as it is shown in fig. 1 (b). in this paper, a synchronous version is not used. however, as it is stated in the introduction, if bidirectional operation of the non-inverting buck-boost converter is required, these additional two switches are necessary. a quiescent value of the output voltage vo of the non-inverting buck-boost converter from fig. 2 (a) is equal to: , 1 g o dv v d   (1) where d and vg are the quiescent values of the duty cycle and the input voltage, respectively. according to (1), when 00.5 (boost mode), a stable operation of the non-inverting buck-boost converter is guaranteed for the entire range of d without slope compensation. instead of mode detection and artificial shifting between pcmc and vcmc, dcmc proposed in [10] is suitable for this application, because it has a natural ability of shifting between pcmc, when d<0.5, and vcmc, when d>0.5, without any mode selector circuits. similarly as pcmc and vcmc, dcmc has a drawback in existence of a peak-toaverage current error (a difference between the reference current iref and the average value of the inductor current ( ) s l t i t over switching period ts). in ideal case of cmc, the aim is 70 s. lale, m. šoja, s. lubura, d. d. manĉić, m. đ. radmanović to control precisely the average value of the inductor current over each switching period, that is, to make this error equal to zero. in order to eliminate peak-to-average current error, an enhanced version of dcmc is proposed in [11], named adcmc. fig. 2 a) adcmc of the non-inverting buck-boost converter, b) operating modes a non-inverting buck-boost converter with an adaptive dual current mode control 71 the operating modes of adcmc applied to the non-inverting buck-boost converter are shown in fig. 2 (b). the main difference between adcmc and dcmc is in the fact that the width between peak iref+ib and valley iref-ib current boundaries (the current bandwidth 2ib), is not constant and predefined for adcmc, unlike dcmc, but it is adaptive and online calculated by using the instantaneous peak-to-peak ripple of the inductor current δilpp on each switching period ts. the adaptive current bandwidth 2ib for the non-inverting buckboost converter is calculated as (fig. 2 (a)): 2 , ( ) g o b ib lpp ib s g o v v i k i k lf v v     (2) where kib is the scaling gain (kib≥1), fs=1/ts is the switching frequency, and l is the inductance value. the gain kib determines whether 2ib≥δilpp. when kib=1, the adaptive current bandwidth 2ib becomes equal to the measured instantaneous peak-to-peak current ripple δilpp, giving zero peak-to-average current error. it is evident from (2) that the calculation of adaptive current bandwidth 2ib depends on the inductance value l, which can be inconvenient if the l parameter is wrong or variable in different operating conditions. the wrong l parameter will lead to inaccurate current bandwidth 2ib and the appearance of the peak-to-average current error. a possible solution for this issue is to directly measure the instantaneous peak-to-peak ripple from the measured inductor current. this solution will be considered in the future work. a detailed analysis of adcmc, including small-signal models and design of the output voltage compensator gc(s) are presented in [12] for three types of dc–dc power electronics converters: buck, boost, and non-inverting buck–boost converter. this paper is focused on experimental verification of adcmc of non-inverting buck-boost converter. 3. simulation and experimental results the operation of the non-inverting buck-boost converter under adcmc, with the topology from fig. 2 (a), was verified with simulations in matlab/simulink and experimentally. the parameters of the non-inverting buck-boost converter working in the continuous conduction mode (ccm), which are the same for both simulations and experiments, are listed in table 1. the experimental setup is shown in fig. 3. the developed setup can be used for testing adcmc on various types of converters, because the used prototype is made as a universal four-quadrant (4q) converter, with possibility of easy configuration to the desired topology, such as buck, boost, non-inverting buck-boost, etc. table 1 parameters of the non-inverting buck-boost converter vg [v] 12 l [µh] 220 c [µf] 1000 r [ω] 20 fs [khz] 23 72 s. lale, m. šoja, s. lubura, d. d. manĉić, m. đ. radmanović fig. 3 experimental setup: 1) the prototype of the non-inverting buck-boost converter; 2) electronic module for measurements and inner current loop; 3) pc with built-in mf624 board; 4) driver module; 5) input voltage source of the converter; 6) power supply units; 7) tektronix mso 2014 oscilloscope a separate electronic module, which is connected to mf624 multifunctional data acquisition input/output digital board [13], is used for implementation of the measurements and inner current loop. the measurement of the inductor current, which is necessary for the inner current loop, is performed with lem current transducer hx 10-np [14]. the converter input and output voltage are measured with galvanic isolation via optocoupler il300 [15] and sampled by 14-bit a/d converter (conversion time about 2 µs) of the mf624 board. mf624 board is built into the computer and it provides a real time processing with matlab/simulink environment. an implementation of the outer voltage loop and calculation of the adaptive current bandwidth 2ib for adcmc is performed in real time in simulink, using real time windows target (rtwt) environment. the reference current iref and current boundaries iref+ib and iref-ib are obtained from 14-bit d/a converter of the mf624 board and fed into the inner current loop. the fundamental sampling time for real time operation in simulink was set to 25 μs, which is the minimum sampling time for this hardware. power mosfets irf540n (100 v, 33 a) [16] are used as power switches t1 and t2. a dual-channel galvanically isolated mosfet driver module (turn on/off delay of 0.6 µs) was developed for driving the power switches. a non-inverting buck-boost converter with an adaptive dual current mode control 73 a primary objective of the performed simulations and experiments is to demonstrate that the proposed adcmc can be successfully applied to the non-inverting buck-boost converter, ensuring a stable operation in all operating modes and good dynamical properties, regardless of the application. several cases of the converter operation were tested: in steady state for buck and boost operating modes, during the step changes in the input voltage and the load resistance and during the gradual change of the input voltage. 3.1. operation of the non-inverting buck-boost converter in steady state the output compensator, as a key part of the outer voltage loop, produces the reference current iref for the inner current loop (fig. 2 (a)). in steady state, the reference current practically has a constant value. therefore, in order to test the behavior of the inner current loop in steady state, the outer voltage loop can be disabled and the reference current should be set manually as a constant signal. a testing the operation of the non-inverting buck-boost converter with adcmc in steady state was performed for both cases: with and without the outer voltage loop. when the voltage loop is disabled, two values of the reference current were used to provide buck and boost operating mode. the simulation waveforms of the inductor current in steady state are shown for ire f = 0.5 a (buck mode) in fig. 4 (a) and iref = 5 a (boost mode) in fig. 4 (b). the corresponding experimental waveforms are given in fig. 5 (a), (b). in the second case, a simple proportional-integral (pi) compensator for the regulation of the output voltage was employed. a design procedure for the output voltage compensator is derived in detail in [12]. as in the first case, the both operating modes were considered. the simulation waveforms of the inductor current in steady state are shown for two values of the output voltage: vo = 7 v (buck mode) and vo = 30 v (boost mode), in fig. 4 (c) and fig. 4 (d), respectively. the corresponding experimental results are presented in fig. 5 (c), (d). it is evident from fig. 4 that there is an excellent matching between the reference current and the average value of the inductor current. a very small peak-to-average current error still exists, which can be attributed to the delays in numerical calculation of the simulation. the experimental results from fig. 5 are similar to the simulation results from fig. 4. a small peak-to-average current error appears as a consequence of imperfections of the components used for realization of adcmc. on the basis of the given results from fig. 4 and fig. 5 it can be concluded that adcmc provide a stable operation of the non-inverting buck-boost converter for both values of the duty cycle: d < 0.5 and d > 0.5. 3.2. robustness to the disturbances in the input voltage and load it is very important to evaluate how the converter with certain control is sensitive to the various disturbances which can occur during operation. in this paper, the disturbances such as the step and gradual changes of the input voltage and the step changes in the load resistance were considered. a line regulation, which is defined as converter ability to maintain the specified output voltage despite changes in the input voltage, was tested for adcmc of the non-inverting buckboost converter. 74 s. lale, m. šoja, s. lubura, d. d. manĉić, m. đ. radmanović fig. 4 the simulation waveforms of the inductor current, reference current and current boundaries in steady state, when the outer voltage loop is: a), b) disabled; c), d) enabled fig. 5 the experimental waveforms of the inductor current, reference current and current boundaries in steady state, when the outer voltage loop is: a), b) disabled; c), d) enabled first, the step changes from 12 v to 6 v and vice versa, were introduced in the input voltage. the output voltage was regulated to 9 v. the load resistance was set to r=10 ω. these step changes were performed in order to make a transition from buck to boost mode and vice versa, and to examine the dynamical behavior of adcmc. the waveforms of the output voltage and the inductor current are shown in fig. 6 (a), (b) (simulation) and fig. 7 (experiment). the same parameters of the output voltage compensator were used in both simulations and experiments. a non-inverting buck-boost converter with an adaptive dual current mode control 75 as it is shown from simulation and experimental results, the converter naturally crosses from buck to boost mode and vice versa. due to adaptation of the current bandwidth 2ib, the transition of the inductor current from one mode to another is smooth, which gives satisfactory line regulation. fig. 6 the simulation waveforms of the output voltage and the inductor current, for the step changes in the input voltage (a), (b) and the load resistance (c), (d) 76 s. lale, m. šoja, s. lubura, d. d. manĉić, m. đ. radmanović fig. 7 the experimental waveforms of the output voltage (up) and the inductor current (bottom), when the input voltage changes from 12 v to 6 v (left) and vice versa (right) fig. 8 the experimental waveforms of the output voltage (up) and the inductor current (bottom), when the load resistance changes from 20 ω to 10 ω (left) and vice versa (right) a non-inverting buck-boost converter with an adaptive dual current mode control 77 fig. 9 the experimental waveforms of the output voltage, when the input voltage changes from 6 v to 12 v (up) and vice versa (bottom), for σ=100, 150, 200 and 500 fig. 10 the experimental waveforms of the output voltage, when the load resistance changes from 10 ω to 20 ω (up) and vice versa (bottom), for σ=100, 150, 200 and 500 78 s. lale, m. šoja, s. lubura, d. d. manĉić, m. đ. radmanović in order to test a step load response, step changes in the load resistance from r=20 ω to r=10 ω and vice versa were performed. the output voltage was regulated to 20 v. the simulation and experimental waveforms of the output voltage and the inductor current are shown in fig. 6 (c), (d) and fig. 8, respectively. it is evident that adcmc successfully reject the introduced load disturbances. the transient response in the output voltage for the considered step disturbances depends also on the designed output voltage compensator, as it is shown in fig. 9 and fig. 10. several values of parameter σ, which determines the transient response time (about 5/σ) and the gains of the pi compensator [12], are considered. it is evident from the given experimental results from fig. 9 and fig. 10 that better responses regarding the transient response time and over/undershoot are obtained for higher values of the adjustable parameter σ. the optimization of the output voltage compensator is not subject in this paper. the aim was to obtain satisfactory results in accordance with the design procedure from [12] (the chosen settling time is about 10-50 ms). also, the output voltage loop is designed to be slow in order to emphasize the behavior of the inner current loop. fig. 11 the experimental waveforms of the output voltage (up) and the inductor current (bottom), for the gradual change of the input voltage from 15 v to 5 v (left) and vice versa (right) besides the step changes, a gradual linear change in the input voltage was also introduced in the experiments. the input voltage was gradually changed from 15 v to 5 v and vice versa, while the output voltage was regulated to 10 v, in order to make a gradual transition from buck to boost mode and vice versa. the experimental results are shown in fig. 11. it is obvious that adcmc is robust against these changes. the output voltage is successfully regulated, without any disruptions between two operating modes. a non-inverting buck-boost converter with an adaptive dual current mode control 79 4. conclusion in this paper, an implementation of a novel adcmc method on the non-inverting buck-boost converter has been presented. the given simulation and experimental results confirm that there is no need for the detection of converter operating modes, because this method ensures a natural and stable transition between the buck and the boost mode, and vice versa. the given results show that adcmc provides a stable operation of the noninverting buck-boost converter for the entire range of duty cycle from 0 to 1. also, it is robust against the disturbances, such as the step and gradual changes in the input voltage and the step changes in the load resistance, with good dynamical performances. the following task will be the using of the proposed adcmc of the non-inverting buckboost converter in various popular applications, such as battery chargers/dischargers, led drivers, etc., and to compare it with other relevant methods in the same applications. references [1] m. a. khan, a. ahmed, i. husain, y. sozer and m. badawy, "performance analysis of bidirectional dc–dc converters for electric vehicles", ieee trans. ind. appl., vol. 51, no. 4, pp. 3442-3452, july/aug. 2015. [2] i. aharon, a. kuperman and d. shmilovitz, "analysis of dual-carrier modulator for bidirectional noninverting buck–boost converter", ieee trans. power electron., vol. 30, no. 2, pp. 840-848, feb. 2015. [3] wei chia-ling, chen chin-hong, wu kuo-chun and ko i-ting, "design of an average-current-mode noninverting buck–boost dc–dc converter with reduced switching and conduction losses", ieee trans. power electron., vol. 27, no. 12, pp. 4934-4943. [4] c.-h. tsai, y.-s. tsai and h.-c. liu, "a stable mode-transition technique for a digitally controlled non-inverting buck–boost dc–dc converter", ieee trans. ind. electron., vol. 62, no. 1, pp. 475-483, jan. 2015. [5] g. k. andersen and f. blaabjerg, "current programmed control of a single-phase two-switch buck-boost power factor correction circuit", ieee trans. ind. electron., vol. 53, no. 1, pp. 263-271, feb. 2006. [6] t.-f. wu, c.-l. kuo, k.-h. sun, y.-k. chen, y.-r. chang and y.-d. lee, "integration and operation of a single-phase bidirectional inverter with two buck/boost mppts for dc-distribution applications", ieee trans. power electron., vol. 28, no. 11, pp. 5098-5106, nov. 2013. [7] l. feng and m. dongsheng, "design of digital tri-mode adaptive-output buck–boost power converter for power-efficient integrated systems", ieee trans. ind. electron., vol. 57, no. 6, pp. 2151-2160, june 2010. [8] haifeng fan, "design tips for an efficient non-inverting buck-boost converter", analog applications journal, texas instruments, pp. 20-25, 2014. [9] linear technology, "60v 4-switch synchronous buck-boost led driver controller", lt3791 datasheet, rev. b, 2012. available: http://cds.linear.com/docs/en/datasheet/3791fb.pdf. [10] a. v. anunciada and m. m. silva, "a new current mode control process and applications", ieee trans. power electron., vol. 6, no. 4, pp. 601–610, oct. 1991. [11] s. lale, m. šoja, s. lubura and m. radmanović, "modeling and analysis of new adaptive dual current mode control", in proceedings of the 10th international symposium on industrial electronics indel 2014, vol. 10, no. t-02, pp. 73–76. [12] available: http://www.indel.etfbl.net/resources/proceedings_2014/indel_2014_paper_11.pdf. [13] s. lale, m. šoja and s. lubura, "a modified dual current mode control method with an adaptive current bandwidth", int. j. circ. theor. appl., 2015. [14] humusoft, "mf624 multifunction i/o card", mf624 user’s manual, 2014. available: http://www2.humusoft.cz/www/datacq/manuals/mf624um.pdf. [15] lem, "current transducer hx 05..15-np", hx 10-np datasheet. available: http://www.lem.com/docs/ products/hx%205_15-np_e%20v10.pdf. 80 s. lale, m. šoja, s. lubura, d. d. manĉić, m. đ. radmanović [16] vishay semiconductors, "linear optocoupler, high gain stability, wide bandwidth", il300 datasheet. available: http://www.vishay.com/docs/83622/il300.pdf. [17] international rectifier, "hexfet ® power mosfet", irf540n datasheet. available: http://www.irf.com/ product-info/datasheets/data/irf540n.pdf. 10819 facta universitatis series: electronics and energetics vol. 36, no 1, march 2023, pp. 43-51 https://doi.org/10.2298/fuee2301043r © 2023 by university of niš, serbia | creative commons license: cc by-nc-nd original scientific paper dual band mimo antenna for lte, 4g and sub–6 ghz 5g applications pinku ranjan1, swati yadav2, amit bage3 1department of electrical / electronic engineering, atal bihari vajpayee-indian institute of information technology and management (abv-iiitm), gwalior, india 2department of electronics & telecommunication engineering, college of engineering roorkee, uttarakhand-247667, india 3department of electronics and communication engineering, national institute of technology, hamirpur, india abstract. in this manuscript, a compact mimo antenna for wireless application has been presented. the proposed antenna consists of the f-shaped radiator with the circular slot in the center and a rectangular ground plane on the other side of the substrate. the proposed antenna has the overall size of 48 × 48 mm2. the antenna is designed to work on two frequency bands from 1.5 to 2.3 ghz, and 3.7 to 4.2 ghz, having the resonating frequency of 1.8 ghz and 3.9 ghz respectively. the diversity performance of the antenna is also observed by using a variety of parameters like envelop correlation coefficient (ecc), diversity gain (dg), total active reflection coefficient (tarc), etc. the value of ecc is 0.02, which shows good diversity performance of the antenna. in order to validate the simulated and measured results, the proposed antenna has been fabricated and shows good agreement with the each other. key words: mimo antenna; envelop correlation coefficient (ecc); total active reflection coefficient (tarc) 1. 1. introduction in worldwide terms, wireless communication is considered to be the fastest growing technology. in 2020, it is expected that 70 percent of the world’s population will have at least a smart phone. the improvement in the generation of wireless communication in terms of data rate, antenna size and higher gain are required. a technology that fulfills the higher demands of such future wireless communication is the use of multiple input multiple output (mimo) antennas. in mimo antenna design technology, multiple antennas are used on both transmitting and receiving side in order to increase the radio link capacity. in this technique, more than received may 26, 2022; revised july 14, 2022, and july 19, 2022; accepted july 25, 2022 corresponding author: pinku ranjan abv-indian institute of information technology and management (iiitm), gwalior, madhya pradesh, india e-mail: pinkuranjan@iiitm.ac.in 44 p. ranjan, s. yadaw, a. bage one data signal is simultaneously transmitted or received over the same radio channel. by using the mimo technology, the signal capturing capacity of receiver is increased by allowing antennas to combine their data streams that are arriving from different paths at different times. mimo is the most important technique in most of the research and will play a key role in the next generation wireless systems, including 5g networks. for the mimo antenna system, isolation between the two radiating elements is very important. therefore, two radiators should be designed in such a way that the isolation between them is less than –15 db. to ensure the isolation between the antenna elements with a miniaturized size is a big challenge for the antenna designers. in the past few years many researches have proposed different mimo antennas with different techniques [1]–[10]. in [11], antenna with two-element semi-ring along with uwb amplifier is presented to design mimo antenna. an annular slot antenna and two shorts in the opposite direction placed at 45 degrees between the microstrip lines are used to achieve an isolation [12]. in 2018 [13], a. dkiouak et al. presented a compact mimo antenna for wireless application based on two symmetrical monopoles with a t-shape junction. the t-shape junction is used to enhance the isolation between two antennas. to abandon the reactive coupling connection between the different antenna elements of mimo antenna, the technique of parasitic elements is used [14]. in [15], a high isolated compact 2 x 2 mimo antenna is designed using pifa and dgs has been used to improve the inter port isolation. in this proposed design, a simple f–shaped radiator is used to get the dual band function of the mimo antenna for the wireless communication. the f–shaped radiator is chosen to get the desired band of application. the antenna has been designed to work at two different frequency bands ranging from 1.5 2.3 ghz and 3.7 4.2 ghz, and having the resonating frequency of 1.8 ghz and 3.9 ghz respectively. the numerical analysis has been carried out using high frequency simulation software (hfss). the organization of the manuscript is as follows. in section 2, antenna design and configuration are presented with its design steps. in section 3, simulated and measured return loss, isolation between ports and radiation pattern are presented. in section 4, diversity performance is evaluated in terms of ecc, tarc, and dg. finally, conclusion is provided in section 5. 2. antenna design and configuration 2.1. methodology the flowchart of the proposed antenna from design specification to fabrication and measurement is shown in fig. 1. the design methodology of the proposed mimo antenna starts from the antenna design specification. after the design specification a single element antenna is designed with the desired frequency response. the single element antenna is modified to a double element square patch mimo antenna. in order to achieve the desired mimo characteristics and frequency ranges, f-shaped mimo antenna is designed with a circular slot. in the next step, the optimization of all the parameters of the designed antennas is done to check its performance. once the desired performance is achieved, the proposed antenna is fabricated and measured. dual band mimo antenna for lte, 4g and sub–6 ghz 5g applications 45 fig. 1 flow diagram represents antenna specification to fabrication 2.2. design parameter the front view and back view of the proposed antenna with its dimensions are shown in fig. 2. the antenna has been designed on fr-4 dielectric substrate having thickness = 1.6 mm, copper thickness = 0.035 mm, dielectric constant = 4.4, and loss tangent = 0.02. the overall dimensions of the proposed antenna are 48 × 48 mm2. the antenna consists of two radiating elements of f-shape, which are placed horizontally to each other on top side of substrate along with the rectangular ground plane, which is designed on the bottom side of the dielectric substrate. the design steps of the proposed antenna are shown in fig. 3. in order to achieve the desired characteristics, there are three design steps. at first, a square (a) (b) fig. 2 (a) top view and (b) bottom view of proposed antenna where l = 48, lp1= 23, lp2= 41, lp3= 4, lp4= 6, lp5= 5, w= 48, wp1=3, wp2 = 13, wp3 = 4, wp4 = 5, wp5 = 8 and lg1 =20 (all in mm) 46 p. ranjan, s. yadaw, a. bage shape radiator is designed along with the microstrip feed line as shown in fig. 3(a). the square shape antenna shows the dual band performance with 1.4 – 2.3 ghz and 3.9 – 4.4 ghz bands, which is not the desired operating frequency bands. also, for the square shaped antenna, the second operating band shows low impedance matching. (a) (b) (c) fig. 3 evolution of the proposed antenna (a) antenna 1 (b) antenna 2 (c) antenna 3 as a result, two f-shape slots are etched from the radiator in the next stage as shown in fig. 2(b). the two f-slots etched from rectangular patch with different dimensions. this f–shaped antenna operates in the 1.5–2.3 ghz and 3.8–4.3 ghz frequency ranges, which is not the required lte and sub-6 5g band. in order to achieve the desired frequency band, the second operating band has been shifted to the lower frequency. in order to shift at lower frequency bands, a circular slot is etched from the upper part of the rectangular patch along with f-slot. using this circular slot the proposed antenna achieved the desired dual band performance with two operating bands from 1.8–2.3 ghz and 3.7–4.2 ghz. the width of the microstrip feed is kept same for all the three design and is equal to 3 mm. the gap between the two radiators is 8 mm. the simulated s-parameters for the fig. 3 is shown in fig. 4. the fig. 4(a) reveals that antenna 3 which is the f–shaped structure with circular slot shows good performance. it is also clear from the fig. 4(b), the designed f–shaped antenna shows the good impedance matching over the two frequency bands with the center (a) (b) fig. 4 (a) simulated s12/s21 parameter versus frequency for antenna 1, 2 and 3 and (b) simulated s11/s22 parameters versus frequency for antenna 1, 2 and 3 dual band mimo antenna for lte, 4g and sub–6 ghz 5g applications 47 frequency of 1.8 ghz and 3.9 ghz, and the isolation between the two antennas is less than –5, and –15 db for the two bands. to analyze the behavior of antenna current densities for the two different frequencies are determined. for calculating the current densities at two resonating frequencies port 1 is excited. the current distribution of the proposed antenna at 1.8 and 3.9 is shown in fig. 5. (a) (b) fig. 5 surface current densities at (a) 1.8 ghz and (b) 3.9 ghz the figure reveals that, at 1.8 ghz resonant frequency the surface current is uniformly distributed at the feed line and the lower part of the f-shaped radiator. for the frequency of 3.9 ghz the current is uniformly distributed at the lower strip of the f-shaped structure. 3. result and discussion in order to validate the numerical analysis, the proposed dual band mimo antenna has been fabricated using pcb prototype machine. the fabricated top view and bottom view are shown in fig. 6. the fabricated antenna is measured using agilent n5230a vector network analyzer. (a) (b) fig. 6 fabricated photograph of the proposed dual band mimo antenna, (a) top view, and (b) bottom view 48 p. ranjan, s. yadaw, a. bage the simulated and measured s-parameters (s11/s22 and s12/s21) are compared and shown in fig. 7. the figure shows that they are in good agreement with each other. the measurement of radiation patterns is performed inside an anechoic chamber for each element, by keeping the other element terminated with matched load. the radiation pattern at two different frequencies is calculated for the two principal planes (e-plane and hplane) as shown in fig. 8. (a) (b) fig. 7 the comparison of simulated and measured results (a) s11/s22 db and (b) s12/s21 db (a) (b) fig. 8 simulated radiation pattern for proposed antenna at (a) at 1.8 ghz and (b) 3.9 ghz dual band mimo antenna for lte, 4g and sub–6 ghz 5g applications 49 the fig. 8(a) shows the simulated radiation pattern of the proposed antenna at 1.8 ghz for e and h plane, and the radiation pattern at the 3.9 ghz for e and h plane frequency is shown in the fig. 8(b). the figure evidences that the antenna possesses the consistent radiation pattern for both frequency bands. fig. 9 shows the gain of the presented antenna for the two different frequency bands. from the figure, it is clear the designed antenna possesses the gain of around 4 db and 2 db for the 1.8 ghz and 3.9 ghz frequency respectively. fig. 9 simulated gain verses frequency graph for the proposed dual band mimo antenna a comparison of the characteristics of proposed mimo antenna with few other reported mimo antenna [11, 12, 13, 14 and 15] is tabulated in table 1. table 1 comparison of presented antenna with previous literature ref. impedance bw (ghz) isolation (db) size (in mm) electrical size in guided wavelength ecc [11] 1.8–5.5 -12 50×90×0.76 1.13× 2.0486 × 0.0173 0.33 [12] 3–12 -15 80×80×0.6 4.19 × 4.19 × 0.031 [13] 2.35–3.05 and 5.12–5.51 -12 43×37×1.6 0.811× 0.69 × 0.0302 0.001 [14] 3.2–3.7, 5.1–5.6, 6.7 7.5 -30 70×50×0.6 1.6886 × 1.2061 × 0.0145 [15] 5.2 – 6 -25 100×50×0.8 3.95 × 1.97 × 0.0316 <0.5 proposed antenna 1.5 – 2.3 and 3.3 –4.2 -15 48×48×1.6 0.6377 × 0.6377 × 0.0213 <0.002 4. diversity performance for mimo antenna, the diversity performance shows how efficiently two antennas work individually. the diversity performance can be calculated using different parameters such as envelop correlation coefficient (ecc), diversity gain (dg), total active reflection coefficient (tarc), etc. the capacity to receive information individually by each antenna is shown through ecc. to achieve better performance, the value of ecc should be less than 0.2, and it can be calculated using the method proposed in [16]: 𝐸𝐶𝐶 = |𝑆11 ∗ 𝑆12 + 𝑆21 ∗ 𝑆22| 2 (1 − |𝑆11| 2 − |𝑆21| 2)(1 − |𝑆21| 2 − |𝑆12| 2)⁄ (1) 50 p. ranjan, s. yadaw, a. bage the diversity gain (dg), can be calculated using the envelop correlation coefficient. for the proposed dual band mimo antenna, the diversity gain can be calculated using [16]: 𝐷𝐺 = √1 − |𝐸𝐶𝐶|2 (2) the simulated ecc and dg of the proposed antenna is shown in fig. 10. the figure shows that ecc of the proposed antenna is below 0.002 at both frequency bands which ensure the good diversity performance of the presented mimo antenna. in the same figure, the diversity gain of the proposed antenna is above 9.9 db at the resonating frequencies. in the transmission and reception process of mimo antenna systems, working of multiple antennas together will affect the overall operating bandwidth and efficiency. the effect of multiple antenna elements on each other is shown through total active reflection coefficient (tarc). tarc can be defined as square root of the ratio of total reflected power to the total incident power and is apparent return loss of the overall mimo antenna system. for dual-band mimo system, the value of tarc can be calculated using the equation given in [17]: 𝑇𝐴𝑅𝐶 = √(𝑆11 + 𝑆12) 2 + (𝑆21 + 𝑆22) 2 √2⁄ (3) the value of tarc should be <0 db for the mimo communication. the simulated tarc of the proposed mimo is shown in fig. 11. the figure reveals at both resonant frequencies the tarc is below -25 db. fig. 11 simulated tarc of the proposed dual band mimo antenna fig. 10 simulated ecc and dg of the proposed antenna dual band mimo antenna for lte, 4g and sub–6 ghz 5g applications 51 5. conclusion this manuscript introduces a small mimo antenna for lte 4g and the sub-6 ghz 5g channel. the suggested antenna operates effectively in two frequency bands having bandwidths of 500 mhz and 600 mhz respectively and ranging from 1.8–2.3 ghz and 3.7– 4.3 ghz. the measured and simulated results of the proposed antenna are compared, which shows the good agreement with each other. the antenna also shows good diversity performance with low envelop correlation coefficient, good diversity gain and low value of tarc. the radiation pattern at e–plane and h–plane for the antenna at both the resonating frequency shows the omnidirectional pattern. acknowledgement: the author would like to acknowledge the iit kanpur for doing the antenna fabrication at their institute. references [1] b. x. wang, w. q. huang and l. l. wang, "ultra-narrow terahertz perfect light absorber based on surface lattice resonance of a sandwich resonator for sensing applications", rsc advances, vol. 7, pp. 42956-42963, 2017. [2] d. hu, t. meng, h. wang, y. ma and q. zhu, "ultra-narrow-band terahertz perfect metamaterial absorber for refractive index sensing application", results phys., vol. 19, p. 103567, 2020. [3] f. yan, q. li, h. tian, z. wang and l. li, "ultrahigh q-factor dual-band terahertz perfect absorber with dielectric grating slit waveguide for sensing", j. phys. d: appl. phys., vol. 53, p. 235103, 2020. [4] q. xie, g. dong, b. wang and w. huang, "design of quad-band terahertz metamaterial absorber using a perforated rectangular resonator for sensing applications", nanoscale res. lett., vol. 13, p. 137, 2018. [5] m. janneh, a. de marcellis, e. palange, a. t. tenggara and d. byun, "design of a metasurface-based dualband terahertz perfect absorber with very high q-factors for sensing applications", opt. commun., vol. 416, pp. 152-159, 2018. [6] w. yin, z. shen, s. li, l. zhang and x. chen, "a three-dimensional dual-band terahertz perfect absorber as a highly sensitive sensor", front. phys., vol. 9, p. 665280, 2021. [7] x. hu, g. xu, l. wen, h. wang, y. zhao, y. zhang, d. r. s. cumming and q. chen, "metamaterial absorber integrated microfluidic terahertz sensors", laser photonics rev., vol. 10, pp. 962-969, 2016. [8] l. cong, s. tan, r. yahiaoui, f. yan, w. zhang and r. singh, "experimental demonstration of ultrasensitive sensing with terahertz metamaterial absorbers: a comparison with the metasurfaces", appl. phys. lett., vol. 106, p. 031107, 2015. [9] a. kovačević, m. potrebić and d. tošić, "sensitivity analysis of possible thz virus detection using quad-band metamaterial sensor", in proceedings of the ieee 32nd international conference on microelectronics (miel), niš, serbia, 2021, pp 107-110. [10] n. akter, m. m. hasan and n. pala, "a review of thz technologies for rapid sensing and detection of viruses including sars-cov-2", mdpi biosensors, vol. 11, p. 349, 2021. [11] n. shen, p. tassin, t. koschny and c. soukoulis, "comparison of goldand graphene-based resonant nanostructures for terahertz metamaterials and an ultra-thin graphene-based modulator", phys. rev. b, vol. 90, no. 11, p. 115437, 2014. [12] wipl-d pro 17, 3d electromagnetic solver, wipl-d d.o.o., belgrade, serbia, 2021. available online: http://www.wipl-d.com (accessed on 29 april 2022). [13] b. dadonaite, b. gilbertson, m. l. knight, s. trifković, s. rockman, a. laederach, l. e. brown, e. fodor and d. l. v. bauer, "the structure of the influenza a virus genome", nat. microbiol., vol. 4, no. 11, pp. 1781-1789, 2019. [14] m. amin, o. siddiqui, h. abutarboush, m. farhat and r. ramzan, "a thz graphene metasurface for polarization selective virus sensing", carbon, vol. 176, pp. 580-591, 2021. [15] b. wang, a. sadeqi, r. ma, p. wang, w. tsujita, k. sadamoto, y. sawa, h. r. nejad, s. sonkusale, c. wang et al, "metamaterial absorber for thz polarimetric sensing", in proceedings of the spie, terahertz, rf, millimeter, and submillimeter-wave technology and applications xi, san francisco, ca, usa, 2018, vol. 10531, pp. 1-7. [16] f. lan, f. luo, p. mazumder, z. yang, l. meng, z. bao, j. zhou, y. zhang, s. liang, z. shi et al, "dualband refractometric terahertz biosensing with intense wave-matter-overlap microfluidic channel", biomed. opt. express, vol. 10, pp. 3789-3799, 2019. http://www.wipl-d.com/ instruction facta universitatis series: electronics and energetics vol. 30, n o 2, june 2017, pp. 245 256 doi: 10.2298/fuee1702245l novel approach to modelling of lightning current derivative  karl lundengård 1 , milica rančić 1 , vesna javor 2 , sergei silvestrov 1 1 mälardalen university, ukk, division of applied mathematics, västerås, sweden 2 university of niš, faculty of electronic eng., dept. of power engineering, niš, serbia abstract. a new approach to mathematical modelling of lightning current derivative is proposed in this paper. it builds on the methodology, previously developed by the authors, for representing lightning currents and electrostatic discharge (esd) currents waveshapes. it considers usage of a multi-peaked form of the analytically extended function (aef) for approximation of current derivative waveshapes. the aef function parameters are estimated using the marquardt least-squares method (mlsm), and the framework for fitting the multipeaked aef to a waveshape with an arbitrary number of peaks is briefly described. this procedure is validated performing a few numerical experiments, including fitting the aef to singleand multi-peaked waveshapes corresponding to measured current derivatives. key words: analytically extended function, lightning current derivative, lightning current function, lightning stroke, marquardt least-squares method 1. introduction besides different parameters of lightning electromagnetic field and lightning discharge currents, which greatly endanger the functionality of power systems, electrical equipment and electronic devices, lightning current derivative signal is often measured at tall instrumented towers, towers at elevated terrain and at rocket-triggered stations, [1]-[8]. current derivatives approximation is important for calculation of lightning induced overvoltages and for further improvements of lightning discharge models [1], [5], [9]. generalizing the function for representing lightning currents from [10]-[12], the proposed multi-peaked analytically extended function (aef) has been applied by the authors to modelling of different lightning currents, including those defined in the iec standard 62305-1 [13], slow and fast-decaying ones, as well as measured ones, see e.g. [14]-[16]. furthermore, it has been recently used in [9] and [17] for representation of the electrostatic discharge (esd) current corresponding to the iec standard 61000-4-2 waveshape as given received august 29, 2016; received in revised form november 26, 2016 corresponding author: vesna javor university of niš, faculty of electronic engineering, aleksandra medvedeva 14, 18000 niš, serbia (e-mail: vesna.javor@elfak.ni.ac.rs) 246 k. lundengård, m. ranĉić, v. javor, s. silvestrov in [18]-[19]. the aef’s parameters were fitted to the desired current waveshapes using the marquardt least-squares method (mlsm), [20]. in this paper we explore the possibility of reproducing the waveshape of the lightning current first derivative using the aef and adjusting its non-linear parameters employing the mlsm. the validity of approximation and this methodology is tested by performing a few numerical experiments related to modelling of lightning current derivative signals measured at the cn tower [5]. since installation, simultaneous measurements of currents and current derivatives by rogowski coils, corresponding electromagnetic field values detected by sensors and high-speed cameras at a few km distance from the tower have been providing useful data for analysis, [1]-[5]. reflection coefficients are estimated for the cn tower and employed for magnetic field calculation in [5]. reflections occur from the tip of this tower, top and bottom of its restaurant and from the ground, so as at the upward-propagating lightning returnstroke channel front, and produce peaks in the current derivative waveshape. in this paper, lightning current derivative approximation is done taking into account the initial peak and subsequent peaks in the derivative waveshape, regardless of their cause. the same procedure may be used in the case when measured current derivatives have multipeaked waveshapes for other reasons, e.g. due to various configurations of the terrain and some tall structures, or due to lightning current channel discontinuities and branching. 2. modelling of the lighting current derivative 2.1. analytically extended function (aef) and some of its properties the basic building block of the multi-peaked aef is, as referred to in [18], the power exponential function (pef) given by 1 ( ; ) ( ) , 0 , t x t te t      (1) where the β-parameter determines the steepness of both its rising and its decaying part. the aef is constructed as a function consisting of piecewise linear combinations of pefs that have been scaled and translated to ensure that the resulting function is continuous. in [18], it is defined as 1 1 1 , , 1 1 , , 1 1 ( ), ,1 , d ( ) d ( ), , 1, q k q q q p k p nq dm dm q k q k m m k k np dm q k q k m k k i i x t t t t q p i t t i x t t t q p                            (2) where:  1 2 , ,..., pdm dm dm i i i the difference in height between each pair of peaks,  1 2 , ,..., pm m m t t t the times corresponding to these peaks,  0 q n  the number of terms in each time interval,  ,q k  real values so that ,1 1 qn q kk    , and  , ( ) q k x t pefs defined by ,q k  parameters in the following way: novel approach to modelling of lighting current derivative 247 2 , 1 2 , 1 , exp 1 , ( ) exp 1 1, q k q q q q q k q q m m m m q k m m t t t t q p t t x t t t q p t t                                      (3) where 1q q qm m m t t t     . expression (2) can be written more compactly as     1 1 t 1 t 1 , , 1 , d ( ) d , , 1, k q q q k p q dm dm q q m m k p d m q q m k i i t t t t q p i t t i t t t q p                     x x   (4) after introducing t ,1 ,2 , [ ] , qq q q q n     ,1 ,2 , ( ) [ ( ) ( ) ( )]. qq q q q n t x t x t x tx the first derivative of the multi-peaked aef corresponds to the second derivative of the lightning discharge current, i(t), and can be easily found since the aef consists of elementary functions. compact form is given by     1 1 t 2 2 t ( ) , , 1 , d ( ) d ( ) , , 1, q q q q q q q q p q q m q dm q q q m m m m m q dm q q q m m m t t x t i t t t t q p t t t i t t t t x t i t t t q p t t                  η b x η b x (5) where q b are diagonal matrices: 2 2 2 ,1 ,2 , 2 2 2 ,1 ,2 , diag( 1, 1, , 1), 1 , diag( , , , ), 1. q q q q q n q q q q n q p q p                    b based on this expression, it is easy to see that the current’s second derivative is also continuous since it will be zero at each qm t . the integral of the aef corresponds to the lighting discharge current i(t) and is also relatively straightforward to find, since the integral of the pef can be written using the lower incomplete gamma function ([21]) i.e. 248 k. lundengård, m. ranĉić, v. javor, s. silvestrov 1 0 1 0 1 01 1 ( ; ) d ( ( 1, ) ( 1, )) ( , , ) t t e e x t t t t t t                    , (6) where 1 0 ( , ) d t t e        is the incomplete gamma function. combining (6) and (2) we obtain the integral of the rising part of the aef 1 1 , 1 1 11 2 , , 1 1 1 1 , 1 1 d ( ) d ( ) ( , ) d ˆ ( 1) ( ) ( , ) , 0 , a b a k a a a q q k q q b k b b p na t m a dm dm a k a m a t k k nqb m dm dm q k q k q a k k nb b m dm dm b k b b m a b m m k k i t t t t i i g t t t t i i g t t i i g t t t t t t                                                          1 1 , , a a b ba m m b m t t t t t       (7) with 2 , 1 1 2 , 1 1 02 1 0 ,22 , ( , ) 1, , ( 1) q k q q q k q q m m q q k m mq k t t t te g t t t t                     and 1 ˆ ( ) ( 1, ) e g         . the integration formula corresponding to the decaying part is 1 1 0 1, 1 1 0 0 1 1 1 d ( ) d ( , ), d p k p np t dm p k p mt k k i t t i g t t t t t t             , i.e. 1 2 1, 1, 1 1 d ( ) d ( ) d p k m p np dm p k p kt k k i t t i g t           , (8) with 1 ( ) ( ( 1) ( 1, )) e g            and 1 0 ( ) de         , the gamma function, [21]. 2.2. marquardt least-squares method (mlsm) detailed explanation of the mlsm algorithm is given in [15], [16], here we just go over the parts specific for the multi-peaked aef. the mlsm is used for estimating β-parameters, and from these, the corresponding η– parameters are calculated. in each iteration step, η–parameters are obtained using the regular least-square method since for fixed β-parameters the aef is linear in η. based on these η–parameters, a new set of β-parameters is found. the mlsm uses a jacobian matrix, denoted by j, containing partial derivatives of the residuals. the least square fitting of the multi-peaked aef to a set of data points can be done separately between each peak (and after the final one), and the corresponding j matrix is ,1 ,1 ,2 ,1 , ,1 ,1 ,2 ,2 ,2 , ,2 ,1 , ,2 , , , ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) q q q q q q q q q q q n q q q q q q n q q q k q q k q n q k p t p t p t p t p t p t p t p t p t                j , (9) novel approach to modelling of lighting current derivative 249 where q k is the number of data points between the (q-1)th and qth peak, and ,q r t is the time corresponding to these data points, and 2 , , , , , 2 , , , , , d ( ) 2 ( ) ( 1), 1 , d ( ) ( ) ( ), 1, q q dm q k q k q q k q r q r q r dm q k q k q q k t t q r i t i h t x q p t p t i h t x q p                          with 1 1 ln 1, 1 , ( ) ln 1, 1. q q q q q q m m m m q m m t t t t q p t t h t t t q p t t                                  3. aef representing measured lighting current derivative examples in this section we validate our model by attempting to represent measured lightning current derivatives data obtained at the 553m cn tower, toronto, canada, [5]. time and current values corresponding to aef peaks were chosen manually and the rest of the aef parameters were obtained using the framework briefly described in section 2.2. the number of time intervals and terms in each of them vary from example to example. general notation, aefp(n1, …, np) for nq, q=1, …, p, is used to denote an aef with p peaks and chosen number of terms nq in each time interval q. 3.1. single-peaked waveshape the first example illustrates the application of a single-peaked aef to representation of the measured initial current derivative impulse occurring in the first 0.5 s given in [5, fig. 4]. the best fitting was obtained choosing two terms in each of the two time intervals: 0-tm and tm-0.5 s (the moment tm corresponds to the maximum current derivative). current derivative value at t=0 is treated as the first point of approximation, so there are 4 terms in total, for these 2 intervals. obtained aef2(2,2) model is illustrated in fig. 1a along with the measured data, data points used for the mlsm fitting, and the locations of peaks observed in this waveshape. using the expressions (7) and (8) we also obtained the aef’s integral, i.e. the lighting discharge current. it can be observed in fig. 1b along with the numerically integrated measured data. 250 k. lundengård, m. ranĉić, v. javor, s. silvestrov a) b) fig. 1 a) aef2(2,2) representing measured lightning current derivative from [5, fig. 4], and b) the corresponding lightning current 3.2. multi-peaked waveshapes in this part we attempt modeling of the measured current derivatives data that include the initial and a number of subsequent impulses. the recoded waveshapes have great number of peaks and therefore is harder to model them using standard functions, but these are more suitable for modelling by the multi-peaked aef. the first example corresponds to an event of lightning discharge measured at the cn tower, using the rogowski coil positioned at 474 m, illustrated in [5, fig. 2] in 10 s. such current derivative waveshape corresponds to typical fast-rising negative lightning discharge which occurs in about 80% of the registered cases (in 126 flashes out of 160 [5]). the complexity of the aef used for modelling of such multi-peaked waveshapes depends on the desired level of accuracy of the data representation. novel approach to modelling of lighting current derivative 251 in fig. 2 are presented two aefs with different number of peaks, including the starting current derivative value at t = 0 and other peaks which are chosen such that they correspond to local maxima only: a) aef6(1,2,2,2,2,2) with 6 intervals and 11 terms in total, and b) aef8(1,2,2,2,2,2,2,2) with 8 intervals and 15 terms in total. the increased number of time intervals fixes representation of the waveshape part corresponding to the period between the fourth and fifth peak of aef6, and also after its sixth peak, so that the total number of intervals in aef8 is increased by 2, whereas the number of terms by 4. a) b) fig. 2 multi-peaked aefs (using starting point and maxima only) representing measured lightning current derivative from [5, fig. 2]: a) aef6(1,2,2,2,2,2) with 6 peaks, b) aef8(1,2,2,2,2,2,2,2) with 8 peaks 252 k. lundengård, m. ranĉić, v. javor, s. silvestrov additional improvement is needed and could be achieved by further segmentation and including also local minima, so as by increasing the number of terms. two such aef models are illustrated in figs. 3a and 3b, both with thirteen peaks, but for different number of terms chosen to represent some of its intervals. thirteen peaks in aef13 include 4 minima added to aef8 and also one more maximum at its ending part, so that the number of peaks is increased from 8 (in fig.2b) to 13 (in figs. 3a and 3b). these two aefs are denoted by a) aef13a(1,1,1,1,1,2,1,1,1,1,2,2,1) with 13 intervals and 16 terms in total, and b) aef13b(1,1,2,1,2,2,1,2,1,2,2,2,2) with 13 intervals and 21 terms in total, where the bold numbers in brackets point out to the changed number of terms, in some intervals increased from 1 to 2. a) b) fig. 3 multi-peaked aefs with 13 peaks (using starting point, 8 maxima and 4 minima) representing measured lightning current derivative from [5, fig. 2]: a) aef13a(1,1,1,1,1,2,1,1,1,1,2,2,1), b) aef13b (1,1,2,1,2,2,1,2,1,2,2,2,2) novel approach to modelling of lighting current derivative 253 results for the same lightning current derivative measured at cn tower are given in first 7s in figs. 4a and 4b for fitting by aefs corresponding to data from [5, fig. 6]. model aef7(1,2,2,2,2,2,2) with 7 peaks (starting point and maxima only) and 13 terms is presented in fig. 4a, able to capture the initial impulse and subsequent peaks due to reflections at the tower discontinuities. aef7 has one more peak added at the end of aef6, and 2 more terms in total. in fig. 4b, aef13c(1,2,2,2,2,2,2,2,2,1,1,2,2) model is presented with the total of 13 peaks (the starting point, 8 maxima and 4 minima), which almost perfectly models measured set of data. it has 23 terms in total, 4 terms added and 2 excluded compared to aef13b. the difference between those two is that 1 peak was added for aef13c between tenth and eleventh peak of aef13b, which improved significantly the approximation, and the thirteenth peak was excluded from the end of aef13b. a) b) fig. 4 multi-peaked aef representing measured current derivative from [5, fig. 6]: a) aef7(1,2,2,2,2,2,2) with 7 peaks (using starting point and maxima only), b) aef13c(1,2,2,2,2,2,2,2,2,1,1,2,2) with 13 peaks (starting point, 8 maxima & 4 minima) 254 k. lundengård, m. ranĉić, v. javor, s. silvestrov figure 5 illustrates lighting discharge currents corresponding to above modelled multipeaked current derivative waveshapes. again, expressions (7) and (8) are employed to calculate them, and the numerically integrated measured data is also given for comparison. fig. 5a corresponds to aef13b(1,1,2,1,2,2,1,2,1,2,2,2,2) model shown in fig. 3b, while fig. 5b relates to model aef13c(1,2,2,2,2,2,2,2,2,1,1,2,2) from fig. 4b. a) b) fig. 5 lightning currents corresponding to derivatives modelled by aefs: a) aef13b (1,1,2,1,2,2,1,2,1,2,2,2,2) from fig. 3b, b) aef13c(1,2,2,2,2,2,2,2,2,1,1,2,2) from fig. 4b novel approach to modelling of lighting current derivative 255 4. conclusions approximation of lightning current derivatives is needed for calculation of lightning induced effects and for improvements of lightning discharge models. suitability of the multipeaked aef to represent lightning current derivatives is presented in this paper through a few examples. aef’s non-linear parameters are calculated using marquardt least-squares method (mlsm), so that the measured current derivatives signals [5] are well approximated. the approximation by aefs in this paper is done for singleand multi-peaked current derivative waveshapes. increasing the number of maxima and minima, so as the number of terms in total, improves the approximation of the current derivative by aef. the lightning current waveshape is obtained with great accuracy as analytically integrated aef representation of the measured derivative. multi-peaked lightning current derivatives are characteristic for lightning discharges to tall towers and high structures at elevated terrain, but also for subsequent lightning strokes and lightning current channels with discontinuities and branching. further work should be aimed at including such current and its derivative function into lightning stroke models in order to obtain measured lightning electromagnetic field at certain distances. references [1] k. elrodesly and a. m. hussein, "cn tower lightning return-stroke current simulation”, journal of lightning research, vol. 4, suppl 2: m3, pp. 60-70, 2012. [2] a. m. hussein, m. milewski, and w. janischewskyj, "correlating the characteristics of the cn tower, lightning return-stroke current with those of its generated electromagnetic pulse”, ieee transactions on electromagnetic compatibility, vol. 50, no. 3, pp. 642-650, aug. 2008. [3] a. m. hussein, m. milewski, e. burnazovic and w. janischewskyj, "current waveform characteristics of cn tower negative and positive lightning", in proceedings of the x international symposium on lightning protection, curitiba, brazil, 2009, pp. 451-456. [4] b. kordi, r. moini, w. janischewskyj, a. m. hussein, v. o. shostak and v. a. rakov, " application of the antenna theory model to a tall tower struck by lightning”, journal of geophysical research, vol. 108, no. d17, 4542, doi: 10.1029/2003jd003398, 2003. [5] m. milewski and a. m. hussein, "tall-structure lightning return-stroke modelling", in proceedings of the 14th international middle east power systems conference (mepcon’10), cairo university, egypt, 2010, paper id 313, pp. 947-952. [6] f. rachidi, w. janischewskyj, a. m. hussein, c. a. nucci, s. guerrieri, b. kordi and j-s. chang, "current and electromagnetic field associated with lightning–return strokes to tall towers", ieee transactions on electromagnetic compatibility, vol. 43, no. 3, pp. 356-367, aug. 2001. [7] v. a. rakov, “transient response of a tall object to lightning”, ieee transactions on electromagnetic compatibility, vol. 43, no. 4, pp. 654-661, 2001. [8] m. a. uman, j. schoene, v. a. rakov, k. j. rambo and g. h. schnetzer, “correlated time derivatives of current, electric field intensity, and magnetic flux density for triggered lightning at 15 m”, journal of geophysical research, vol. 107, no. d13, doi: 10.1029/2000jd000249, 2002. [9] v. javor, "an analytically extended function for representing the lightning current first derivative", in proceedings of the int. colloquium on lightning and power systems, bologna, italy, 2016, p13_s3.2, pp. 1-8. [10] v. javor and p. d. rancic, “a channel-base current function for lightning return-stroke modeling”, ieee transactions on electromagnetic compatibility, vol. 53, no. 1, pp. 245-249, feb. 2011. [11] v. javor, "multi-peaked functions for representation of lightning channel-base currents", in proceedings of 2012 international conference on lightning protection iclp, vienna, austria, 2012, pp. 1–4. 256 k. lundengård, m. ranĉić, v. javor, s. silvestrov [12] v. javor, "new function for representing iec 61000-4-2 standard electrostatic discharge current", facta universitatis, series: electronics and energetics, vol. 27(4), pp. 509-520, 2014. [13] iec 62305-1, protection against lightning part i: general principles ed. 2.0, 2010-12. [14] k. lundengård, m. ranĉić, v. javor and s. silvestrov, "application of the multi-peaked analytically extended function to representation of some measured lightning currents", serbian journal of electrical engineering, vol. 13(2), pp. 1-11, 2016. [15] k. lundengård, m. ranĉić, v. javor and s. silvestrov, "estimation of parameters for the multi-peaked aef current functions", methodology and computing in applied probability, springer, pp. 1-15, 2016. doi: 10.1007/s11009-016-9501-z [16] k. lundengård, m. ranĉić, v. javor and s. silvestrov, “on some properties of the multi-peaked analytically extended function for approximation of lightning discharge currents”, engineering mathematics i: electromagnetics, fluid mechanics, material physics and financial engineering, series: springer proceedings in mathematics & statistics, vol. 178, eds. s. silvestrov and m. ranĉić, springer, heidelberg, 2016, pp. 151-172, ebook isbn 978-3-319-42082-0; hardcover isbn 978-3-319-42081-3; doi 10.1007/978-3-319-42082-0 [17] k. lundengård, m. ranĉić, v. javor and s. silvestrov, "multi-peaked analytically extended function representing electrostatic discharge (esd) currents", in aip conference proceedings of icnpaa 2016, la rochelle, france, 2016, pp. 1-10. [18] emc part 4-2: testing and measurement techniques electrostatic discharge immunity test. iec international standard 61000-4-2, basic emc publication, 1995+a1:1998+a2:2000. [19] emc part 4-2: testing and measurement techniques electrostatic discharge immunity test. iec international standard 61000-4-2, basic emc publication, ed. 2, 2009. [20] d. m. marquardt, "an algorithm for least-squares estimation of nonlinear parameters", journal of the society for industrial and applied mathematics, vol. 11(2), pp. 431-441, 1963. [21] m. abramowitz and i. a. stegun, handbook of mathematical functions with formulas, graphs, and mathematical tables. 1964, dover, new york. instruction facta universitatis series: electronics and energetics vol. 27, n o 2, june 2014, pp. 259 273 doi: 10.2298/fuee1402259n physical modeling of electrical and dielectric properties of high-k ta2o5 based mos capacitors on silicon  nenad novkovski institute of physics, faculty of natural sciences and mathematics, university “ss. cyril and methodius”, arhimedova 3, 1000 skopje, macedonia abstract. in this paper we present an integral physical model for describing electrical and dielectric properties of mos structures containing dielectric stack composed of a high-k dielectric (with emphasize on pure and doped ta2o5) and an interfacial silicon dioxide or silicon oxynitride layer. based on the model, an equivalent circuit of the structure is proposed. validity of the model was demonstrated for structures containing different metal gates (al, au, pt, w, tin, mo) and different ta2o5 based high-k dielectrics, grown of bare or nitrided silicon substrates. the model describes very well the i-v characteristics of the considered structures, as well as frequency dependence of the capacitance in accumulation. stress-induced leakage currents are also effectively analyzed by the use of the model. key words: high-k dielectrics, metal-insulator-silicon structures, conduction mechanisms in dielectrics, leakage currents 1. introduction further scaling of microelectronic devices required for new generations of integrated circuits is confronting multiple challenges, rather important one of them being the fabrication of ultrathin dielectric layers used particularly in mosfets and drams. while decreasing the lateral size of devices, in order to obtain the required capacitance, a decrease of the equivalent oxide thickness is required. the above requirement can be met either by decreasing the physical thickness or by increasing the permittivity of the dielectric (gate oxide for mosfets, dielectric in mos capacitors of drams). doped, mixed and laminate high-permittivity (high-k) dielectric stacks attract progressively higher attention as a solution for further improvement of their electrical and dielectric properties [1]-[13]. it has been shown that ta2o5, known as one of the most attractive dielectrics for the nanoscale dynamic random-access memories, can improve  received february 5, 2014 corresponding author: nenad novkovski institute of physics, faculty of natural sciences and mathematics, university “ss. cyril and methodius”, arhimedova 3, 1000 skopje, macedonia (e-mail: nenad@iunona.pmf.ukim.edu.mk) 260 n. novkovski further by doping with convenient elements [14]. detailed studies of the properties of tantalum pentoxide doped with al, ti and hf and mixed with hfo2 have been reported [15][30]. in addition, it has been shown that the nitridation of the si substrate improves substantially electrical, dielectric and reliability properties of metal-high-k-si structures [31]. in [32] we described in detail a comprehensive model for the i-v characteristics of metal-ta2o5/sio2-si structures. in this work we present integrally the generalization of the comprehensive model for mis structures containing dielectric stack composed of a high-k dielectric (particularly pure and doped ta2o5) and an interfacial silicon dioxide or silicon oxynitride layer and review the important results obtained with using specific cases of this model for various mos structures of the considered type. 2. theoretical model 2.1. band diagram band diagram of the considered structure in the case of al gate is shown in fig. 1. 4 .0 5 e v al 4 .2 5 e v vacuum level ec  e  h high-k  e if si 1.12 ev sio2 or sioxny ms ef ev  e '  e h k s  h ' fig. 1 band diagram of the considered structure in fig. 1 ehk and eif are the bandgaps of the high-k and the interfacial layer, respectively. e' and h' are band offsets for electrons and holes, respectively, at the contact between the high-k and the interfacial layer, while e and h are band offsets for electrons and holes, respectively, at the contact between the interfacial layer and the silicon substrate. ms is the work function difference between the metal gate and si, while s is the shottky barrier height for electrons. in the case of al gate, ta2o5 high-k dielectric and sio2 interfacial layer the values are those summarized in table 1. work function difference, ms, depends on the si substrate doping and is the same as in the case of the corresponding metal-sio2-si structure. for p-type substrates it is around 0.5 ev. table 1 values of bandgaps and band offsets for al-ta2o5/sio2-si structures ehk (ev) eif (ev) e (ev) h (ev) e' (ev) h' (ev) s (ev) 8.97 4.4 3.15 4.97 3.06 1.51 0.29 physical modeling of ta2o5 based mos capacitors on si 261 2.2. conduction mechanisms the conduction mechanisms that have to be considered in general case for the interfacial layer are:  hopping conduction, which is a result of the quantum diffusion of electrons between the localized states in the insulator, typical of disordered materials. this is a bulk-limited conduction mechanism, and hence it does not depend on the gate voltage polarity. since the current density in this case is a linear function of the electric filed, we can consider it as a conductivity of ohmic type.  the trap-assisted inelastic tunneling [33]-[34]. electrons tunnel from the silicon to the traps in the sio2 layer. as the sio2 is an amorphous material with low trap density it is expected to observe this effect only in the films where the traps are created as a result of a stress, radiation or process induced damage. in the case of an sioxny interfacial layer significantly higher density of traps is to be expected. however, this density is still very low compared to typically high density materials.  direct tunneling (trough a trapezoidal barrier) and fowler-nordheim injection (trough a triangular barrier) into interfacial layer. tunneling current can be created by the electrons or the holes from the si substrate. the barrier for the tunneling of the holes is different from that for the electrons, thus a remarkable asymmetry can be observed between the opposite polarities. a particular mechanism involving both sioxny and high-k is the tunneling through double barrier (through a trapezoidal barrier in sio2 and a triangular barrier in high-k). the conduction mechanisms that have to be considered for the high-k dielectric are:  poole-frenkel mechanism, which is bulk-limited, and hence independent on the gate bias polarity. electrons are exited to the conduction band from the traps by field-enhanced thermal emission and they drift trough the layer. because of the high defect density, they are easily trapped by other positively charged defects. new electrons are released from other traps, thus transporting the charge step by step from one surface of the film to the opposite (fig. 2). when the gate is negative, electron needs first to enter the insulator from the metal gate. it is to be noted that they do not need to obtain enough energy to enter the conduction band, but just to move to a defect-related state in the vicinity of the metal surface. the activation energies of the defects responsible for the poole-frenkel emission in the ta2o5 are 0.2 ev (type a, [35]) and 0.8 ev (type d, must probably the first ionization level of the double-donor oxygen vacancy, [36]). they are close to or lower than the metal-gate fermi level (0.29 ev under the conduction band of ta2o5. we estimated the tunneling probability from the al-gate to the neighboring traps to be so high that extremely high current densities of order of 100 a/cm 2 can be attained for a voltage drop of only few mv.  shottky emission, which is an electrode-limited effect. schottky conduction is excluded for gate positively biased, because the side of the high-k layer near the negative electrode is not in direct contact with a metal or semiconductor. for the gate negatively biased, the barrier is low (for ta2o5 only 0.29 ev), and hence the schottky emission is to be expected. however, it is not expected to be a currentlimiting mechanism, because thus injected electrons are quickly trapped in the the high-k layer near the contact with the metal, continuing the transport by the poole262 n. novkovski frenkel emission from the traps. namely, the pure schottky effect occurs when electrons are injected from the metal in vacuum. the situation is similar when they are injected in a medium where they can almost freely traverse the distance from the injecting to the opposite electrode, as is the case with the ultra-thin sio2 or sioxny if the defect density is fairly low. for metals with higher absolute values of the work functions this issue requires further consideration. we observed a particular effect of charge trapping at the interface between the metal gate and the high-k dielectric for au and pt [37]-[39]. although the schottky emission from the metal to the high-k conduction band is practically impossible, an emission to the traps can substantially influence the leakage currents. for example, in the case of ta2o5 and a pt electrode, the fermi level in the metal is about 0.6 ev lower than the trapping level of the d type defect. in that case the filling of the traps d type can occur by thermal emission from the metal, leading to a schottky-like effect at low applied voltages, as it was observed on au-ta2o5-pt-si structures at pt electrode negatively biased [40]. this issue requires deeper investigation in a separate study on metal-insulator-metal structures. one of the possible approaches to this problem will be to use the multi-step trap-assisted tunneling model, as it was done in [41] for the metal-al2o3-si structures. fig. 2 illustration of the poole-frenkel conduction mechanism  the hoping conduction in the ta2o5 layer is of much lower importance because the poole-frenkel mechanism gives already much higher conductivity in ta2o5 then the hopping conductivity in sio2. specifically, when ta2o5 is polycrystalline, as is the case with the films studied here [42], the hopping conductivity is very weak, while the trap density (related to oxygen vacancies, grain boundaries etc.) becomes extremely high. therefore, it is reasonable to neglect the hopping conductivity. 2.2. differences between the cases of positive and negative gate in the case of the gate positively biased, the electrons that tunnel through the sio2 barrier enter the ta2o5 conduction band. they drift for a small distance, then they become trapped, but some new electrons are subsequently emitted from the traps and continue the transport, step by step, until entering the metal (fig. 3). e si high-k metal sio2 or sioxny physical modeling of ta2o5 based mos capacitors on si 263 fig. 3 conduction mechanisms ate positive gate in the case of the gate negatively biased, some electrons from the traps near the ta2o5/sio2 interface can move to the localized states in the sio2 layer, then by quantum diffusion to contribute to the hopping conduction. tunneling of electrons through the sio2 layer from the ta2o5 layer and of holes from the si substrate could occur. the usual assumption that the electron current gives the dominant contribution in this case is not valid, because the fowlernordheim and direct tunneling are possible where an electron gas from the metal of semiconductor is in contact with an sio2 surface [43]. there, the dominant part of the electrons moving towards this surface are reflected, while a small part tunnels through the sio2 layer entering the opposite electrode (direct tunneling) or a part of it entering its conduction zone (fowler-nordheim tunneling). in the case of an insulator, the density of the electrons in the conduction zone is practically zero and the electron tunneling is practically impossible. therefore only the holes from the substrate contribute to the tunneling current [44]. for enough high fields, the holes injected from the si substrate enter the valence band of the ta2o5 layer. because of the high trap density, after passing a small distance, they recombine with the electrons on the traps. special attention has to be devoted to the case of lower fields, where the holes can not tunnel to the valence band (fig. 4). by other authors [45] an attempt was made to describe a similar situation by the double barrier tunneling. fig. 4 conduction mechanisms ate negative gate e(-) e(-) poole-frenkel transport of electrons trapping of the electrons injected through sio2 into the high-k conduction band tunneling of electrons high-k metal si e(-) si high-k metal tunneling of holes poole-frenkel transport of electrons h(+) recombination of the electrons from the high-k traps with the holes injected through sio2 trapping of holes sio2 or sioxny sio2 or sioxny 264 n. novkovski our estimations in connection with the proposed comprehensive model showed feeble agreement with the experimental results if a double barrier tunneling mechanism is invoked. the reason is that the dominant conduction mechanism for the ta2o5 layer is the poole-frenkel and not the tunneling. once the charge carriers enter the forbidden gap of the tantalum pentoxide, they become trapped after a short distance, because the defect related trap density there is extremely high. tunneling is typical of the sio2 films and is observed in si3n4 films with very high quality, where the defect density is low and the injected charge carriers can pass long distances (of order of 100 nm) with a small probability to be trapped. in some cases (sio2 thinner than 4 nm) even a ballistic transport is observed [46]. the most probable route of the electrons injected into the ta2o5 forbidden gap is to be first trapped near the ta2o5/sioxny interface and then to recombine with electrons from other traps or from the conduction band (fig. 4). a similar situation can also appear in the case of low fields for the opposite gate polarity. 2.3. construction of the model the expressions for the current density due to the hopping conductivity in sio2 (jhc) is described by the following expression: ififhc ej  (1) where if is the temperature dependant hopping conductivity and eif is the filed in the interfacial layer. direct tunneling current density through the interfacial layer (jtd) is given by the following expression:                             2 3 if if if 3 2 if 2 td 11 3 28 exp 8 e d he qm e h q j    (2) and for the fowler-nordheim injection with (jfn)           if 3 2 if 2 fn 3 28 exp 8 he qm e h q j   , (3) where q is the electron charge, h is the planck’s constant, m* is the effective tunneling mass of charge carriers injected through the interfacial layer, dif is the thickness of the interfacial layer,  is the tunneling barrier height and eif is the electric field in it. the total current density flowing through the interfacial layer (jif) is given by the following expression: ifif ifif fn td hcif de de j j jj         , (4) and the voltage drop on the interfacial layer (eif) is ififif edv  . (5) physical modeling of ta2o5 based mos capacitors on si 265 the current density due to the poole-frenkel effect in the high-k layer (jpf) is described by the following expression: 3 pf hk hk hk 0 t 1 q j (0)e exp e rkt k          , (6) where hk(0) is a temperature dependent defect related constant having dimensions of conductivity, r is the degree of compensation [47], k is the boltzmann constant, 0 is the dielectric permittivity in vacuum, kt is the optical frequency dielectric constant of the high-k dielectric and ehk is the electric filed in it. the voltage drop on the layer (vhk) is given by: hkhkhk edv  , (7) where dhk is the thickness of the high-k dielectric layer. the numerical procedure consists in simultaneous computation of the two following quantities: the oxide voltage: ififhkhkifhkox ededvvv  (8) and the current density in steady state (kirchhoff’s laws) ifpf jjj  . (9) first the current density j = jif was determined for a given field eif in the interfacial layer. then the field in the high-k layer was computed as an inverse function of the current density jhk = j. at the end, the oxide voltage was calculated with the use of the expression (8). we intend to use minimum of fitting parameters. the defect density parameter for high-k layer was first chosen because it is dependent on the technological parameters and is difficult to be determined by independent methods. silicon dioxide layer thickness was also treated as a fitting parameter in a restricted range (2 to 3 nm) close to the measured value, because the small variations in it cause substantial variations in the result. later, these results were compared with independent measurements. the hoping conductivity was also treated as a fitting parameter, since there are no available data from independent experiments. because the different mechanisms do not exclude each other, they are considered in a single form for the entire measurement region; as we discussed in [48], this approach is unavoidable in the case of nano-layered dielectrics where the effects of contributions of different conduction mechanisms can not be separated but standard methods a single assuming dominant conduction mechanism in a given voltage range. in the case of al-ta2o5/sio2-si structures following typical values can be taken from the literature: tunneling electron mass in ultrathin sio2, me* = 0.61 me [49], where me denotes the mass of free electron; tunneling hole mass in sio2, mh* = 0.51 me; optical frequency dielectric constant of ta2o5, kt = n 2 = 2.1 2 = 4.4; tunneling barrier height for of holes in sio2; h = 4.70 ev [49]; tunneling barrier height for of electrons in sio2, e = 3.15 ev [50]; and compensation factor, r = 1 (we consider the poole-frenkel effect without compensation). 266 n. novkovski voltage on the stacked insulating layer (vox) can be calculated by using relations involving the flatband voltage (vfb) and the voltage drop in the semiconductor (vs): sfbgox vvvv  . (10) the value of the vfb was determined with the standard method which is not described here. a low value of the fixed charge density in the sio2 was assumed, i.e. the ideal value of the flatband voltage ( id fbv ) was used. this assumption will be discussed later, though it can be simply treated as an approximation that holds for insulating films of high quality, where the oxide charge density is fairly low. the voltage drop in si (vs) is connected with the electric field strength in the interfacial layer (eif) by the following expression:                                                                   si typen11 2 si typep11 2 ss 2 0 2 i si 0 s 2 0 2 is si 0 si ss ss kt qv e kt qv e n nktn kt qv e p n kt qv e ktp e kt qv kt qv kt qv kt qv if if     . (11) where si is the relative permittivity of silicon, if is the relative permittivity of the interfacial layer, n0 is the density of electrons in n-type silicon, p0 is the majority carrier density in p-type silicon and ni is the intrinsic carrier density in silicon. in strong inversion (positive gate for p-type substrate, negative gate for n-type substrate) the leakage current density reaches an almost saturated value of the order of magnitude 1 ma/cm 2 . this saturation is due to the exhaustion of the minority carriers in the substrate, due to the minority carrier extraction from the substrate (electrons for p-type and holes for n-type). namely, the maximum tunneling current density of the electrons from the substrate is limited by the thermal generation rate of electrons in the inversion region of si, similarly to the case of the diode reverse current. the values observed in our experiment are comparable to the values obtained for p-n si diode reverse currents for the voltages between 1 v and 10 v. 2.3. equivalent circuit combining above described model with the standard description of mis structures [51], a complete equivalent circuit of the considered structure can be constructed (fig. 5). diode (d) that is shown at the left end of the figure accounts for the effect of exhaustion of minority carrier in strong accumulation, as described above. diode orientation shown in the figure corresponds to an n-type substrate; for the case of p-type si substrate the orientation is reversed. physical modeling of ta2o5 based mos capacitors on si 267 fig. 5 equivalent circuit of the considered structure meanings of the symbols for physical quantities in fig. 5 are as follows: rl – serial resistance, rhk – voltage dependent resistance of the high-k layer, rif – voltage dependent resistance of the interfacial layer, rit – interface traps resistance, chk – capacitance of the high-k layer, cif – capacitance of the interfacial layer and cit – interface traps capacitance. capacitances of the layers of the dielectric stack are given by following expressions: hk 0hkhk d a c  (12) and if 0ifif d a c  , (13) where hk is the the relative permittivity of high-k layer and a is the electrode area of the capacitor. rl, rit, cif and cit are to be extracted from the c-g-v curves at various frequencies, while rhk and rif from i-v curves while using here described model. rhk and rif are both voltage dependent. 3. results 3.1. i-v curves first we discuss the values of the parameters obtained from the fitting of the theoretical to the experimental curve that can be obtained by independent methods. this is the case with the interfacial layer thickness (dif) and the band offsets (e and h) at the contact between si and sio2. for e and h values close to the literature data, 3.15 ev and 4.70 ev, respectively, have been obtained [44]. in [44], fitted value dif = 2.8 nm was obtained, close to the value of 2.6 nm measured by transmission electron microscopy. some of the results obtained from applying the model on the experimental results for i-v curves different for al-high-k/sioxny-si structures are displayed in table 2. several chk cif rhk rif rl rit cit gate substrate d u u cs 268 n. novkovski important features of the structures are clearly identified by the values of the important parameters. table 2 values of fitting parameters for al-high-k/sioxny-si structures r.f. sputtered ta2o5 on bare si at substrate temperature 493 k (unpublished data) annealed dif (nm) dhk (nm) e (ev) h (ev) hc ( -1 cm -1 ) hk(0) ( -1 cm -1 ) not 2.90 27 2.50 3.30 110 -16 3.9510 -17 at 893 k 2.95 27 3.05 3.40 110 -16 3.9510 -15 at 1193 k 2.97 26 3.15 4.70 110 -16 1.9810 -12 ta2o5 obtained by thermal oxidation of ta in pure o2 at 873 k on bare si [44] gate dif (nm) dhk (nm) e (ev) h (ev) hc ( -1 cm -1 ) hk(0) ( -1 cm -1 ) al 2.78 47 3.15 4.70 8.110 -17 8.210 -11 au 2.72 47 3.15 4.70 8.110 -17 6.610 -14 w 2.80 47 3.15 4.70 8.110 -17 1.710 -13 r.f sputtered ta2o5 at 493 k on si nitrided in nitrous oxide at temperatures ton [52] ton (k) dif (nm) dhk (nm) e (ev) h (ev) hc ( -1 cm -1 ) hk(0) ( -1 cm -1 ) 973 2.65 17.3 2.92 ev 3.35 ev 410 -15 3.310 -8 1073 2.70 17.3 2.85 ev 3.50 ev 110 -15 3.310 -8 1123 2.80 17.2 2.80 ev 3.50 ev 310 -15 3.310 -8 r.f sputtered ta2o5 at 493 k on si nitrided in ammonia at temperatures ton [52] ton (k) dif (nm) dhk (nm) e (ev) h (ev) hc ( -1 cm -1 ) hk(0) ( -1 cm -1 ) 973 2.70 17.3 2.60 ev 3.30 ev 110 -15 3.310 -8 1073 2.80 17.2 2.85 ev 3.25 ev 110 -15 3.310 -8 ta2o5 obtained by thermal oxidation of ta in pure o2 at 873 k on bare si [53] gate dif (nm) dhk (nm) e (ev) h (ev) hc ( -1 cm -1 ) hk(0) ( -1 cm -1 ) al 1.84 8.1 3.15 4.4 110 -15 210 -9 w 2.04 8.0 3.15 4.7 210 -15 810 -11 au 2.05 8.0 3.15 4.7 510 -16 810 -11 metal-hf:ta2o5/sioxny-si structures (work in progress) gate dif (nm) dhk (nm) e (ev) h (ev) hc ( -1 cm -1 ) hk(0) ( -1 cm -1 ) ag 2.56 5.44 2.6 4.2 210 -16 210 -16 w 2.24 5.76 2.6 4.2 710 -15 210 -14 tin 2.10 5.90 2.6 4.2 1.210 -12 110 -11 first, as is seen from data for r.f. sputtered ta2o5 on bare si at substrate temperature 493 k, unannealed films posses high defect density, as manifested by a high value of the parameter hk(0); annealing substantially reduces density of these defects. annealing also increases the band offsets, thus substantially reducing leakage currents. this is attributed to the improvement of stoichiometry of the interfacial silicon oxide. second, for ta2o5 obtained by thermal oxidation of ta in pure o2 at 873 k on bare si it is obtained that band offsets are those for sio2, indicating that thermally grown films posses an sio2-like interfacial layer. the parameter depending on the deffect density in the high-k layer, hk(0), is about two order of magnitude higher for reactive al gate than for the nonreactive au, w and tin gates, indicating that deposition of the reactive gate creates high amount of defects in the high-k layer. thickness of the layer is practically independent on the gate material for films as thick as 50 nm [44], and weakly dependent physical modeling of ta2o5 based mos capacitors on si 269 on the gate material in the case of films as thin as 10 nm or thinner (nanosized dielectric) [53]. low-field conductivity (hc) for films as thick as 50 nm is independent on the gate material [44], while for nanosized films it is somehow reduced in the case of reactive al gate [53]. therefore, we conclude that the reactive gate in the case of nanosized high-k dielectrics affects also interfacial layer. third, it is seen that substrate nitridation reduces band offsets [52]. with this effect alone, the nitridation would degrade leakage properties of the dielectric films. nevertheless, there is a more important beneficial effect of nitridation consisting in an increase of the relative permittivity of the interfacial layer and substantial decrease of the equivalent thickness with nitridation. as a result, leakage currents for same equivalent thicknesses are lower for films grown on nitrided substrates than for the films grown on bare substrates. detailed analysis of electrical and dielectric properties of different mos structures containing high-k dielectric grown on nitrided si substrate have been reported in several works [31],[52],[62]. the model is also applicable to the structures containing ta2o5 with different metals (one example is given in the last section of the table. 2). in addition, in [54] we have shown that the model described in this work is also applicable to the case of hfo2 high-k dielectrics, by fitting the experimental i-v curves obtained by other authors [55]. it is expected the same or slightly modified model to be applicable on various similar structures. recently, an analysis of leakage properties of al-ta2o5/sioxny-si structures based on a derived model has been published by other authors [56]. 3.2. effective capacitance standard methods for characterization of mos structures include measurement of c-v and g-v (or r-v) curves in parallel mode (i.e., cp-v and g-v or rp-v) [51]. an alternative approach is to use c-v and r-v curves obtained in serial mode (cs-v and rs-v). our extensive experience with metal/high-k/si structures suggests that better results are obtained when using serial mode in characterization of capacitance properties of the considered structures. this approach has been supported by additional studies of the ac capacitance and resistance measurements at various frequencies [57],[58]. based on the model described here an equivalent circuit (simplified equivalent circuit of that shown in fig. 5) for the capacitance in accumulation has been constructed and applied to describe experimental results for measured capacitances and resistances as a function of the signal frequency, both in parallel and serial mode [57]. impedance of the considered equivalent circuit (z) is given with the following expression: hk if l2 2 hk hk if if hk if 2 2 hk hk if if 1 (2 ) 1 (2 ) 1 11 2 1 1 (2 ) 1 1 (2 ) r r z r fc r fc r c c i f fc r fc r                   , (14) where f is the measurement signal frequency. for measurements in serial mode (at given gate voltage v in accumulation), corresponding effective serial capacitance (cs) and resistance (rs) are frequency dependent and given with following expressions: 270 n. novkovski 1 hk if s 2 2 hk hk if if 1 1 ( ) 1 1 (2 ( )) 1 1 (2 ( )) c c c f fc r v fc r v           (15) and hk if s l2 2 hk hk if if ( ) 1 (2 ( )) 1 (2 ( )) r r r f r fc r v fc r v       . (16) in [57] excellent fits to the experimental results for al-ta2o5/sio2 structures have been obtained when using expressions (15) and (16). detailed analysis for the c-v, r-v and c-v curves for metal(al,w,au)-ta2o5/sio2 structures, both in parallel and serial mode, have been reported in [51]. all the results obtained are consistent with the model described in this work. 3.3. stress-induced leakage currents in addition to the description of the leakage currents of fresh structures, this model has been successfully applied to the description of the stress-induced leakage currents. we dominantly studied the case of constant current stress. we have shown that i-v characteristics of stressed al-ta2o5/sio2 structures can be very well described by our model [59]. increase of the leakage currents with the stress has been attributed to the degradation of the interfacial layer by creation of high density of defects in a part of it. this part can be degraded to the point where it can be regarded as a conductive material where conduction occurs through percolation paths [59]-[61]. 4. conclusions comprehensive physical model for describing electrical and dielectric properties of mos capacitors containing high-k/(sio2,sioxny) dielectric stack has been described in details. corresponding equivalent circuit has been constructed and displayed. the proposed model describes very well mos structures containing ta2o5 based dielectric layers, both obtained with different technological procedures and with different doping. it has been also shown that the model can be used for other high-k dielectrics such as hfo2. based on the model, degradation of the dielectric properties of the high-k dielectric layer induced by a reactive metal gate, such as al, can be clearly distinguished from other effects. the model is applicable on fresh as well on high-field/current stressed samples, thus allowing analyzing the stress-induced leakage currents at medium fields. finer details of the effect of various technological processes on the electrical and dielectric properties of the considered structures can be extracted using the model. acknowledgement: this work was supported by macedonian ministry of education and sciences under contract 13-3573. physical modeling of ta2o5 based mos capacitors on si 271 references [1] j. zhang, z. li, h. zhou, c. ye and h. wang, “electrical, optical and micro-structural properties of ultrathin hftion films”, applied surface science, in press, http://dx.doi.org/10.1016/j.apsusc.2013.12.064. [2] c.ye, c. zhan, j. zhang, h. wang, t. deng and s. tang, “influence of rapid thermal annealing temperature on structure and electrical properties of high permittivity hftio thin film used in mosfet”, microelectronics reliability 54, 2014, pp. 388–392. (anneling) [3] s. chen, zh. liu, l. feng, x. che and x. zhao, “the dielectric properties enhancement due to yb incorporation into hfo2”, appl. phys. lett. 103 2013, pp. 132902 (4 pages). [4] g.lee, b.-k. lai, c. phatak, r. s. katiyar and o. auciello, “interface-controlled high dielectric constant al2o3/tiox nanolaminates with low loss and low leakage current density for new generation nanodevices”, j. appl. phys. 114, 2013, pp. 027001 (5 pages). [5] m. ali khaskheli, p. wu, r. chand, x. li, h. wang, sh. zhang, s. chen and yili pei, “structural and dielectric properties of ti and er co-doped hfo2 gate dielectrics grown by rf sputtering”, applied surface science 266, 2013, pp. 355–359 [6] b. toomey, k. cherkaoui, s. monaghan, v. djara, é. o’connor, d. o’connell, l. oberbeck, e. tois, t. blomberg, s.b. newcomb and p.k. hurley, “the structural andelectrical characterization of a hferox dielectric for mim capacitor dram applications”, microelectronic engineering 94, 2012, pp. 7–10 [7] z. essa, c. gaumer, a. pakfar, m. gros-jean, m. juhel, f. panciera, p. boulenc, c. tavernier and f. cristiano, “evaluation and modeling of lanthanum diffusion in tin/la2o3/hfsion/sio2/si high-k stacks”, appl. phys. lett. 101 2012, pp. 182901 (5 pages). [8] t. usui, s. a. mollinger, a. t. iancu, r. m. reis and f. b. prinz, “high aspect ratio and high breakdown strength metal-oxide capacitors”, appl. phys. lett. 101 2012, pp. 033905 (4 pages). [9] w.yang, q.-q. sun, r.-c. fang, l. chen, p. zhou, s.-j. ding and d.w. zhang, “the thermal stability of atomic layer deposited hflaox: material and electrical characterization”, current applied physics 12, 2012, pp. 1445–1447 [10] t. yu, c. jin, x. yang, y. dong, h. zhang, l. zhuge, x. wu and z. wu, “the structure and electrical properties of hftaon high-k films prepared by dibsd”, applied surface science 258, 2012, pp. 2953– 2958 [11] x. zhang, h. tu, y. guo, h. zhao, m. yang, f. wei, y. xiong, z. yang, j. du and w. wang, “atomic configuration of the interface between epitaxial gd doped hfo2 high-k thin films and ge (001) substrates”, j. appl. phys. 111, 2012, pp. 014102 (4 pages) [12] l. ning, f. yang, c. duan, y. zhang, jun liang and z. cui, “structural properties and 4f→5d absorptions in ce-doped lualo3: a first-principles study”, j. phys.: condens. matter 24, 2012, pp. 055502 (10 pages) [13] l. kornblum, b. meyler, c. cytermann, s. yofis, j. salzman and m. eizenberg, “investigation of the band offsets caused by thin al2o3 layers in hfo2 based si metal oxide semiconductor devices”, appl. phys. lett. 100, 2012, pp. 062907 (3 pages) [14] k.m.a. salam, h. fukuda and s. nomera, “effects of additive elements on improvement of the dielectric properties of ta2o5 films formed by metalorganic decomposition”, j. appl. phys. 93, 2003, pp. 1169–1175. [15] e. atanassova, n. novkovski, d. spassov, a. paskaleva and a. skeparovski, “time-dependent-dielectricbreakdown characteristics of hf-doped ta2o5/sio2 stack”, microelectron. reliab. 54, 2014, pp. 381–387. [16] e. atanassova, n. stojadinovic, d. spassov, i. manic and a. paskaleva, “time-dependent dielectric breakdown in pure and lightly al-doped ta2o5 stacks”, semicond. sci. technol. 28, 2013, pp. 055006– 055006-9 [17] e. atanassova, d. spassov, n. novkovski, and a. paskaleva, “constant current stress of lightly al-doped ta2o5”, materials science in semiconductor processing 15, 2012, pp. 98–107. [18] y. karmakova, a. paskaleva and e. atanassova, “interfacial layers in ta2o5 based stacks and constituent depth profiles by spectroscopic ellipsometry”, appl. surf. sci. 258, 2012, pp. 4507–4512. [19] e. atanassova, a. paskaleva and d. spassov, “doped ta2o5 and mixed hfo2–ta2o5 films for dynamic memories applications at the nanoscale”, microelectron. reliab. 52, 2011, pp. 642–650. [20] a. paskaleva, m. ťapajna, e. dobročka, k. hušeková, e. atanassova and k. fröhlich, “structural and dielectric properties of ru-based gate/hf-doped ta2o5 stacks”, appl. surf. sci. 257, 2011, pp. 7876–7880. [21] a. skeparovski, n. novkovski, e. atanassova, a. paskaleva and v. k. lazarov, “effect of al gate on the electrical behaviour of al doped ta2o5 stacks”, j. phys. d: appl. phys. 44, 2011, pp. 235103–235103-10. [22] i. manić, e. atanassova, n. stojadinović, d. spassov and a. paskaleva, “hf-doped ta2o5 stacks under constant voltage stress”, microelectron. eng. 88, 2011, pp. 305–313. [23] d. spassov, e. atanassova and a. paskaleva, “lightly al-doped ta2o5: electrical properties and mechanisms of conductivity”, microelectron. reliab. 51, 2011, pp. 2102–2109. 272 n. novkovski [24] n. novkovski and e. atanassova, “charge trapping during constant current stress in hf-doped ta2o5 films sputtered on nitrided si”, thin solid films 519, 2011, pp. 2262–2267. [25] e. atanassova, n. novkovski, a. paskaleva and d. spassov, “constant current stress-induced leakage current in mixed hfo2– ta2o5 stacks”, microelectron. reliab. 50, 2010, pp. 794–800. [26] a. paskaleva and e. atanassova, “evidence for a conduction through shallow traps in hf-doped ta2o5”, mat. sci. semicond. proc. 13, 2010, pp. 349–355. [27] e. atanassova, m. georgieva, d. spassov and a. paskaleva, “high-k hfo2–ta2o5 mixed layers: electrical characteristics and mechanisms of conductivity”, microelectron. eng. 87, 2010, pp. 668–676. [28] d. spassov, e. atanassova, n. novkovski, “electrical behaviour of ti-doped ta2o5 on n2o and nh3 nitrided si”, semicond. sci. technol. 24, 2009, pp. 075024–075024-10. [29] a. skeparovski, n. novkovski, e. atanassova, d. spassov and a. paskaleva, “temperature dependence of leakage currents in ti doped ta2o5 films on nitrided silicon”, j. phys. d: appl. phys. 42, 2009, pp. 095302–095302-8. [30] a. paskaleva, e. atanassova and n. novkovski, “constant current stress of ti-doped ta2o5 on nitrided si”, j. phys. d: appl. phys. 42, 2009, pp. 025105–025105-8. [31] n. novkovski, “analysis of the improvement of al-ta2o5/sio2-si structures reliability by si substrate plasma nitridation in n2o”, thin solid films 517, 2009, 4394–4401. [32] n. novkovski and e. atanassova, “a comprehensive model for the i-v characteristics of metal-ta2o5/sio2-si structures”, appl. phys. a 83, 2006, pp. 435–445. [33] e. rosenbaum and l. f. register, “mechanism of stress-induced leakage current in mos capacitors”, ieee trans. electron dev. 44, 1997, pp. 317–323. [34] m. houssa, m. tuominen, m. naili, v. afanas’ev, a. stesmans, s. haukka and m. m. heyns, “trapassisted tunneling in high permittivity gate dielectric stacks”, j. appl. phys. 87, 2000, pp. 8615–8620. [35] w. s. lau, l. zhong, allen lee, c. h. see, taejoon han, n. p. sandler and t. c. chong, “detection of defect states responsible for leakage current in ultrathin tantalum pentoxide (ta2o5) films by zero-bias thermally stimulated current spectroscopy”, appl. phys. lett. 71, 1997, pp. 500–502. [36] w. s. lau, l. l. leong, t. han and n. p. sandler, “detection of oxygen vacancy defect states in capacitors with ultrathin ta2o5 films by zero-bias thermally stimulated current spectroscopy”, appl. phys. lett. 83, 2003, pp. 2835–2837. [37] n. novkovski, a. skeparovski and e. atanassova, “charge trapping effect at the contact between a highwork-function metal and ta2o5 high-k dielectric”, j. phys. d: appl. phys. 41, 2008, pp. 105302–105302-4. [38] l. stojanovska-georgievska, n. novkovski and e. atanassova, “charge trapping at pt/high-k dielectric (ta2o5) interface”, physica b: condensed matter 406, pp. 3348-3353 (2011). [39] l.s. georgievska, n. novkovski and e. atanassova, “charge trapping at low injection currents in (tin, mo, pt)/ta2o5:hf/sio2/si structures”, 2012 28 th international conference on microelectronics, proceedings, miel2012, pp. 331-334 [40] f.-c. chiu, j.-j. wang, j. y. lee and s. c. wu, “leakage currents in amorphous ta2o5 thin films”, j. appl. phys. 81, 1997, pp. 6911-6915. [41] o. blank, h. reisinger, r. stengl, m. gutsche, f. wiest, v. capodieci, j. schulze and i. eisele, “a model for multistep trap-assisted tunneling in thin high-k dielectrics”, j. appl. phys. 97, 2005, pp. 044107– 044107-7. [42] e. atanassova, d. spassov, a. paskaleva, j. koprinarova and m. gueorguieva, “influence of oxidation temperature on the microstructure and electrical properties of ta2o5 on si”, microel. j. 33, 2002, pp. 907–920. [43] m. lenzlinger and e. h. snow, “fowler-nordheim tunneling into thermally grown sio2”, j. appl. phys. 40, 1969, pp. 278-283. [44] n. novkovski and e. atanassova, “injection of holes from the silicon substrate in ta2o5 films grown on silicon”, appl. phys. lett. 85, 2004, pp. 3142-3144. [45] c. chaneliere, j. l. autran and r.a.b. devine, “conduction mechanisms in ta2o5/sio2 and ta2o5/si3n4 stacked structures on si”, j. appl. phys. 86, 1999, pp. 480–486. [46] m. v. fischetti and d. j. dimaria, “hot electrons in sio2: ballistic to steady-state transport”, solid-st. electron. 31, 1988, pp. 629–636. [47] j. r. yeargan and h. l. taylor, “the poole-frenkel effect with compensation present”, j. appl. phys. 39, 1968, pp. 5600–5604. [48] n. novkovski, “limitations in the methods of determination of conduction mechanisms in highpermittivity dielectric nano-layers”, physica b: condensed matter. 398, 2007, pp. 28–32. physical modeling of ta2o5 based mos capacitors on si 273 [49] k. n. yang, h. t. huang, m. c. chang, c. m. chu, y. s. chen, m. j. chen, y. m. lin, m. c. yu, s. m. yang, d. c. h. yu and m. s. liang, “a physical model for hole direct tunneling current in p + poly-gate pmosfets with ultrathin gate oxides”, ieee trans. electron dev. 47, 2000, pp. 2161-2166. [50] n. yang, w.k. henson, j.r. hauser and j. wortman, “modeling study of ultrathin gate oxides using direct tunneling current and capacitance-voltage measurements in mos devices”, ieee trans. electron dev. 46, 1999, pp. 1464-1471. [51] d. k. shroder, semiconductor material and device characterization. hobokeen, new jersey: john wiley& sons, 2006, chapter 9, pp. 347–350. [52] n. novkovski, a. paskaleva and e. atanassova, “dielectric properties of rf sputtered ta2o5 on rapid theramlly nitrided si”, semicond. sci. technol. 20, 2005, pp. 233–238. [53] n. novkovski, “conduction and charge analysis of metal (al, w and au)-ta2o5/sio2-si structures”, semicond. sci. technol. 21, 2006, pp. 945–951. [54] aleksandar skeparovski and nenad novkovski, “on the nature of the high-k dielectrics leakage current reduction by postdeposition annealing”, j. optoelectron. adv. mat. 9, 2007, pp. 897–901. [55] w. j. zhu, t.-p. ma, t. tamagawa, j. kim and y. di, “current transport in metal/hafnium oxide/silicon structure” ieee electron device lett. 23, 2002, pp. 97–99. [56] s. huang, “oxygen annealing effects on transport and charging characteristics of al-ta2o5/sioxny-si structure”, ieee trans. electron. dev. 60, 2013, pp. 2741–2746. [57] n. novkovski, and e. atanassova, “frequency dependence of the effective series capacitance of metalta2o5/sio2-si structures”, semicond. sci. technol. 22, 2007, pp. 533–536. [58] n. novkovski and e. atanassova, “peculiarities of capacitance measurements of nanosized high-k dielectrics: case of ta2o5”, j. optoelectron. adv. mat.-symposia 1, 2009, pp. 398–403. [59] n. novkovski and e. atanassova, “origin of the stress-induced leakage currents in al-ta2o5/sio2-si structures”, appl. phys. lett. 86, 2005, pp. 1521041–52104-3. [60] n. novkovski, e. atanassova and a. paskaleva, “stress-induced leakage currents of the rf sputtered ta2o5 on n-implanted silicon”, appl. surf. sci. 253, 2007, pp. 4396–4403. [61] n. novkovski, e. atanassova and a. paskaleva, “model based analysis of electrical and wear-out characteristics of ultra-thin ta2o5/sioxny stacks on si”, proc. 26 nd international conference on microelectronics, 10-14 may, 2008, vol. 2, pp. 533–536. [62] n. novkovski and e. atanassova, “dielectric properties of ta2o5 films grown on silicon substrates plasma nitrided in n2o”, appl. phys. a 81, 2005, pp. 1191–1195. instruction facta universitatis series: electronics and energetics vol. 30, n o 4, december 2017, pp. 585 597 doi: 10.2298/fuee1704585b spectral parameters for finger tapping quantification * vladislava n. bobić 1 , milica d. djurić-jovičić 2 , nathanael jarrasse 3 , milica ječmenica-lukić 4 , igor n. petrović 4 , saša m. radovanović 5 , nataša dragašević 4 , vladimir s. kostić 4 1 school of electrical engineering, university of belgrade, serbia 2 innovation center of school of electrical engineering, university of belgrade, serbia 3 institut des systèmes intelligents et de robotique, université pierre et marie curie, paris, france 4 neurology clinic, clinical center of serbia, medical faculty, university of belgrade, serbia 5 institute for medical research, university of belgrade, serbia abstract. a miniature inertial sensor placed on fingertip of index finger while performing finger tapping test can be used for an objective quantification of finger tapping motion. temporal and spatial parameters such as cadence, tapping duration, and tapping angle can be extracted for detailed analysis. however, the mentioned parameters, although intuitive and simple to interpret, do not always provide all the necessary information regarding the subject’s motor performance. analysis of frequency content of the finger tapping movement can provide crucial information about the patient's condition. in this paper, we present parameters extracted from spectral analysis that we found to be significant for finger tapping assessment. with these parameters, tapping’s intra-variability, movement smoothness and anomalies that may occur within the tapping performance can be detected and described, providing significant information for further diagnostics and monitoring progress of the disease or response to therapy. key words: frequency analysis, finger tapping, parkinson's disease. 1. introduction patients with parkinson’s disease (pd) exhibit severe motor problems; therefore objective assessment of their movements is crucially important for diagnostics and evaluation of progress of the disease. frequency analysis is widely used for such assessment of parkinsonian patients. some usual frequency-derived measures obtained from fast fourier transform (fft), such as amplitude, median power frequency, power dispersion, and power received november 29, 2016; received in revised form march 29, 2017 corresponding author: vladislava n. bobić school of electrical engineering, university of belgrade, kralja aleksandra blvd. 73, 11120 belgrade, serbia (e-mail: vladislava.bobic@yahoo.com) * an earlier version of this paper received best section paper award at 3rd international conference on electrical, electronic and computing engineering, icetran 2016, zlatibor, serbia, june 13 – 16, 2016 [1]. 586 v. n. bobić, m. d. djurić-joviĉić, n. jarrasse, m. jeĉmenica-lukić, et al. percentage within the 4–7 hz frequency range were used for quantification of hand tremor [2]. body-area inertial sensing system and signal processing based on filter-bank analysis and cross correlation were used for the interpretation of tremor frequency and energy [3]. one study proposed a new technique for tremor detection from gyro data [4] that comprises empirical mode decomposition and the hilbert spectrum, introducing the concept of instantaneous frequency in the field of tremor. frequency-derived measures were extracted from the results of the welch's averaged modified periodogram method of spectral estimation performed on the acceleration data and used for assessment of stride-to-stride variability in pd patients and healthy controls in real-life settings [5]. they defined four parameters for the main peak of the power spectral density function: its frequency, the amplitude, the width at half of its amplitude and the slope from the point of the peak’s maximum to the point of half of the peak’s amplitude. body motion of pd patients was also assessed by using a maximum-likelihoodestimator-based fractal analysis method for triaxial accelerometer data [6]. freeze of gait in patients with pd was quantified from the power spectral density of the shank acceleration [7]. researchers defined a new index, named frequency ratio as the square of the total power in the 3–8 hz band, divided by the square of the total power in the 0.5–3 hz band. results showed that the defined parameter can be used for better differentiation between patients than traditional gait spatial measures. although spectral components hidden in the performed movement can indicate motor impairment [8], fourier analysis is not the most effective tool for the analysis of transient behavior or discontinuities that are typical for human movement. in such case, timefrequency algorithms can provide detailed analysis of signal’s frequency content over time, allowing detection of localized features in specific time moments. time-frequency algorithms short-time fourier transform (stft), and wavelet transform (wt) have already been used in many studies in the field of human movement [9][11]. detection of transient episodes and tripping in inertial data can be performed with both stft and discrete wavelet transform [12]. however, wavelets proved to be superior at describing anomalies, pulses and other transient events that start and stop within a movement signal [13]. parameters expressing main frequencies, pattern decrement and activity volume of the basic finger tapping rhythm and vigor of the performed movements were extracted from the coefficients of the results of continuous wavelet transform performed on gyro signals, providing classification between pd patients and healthy subjects [14]. neurological disorders, including parkinson’s disease [15], can affect smoothness of the patient’s motor performance. because of that, objective measure of movement smoothness can be a very important segment of the assessment of the patient’s motor abilities. it was shown that frequency analysis can provide information about movement smoothness by analyzing the spectral arc length (sparc) [16]. repetitive finger tapping represents one of the descriptive characteristics of the patient motor ability that is included in unified parkinson’s disease rating scale (updrs test, e.g., fahn et al, 1987 [17]). in clinical practice, the finger tapping performance is often validated visually, which results in a low diagnostic resolution [18]. however, using the appropriate instrumentation, such as miniature inertial sensors, finger tapping performance can be quantified, allowing the objective assessment of specific characteristics or changes in the finger tapping pattern over time [19]-[20]. our goal is to offer a new method for the objective quantification of finger tapping performance that is regularly used for assessment and visually estimated by physicians. we spectral parameters for finger tapping quantification 587 suggest a set of frequency derived parameters that can provide the assessment of tapping’s rhythmic behavior, vigor of its performance, intra-variability, tremor and motor blocks. in this way, the quantitative assessment of repetitive finger tapping performance can be obtained thus providing support in monitoring of the patient's condition, response to therapy as well as in differential diagnostics of parkinsonism. 2. methods and materials instrumentation the instrumentation includes an inertial sensor unit comprising a 3d gyroscope l3g4200 (stmicroelectronics, usa) [21]. in our system, the small sized (10x12 mm) and lightweight (3 g) sensor is placed on a fingertip of the subject’s index finger (fig. 1). the sensor is connected to its sensor control unit (scu), positioned on the forearm, by thin, light, flexible and loose cable. the designed instrumentation and mounting concept secure that movement path and range are not hindered in any aspect. different technical and mounting solutions (sensor gloves, wireless sensors) have also been considered, however, all of them showed certain shortcomings in terms of size, weight (e.g. having wireless sensor on fingertip requires mounted battery which increases the size and weight), limited performance and tactility (gloves), as well as hygiene and price. the signals are collected by scu and wirelessly transmitted to a remote computer. custom-made graphical user-friendly interface, which is developed in cvi (cvi 9.0, ni labwindows, usa), controls the data acquisition, storing and provides export (ascii comma separated value (csv) format) for further analysis. fig. 1 system setup: sensor (s) positioned on fingertip connected to sensor control unit (scu) mounted on the subject’s hand. experiments twenty patients with parkinson's disease (age: 61,39±9,7), and twelve age and gender matched controls (age: 56,53±9,13) were enrolled in this study. during the performance, subjects were sitting comfortably in a chair, with their hand placed in front of them. as the part of the test, they repeatedly tapped index finger and thumb as rapidly and as widely as possible for 15 s, as described in [19]. each recording began and ended with their fingers closed at the "zero-posture”. for each subject, three trials per affected hand were recorded. a resting period of one minute in between was given; because fatigue may compromise the performance. 588 v. n. bobić, m. d. djurić-joviĉić, n. jarrasse, m. jeĉmenica-lukić, et al. the study was performed at the neurology clinic, clinical centre of serbia, belgrade in accordance with the ethical standards of the declaration of helsinki. all the participants gave informed written consent prior to the participation in the study. signal processing angular velocity was recorded using digital gyroscopes with the sampling frequency fs=200 hz, calibrated and directly processed by custom-made matlab script (matlab 7.6.0., r2008a). the examples of recorded signals for one healthy control (ctrl) and two pd patients are presented in fig. 2. fig. 2 the examples of recorded gyro signals for: two pd patients and one ctrl subject. firstly, tapping performance was described with parameters typically used for tapping description [19]:  duration of the taps tt – expressed in seconds,  tapping cadence ct – expressing the number of taps in the observed 15 s long sequence,  angle that index finger forms relative to the “zero posture” of the fingers αt – expressed in degrees. additionally, continuous wavelet transform (cwt), welch's averaged modified periodogram method of spectral estimation and spectral arc length method (sparc) [16] were applied on the observed 15 s long sequences of the signal. the methods were performed for the frequency range between 0.01 and 20 hz (the frequency increment 0.01 hz), covering the complete possible spectral content of finger tapping. continuous wavelet transformation continuous wavelet transformation based on fft algorithm was applied on the 15 s long sequences of the gyro signal. for this application, we used a mother wavelet from complex morlet wavelet family, with center frequency f0=1 hz and time-frequency resolution σ=0.7. the fourier transform of wavelet function was found for each scale (reciprocal of each frequency from the defined band 0-20 hz) and multiplied by the representation of the gyro signal in the frequency domain. complex cwt coefficients were obtained using the inverse fourier transform and then normalized with the weighting function i.e., by dividing the coefficients by the square root of the scale. the final result is obtained in the spectral parameters for finger tapping quantification 589 form of matrix, with the same time resolution ∆t=5 ms (∆t=1/fs=1/200 hz) as the original gyro signal (no additional interpolation or down sampling were performed). the examples of obtained cwt coefficients, presented in the shape of a 3d scalogram, are shown in fig. 3. the scalogram represents an original color-coded illustration of wavelet coefficients. for this application, we used jet colormap, where small amplitudes are represented with the cold color tones (starting from navy blue), whereas warmer colors (ending with dark red) follow the increase of the amplitude. fig. 3 3d representation of cwt coefficients. an example is given for patient pd1. in order to observe temporal changes of tapping activities, we defined cross-sectional area perpendicular to the t-axis (csa-ttot) [14]. csa-ttot was calculated by summing the absolute values of cwt coefficients, and finally expressed as percent of the maximum energy of csa-ttot characteristic. by introducing two thresholds at 50 and 25% (light and dark dashed grey lines in fig. 4, respectively), we found signal parts where tapping performance was compromised causing energy loss below two defined levels. fig. 4 representative example of csa-ttot [%] distribution given for one pd patient. light and dark dashed grey lines mark two defined thresholds at 50 and 25%, whereas dashed blue and solid red rectangles outline signal parts with energy loss below defined levels (50 and 25%, respectively). 590 v. n. bobić, m. d. djurić-joviĉić, n. jarrasse, m. jeĉmenica-lukić, et al. in this way, tapping performance can be described regarding the disturbance of its basic rhythmic behavior e.g., motor blocks. we introduced two parameters representing the duration of the detected anomalies, expressed in seconds (cwt<50 and cwt<25, respectively). welch's method of spectral estimation power spectral density was calculated with welch’s method of spectral estimation. for this application, a window size of 800 samples and overlap between the windows of 50% were applied. a fft length was 2 times the next higher power of 2 of the signal length. for each subject, we extracted four parameters for the main peak i.e., the dominant harmony of the obtained power spectral density function (fig. 5) [5]:  the frequency of the peak – f;  the amplitude of the peak – h;  the width of the peak at half of its amplitude – w (the red lines in fig. 5);  the slope of the peak, calculated from the point of half of the peak’s amplitude to the peak’s maximum point – s (the blue lines in fig. 5). fig. 5 representation of power spectral density function. blue line marks slope of the peak, whereas red line shows width of the peak at half of its amplitude. the examples are given for: one ctrl subject (top panel) and two pd patients (middle and bottom panels). spectral parameters for finger tapping quantification 591 sparc method for assessment of tapping smoothness spectral arc method is used for the assessment of smoothness of signals describing any rhythmic sensorimotor behavior [22]-[24]. sparc method applied here is modified spectral arc length method, defined in [16]. it represents the signal smoothness as a single scalar, by calculating the arc length of the fourier spectrum within the defined frequency range of a given velocity. final value of this parameter was expressed as negative logarithm of the calculated arc length. bigger values correspond to greater smoothness. smoothness was calculated for the upward trend of the taps, because it corresponds partially to both opening and closing but it doesn’t include the moment when fingers are closed, which may cause some changes in the signal and thus introduce error. the procedure was repeated for all the taps, which were previously segmented. for each subject we calculated the total measure of tapping smoothness, expressed as descriptive statistics (average ± std.dev), and the trend of change in smoothness across all segmented taps, represented by the slope of the fitted linear regression line across the corresponding smoothness characteristic (the red dashed line in fig. 6). fig. 6 sparc smoothness characteristic with corresponding slope (red dashed line) for one ctrl subject (top panel) and two pd patients (middle and bottom panels). dashed blue rectangle marks detected change in movement smoothness. 592 v. n. bobić, m. d. djurić-joviĉić, n. jarrasse, m. jeĉmenica-lukić, et al. statistical analysis the two groups were compared using the t-test for two independent samples (if both groups satisfied the normal distribution) or mann-wilcoxon test (if the distributions were not normal). statistical significance was determined with 2-tailed tests when p<0.05. statistical analysis was performed in spss v17.0 (chicago, il). 3. results by observing the examples of recorded gyro signals (fig. 2), one can notice that the healthy subject had rapid and vigorous performance. patient pd1 performed even more rapidly, but less vigorously, less rhythmically and with noticeable amplitude changes within the signal, as the consequence of motor block that occurred during the performance. on the other hand, the patient pd2 had slower and non-smooth but more rhythmical tapping performance. results summarized for all the participants showing descriptive statistics (average ± std.dev) for the parameters expressing duration of tapping performance, tapping cadence and angles, as well as the statistical differences between the two groups are given in table 1. distributions of the introduced parameters are shown in fig. 7. although those parameters show statistically significant differences between the groups (the grey shaded cells in table 1), they cannot provide information about changes in tapping shape and the appearance of specific transient events, and therefore they are not suitable for the detection or description of such noticeable characteristics of tapping performance. because of that, the evaluation of tapping pattern needs to be supplemented with the frequency analysis of gyro data. table 1 descriptive statistics of finger tapping duration, cadence and angle for both ctrl and pd subjects param. ctrl (av±std) pd (av±std) p-value tt [s] 0.32 ± 0.07 0.65 ± 0.41 0.001 ct [taps/s] 49.00 ± 13.02 30.40 ± 17.22 0.001 αt [°] 61.88 ± 18.18 39.53 ± 18.74 0.024 in order to provide the complete analysis of tapping data, we applied cwt, sparc and welch's method of spectral estimation on the 15 s long sequences of the signal. continuous wavelet transformation has an important role in the detection and localization of anomalies that may appear within movement signal. patient pd1 had some changes in the tapping motion which are obvious from the raw gyro signal (marked with the solid red rectangle in fig. 4). by using the cwt method, this disturbance can be described in terms of the degradation level (below 25% of the maximum performing energy) and duration. however, the suggested technique allowed detection of another not so noticeable tapping "anomaly" (marked with the dashed blue rectangle, around 12 s), which could be left unnoticed otherwise. by combining csa-ttot function with a color-coded illustrative representation of cwt coefficients such as 3d scalogram (fig. 3), clinicians can assess anomalies in tapping performance, localize them in time and evaluate the duration and severity of those disturbances. spectral parameters for finger tapping quantification 593 by using parameters extracted from welch’s algorithm of spectral estimation, tap-to-tap variability can be assessed. sparc algorithm allowed calculation of tapping smoothness and its decrement in time. the combined frequency analysis of all three performed methods can provide clinicians with crucial information about tapping performance that can be used for further analysis, or assistance in diagnostics. the applied analysis is summarized in table 2, showing descriptive statistics (average ± std.dev) for the listed frequency parameters for all the subjects, as well as the statistical difference between the two groups. the statistically significant difference between pd patients and healthy subjects was found for all the parameters (except slope of sparc). in addition, for all ctrl subjects the value of cwt<25 parameter was equal to zero, indicating that none of them had severe energy loss below 25%, as opposed to pd patients who demonstrated the appearance of those anomalies in duration up to 5 s long. this indicates that cwt based evaluation is suitable for finger tapping quantification, with potential for differential diagnostics. table 2 descriptive statistics of cwt, welch and sparc based parameters of finger tapping for both ctrl and pd subjects param. ctrl (av±std) pd (av±std) p-value cwt<50 [s] 1.02 ± 1.49 5.23 ± 3.26 <0.001 cwt<25 [s] 0.00 ± 0.00 0.94 ± 1.74 0.023 f [hz] 3.47 ± 0.92 2.10 ± 1.21 0.002 h [psd] 1.34 ± 0.29 1.14 ± 0.39 0.039 s [psd/hz] 3.42 ± 0.70 2.90 ± 1.09 0.042 w [hz] 0.39 ± 0.04 0.42 ± 0.07 0.041 sparc -3.13 ± 0.13 -3.69 ± 0.70 0.001 sparcs -0.0005 ± 0.003 -0.03 ± 0.05 0.373 the distributions of cwt<50 and four psd based parameters for two groups of subjects (ctrl and pd) are shown in fig. 7. sparc smoothness parameter distributions are presented for 10 randomly selected healthy subjects and 10 pd patients with different patterns of tapping performance and shown in the form of a boxplot in the bottom panel in fig. 7. based on the presented results of the applied sparc analysis, it can be seen that healthy subjects have small intraand inter-subject variability of tapping smoothness. on the other hand, patients with pd have wider range of sparc index within their tapping patterns (intra-variability) as well as within the group (inter-variability). this cognition proves that sparc parameter is suitable for the analysis of tapping performance and has potential for differential diagnostics. 594 v. n. bobić, m. d. djurić-joviĉić, n. jarrasse, m. jeĉmenica-lukić, et al. fig. 7 boxplot representation of all listed parameters for both ctrl subjects and pd patients. spectral parameters for finger tapping quantification 595 4. discussion and conclusion tapping performance can be described with temporal and spatial parameters, describing tapping duration and cadence and angle between fingers at maximum opening. although the mentioned characteristics of tapping performance can be used for distinction between healthy individuals and patients (table 1), they are not suitable for the detailed analysis of changes that may occur within tapping performance, movement variability and smoothness. therefore, the analysis should be supplemented with other techniques that can provide such evaluation of tapping performance. in this paper, three frequency based methods were applied on gyro signal acquired from one miniature sensor mounted on the subject’s index finger, and the results of performed techniques are used for quantification of finger tapping performance. by implementing continuous wavelet transform, the frequency content of signal can be observed over time (fig. 3), but also analyzed in terms of energy changes that can be useful for anomaly detection (the solid red rectangle in fig. 4). two cwt based parameters expressing the duration of energy loss below 50% and 25% proved to be statistically different between groups (the grey shaded cells in table 2). in previous research studies, the smaller slope and larger width of the dominant frequency within welch’s power spectral density function were defined as indicators of the greater signal intra-variability. the most prominent peak of the psd function was explained with f, h, s and w parameters which proved to be statistically different between the two groups of subjects (the grey shaded cells in table 2). for pd group, the smaller slope and higher values of width parameters comparing to ctrl group, indicate prominent tapping intra-variability for pd patients. this discovery agrees with the result from weiss et al, performed on gait data [5]. sparc based parameter provide the assessment of movement smoothness, whereby bigger values indicate smoother movements. in this paper, it was demonstrated (table 2, fig. 7) that pd patients have decreased movement smoothness, with statistically significant difference from healthy subjects. by implementing this method, patient’s motion smoothness and its decrement in time can be assessed. also, the combined analysis of these methods allows detection of some changes (the dashed blue rectangle in fig. 4 and fig. 6), which aren’t obvious from the gyro signal, and therefore can be overlooked. based on the presented analysis, finger tapping can be quantified in terms of its rhythmic behavior, the vigor of its performance, tapping intra-variability, tremor and motor blocks that can occur within the tapping performance. these methods allow monitoring of patient’s response to therapy and progress of the disease, and comparison with other evaluated patients. in the future, defined parameters will be complemented with additional parameters which can provide the complete assessment of tapping movement. designed methodology will be implemented for automated differential diagnostic system. acknowledgment: this work was partially supported by the serbian ministry of education, science and technological development under grant no. 175016, grant no. 175090 and grant “pavle savic” bilateral collaboration with france. we would also like to thank phd student minja belić for assisting with recordings. 596 v. n. bobić, m. d. djurić-joviĉić, n. jarrasse, m. jeĉmenica-lukić, et al. references [1] v.n. bobić, m. d. djurićjoviĉić, n. jarrasse, m. jeĉmenica-lukić, i. n. petrović, s. m. radovanović, n. dragašević and v. s. kostić, “frequency analysis of repetitive finger tapping – extracting parameters for movement quantification”, in proceedings of the 3rd international conference on electrical, electronic and computing engineering (icetran 2016), zlatibor, serbia, june 13 – 16, 2016, pp. mei2.2 1-5 [2] c. duval, "rest and postural tremors in patients with parkinson's disease", brain research bulletin, vol. 70, no. 1, pp. 44-48, 2006. [3] h. c. powell, m. a. hanson and l. john, "on-body inertial sensing and signal processing for clinical assessment of tremor", biomedical circuits and systems, ieee transactions on, vol. 3, no. 2, pp. 108116, 2009. [4] e. rocon, j. l. pons, a. o. andrade and s. j. nasuto, "application of emd as a novel technique for the study of tremor time series", in proceedinigs ieee eng med biol soc conf, 2006, pp. 6533-6536. [5] a. weiss, s. sharifi, m. plotnik, j. p. van vugt, n. giladi and j. m. hausdorff, "toward automated, athome assessment of mobility among patients with parkinson disease, using a body-worn accelerometer", neurorehabilitation and neural repair, vol. 25, no. 9, pp. 810-818, 2011. [6] m. sekine, m. akay, t. tamura, y. higashi and t. fujimoto, "fractal dynamics of body motion in patients with parkinson's disease", journal of neural engineering, vol. 1, no. 1, pp. 8, 2008. [7] s. t. moore, h. g. macdougall, and w. g. ondo, “ambulatory monitoring of freezing of gait in parkinson’s disease,” j. neurosci. methods, vol. 167, no. 2, pp. 340–348, 2008. [8] i. shimoyama, t. ninchoji and k. uemura, "the finger-tapping test: a quantitative analysis", arch neurol, vol. 47, no. 6, pp. 681-684, 1990. [9] g. strang, "wavelet transforms versus fourier transforms", bulletin of the american mathematical society, vol. 18, pp. 288–305, 1993. [10] t. m. e. nijsen, p. j. m. cluitmans, p. a. m. griep and r. m. aarts, ”short time fourier and wavelet transform for accelerometric detection of myoclonic seizures”, embs benelux symposium, pp. 155158, december 7-8, 2006. [11] a. napieralski, z. ciota, m. janicki, m. kamiński, r. kotas, p. marciniak, a. mielczarek, m. napieralska, r. ritter, b. sakowicz, w. tylman and m. zubert, “examples of medical software and hardware expert systems for dysfunction analysis and treatment”, facta universitatis, series: electronics and energetics, vol. 28, no. 1, pp. 29-50, 2014. [12] m. a. hanson and l. john, "assessing joint time-frequency methods in the detection of dysfunctional movement", in proceedings of the fortieth asilomar conference on signals, systems and computers, 2006. acssc'06, 2006. [13] b. xu, a. song and j. wu. "algorithm of imagined left-right hand movement classification based on wavelet transform and ar parameter model", in proceedings of the 1st int. conf. on bioinformatics and biomedical engineering, icbbe 2007, 6-8 july 2007, pp. 539-542. [14] m. d. djuric-jovicic, v. n. bobic, m. jecmenica-lukic, i. n. petrovic, s. m. radovanovic, n. s. jovicic, v. s. kostic and m. b. popovic, "implementation of continuous wavelet transformation in repetitive finger tapping analysis for patients with pd", in proc of the 22nd telecommunications forum telfor 2014, ieee, 2014, pp. 541-544. [15] j. jankovic and j. d. frost, "quantitative assessment of parkinsonian and essential tremor clinical application of triaxial accelerometry", neurology, vol. 31, no. 10, pp. 1235-1235, 1981. [16] s. balasubramanian, a. melendez-calderon, a. roby-brami and e. burdet, "on the analysis of movement smoothness", journal of neuroengineering and rehabilitation, vol. 12, no. 1, pp.1, 2015. [17] s. fahn and r. l. elton, “unified parkinsons disease rating scale”, in: s. fahn, c. d. marsden, m. goldstein and d. b. calne, recent developments in parkinsons disease ii, committee mot ud, new york: macmillan, pp. 153-63, 1987. [18] á. jobbágy, p. harcos, r. karoly and g. fazekas, "analysis of finger-tapping movement", journal of neuroscience methods, vol. 141, pp. 29–39, 2005. [19] m. djurić-joviĉić, i. petrović, m. jeĉmenica-lukić, s. radovanović, n. dragašević-mišković, m. belić, v. miler-jerković, m. b. popović and v. s. kostić, “finger tapping analysis in patients with parkinson’s disease and atypical parkinsonism”, journal of clinical neuroscience, vol. 30, pp. 49-55, 2016. [20] s. r. muir, r. d. jones, j. h. andreae and i. m. donaldson, "measurement and analysis of single and multiple finger tapping in normal and parkinsonian subjects", parkinsonism & related disorders, elsevier science ltd, great britain, vol. 1, no. 2, pp. 89-96, 1995. [21] n. s. joviĉić, l. v. saranovac and d. b. popović, "wireless distributed functional electrical stimulation system", journal of neuroengineering and rehabilitation, vol. 9, no. 1, pp. 1-10, 2012. spectral parameters for finger tapping quantification 597 [22] s. balasubramanian, a. melendez-calderon and e. burdet, “a robust and sensitive metric for quantifying movement smoothness”, ieee transactions on biomedical engineering, vol. 59, no.8, pp. 2126-2136, 2012. [23] v. crocher, j. fong, m. klaic, d. oetomo and y. tan, “a tool to address movement quality outcomes of post-stroke patients”, in replace, repair, restore, relieve–bridging clinical and engineering solutions in neurorehabilitation. springer international publishing, 2014, pp. 329-339. [24] s. estrada, m. k. o'malley, c. duran, d. schulz and j. bismuth, “on the development of objective metrics for surgical skills evaluation based on tool motion”, in proceedings of the 2014 ieee international conference on systems, man, and cybernetics. ieee, 2014, pp. 3144-3149. instruction facta universitatis series: electronics and energetics vol. 30, n o 2, june 2017, pp. 145 160 doi: 10.2298/fuee1702145e load sharing methods for inverter-based systems in islanded microgrids  a review  augustine m. egwebe, meghdad fazeli, petar igic, paul holland electronic system design center at college of engineering, swansea university, wales abstract. this paper explores and discusses various design considerations for inverterbased systems. different load sharing techniques are presented for the integration of renewable energy sources within islanded microgrids. in off-grid connection, renewable energy sources are often configured to share power based on their rated capacity. this paper explores both conventional and dynamic load sharing interaction between distributed generation units, both in an inductive (high voltage) and resistive (low voltage) networks. load sharing based on the proper design of virtual impedance is also reviewed. key words: distributed generation, microgrids, droop control, virtual impedance, photovoltaic, renewable sources. 1. introduction the need for clean and reliable energy generation has propelled global activity in various spheres of human endeavor to develop alternative sources of energy. the provision of affordable, reliable and sustainable access to energy in different forms remains one of the key challenges of economic and social development especially in developing countries [1, 2]. while it may be practically impossible to eliminate conventional nuclear and fossil fueled steam turbines, renewable energy sources (res) offer huge prospects to ease the ever-increasing demand burden on large, centralized conventional power systems. vast reduction of greenhouse gases emission can also be achieved via res integration with the existing electricity grid networks [3]. distributed generation is a term commonly used to describe small-scale and modular power generation sources that are located close to the distribution network rather than large power stations connected to the high voltage transmission network [4, 5]. distributed generators (dg) includes small-scale fossil and renewable energy generation technologies including wind, photovoltaic, micro-hydro-turbines, biogas, geothermal, tidal, steam turbines with supplementary storage devices like fuel cells and batteries. dg therefore serves as a contrast to conventional large power stations that use a small number of largescale, frequency controlled generators; it offers enhanced and improved power quality,  received october 28, 2016 corresponding author: augustine marho egwebe electronic system design center at college of engineering, swansea university, wales (e-mail: augustine.egwebe@swansea.ac.uk) 146 a. m. egwebe, m. fazeli, p. igic, p. holland enhanced system security, mitigates against issues like blackout and gives better control over the cost of energy [6]. with distributed generation, consumers now have some scales of flexibility on their energy utilization [7]. increased penetration of green renewable energy requires high-level engineering prowess in maintaining and improving the technologies that make them effective, durable and sustainable [8]. the integration of res with the existing power network mainly involves the strategies and schemes employed via the use of technologies, processes, and advanced control protocols to balance the production and demand of electrical energy within the network. these control schemes enhance the reliability of energy supply irrespective of the intermittent nature of the renewable source (i.e. fluctuating sunshine or wind profile). they include strategies for the optimal harnessing of the available renewable power, effective energy management, power/voltage device control, intelligent control of energy transformation, islanding detection, and line faults management [7-9]. to balance generated energy with demand in modern microgrids, various renewable sources and converters are often interconnected for load sharing and complimentary energy support. renewable power generation units are also often supplemented with dispatchable resources such as energy storage systems and local auxiliary generators; where the absence of such resources can result in the malfunctioning of the inverter-based sources (ivbs) [10]. an intermediate solution to some of the problems with the integration of dg with the existing power network is the concept of the microgrid shown in fig. 1 an electrical distribution system using distributed energy resources such as generators, storage devices and controllable loads which are coordinated, when connected to the main power network or operated in islanded mode [9, 10]. in grid-connected mode, control measures are relatively easy to be implemented since the utility grid regulates voltage and frequency for loads within the microgrid; whereas in islanded mode voltage and frequency must be actively controlled for the continuous and stable performance of the network [11-13]. the microgrid, when operated in islanded mode, must be able to integrate and coordinates several energy resources with appropriate voltage-frequency control strategies. the electrical power generated from dgs must be well regulated to suit sensitive non-linear loads within the distribution level (i.e. computers, motor drives, battery chargers), without causing unregulated constraint on the generator. the control measures in dgs also aim to offer greater power quality control and low voltage ride through required for eliminating transient stability issues [14, 15]. additionally, microgrids help to reduce congestion on the utility grid, serves as uninterrupted supply for critical loads, encourage the localized generation of power on the consumer side and offer extra support regarding voltage support, demand response as well as spinning reserve via inbuilt storage devices [16]. the hierarchical control approach employed in a microgrid allows autonomously coordinated generation output from dgs and energy storage systems while ensuring the appropriate load sharing and interaction with the national grid (ng) [17]. in the absence of free generation capacity in the system due to each dg hitting their maximum generating capacity, the microgrid should be self-sustainable without violating sensitive network parameters like voltage and frequency [18]. a microgrid can connect and disconnect from the ng to enable it to operate in either grid-connected or islanded mode using the microgrid central switch shown in fig. 1. a required basic characteristic of the microgrid is seamless “islanding” and “reconnection” from/to the ng. disconnection can be as a result of grid events which include faults, voltage collapse, and blackout [8, 19]. all dgs within the microgrid must be well regulated to present peer-to-peer and plug-and-play characteristics. load sharing methods for inverter-based systems in islanded microgrids  a review 147 pv arrays small wind turbine energy storage bank domestic houses small industries local amenities e.g. hospital diesel generator mv network (10kv) microgrid central switch transformer pv arrays lv lv: low voltage mv: medium voltage mc: microsource controller mcc: microgrid central controller mcc mc mc mc mc mc fig. 1 an example of a microgrid network islanding detection of distributed generation systems, voltage regulation, protection, power quality improvement and stability of the power system network are some of the technical challenges facing dgs as they are increasingly connected to the microgrids. accurate and intelligent controller design is thus required to ensure swift interaction with connected loads and the microgrid while ensuring system stability when disconnecting from the ng in the case of fault or disturbance [20]. in this paper, a thorough survey of the core components and techniques required for the effective integration of dgs in an islanded microgrid is presented. control paradigms to facilitate the efficient load sharing, operation, and energy management of dgs in a microgrid is also presented. 2. inverter-based systems inverter-based systems (ivbs) play a vital role in the effective integration of renewables with the microgrid at a synchronized system frequency. ivbs are commonly employed for switching dc voltages from renewable sources to ac voltages supplied to the microgrid and locally connected loads [19, 21]. monitoring and control functionality are essential requirements for the power electronics interface used in ivbs so as to ensure the protection of the dg system and as well as meet the connection specifications of the ng [19]. active power, reactive power, voltage and frequency at the point of common connection are some of the critical monitoring parameters for these types of systems as shown in fig. 2. also, proper conditioning of voltage and current ensures successful control of power flow per specific power references under varying load or dg input sources. motor drives and distributed generation systems use ivbs due to their inherent advantages of adjustable power factor, low total harmonic distortion (thd), and their high efficiency. 148 a. m. egwebe, m. fazeli, p. igic, p. holland inverter gc-pr(s)gv-pr(s) power controller ll cf lfvi ii voαβ * iiαβ * vcon-αβ ii vo vo io vb io iiαβ voαβ voαβ ab c voαβ ioαβ αβ ab cα β pwm mαβ vdc θ ÷ * +vdc fig. 2 generic inverter-based system a conventional inverter circuit consists of controllable transistorized switches, such as igbts with parallel diodes to provide a bypass path for transient currents as shown in fig. 3.a. the three-phase igbt bridge circuit operates according to the control signal (vcon) generated by the control algorithm of the controller as shown in fig. 3.b. the threephase igbt-based inverter in fig. 3 consists of six switching devices (q1 q6), which are directly controlled by pulse width modulation (pwm) signals (s1 through s6) to be on (closed) or off (open) according to a well-structured switching pattern to produce the desired output ac waveforms [23, 24]. vdc s1 q1 s3 q3 s5 q5 s4 q4 s6 q6 s2 q2 a b c van vbn vcn ia ib ic 2 vdc 2 vdc + + + reference sine-wave generator (vcon) + carrier triangular wave (sets switching frequency, fsw) + + vcon-a vcon-c vcon-b vtrig s1, s4 s3, s6 s5, s2 comparator(a) (b) fig. 3 (a) three-phase bridge inverter [22]; (b) spwm control signal generator [22] the use of pwm switching together with closed-loop voltage and current controllers produces a sinusoidal output current in phase with the grid voltage with thd aligned to grid regulations. conventional grid-mode ivbs in pv applications ensure that: (1) pv modules operate at maximum power point (mpp); (2) the injected ac current into the grid is sinusoidal, with consideration for the ieee 547 demand standards for grid connection. these standards include issues such as power quality, islanding detection mode, grounding and harmonics. one of the challenges of switching ivbs at high frequencies (2 20 khz) is the creation of high-order harmonics. thd in current and voltage can lead to low power factor, overheating of distribution system components, mechanical oscillation in generators and motors, poor performance of communication equipment, and unpredictable behavior of load sharing methods for inverter-based systems in islanded microgrids  a review 149 security protection systems [22, 23, 25]. the low-pass filter connected to the output of the inverter helps to prevent the injection of high-frequency harmonics into the ac bus [23, 25]. line frequency transformers are used for galvanic isolation when interfacing the microgrid with the ng as shown in fig. 1. sinusoidal pulse width modulation (spwm) is the simplest continuous carrier-based pwm method for generating pulses, and for switching inverter-based devices with a fundamental frequency of 50 or 60 hz. the main objectives of any modulation scheme are: (1) lower switching losses, (2) reduced thd of output current, (3) minimize computational switching time, (4) better dc bus utilization, and (5) easy digital implementation [26]. in the spwm-based system in fig. 2, the three-phase fundamental components of the ac output voltage of the inverter are given by (1) [27-31]. 0 0 0 0 0 0.5 cos( ) 0.5 cos( 120 ) 0.5 cos( 120 ) i a a dc i b b dc i c c dc v m v t v m v t v m v t            (1) where ma, mb, mc = modulation index per phase; vdc = dc link voltage; and ω0 = fundamental angular frequency of the system. by using the vector-control approach, (1) can be represented as αβ-components hence offering better tracking performance at steady state for the proportional-resonant (pr) controller as shown in (2). dci dci vmv vmv   5.0 5.0     (2) the magnitude of the ac output voltage of the dg in (2) is provided in (3) dcdci mvmmvv 5.05.0 22    (3) in (3), the fundamental component of the ac output voltage is thus controlled by controlling the inverter amplitude modulating index m. where m is defined as the ratio of the amplitude of the modulated signal to that of the carrier signal. the inverter switching process works well for 0 < m < 1 to prevent unwanted harmonic distortion [23, 32, 33]. the dc link voltage vdc of the inverter-based source must satisfy (3) to avoid pwm over modulation and to ensure the stable operation of the dg in a microgrid. however, when there is a reduction in renewable energy resource level (i.e. wind or solar irradiance level) hence decreasing vdc, m must increase to maintain vi-αβ in (3). at m = 1; a fixed vi-αβ depends solely on vdc. therefore, when designing the ivbs, consideration for the minimum dc link voltage to satisfy (3) must be ensured. the diagrammatic description of the closed-loop control scheme of each inverterbased dg in an islanded microgrid is shown in fig. 2. the direct proportional-resonant (pr) control approach can be used to simplify fig. 2 as shown in the block diagram representation in fig. 4. 150 a. m. egwebe, m. fazeli, p. igic, p. holland gv-pr(s) gc-pr(s) + + vdc ÷ * gpwm(s) * * slf + rf 1+ + io scf 1 vovo * il * il ic lc filterpwm invertercontrollers vi fig. 4 block diagram of the closed-loop inverter-based source [20, 34] in the closed-loop dg model in fig. 4, an outer voltage loop gv-pr is used to control the output voltage of the inverter. the main control objective of gv-pr is to maintain a clean and balanced dg voltage as close as possible to the given sinusoidal reference voltage so that the thd of the output voltage is minimized. the voltage reference is compared with the measured voltage in αβ-frame to produce an error signal. the error signal is fed into a pr compensator, which in turn generates the current reference signal for the inner current loop. similarly, in the inner current controller gc_pr in fig. 4, the reference current from the outer voltage loop is compared with the measured output current. the error signal is fed into a pr controller to generate the reference signal for the pwm generator. the controlled output wave from the current controller is transformed back to the abc-frame using the abc/αβ-coordinate transformation principle, to generate the reference control signal for the inverter switching devices. the bandwidth of the inner current controller is usually designed to be much faster than the outer voltage loop to achieve a fast dynamic response. in general, the voltage and current controllers are designed to provide nearly perfect sinusoidal output voltage waveforms at a nominal switching frequency and to offer good damping for the output filter of the inverter and the rejection of high-frequency disturbances. vcα ioα + vcβ ioβ vcβ ioα vcα ioβ vcα vcβ ioβ ioα p qωf s + ωf ωf s + ωf p q ω * mpp v * nqq 1 s ω0 θ v fig. 5 droop controlled power sharing for islanded dg [30] the dynamics of the control scheme depends mainly on the bandwidth of the pq controller shown in fig. 4, since the bandwidth of the current and voltage controller, are designed to be much higher than that of the pq controller [25]. the power controller block is used for accurate sharing of p and q according to the droop characteristics as shown in fig. 5 [10]. the low-pass filter with cutoff frequency wf is used to extract the average powers as shown in fig. 5. the non-ideal pr controller adopted in this paper can overcome two well-known drawbacks of conventional pi controller: (1) the inability to track a sinusoidal reference with zero steady-error, (2) poor disturbance rejection capability. this is due to the pr controller infinite gain at the fundamental frequency [35], thus reducing steady state error to zero. load sharing methods for inverter-based systems in islanded microgrids  a review 151 equation (4) shows the transfer function of the adopted practical non-ideal pr controller to achieve finite gain at the ac line frequency. 22 s2s s2 (s) oci ci ppr k k kg     (4) fig. 6 frequency response of a pr controller for kp = 0.01, ki = 1 and ωc = 1, 5, 15, 25, 50, 100 rad/s the frequency response of (4) shows a wider bandwidth around the 50 hz resonant frequency which helps to minimize any slight frequency variation due to load disturbance. the pr controller’s bandwidth can be varied with the damping factor ωc as shown in fig. 6. it can be seen that ωc has an effect on both the magnitude and phase of the controller. when choosing ωc, there has to be a compromise between the reduction of sensitivity and steady state error. 2.1. current controller design the design objective of the current controller is to have a high loop bandwidth with sufficient stability margins. it is noted from control laws that systems with greater gain margins can withstand greater changes in the systems parameters before becoming unstable in the closed loop response. when designing via frequency response analysis, the goal is to predict the closed-loop behavior from the open-loop response of the current control loop shown in fig. 4. the closed-loop transfer function of the current loop when the output current is assumed as disturbance is given in (5) [36]. feedforward terms are added to the current loop in order to decouple the αβ components of the output voltage [37]. (s)(s)(s)1 (s)(s)(s) (s) * lfpwmprc lfpwmprc l l c ggg ggg i i g     (5) where gc-pr is the pr current controller; glf is the transfer function of the lc filter respectively; gpwm(s) = 1 / (1+1.5tss) represents the pwm and computational delay with 152 a. m. egwebe, m. fazeli, p. igic, p. holland respect to the sampling period ts. by setting ωc in (4) equal to 10 rad/s, (5) can be tuned for a closed-loop bandwidth of 1 khz to give kp and ki of 12.5ω and 250ω respectively. note that the bandwidth of (5) is usually selected as one-tenth of the switching frequency. 2.2. voltage controller design the voltage controller is also based on the pr structure discussed in (4), where a generalized integrator is used to achieve a zero steady-state error. the closed-loop dynamic behavior of the dg in fig. 4 is approximated as an equivalent thevenin equation as given in (6):             ooooo o xxx prcpwmff o xxx prvprcpwm o izvgv i csbas ggrsl v csbas ggg v (s)(s) (s)(s)(s)(s)(s) * 2 * 2 (6) where ax = lfcf; bx = (rf + gpwm(s)gcpr(s))cf; cx = gpwm(s)gcpr(s)gvpr(s); gvpr(s) is the pr capacitor voltage controller; rf is the parasitic resistance of the filter inductor; go(s) is the control closed-loop system transfer function; zo(s) is the output impedance. fig. 7 shows the open-loop frequency response of the dg’s voltage loop when the output current is assumed as disturbance, the positive high gain margin (46.4 db) and phase margin (79.3 degrees) both confirm the stability of the overall system. the bandwidth of the voltage controller is tuned to be about one-fifth of the bandwidth of the current controller as shown in the closed loop frequency response in fig. 8, to give kp and ki values of 0.5ω and 7 ω respectively. fig. 7 open-loop frequency response of the dg voltage loop load sharing methods for inverter-based systems in islanded microgrids  a review 153 fig. 8 closed-loop frequency response of the dg voltage loop 2.3. virtual impedance design for p and q decoupling go(s) vo * +io vo zv(s) + zo(s) → go(s) vo * +io vo go(s)zv(s)+zo(s) fig. 9 block diagram representation of virtual impedance loop in order to ensure a stable output impedance of the dg, the output dg voltage is dropped proportionally with the output current as shown in fig. 9 and explained in (7): * ( ) ( ( ) ( ) ( )) o o o o v o o v g s v g s z s z s i   (7) the output impedance of the inverter is re-designed to mitigate the influence of control parameters and line impedance on the power-sharing accuracy around the fundamental frequency as shown in fig. 9, to share the power precisely between the distributed ivbs [34]. references [21, 34, 38, 39] proposes a design scheme to eliminate the impact of dg output impedance on the overall system dynamics; hence the virtual impedance loop was implemented for power decoupling and restraining of circulating current between dgs. a performance comparison of virtual impedance techniques used in droop-controlled islanded microgrids was presented in [40, 41]. it was noted that the virtual inductive loop helps to improve the output impedance of the inverters such that it becomes predominantly inductive thereby improving the power-sharing accuracy of the droop control algorithm. similarly, a virtual resistive loop increases the output impedance of the inverters such that it becomes more resistive. the overall effect of impedance mismatches is also reduced by 154 a. m. egwebe, m. fazeli, p. igic, p. holland the virtual resistance loop thereby improving the current sharing. a virtual resistance allows sharing of linear and nonlinear loads in microgrid applications without introducing additional losses in the network and improves the stability of the microgrid [41]. according to fig. 10, the magnitude of the ivbs output impedance at the fundamental frequency is approximately zero. this shows the effectiveness of the designed control parameter of the voltage and current loop. hence, the output impedance of the ivbs is designed to be equal to the virtual impedance around the fundamental frequency as shown in fig. 10. fig. 10 also illustrates the effect of the virtual inductance on the overall output impedance of the ivbs. as can be seen, the overall output impedance become more inductive as the virtual inductance increases. fig. 10 output impedance frequency response of the ivbs with varying virtual inductance 3. conventional load sharing schemes in an islanded microgrid load sharing without communication between the parallel dgs is the most favored option in an autonomous microgrid as the network can be complex and can span over a large geographical area [19, 42]. numerous literature has studied and presented the droop scheme so that parallel dgs can be locally controlled to deliver required active and reactive power to the microgrid network. by adopting the droop scheme, two local independent network quantities (voltage and frequency) are controlled to regulate active and reactive power with consideration for the allowable frequency and voltage deviation within the microgrid. the small-signal stability analysis of the droop scheme has also been explored in the various literature [19, 30, 43]. one major concern with the droop scheme is its sensitivity to the imbalance in the system’s closed-loop output impedance and line impedance, which can lead to poor coupling between the active and reactive power [21, 42]. load sharing methods for inverter-based systems in islanded microgrids  a review 155 the complex power delivered to the common bus in fig. 2 can be expressed as shown in (8). jqps  (8) 2 2 cos( ) cos sin( ) sin o b b o b b v v v p z z v v v q z z                   (9) where p and q are the active and reactive power delivered by the dg; vo is the ac output voltage of the dg; vb is the bus voltage; z is the magnitude of the output impedance, and θ is the phase angle of the output impedance. 3.1. active power-frequency droop scheme conventionally, the output impedance is considered to be purely inductive (i.e. z ≈ jx), hence (9) is re-written as in (10).          x v x vv q x vv p bbo bo 2 cos sin   (10) the power droop controller in fig. 5 aims to adjust the frequency and voltage difference relative to increasing load in a stable manner. in an inductive-based microgrid, the droop equation is expressed as (11) and shown in fig. 13. * * 0 0 * * ( ); ( ) p o q m p p v v n q q               (11) where ω* is the fundamental frequency; v* is the ac reference voltage; p* and q* are the reference active and reactive powers; ɸ is the power angle, p and q are the instantaneous active and reactive power of the dg. the droop gains mp and nq are calculated for a given range of frequency and voltage as shown in (12) rated q rated p q vv n p m minmaxminmax ;      rated q rated p q vv n p m minmaxminmax ;      (12) fig. 11 steady-state characteristic of conventional droop scheme 156 a. m. egwebe, m. fazeli, p. igic, p. holland equation (11) indicates that the active power of the dg is dependent on the power angle, whereas the voltage amplitude difference mainly influences the reactive power. equation (13) shows the load distribution for n-parallel connected dgs in a microgrid when (11) is adopted. nqnqq npnpp qnqnqn pmpmpm   ... ... 2211 2211 (13) 3.2. active power-voltage droop scheme it is noted in the various literature that the performance of the conventional droop control is severely affected by the resistance-to-inductance (r/x) ratio of output and the line impedance. equation (9) is given as (14) when the output impedance of the dg is resistive (i.e. z ≈ r): sin cos 2   r vv q r v r vv p bo bbo   (14) since low voltage microgrid electrical distribution networks present a high r/x ratio, the voltage amplitude is used to control active power, while reactive power is controlled by the system frequency as shown in (15). * * 0 0 * * ( ); ( ) p o o q m q q v v n p p           (15) rated p q m minmax    ; rated q p vv n minmax   3.3. virtual impedance load sharing scheme the active and reactive power can also be well autonomously controlled using the virtual impedance scheme in (16) without any requirement for additional power controller as studied in [43]. equation (16) ensures accurate load sharing between the dgs and compensates reactive power differences due to output voltage mismatches, or line impedance mismatches. in order to avoid the steady-state frequency deviation, a pll is introduced. this way, the pll adjust the phase of the inverter, and the system is controlled by a virtual resistance controlling current as in a dc electrical system. reference [44] proposes an autonomous loading sharing scheme using the virtual resistance loop and a synchronous reference frame phase-locked loop. this scheme provides for both instantaneous current sharing and fast dynamic response of the paralleled ivbs. the relationship between the ioαβ and the virtual resistance rv for n-dgs is given as (16). vnnovovo vnnovovo ririri ririri     ... ... 2211 2211 (16) the small-signal analysis shows that the output α and β axis output currents of paralleled inverters are inversely proportional to their virtual resistances since the current load sharing methods for inverter-based systems in islanded microgrids  a review 157 sharing performance is just influenced by the output impedance ratio instead of the output impedance value of the dgs [43]. 1 v i1 * (p) 2 v * i1(p) i2 (p)i2 * (p) i(p) fig. 12 characteristics of virtual impedance droop (v-p) 3.4. energy saving via dynamic load sharing reference [10] presented a dynamic load sharing scheme for photovoltaic (pv) inverter-based systems in an inductive microgrid, by using the pv array’s current vs voltage characteristics in defining an operating range for the inverter-based source. the dynamic load sharing scheme is based on the available solar power to ensure an efficient load sharing interaction with other dgs, without the need for energy support from local connected fossil-fuelled auxiliary generator and thereby providing significant energy saving compared with conventional static droop control techniques. in the dynamic loading scheme, the droop gains of the power controller in (11) are redefined for dynamic load interaction between the dgs [45] as follows: avail q dc p q v n p m      ; max  (17) where pdc-max is the maximum available power of the pv array which is deduced from the maximum power curve in fig. 13. qavail is the available reactive power that the dg can supply as defined in (18). 22 dgratedavail psq  (18) figure 14 shows the load sharing profiles of two dgs interfaced to the islanded microgrid [10]. fig. 14.b shows load sharing based on the conventional droop scheme in (11), where a drop in the available power of dg2 causes similar drop in dg1 even though it has enough available capacity. as a result, the total generation becomes less than the load, the auxiliary generator (ag) is thus triggered on to supply the shortage in supply paux. in fig. 14.c, the load is adaptively shared based on the available pv power using (17). thus, a drop in the available energy in dg2 causes a proportional drop in its contribution to pload. similarly, dg1 dynamically compensate for this drop by supplying more power. hence no extra power is required from the ag (paux ≈ 0 in fig. 14.c). 158 a. m. egwebe, m. fazeli, p. igic, p. holland ppv-rated pload pdc-max o operating points as g drops vdc-min voc c g1 fig. 13 steady-state characteristic of pv operating zone fig. 14 simulation results of two dg systems using droop-based load sharing scheme showing active power sharing (a) available solar power in pu; (b) static scheme: active power in pu (d) dynamic scheme: active power in pu. 4. conclusion a thorough review of the effective integration of inverter-based systems in islanded microgrids was presented in this paper. different control and load sharing method were discussed with respect to the output impedance of the dg, and a frequency response analysis influence of the pr controller on the performance of the dg was also presented. in the dynamic load sharing scheme presented, the droop parameters were tuned based on the available power of the dg. the dynamic load sharing scheme offers energy savings when compare to the conventional loading scheme. references [1] m. olken and a. zomers. (2014, jul.aug.) energy for all: world access to electricity. power and energy society. [2] j. c. vasquez, j.m. guerrero, m. savaghebi, j. eloy-garcia and r. teodorescu, "voltage support provided by a droop-controlled multifunctional inverter," ieee trans. ind. electronics, vol. 56, pp. 4510-4519, oct. 2009. load sharing methods for inverter-based systems in islanded microgrids  a review 159 [3] p. basak, s. chowdhury, s. halder, s. p. chowdhury, "a literature review on integration of distributed energy resources in the perspective of control, protection and stability of microgrid," renewable and sustainable energy reviews, vol. 16, pp. 5545-5556, 2012. [4] n. jenkins, r. allan, p. crossley, d. kirshen, and g. strbac, embedded generation. london: the institute of electrical engineers, 2000. [5] y. levron, j. m. guerrero and y. beck, "optimal power flow in microgrids with energy storage," ieee transactions on power systems, vol. 28, pp. 3226-3234, 2013. [6] m. milligan, b. frew, b. kirby, m. schuerger, k. clark, d. lew, p. denholm, b. zavadi, m. o’malley, and b. tsuchida, "alternatives no more: wind and solar power are mainstays of a clean, reliable, affordable grid," ieee power and energy magazine, vol. 13, pp. 78-87, 2015. [7] p. k. olulope, k. a. folly, and g. k. venayagamoorthy, "modeling and simulation of hybrid distributed generation and its impact on transient stability of power system," in proc. of the 2013 ieee international conference on industrial technology (icit), 2013, pp. 1757-1762. [8] q. fu, a. hamidi, a. nasiri, v. bhavaraju, s. b. krstic, and p. theisen, "the role of energy storage in a microgrid concept: examining the opportunities and promise of microgrids," ieee electrification magazine, vol. 1, pp. 21-29, 2013. [9] g. a. jimnez-estevez, "energy access challenge: it takes a village," ieee power and energy society trans. , vol. 12, pp. 60-69, 2014. [10] a. m. egwebe, m. fazeli, p. igic, and p. m. holland, "implementation and stability study of dynamic droop in islanded microgrids," ieee transactions on energy conversion, vol. 31, pp. 821-832, 2016. [11] j. rocabert, g. m. s. azevedo, a. luna, j. m. guerrero, j. i. candela, and p. rodrixguez, "intelligen t connection agent for three-phase grid-connected microgrids," ieee trans. power electronics, vol. 26, pp. 2993-3005, oct. 2011. [12] j. y. kim, j. h. jeon, s. k. kim, c. cho, j. h. park, h. m. kim, and k. y. nam, "cooperative control strategy of energy storage system and microsources for stabilizing the microgrid during islanded operation," ieee transactions on power electronics, vol. 25, pp. 3037-3048, 2010. [13] m. fazeli, g.m. asher, c. klumpner, l. yao, "novel integration of dfig-based wind generators within microgrids," ieee trans. energy conversion, vol. 26, pp. 840-850, aug. 2011. [14] l. yun wei and k. ching-nan, "an accurate power control strategy for power-electronics-interfaced distributed generation units operating in a low-voltage multibus microgrid," ieee transactions on power electronics, vol. 24, pp. 2977-2988, 2009. [15] c. trujillo rodriguez, d. velasco de la fuente, g. garcera, e. figueres, and j. a. guacaneme moreno. trujillo rodriguez, et al., "reconfigurable control scheme for a pv microinverter working in both gridconnected and island modes," ieee trans. ind. electronics, vol. 60, pp. 1582-1595, nov. 2013. [16] l. gao, r. a. dougal, s. liu, and a. p. lotova. gao, et al., "parallel-connected solar pv system to address partial and rapidly fluctuating shadow conditions," ieee transactions on industrial electronics, vol. 56, pp. 1548-1556, 2009. [17] b. homchaudhuri and m. kumar, "market based allocation of power in smart grid," in proceedings of the 2011 american control conference, 2011, pp. 3251-3256. [18] n. s. wade, p. c. taylor, p. d. lang, and p. r. jones, "evaluating the benefits of an electrical energy storage system in a future smart grid," energy policy, vol. 38, pp. 7180-7188, 2010. [19] r. majumder, b. chaudhuri, a. ghosh, g. ledwich, and f. zare, "improvement of stability and load sharing in an autonomous microgrid using supplementary droop control loop," ieee power and energy society general meeting, pp. 1-1, jul. 2010. [20] x. wang, f. blaabjerg, and z. chen, "an improved design of virtual output impedance loop for droopcontrolled parallel three-phase voltage source inverters," in 2012 ieee energy conversion congress and exposition (ecce), 2012, pp. 2466-2473. [21] s. golestan, f. adabi, h. rastegar, and a. roshan, "load sharing between parallel inverters using effective design of output impedance," in proc. of the power engineering conference, 2008. aupec '08. australasian universities, 2008, pp. 1-5. [22] n. mohan, t. undeland, and w. robbins, power electronics: converters, applications and design. new jersey: john wiley & sons, inc., 2003. [23] a. keyhani, design of smart power grid renewable energy systems. hoboken, new jersey: john wiley and sons, 2011. [24] p. igic, "review of advanced igbt compact models dedicated to circuit simulation," facta universitatis series: electronics and energetics, vol. 27, pp. 1-12, 2014. [25] m. n. marwali, j. jin-woo, and a. keyhani, "stability analysis of load sharing control for distributed generation systems," ieee trans. energy conversion, vol. 22, pp. 737-745, sep. 2007. 160 a. m. egwebe, m. fazeli, p. igic, p. holland [26] m. trabelsi, l. ben-brahim, t. yokoyama, a. kawamura, r. kurosawa, and t. yoshino, "an improved svpwm method for multilevel inverters," in proc. of the 15th international power electronics and motion control conference (epe/pemc), 2012, pp. ls5c.1-1-ls5c.1-7. [27] v.f. pires, j.f. martins, and c. hao, "dual-inverter for grid-connected photovoltaic system: modeling and sliding mode control," sciencedirect: solar energy, vol. 86, pp. 2106-2115, jul. 2012. [28] y. mohamed and e. f. el-saadany, "adaptive decentralized droop controller to preserve power sharing stability of paralleled inverters in distributed generation microgrids," ieee trans. power electron., vol. 23, pp. 2806-2816, nov. 2008. [29] p. h. divshali, s. h. hosseinian, and m. abedi, "a novel multi-stage fuel cost minimization in a vscbased microgrid considering stability, frequency, and voltage constraints," ieee trans. power sys. , vol. 28, pp. 931-939, may 2013. [30] s. hongtao, z. fang, h. lixiang, y. xiaolong, and z. dong, "small-signal stability analysis of a microgrid operating in droop control mode," ieee ecce asia downunder (ecce asia), pp. 882-887, jun. 2013. [31] m. antchev and g. kunov, "investigation of three-phase to single-phase matrix converter," facta universitatis series: electronics and energetics, vol. 22, pp. 245-252, 2009. [32] m. liserre, r. teodorescu, and j. rodriguez, grid converter for photovoltaic and wind power systems. chichester, west sussex: john wiley & sons inc, 2011. [33] z. grbo, s. vulkovic, and e. levi, "a novel power inverter for switched reluctance motor drives," facta universitatis series: electronics and energetics, vol. 18, 2005. [34] j. c. vasquez, j. m. guerrero, m. savaghebi, j. eloy-garcia, and r. teodorescu, "modeling, analysis, and design of stationary-reference-frame droop-controlled parallel three-phase voltage source inverters," ieee transactions on industrial electronics, vol. 60, pp. 1271-1280, 2013. [35] h. cha, t. k. vu, and j. e. kim, "design and control of proportional-resonant controller based photovoltaic power conditioning system," in 2009 ieee energy conversion congress and exposition, 2009, pp. 2198-2205. [36] a. chatterjee and k. b. mohanty, "design and analysis of stationary frame pr current controller for performance improvement of grid tied pv inverters," in proc. of the ieee 6th india international conference on power electronics (iicpe), 2014, pp. 1-6. [37] f. de bosio, l. a. d. s. ribeiro, m. s. lima, f. freijedo, j. m. guerrero, and m. pastorelli, "inner current loop analysis and design based on resonant regulators for isolated microgrids," in proc. of the ieee 13th brazilian power electronics conference and 1st southern power electronics conference (cobep/spec), 2015, pp. 1-6. [38] x. wang, f. blaabjerg, and z. chen, "autonomous control of inverter-interfaced distributed generation units for harmonic current filtering and resonance damping in an islanded microgrid," ieee transactions on industry applications, vol. 50, pp. 452-461, 2014. [39] j. m. guerrero, v. luis garcia de, j. matas, m. castilla, and j. miret, "output impedance design of parallel-connected ups inverters with wireless load-sharing control," ieee transactions on industrial electronics, vol. 52, pp. 1126-1135, 2005. [40] a. micallef, m. apap, c. spiteri-staines, and j. m. guerrero, "performance comparison for virtual impedance techniques used in droop controlled islanded microgrids," in proc. of theinternational symposium on power electronics, electrical drives, automation and motion (speedam), 2016, pp. 695-700. [41] g. herong, g. xiaoqiang, and w.weiyang, "accurate power sharing control for inverter-dominated autonomous microgrid," in proc. of the 7th international power electronics and motion control conference (ipemc), 2012, pp. 368-372. [42] j.m. guerrero, j. matas, v. luis garcia de, m. castilla, and j. miret, "decentralized control for parallel operation of distributed generation inverters using resistive output impedance," ieee trans. ind. electron., vol. 54, pp. 994-1004, apr. 2007. [43] y. guan, j. c. vasquez, j. m. guerrero, and e. a. a. coelho, "small-signal modeling, analysis and testing of parallel three-phase-inverters with a novel autonomous current sharing controller," in proc. of the ieee applied power electronics conference and exposition (apec), 2015, pp. 571-578. [44] y. guan, j.c. vasquez, and j.m. guerrero, "a simple autonomous current-sharing control strategy for fast dynamic response of parallel inverters in islanded microgrids," in proc. of the ieee international energy conference (energycon), 2014, pp. 182-188. [45] d. wu, f. tang, j. m. guerrero, j. c. vasquez, g. chen, and l. sun, "autonomous active and reactive power distribution strategy in islanded microgrids," in proc. of the ieee applied power electronics conference and exposition apec 2014, 2014, pp. 2126-2131. 10173 facta universitatis series: electronics and energetics vol. 35, no 3, september 2022, pp. 349-377 https://doi.org/10.2298/fuee2203349r © 2022 by university of niš, serbia | creative commons license: cc by-nc-nd original scientific paper combined effects of electrostatic and electromagnetic interferences of high voltage overhead power lines on aerial metallic pipeline djekidel rabah1, mohamed lahdeb1, sherif salama m. ghoneim2, djillali mahi1 1laboratory for analysis and control of energy systems and electrical systems lacosere, laghouat university (03000), algeria 2electrical engineering department, college of engineering, taif university, p.o. box 11099, taif 21944, saudi arabia abstract. the main purpose of this paper is to model and analyze the electrostatic and electromagnetic interferences between a hv overhead power line and an aerial metallic pipeline situated parallel at a close distance. the modelling of these interferences is typically done for safety reasons, to ensure that the induced voltage does not pose any risk to the operating and maintenance personnel and to the integrity of the pipeline. the adopted methodologies respectively for electrostatic and electromagnetic interferences are based on the charge and current simulation methods combined with the teaching learning based optimization (tlbo) algorithm. the friedman test analysis indicate that teaching learning based optimization (tlbo) algorithm can be used for parameters optimization, it showed better results. in the case where the induced currents and voltages values exceed the limit authorized values by the international cigre standard, mitigation measures become necessary. the simulation results obtained were compared with those provided respectively by the admittance matrix analysis and carson's method, good agreement was obtained. key words: charge simulation method (csm), current simulation technique (cst), teaching learning based optimization (tlbo), friedman test, hv power line, aerial metallic pipelines received november 3, 2021; revised february 25, 2022; accepted february 28, 2022 corresponding author: djekidel rabah laboratory for analysis and control of energy systems and electrical systems lacosere, laghouat university (03000), algeria e-mail: rabah03dz@live.fr 350 r. djekidel, m. lahdeb, s. s. m. ghoneim, d. mahi acronyms: ac alternating current fem finite element method cigre international council on large electric systems ga hs genetic algorithm harmony search csm charge simulation method hv high voltage cst dc current simulation technique direct current ieee institute of electrical and electronics engineers eas evolutionary algorithms nna nodal network analysis emf electromotive force of objective function fba flower pollination algorithm pso particle swarm optimization fdm finite difference method tlbo teaching learning based optimization 1. introduction the hydrocarbon and water transport metallic pipelines (buried or aerial) that share common right-of-way with high-voltage overhead transmission power lines network are subject to the influence of electrostatic and electromagnetic interferences created by the electric and magnetic fields emitted by these hv power lines in normal operating condition. these fields can induce voltages and currents in the metallic pipelines installed in the immediate vicinity of these hv power lines. in some cases, these induced voltages can reach to high levels enough to be hazardous to the safety of operating personnel coming into contact with the metallic pipeline, causing severe damage to metallic pipeline safe operation and associated equipment, cathodic protection systems and the pipeline itself [1-4]. consequently, the induced voltages on the metallic pipelines must be reduced to acceptable levels for the safety of workers personnel, and to ensure the integrity of the pipeline. based on the above, it is important and necessary to assess electrostatic and electromagnetic interference between transmission power lines and pipelines for performance and safety reasons in normal operation condition of the electric network. interference problems involving hv overhead power lines and metallic pipelines have been commonly deal in the literature, where several important researches have been devoted to evaluating the inductive and capacitive interference phenomenon based on various analytical and numerical methods. different simulation methodologies have been used [5,6], which are generally relied on transmission line approach [7-15], or by finite element method (fem) alone [16-20], or in combination with circuit analysis [21-26]. in addition, the nodal network analysis [27,28], the finite difference method (fdm) [29,30] and the charge simulation method (csm) [31-35]. the transmission line approach utilizes thevenin equivalent circuits as its basic assumption and provides almost good results for the induced voltage, the finite element method (fem) is a most robust approach with reliable and accurate results for calculating induced voltage, the circuit theory approach gives more conservative results because it does not take into account the effects of infinite transmission line length, the nodal network analysis (nna) can predict the induced voltage with sufficient accuracy, the finite difference method (fdm) is sufficiently rigorous, leading to accurate results, the charge simulation method (csm) is one of the most widely used approaches for its various advantages of optimization and accuracy which leads to better accuracy of results. this present paper proposes a numerical modeling analysis of electrostatic and electromagnetic couplings between hv overhead power lines and a proximity aerial metallic pipeline using hybrid simulation methods. the computation methodologies used were successively designed on the basis of the charge simulation method (csm) and the current combined effects of electrostatic and electromagnetic interferences... 351 simulation technique (cst) [36-38]. the main constraints of these analysis methods consist respectively in the number and position of the fictitious charges and the line current filaments. for solving this associated optimization problem in order to obtain the optimal values of these parameters, which provide a solution of sufficient precision of these couplings, evolutionary computation algorithms (eas) are commonly used. evolutionary algorithms (eas) are stochastic optimization methods based on a rough simulation of the natural evolution of populations. one of the most important and best types of evolutionary algorithms is teaching learning based optimization (tlbo). the teaching learning based optimization (tlbo) is a new stochastic optimization metaheuristics that was originally proposed by rao et al in 2011[39]. this population search algorithm is inspired by the teaching learning process and is based on the effect of the influence of a teacher on the production of students in a classroom; it is widely used due to their best performance, its efficiency and simplicity of implementation [40]. it has been successfully applied to solve optimization problems in many scientific applications and techniques in recent years. finally, the validity of the simulation results obtained by the two proposed combined methods is demonstrated by a comparison with those yielded respectively by the analytical approaches based on the admittance matrix analysis and carson's equations [15,35]. 2. coupling mechanisms in electricity, coupling is the transfer of energy from element to another element of the electrical system. there are mainly three types of couplings by which alternating voltages and currents can be induced on metallic pipelines near hv power transmission lines, these sources of interference are electrostatic, electromagnetic and conductive coupling. 2.1. electrostatic coupling from hv power line to pipeline only metallic pipeline installed above ground level is subject to the electrostatic coupling, the buried pipeline is protected by the good shielding effect caused by the ground. if a pipeline is located near a hv power line above ground level, it can undertake a large voltage to ground. the voltage is due to the charges accumulation through the capacitance between the hv power line conductors and pipeline in series with the capacitance between the pipeline and ground, which form a capacitive voltage divider; this is illustrated in figure 1[1-3]. fig. 1 electrostatic coupling from hv power line to a metallic pipeline ground level c2 pipeline c1 r s t ⚫ ⚫ ⚫ g 352 r. djekidel, m. lahdeb, s. s. m. ghoneim, d. mahi 2.2. electromagnetic coupling from hv power line to pipeline the electromagnetic interference is the result of the magnetic field temporal variation generated by the hv power lines, as shown in figure 2. aerial and buried pipelines running parallel to or in close proximity to hv transmission lines are subjected to induced voltages by the time varying magnetic flux produced by the hv transmission line currents according to faraday's law of electromagnetic induction. the induced voltage causes currents circulation on the pipeline and voltages between the pipeline and the surrounding earth [1-3]. fig. 2 electromagnetic coupling from hv power line to a metallic pipeline 2.3. conductive coupling from hv power line to pipeline conductive coupling appears when a phase-to-earth or phase-to-phase-to-earth default had occurred. in this case, a large amount of current flows to earth through the pylon earthing, as shown in figure 3 below. this current raises the ground potential in proximity to metallic pipeline. this high voltage stresses the coating of pipelines and can cause arcs that damage the pipeline coating or the pipeline itself. in addition, this high voltage difference could pose an electric shock hazard to person directly touching the pipeline [1-3]. fig. 3 conductive coupling from hv power line to a metallic pipeline r s t ⚫ ⚫ ⚫ conductor single phase-ground fault pipeline fault currents conductive soil g pipeline magnetic field ground level combined effects of electrostatic and electromagnetic interferences... 353 3. electrostatic coupling calculation charge simulation method (csm) is a numerical calculation tool for the solution of boundary value problems of laplace's equation. this method was initially proposed by steinberger in 1969 [41], and then it was well developed and turned into a very powerful and efficient tool for calculating the electric field for high-voltage equipment. in fact, this method is very simple to use and implement; it can quickly deal with the problem to be solved while providing an accurate solution [42, 43-45]. in the principle of this method, each conductor is simulated by a number of simulated fictitious infinite line charges placed inside the conductor around a cylinder of fictitious radius. in most problems concerning the solution by the charge simulation method (csm), there is a plane of symmetry which is generally represented by the earth conventionally assumed that its reference potential is zero, this procedure makes it possible to take into account the ground effect, by introducing the concept of image charges [46-50]. therefore, the number of boundary points selected on the conductor's surface is assumed to be equal to the number of simulated charges; these charges are placed in such a manner while satisfying the dirichlet type boundary conditions. once the magnitudes of these simulated charges are determined, the potential at any point in space outside the region of the conductors can be determined using the superposition theorem as follows [50-53]: 1 cn i ij j j v p q = =  (1) where, nc is the total number of simulated charges; pij is the maxwell's potential coefficient at the contour point ( )i created by the simulated charge qj. firstly, the magnitudes of simulated charges are computed by solving the system of nc linear equations for nc unknown charges in the form described below in equation (2) [50-53]: 1[ ] [ ] [ ] c cc c j n ij ci nn n q p v (2) where, qj is the column vector of the simulated charges on the conductors; vci is the column vector of the known potentials at the boundary points of the conductors; pij is the matrix of the maxwell potential coefficients of the conductors. as an example, in figure 4, we consider three point charges in free space placed at different distances from the point mi. according to the superposition principle, the potential vi at this point will be [41]: 1 2 3 1 1 2 2 3 3 0 1 0 2 0 3 4 4 4 i i i i q q q v p q p q p q r r r   = + + = + + (3) once the magnitudes of the simulated charges are calculated after solving the system of equation (2), it is necessary to check whether these calculated magnitudes produce the same real boundary conditions fixed on the conductors’ surface; in order to get the best calculation precision. firstly, by selecting several checkpoints around the conductors, the new potential can be computed by these checkpoints on the surface of conductors. secondly, by determining the relative error between the new calculated potential and the real potential applied to the contours of the conductors, which makes it possible to indicate the simulation accuracy. if this accuracy does not satisfy the simulation criterion, it is necessary to change the number and/or location of the simulated charges. once this is done, the electric field strength at any point can be computed [50-53]. 354 r. djekidel, m. lahdeb, s. s. m. ghoneim, d. mahi fig. 4 three point charges in free space the charge simulation method (csm) is widely used to calculate the electric field strength in the vicinity of very high voltage overhead transmission lines. generally, the type of charges used for overhead power lines are of infinite length, because the radius of the conductor is negligible compared to its length. the typical emplacement of simulated charges and contour points in the conductor/pipeline cross-section is shown in figure 5. fig. 5 two-dimensional arrangement of simulation charges and contour points for the line conductor and the pipeline the general form of coordinates for contour points and simulated charges along the orthogonal frame is described by the following equations [35,36,51]: ( ) ( )0 0 2 2 cos 1 , sin 1k k k k x x r k y y r k n n      = +   − = +   −        (4) where, 1 2 , r r if k i r if k j= = = , 0y is the height of conductors/ pipeline above ground level; x0 is the horizontal coordinates of conductors/ pipeline. the electric field calculation generated by an electric charge is described by gauss's law. for a three-phase transmission line, in a rectangular coordinate system, the horizontal and vertical components of the electric field intensity along the two perpendicular axes due to all the simulated charges, including the image charges, are expressed by the equations described below [50-53]: 1 1 , c c i i n n x ij j y ij j j j e f q e f q = = = =  (5) where, fxi and fyi are the electric field intensity coefficients between the contour points and the simulated charges qj. the total electric field strength at any observation point is calculated as follows [43]: 2 2 res xi yi e e e= + (6) : simulated charges : contour points : check point r1: real radius of the conductor/pipeline r2: fictitious radius of the conductor/pipeline r1 r2 mi q2 r2 q3 q1 r1 r3 combined effects of electrostatic and electromagnetic interferences... 355 the induced voltage on the aerial metallic pipeline due to the capacitive effect of all electrical charges that simulate the conductors is evaluated as follows [1,33]: 2 2 2 2 10 ( ) ( )1 .ln 2 ( ) ( ) cn j j ind j j j j x x y y v q x x y y  =  − + +  =  − + −    (7) where, (x,y) are the coordinates of the observation point; (xj,yj) are the coordinates of the simulated charges. if a person is in contact with the ground and at the same time touches this pipeline, he gets an electric shock whose current passing through his body is given by the following relationship [1, 32]: shock p p indi j c l v= (8) where, lp is the length of the pipeline exposed to the electrostatic coupling; cp is the pipeline’s capacitance to earth per unit length;  is the angular frequency. when the discharge current in human body exceeds the safe limit in steady state conditions defined by the cigre standard at 10 ma [1], its reduction below the admissible level is required; the best protection is to connect the metallic pipeline to the ground through an adequate resistance rg, its value must be less than [1,54]: 1 body g r r  − (9) where, rbody is the body resistance;  is a ratio which is given by (i / i )shock admß = . according to the american standard ieee 80:2013, the overall resistance of the human body is usually taken equal to 1000 ω [1,55]. 3. electromagnetic coupling calculation many analytical and numerical methods are available for modeling and simulating magnetic induction due to very high voltage (vhv) overhead transmission lines. the current simulation technique (cst) is the most suitable method for two-dimensional computation, as it represents a reliable and efficient evaluation tool in the numerical solution of the magnetic induction equation for open boundary problems. its basic principle is very similar to that of the charge simulation method (csm) [37, 38]. high voltage transmission lines may use the bundled conductors (multiple sub-conductors per phase) to increase the electrical transport capacity. this approach consists by representing each current passing through a sub-conductor by a set of finite number of current filaments nf. in this method, each current passing through a sub-conductor is considered as a set of finite number of current filaments nf, which are allocated across a cylinder surface of fictitious radius rj. in a three-phase transmission line with bundled conductors, if each phase conductor consists of (m) identical sub-conductors, the total number of sub-conductors is (3m), as shown in figure 6. the number and position of simulated filament currents depends on the total number of power line conductors, their spatial arrangements and boundary conditions. for the full number of currents filaments line, the simulation currents along the all sub-conductors ) 1 ....3( i f i i mn=    must satisfy the following conditions [56-59]: 1 the normal component of the magnetic field intensity on the sub-conductor surfaces is zero, according to the biot-savart's law. 356 r. djekidel, m. lahdeb, s. s. m. ghoneim, d. mahi 2 the sum of the filamentary currents which simulates the current in the sub-conductor must be equal to the real current passing through the sub-conductor. after selecting several contour points on the sub-conductors surface, the unknown simulation currents can be assessed by solving the system of equations given below: 3 1 0 , 1, 2, 3,........,3 ( 1) fn m ij ij i f i a k i j m n = = = = − (10) ( 1) 1 , 1, 2, 3,..............,3 f f n q i cq i q n i i q m = − + = = (11) where, m is the number of sub-conductors per phase; nf is the number of filament line currents; kij is the coefficient of normal magnetic field defined by the coordinates of the ith contour point and the jth filament line current, it is given by [37,38]: 0 ln 2 j ij ij r k r   = (12) where, rij is the distance between the simulation current point (j) and the contour point (i) at sub-conductor surface, rj is the fictitious radius of current filament simulation (see figure 7). fig. 6 three phase transmission line above ground with the images of line conductors fig. 7 normal and tangential field components at a point on the sub-conductor surface combined effects of electrostatic and electromagnetic interferences... 357 having calculated the values of the current line filaments by solving the equations system mentioned above in equations (10) and (11). it can be checked about the values and position of the currents filaments by adopting the same steps mentioned above in the charge simulation method (csm). in quasi-static analysis, the magnitude of the magnetic induction b is derived from the curl of the vector potential a, thus, the horizontal and vertical components of the magnetic induction vector according to the two perpendicular axes (x and y) can be determined as follows [37,38]: ij ij xi yi a a b a b and b x y → →   =   = =   (13) where, aij is the magnetic potential generated by the hv power line conductors’ currents, it can be expressed by the following relation [37,38]: 3 0 12 n m ij i ij i a i k   = =  (14) in this magnetic induction calculation, taking into account the earth effect. the induced currents in the earth represented by the filament image currents, which are located at a depth of penetration de below the surface of the earth, it can be calculated using the formula below [37,38]: 658.87 sed f  = (15) where, s is the electrical resistivity of the soil; f is the frequency of the source current. finally, the resulting magnetic induction intensity at a given point in space can be obtained by adding the horizontal and vertical components mentioned above in equation (13), as indicated below [37,38]: 2 2 res xj yj b b b= + (16) also, in this magnetic induction calculation, it is desirable to take into account the effects of induced currents circulating in the earth wires and metallic pipeline, which are caused by the three-phase currents passing through the phase conductors, they can be calculated by the following expression [60,61]: 1 [ ]=-[ ] [ ] [ ]g gg gc ci z z i − (17) where, zgg are the self impedances of the earth wires and metallic pipeline; zgp are the mutual impedances between the phase conductors and earth wires / metallic pipeline; ic are the currents passing through the three-phase conductors of the power line; ig represents the induced currents in the earth wires and metallic pipeline. in the extremely low frequency domain, the self and mutual longitudinal impedances of the conductors with ground return can be obtained by the simplified formulas of carsonclem as shown below, respectively [60,61]: 0 0 [ ln ( )] 8 2 e gg g gm d z r j r      = + + (18) 0 0 ln ( ) 8 2 e gc gc d z j d      = + (19) https://brilliant.org/wiki/curl/ 358 r. djekidel, m. lahdeb, s. s. m. ghoneim, d. mahi where, rg is the dc conductor resistance, rgm is the geometric mean radius of the conductor; dgc is the mutual distance between two conductors;  is the angular frequency; de is the penetration depth of earth return; 0 is the permeability of free space. the induced voltage on the aerial metallic pipeline due to the magnetic effect can be calculated through faraday’s law of electromagnetic induction. this law explains that magnetic induction that changes with time will induce a voltage in the pipeline; the total flux t due to all currents flowing through the conductors and change with time onto the pipeline is calculated as a surface integral as shown in [62-64]. t res s b ds = (20) where, t is the total flux density produced by all power line conductors; s is the total surface area. the metallic pipeline conductors form a closed loop, they are located at the position of the coordinates as shown in figure 8, the total magnetic flux t flowing through the surface s defined by the set of coordinates of the power line conductors and the pipeline can be expressed as following [62-64]: 2 2 0 2 2 1 ( ) ( ) ln 4 ( ) ( ) n p i p e i t i p i p i x x y d yl i x x y y    + + + + = − + + −  (21) where, (x,y) are the coordinates of the power line conductors; (xj,yj) are the coordinates of the metallic pipeline. finally, using the total magnetic flux, the induced voltage on the metallic pipeline due to the magnetic coupling can be expressed as follows [62-64]: t ind tv j t    = − = −  (22) in case of direct accidental contact with the metallic pipeline, the value of the shock current flowing through the human body can be calculated by this equation below [15,55]: indshock pipe body c v i z r r = + + (23) where, rbody is the human body resistance; rc is the ground contact resistance of a person; zpipe is the total impedance of the metallic pipeline, it is calculated by the equation given below [1]: 1 1 0 0 00 0 3,7 [ ln ( )] 8 22 2 p p p p s pp p j dd d                 − − + + + (24) where, dp is the pipeline’s diameter; p is the relative permeability of the pipeline’s metal; p is the pipeline’s resistivity. combined effects of electrostatic and electromagnetic interferences... 359 fig. 8 determination of the induced voltage on the metallic pipeline for touch voltages, for a soil with a surface resistivity, the contact resistance rc is calculated as [15]: 3,125 c s r =  (25) in some cases, the induced voltage exceeds the acceptable limit recommended by international standards; the international cigre regulations insist that safety measures be taken into account if the voltage on the pipeline exceeds 50v in steady state [1]. in this case, the mitigation is necessary to maintain the voltage within the permitted limit; it is enough to connect the metallic pipeline to the ground with two identical electrodes at each end of the pipeline. 4. teaching learning based optimization (tlbo) teaching learning based optimization (tlbo) is a meta-heuristic optimization algorithm proposed by rao et al. [39]. this is inspired from the teaching-learning process and is based on the effect of a teacher's influence on the output of students in a classroom environment. the teacher-students interaction is the fundamental inspiration for this algorithm, a group of learners in a classroom is considered as a population size and the different subjects offered to the learners are similar to the different design variables of the optimization problem. the results of the learner are analogous to the objective function value of the optimization problem, and the number of exams is the number of iterations, the best solution in the whole population is considered the teacher. the major advantage of this algorithm is the fact that it does not require specific control parameters. the teacher and the learners are the two essential components of the algorithm, thus, this algorithm describes two learning processes, through teacher (known as the teacher phase) and through interaction with other learners (known as the learner phase) [65-70]. pipeline ( , ) p p e x y d− −  ( , ) p p x y ( , ) j j x y ind v i i x y 360 r. djekidel, m. lahdeb, s. s. m. ghoneim, d. mahi 4.1. teacher phase during this phase, the teacher aims to impart knowledge to the learners and tries to improve the average result of the classroom, making the maximum efforts to increase the level of knowledge of those learners who acquire his knowledge depending on the quality of the teaching provided by this teacher and the skills of the learners present in the class. taking this into account, the difference between the teacher's result and the learner's average result in each subject is expressed as follows [65-70]: , ( ) i i t i f i diff r x t m= −  (26) where, ri is a random number in [0, 1]; tf is a random number that accounts for the teacher factor that depends on teaching quality, and equals either 1 or 2. the value of tf is calculated at random by the following formula [65-70]: [1 (0,1){1, 2}]ft round rand= + (27) through the processes of teaching and transferring knowledge to learners and their acquisition, their new results being modified in the upcoming test, this difference is represented by the following expression [65-70]: ' , ,j i j i ix x diff= + (28) where, ' ,j ix and ,j ix are the new and old grades learner ( )j earned in exam ( )i , respectively. the best result among the two possible will be accepted and to be used as input for the learner phase. 4.1. learner phase in this second phase, the learners increase their knowledge through the interaction between them, also by discussing and interacting with other better learners by working as a collective team which helps to produce the best results x ”. considering a population size of n , the helping interaction learning phenomenon between two learners a and b in each exam for minimization problems is explained as follow [65-70]: ' ' ' ' ' '' , , , , , , ' ' ' ' ' , , , , , ( ) ( ) ( ( ) ) a i i a i b i a i b i a i a i i b i a i b i a i x r x x if x x x x r x x if x x  + − =  + − (29) x ” is accepted into the population if it gives a better function value. the implementation steps of tlbo algorithm can be summarized as follows [71-73]: step 1: define the optimization problem (minimization) and initialize the parameters of algorithm, the population size, number of variables, the maximum number of iterations, and the objective function f(x). step 2: randomly initialize the grades (solutions) (xi,j) of n learners (j = 1, 2, ..., n) in exam (i = 1). step 3: calculate the objective function for n students in exam (i) step 4: calculate (mi) and (xt,i), identify the best solution as teacher according to ( ) minteacher f xx x == step 5: calculate diffi for exam (i) according to equation (26) by utilizing the teaching factor tf. combined effects of electrostatic and electromagnetic interferences... 361 step 6: calculate x’j,i for n learners in exam (i) according to equation (28), compare the two solutions x’j,i and xj,i, accept the best solution for transferring to the next step. step 7: choose randomly each pair of learners and update the solution according to (4) and accept the better for the next step. step 8: calculate the objective function for all learners, check if the stopping criterion is met (the optimal solution is obtained), otherwise the algorithm will iterate from step (4). for charge simulation method (csm), the objective function used for the relative error is very simple and has the form given in the following equation [38]: 1 1 1 100 nc ci vi i c ci v v of n v= − =  (30) where: vvi is the exact potential to which is subjected the conductor and vci is the actual voltage of the check points; nc is the total number of check points. for current simulation technique (cst), the employed objective function is expressed by the relative error of the magnetic potential as follows [38]: 2 1 1 100 n f ci vi i f ci a a of n a= − =  (31) where, aci is the magnetic potential calculated by the current filaments points; avi is the new magnetic potential estimated by the matching filaments points; nf is the total number of matching points. 5. friedman's statistical test in fact, to prove the superiority and the best performance of an optimization algorithm in comparison with the analytical results obtained by different algorithms, we most often use the friedman nonparametric test to determine if the algorithms are statistically different and to classify them in terms of performance and speed, in order to implement the best of them in the optimization problem. generally, to conclude on the result of a statistical test, the procedure employed consists in quantifying the p-value and compare it to a previously defined threshold (traditionally 5%). if the p-value is less than this threshold, the null hypothesis is rejected in favor of the alternative hypothesis, and the test result is declared statistically significant [74-77]. in this paper, the friedman’s statistical test will be used to analyze the minimum values of the objective function obtained from different optimization algorithms such as the teaching learning based optimization (tlbo) [78], flower pollination algorithm (fpa) [79], harmony search algorithm (hs) [80], particle swarm optimization (pso) and genetic algorithm (ga) [81], in order to identify the most efficient algorithm. 6. validation methods in case of electrostatic coupling, the induced voltage on the metallic pipeline caused by the hv power line conductors can be evaluated using the admittance matrix technique. under steady-state operation condition, for a symmetrical hv overhead transmission power 362 r. djekidel, m. lahdeb, s. s. m. ghoneim, d. mahi line system with an aerial metallic pipeline, the shunt admittance matrix per unit length of the proposed electric circuit is determined by the following equation [1,54,82-85]. 1[ ] [ ] ij ij y j p − = (32) where, pij is the potential coefficients matrix of the proposed circuit (overhead power line conductors and metallic pipeline). then the current-voltage relations for this electric system can be represented in matrix form as follows:    i ij ii y v =  (33) the resulting matrix of shunt admittances for the total number of conductors (including three-phase conductors, earth wires and metallic pipeline) is represented below [1,53,81-85]: cc cp cgc c p pc pp pg p g gc gp gg g y y yi v i y y y v i y y y v             =                  (34) where, c, p and g are subscripts which represent respectively the three-phase conductors, metallic pipeline and earth wires. the current through the earthed earth wires is equal to zero; they can be removed by replacing (ig = 0) in equation (34), which gives: ' ' ' ' cc c pc c p pp c pp y yi v i vy y      =            (35) where, ' ' ' ' , , cg gc cg gp cc cc cp cp gg gg pg gc pc cp pc pc pp pp gg gg y y y y y y y y y y y y y y y y y y y y  = − = −    = − = −   (36) for an insulated metallic pipeline, the current flowing through it is zero ip = 0, by substituting it in equation (35), the resulting pipeline voltage to earth due to the electrostatic coupling with the hv power line can easily be deduced and given by the following relation [1,53,81-85]: ' 1 '[ ]= -[ ] [ ] [ ] p pc pp c v y y v − (37) where, vc is the column of the known three-phase voltages to earth of the hv power line conductors. in electromagnetic coupling case, under steady state conditions, the induced voltage on the metallic pipeline can be obtained by applying carson’s method. this approach is based on the principle of mutual impedances between the conductors of the hv power line and the metallic pipeline, the determination of these impedances is done using carson's formula mentioned previously in equation (19) [4,85-90]. the induced voltage calculation that appears between the metallic pipeline and the adjacent earth is done in two steps, firstly, the determination of the electromotive force (emf) induced along the metallic pipeline due to variable magnetic field, and then the induced voltage along the metallic pipeline can be obtained. combined effects of electrostatic and electromagnetic interferences... 363 the total longitudinal electromotive force (emf) induced on the metallic pipeline is obtained through the mutual impedances between the pipeline and the power line conductors, carrying a time varying alternating currents in the power line conductors. in the case where the overhead power line is equipped by one earth wire, the induced electromotive force (emf) is calculated according to the following equation [4,85-90]: 2 3 31 1 2 1 1ind c pc c pc c pc g pg e i z i z i z i z= − − − − (38) this relation can be easily reduced to the general form below: 1 ni ind i pi i e i z = = − (39) where, zpi are the mutual impedances between the conductors of the power line (phase conductors, earth wires) and the metallic pipeline; ii are the currents passing through the three-phase conductors and the earth wires of the power line; ni is the total number of conductors in the hv power line. the induced voltage on the metallic pipeline for an exposed length of exposure l to the electromagnetic coupling can be found using the formula given below [4,85-90]: ind ind v e l= (40) as can be see; this applied approach assumes that the induced voltage is constant over the entire length of the metallic pipeline. consider an hv overhead vertical single circuit transmission line of 275 kv, with one earth wire and an aerial insulated metallic pipeline in the immediate vicinity; the arrangement and geometric coordinates of the overhead power line and metallic pipeline are shown in figure 9. the pipeline is placed in perfect parallel to the axis of the hv power overhead line at a separation distance of 45 m; its height above the ground is 1 m with a radius of 0.3m. the metallic pipeline length of exposure to the ac interference is 25 km. the threephase currents in hv power line have been assumed under balanced operation with the magnitude of 500 a, with a nominal system frequency of 50 hz. the earth is assumed to be homogeneous with a resistivity of 100 (ω m), the ac resistance of the phase conductor is 0.1586 (ω/km), for the earth wire is 0.1489 (ω/km) and 0.5 (ω/km) for the metallic pipeline. fig. 9 single circuit hv vertical configuration with an aerial metallic pipeline 364 r. djekidel, m. lahdeb, s. s. m. ghoneim, d. mahi 7. results and discussions firstly, the aim is to select the best parameters to insert in the simulation methods to achieve results with satisfactory accuracy. in order to obtain the optimal number and location of fictitious charges and current filaments, it is necessary to use a robust and powerful optimization algorithm. in this context, a comparison of the performances of different optimization algorithms (pso, fba, hs, tlbo, ga) was made using the statistical friedman test under the same conditions, in order to be able to classify them according to their performance. to ensure a fair comparison, these algorithms were implemented in the matlab interface (r2014a), the experiments for each algorithm were repeated 10 times on the same computer running with windows 10 operating system. parameter settings of all optimization algorithms are shown in table 1. table 1 parameters settings of each algorithm algorithms parameters setting (100 iterations) particle swarm optimization (pso) swarm size n =20; learning factor c1=2, c2=2; inertia weight wmax=1.2; wmin=0.4. flower pollination algorithm (fpa) population size n=20; switch probability p=0.8 harmony search algorithm (hs) harmony memory size hms=5; harmony memory consideration rate hmcr=0.95; pitch adjustment rate par=0.25; band width distance bw=0.02*( ublb). teaching learning based optimization (tlbo) population size n=20 genetic algorithm (ga) population size n=20, mutation probability =0.2, crossover probability =0.4, number of bits =25. the statistical and comparative analysis of the obtained results by the different selected optimization algorithms following the friedman ranking test is presented in table 2. table 2 results of friedman's statistical test of the optimization algorithms test statistics algorithms mean rank friedman's chi-square statistic 84 pso 3 degrees of freedom (df) 4 fba 4 number of observations n 21 hs 5 standard deviation (sigma) 1.5811 tlbo 1 prob>chi-sq (p-value) 2.47e-17 ga 2 based on the friedman's statistical test, it shows that the difference between the performance of different proposed algorithms is significant, the level of probability (p) is very low and well below the critical value (p=0.05). moreover, it was observed that the tlbo algorithm achieved the first rank with minimum simulation accuracy and could provide the best performance compared to other algorithms. consequently, the tlbo algorithm can be used to solve the optimization problems in the adopted methods for induced voltages calculation. the variation of the objective functions (of) mentioned in equations (30 and 31) with the number of iterations is represented in figure 10, it shows the search process adopted by this algorithm and the optimization based on the minimization of these objective functions. it can combined effects of electrostatic and electromagnetic interferences... 365 be clearly observed that the objective functions values decrease as a number of iterations increase to converge towards a minimum solution. the optimization results for the optimal values of the parameters to be inserted in these simulation methods are summarized in table 3. table 3 optimum value of the simulation methods (csm and cst) algorithm+ method phase conductor earth wire pipeline of value csm+ tlbo fictitious charges number 22 15 23 2e-14 fictitious radius [m] 0.036 0.008 0.14 cst+ tlbo current filaments number 25 19 30 9.9e-07 fictitious radius [m] 0.03 0.01 0.1 -100 -80 -60 -40 -20 0 20 40 60 80 100 5 x 10 -12 o b je c ti v e f u n c ti o n iterationn number -100 -80 -60 -40 -20 0 20 40 60 80 100 5 x 10 -6 o b je c ti v e f u n c ti o n csm+tlbo cst+tlbo fig. 10 objective functions variation with number of iterations for electrostatic coupling analysis, figure 11 shows the lateral profile of the electric field distribution with and without the presence of the metallic pipeline. it is clear from the graph that the initial electric field distribution is symmetrical at a distance of 7 m near the suspension pylon, the presence of the metallic pipeline has a relatively significant effect on the maximum value of the electric field at the exact location where this pipeline is located, at this point the electric field is subjected to a slight increase on the pipeline’s surface due to the induced electrical charges accumulated on this surface. therefore, it can be concluded that the presence of a metallic pipeline in the immediate vicinity of an overhead power line causes a distortion of the electric field at the emplacement where this pipeline is implanted. the profile of the perturbed electric field on the pipeline's surface located at different distances in the two right-of-way sides is shown in figure 12. it can be observed that the perturbed electric field reaches its maximum value (e= 7.12 kv/m) for a horizontal separation distance of pipeline equal to +7 m, as it gradually moves away from either side of this point, the electric field intensity begins to decline where it becomes almost minimal very far from the point of symmetry of the electric field. as a result, it is suggested that the 366 r. djekidel, m. lahdeb, s. s. m. ghoneim, d. mahi pipeline be located as far as possible from the power line in order to effectively reduce the electric field effects on this pipeline. -60 -40 -20 0 20 40 60 0 0.5 1 1.5 2 lateral distance (x) [m] e le c tr ic f ie ld [ k v /m ] without pipeline with pipeline fig. 11 electric field profile with and without the metallic pipeline at 1 m above the ground -60 -40 -20 0 20 40 60 0 1 2 3 4 5 6 7 x: 45 y: 0.2397 pipeline position from the power line center [m] e le c tr ic f ie ld [ k v /m ] fig. 12 perturbed electric field profile on the metallic pipeline’s surface figure 13 shows the induced voltage profile on the pipeline's surface as a function of the separation distance of pipeline along the right-of-way. generally, the voltage induced on the metallic pipeline is directly proportional to the perturbed electric field, its distribution is very similar to that of the perturbed electric field, the maximum value of the induced distance is obtained at a separation distance of pipeline equal to +7 m. as a general suggestion, it is highly recommended that the metallic pipeline be installed at a proximity distance called the critical distance where the induced voltage is below the values prescribed by international standards. under normal operating conditions, the discharge current due to the capacitive coupling through a person's body touching the metallic pipeline located at different separation distances combined effects of electrostatic and electromagnetic interferences... 367 along the right of way is shown in figure 14. it is important to note that the discharge current level is directly related to the induced voltage value, the higher induced voltage, the more intense in resulting current. the discharge current in this case study is 17 (ma), this value is considered unacceptable from a personnel safety point of view. -60 -40 -20 0 20 40 60 0 500 1000 1500 2000 x: 45 y: 72.89 pipeline position from the power line center [m] in d u c e d v o lt a g e o n t h e p ip e li n e [ v ] fig. 13 induced voltage on the insulated metallic pipeline due to hv power line -60 -40 -20 0 20 40 60 0 50 100 150 200 250 300 350 400 450 500 x: 45 y: 17 pipeline position from the power line center [m] d is c h a r g e c u r r e n t [m a ] fig. 14 intensity of shock current flowing in human body concerning the discharge current values through the human body which are greater than the safety limit value recommended by the cigre standard which is equal to 10 ma. a protection procedure must be implemented, it is enough simply to connect the metallic pipeline to the earth through to an appropriate resistance calculated according to equation (9). the grounding resistance of the pipeline as a function of its horizontal proximity distance is shown in figure 15. as can be seen from this figure, the behavior of the graph represented by the grounding resistance is inversely to that of the discharge current. therefore, the metallic pipeline in this study example is grounded by a very suitable resistance which is equal to 1429 ω. 368 r. djekidel, m. lahdeb, s. s. m. ghoneim, d. mahi -60 -40 -20 0 20 40 60 0 1000 2000 3000 4000 5000 6000 7000 8000 9000 x: 45 y: 1429 pipeline position from the power line center [m] e a r th in g r e si st a n c e [ o h m s] fig. 15 calculation of the earthling resistance of metallic pipeline for electromagnetic coupling analysis, figure 16 shows the lateral profile of the magnetic induction distribution with and without the presence of the metallic pipeline, taking into account the effect of the induced currents in the earth wire and the metallic pipeline. without the pipeline, it can be observed that the profile presents a symmetry close to the center of the power line (x = + 6 m), when it moves away from either side of this point, the magnetic induction intensity decreases rapidly as a function of the lateral distance. the figure also indicates, that the presence of a metallic pipeline in proximity to the power line disturbs the of magnetic induction distribution, this profile is distorted where the metallic pipeline is implanted. the pipeline will be affected by the magnetic induction and this is due to the current generated at the ends of this pipeline by the electromagnetic coupling. -60 -40 -20 0 20 40 60 0 0.5 1 1.5 2 2.5 3 3.5 x 10 -6 lateral distance [m] m a g n e ti c i n d u c ti o n [ t ] without pipeline with pipeline fig. 16 magnetic induction profile with and without the metallic pipeline at 1 m above the ground the effect of the metallic pipeline's location along the right-of-way on the perturbed magnetic induction profile at its surface is shown in figure 17. it can be seen that the combined effects of electrostatic and electromagnetic interferences... 369 maximum value of the perturbed magnetic induction (b= 4.1 µt) is obtained directly near the lateral phase at a separation distance of pipeline equal to (x=+ 6 m), from this position the magnetic induction decreases continuously with the lateral metallic pipeline's location to reach less intense or lower values very far from the power line center. -60 -40 -20 0 20 40 60 1.5 2 2.5 3 3.5 4 x 10 -6 x: 45 y: 1.886e-06 pipeline position from the power line center [m] m a g n e ti c i n d u c ti o n [ t ] fig. 17 perturbed magnetic induction profile on the metallic pipeline’s surface the induced voltage on the metallic pipeline by changing the pipeline's position along the right-of-way is shown in figure 18. as can be seen in this figure that the induced voltage is maximum where the pipeline is located at proximity position equal to +6 m, then it decreases progressively as the lateral position of this pipeline increases in the two sides. from this figure, it is important to note that the magnitude of the induced voltage in the metallic pipeline is directly proportional to the magnetic induction. in this case study the pipeline is kept location of 45 m from the pylon center, the obtained value of the induced voltage on the metallic pipeline is 270.9 v, this value is very higher than the maximum value permissible by the cigre norme which is 50 v. -60 -40 -20 0 20 40 60 100 200 300 400 500 600 x: 45 y: 270.9 pipeline position from the power line center [m] i n d u c e d v o lt a g e [ v ] fig. 18 induced voltage profile on the metallic pipeline the variation of the electric shock current flowing through a person coming into contact with the metallic pipeline as a function of its separation distance from the pylon is illustrated 370 r. djekidel, m. lahdeb, s. s. m. ghoneim, d. mahi in figure 19. as reflected in this figure, the amount of the shock current that flow in the human body accidentally is perfectly proportional to the magnitude of the applied induced voltage on the metallic pipeline, the form of its graph is very similar to that of the induced voltage. in this case of study, during normal operation the shock current due to accidental contact with the metallic pipeline is 204.2 ma, which can cause a significant risk and a great severity for this human body by comparing it with the admissible body current. -60 -40 -20 0 20 40 60 100 150 200 250 300 350 400 450 500 x: 45 y: 204.2 pipeline position from the power line center [m] s h o c k c u r r e n t in h u m a n b o d y [ m a ] fig. 19 intensity of shock current flowing through the human body for induced voltages values applied on the metallic pipeline which are greater than the maximum value admissible by the international cigre standard of 50v, that may pose a threat to the integrity of the pipeline and a risk to the safety of personnel. it then becomes imperative to implement an attenuation technique, to maintain the induced voltage at the recommended limit; it suffices simply to install low value shunt resistances at the ends of the pipeline with the earth which allow the current to be evacuated to earth along the pipeline section. figure 20 shows the electrode resistance value as a function of the separation distance of the metallic pipeline, this graph illustrates the earthing resistance values that ensure the safety of personnel and metallic pipeline, the behavior of the earthing resistance profile is exactly opposite to that of the electric shock current. -60 -40 -20 0 20 40 60 1 2 3 4 5 6 x: 45 y: 3.555 pipeline position from the power line center of [m] g r o u n d r e s is ta n c e [ o h m ] fig. 20 resistance of the ground electrode of metallic pipeline combined effects of electrostatic and electromagnetic interferences... 371 figure 21 shows the voltage applied to the electric system that combines in series the metallic pipeline and the electrode resistance to obtain a safety limit voltage (50 v). in this case study, it is necessary to install an earthing resistance value equal to 3.555 (ω) at each end of the metallic pipeline. 0 0.5 1 1.5 2 2.5 3 3.5 4 0 10 20 30 40 50 60 safe induced voltage x: 3.555 y: 50 electrode resistance value [ohm] e le c tr o d e v o lt a g e [ v ] fig. 21 safe voltage in the electrode resistance the results presented in figures 22 and 23 show the combined effect due to the electrostatic and electromagnetic couplings, which is generally represented by the total induced voltage applied on the metallic pipeline, as well as the total discharge current passing through the human body. as can clearly see that the obtained values according to the position of the metallic pipeline along the right-of-way are very significant. they can constitute a serious danger for the safety of the agents of intervention and maintenance, a great threat for the pipeline integrity and perfect degradation following to the metal corrosion and damage of the applied coatings, the failure of the cathodic protection system and the various devices connected to the metallic pipeline. in order to protect the safety to personnel of intervention and maintenance, thus the cost-effective functioning of the metallic pipelines, the application of mitigation procedure is necessary. -60 -40 -20 0 20 40 60 0 500 1000 1500 2000 2500 3000 x: 45 y: 343.8 pipeline position from the power line center [m] t o ta l in d u c e d v o lt a g e o n p ip e li n e [ v ] x: 45 y: 270.9 x: 45 y: 72.89 electrostatic effect electromagnetic effect combined effect fig. 22 total induced voltage on the metallic pipeline due to the combined effect 372 r. djekidel, m. lahdeb, s. s. m. ghoneim, d. mahi -60 -40 -20 0 20 40 60 0 100 200 300 400 500 600 700 800 900 1000 x: 45 y: 17 pipeline position from the power line center [m] t o ta l e le c tr ic s h o c k c u r r e n t [m a ] x: 45 y: 204.2 x: 45 y: 221.2 electrostatic effect electromagnetic effect combined effect fig. 23 total shock current intensity flowing through the human body due to the combined effect finally, in order to verify the effectiveness of the proposed methods, the results obtained for the induced voltage intensity for the electrostatic and electromagnetic couplings were compared with those computed respectively by the approaches of admittance matrix analysis and carson for the same data and similar geometry. figures 24 and 25 show the comparisons between the values of the obtained induced voltage, the results analysis of the comparison indicates that there is a very good correlation between the graphs of the different methods, the maximum estimated relative errors between the values of these different methods according to the two couplings cases were within the permissible range, this process is sufficient to validate the precision of the methods adopted. -60 -40 -20 0 20 40 60 0 500 1000 1500 2000 2500 x: 45 y: 72.89 pipeline position from the power line center [m] i n d u c e d v o lt a g e [ v ] x: 45 y: 72.52 admittance matrix method csm+pso fig. 24 comparison of the induced values by the two calculation methods for electrostatic coupling combined effects of electrostatic and electromagnetic interferences... 373 -60 -40 -20 0 20 40 60 100 200 300 400 500 600 700 x: 45 y: 271.2 pipeline position from the power line center [m] i n d u c e d v o lt a g e [ v ] x: 45 y: 270.9 faraday's law carson's method fig. 25 comparison of the induced values by the two calculation methods for electromagnetic coupling 8. conclusion in this paper, a rigorous quasi-static modeling approach is used to analyze the electrostatic and electromagnetic couplings under normal operating condition between an hv power transmission line and an aerial metallic pipeline placed in parallel and in close proximity. two hybrid simulation methods based on the charge simulation (csm) and current simulation techniques (cst), which are combined with the teaching learning based optimization (tlbo), were presented. this algorithm is applied in order to find the optimal position and the appropriate number of simulation charges and current filaments required of these methods. the intensities of the perturbed electric and magnetic fields and the induced voltage on the metallic pipeline were analyzed. for electrostatic coupling, from the results, it is clear that the presence of an aerial metallic pipeline in the vicinity of hv overhead power transmission line causes the distortion of the electric field at pipeline's surface due to the resulting electric static charges accumulated on this insulated surface. the magnitude of the maximum value of the induced voltage on the pipeline occurs at a separation distance of 7 m, and then it declines rapidly on both sides of this distance, where it becomes almost negligible at a critical distance, at which it is recommended to lay this metallic pipeline. if the discharge current flowing in the human body during direct contact with the metallic pipeline exceeds the authorized safety limit, it is recommended that the mitigation procedure be installed and it is sufficient to ground the metallic pipeline with an appropriate resistance. for electromagnetic coupling, according to the obtained results, it is evident that the presence of an aerial metallic pipeline in close proximity to a hv overhead power line disturbs the distribution of the magnetic field at the metallic pipeline's surface due to the electric current induced intensity in this pipeline. the maximum induced voltage appears in the metallic pipeline is obtained when this pipeline is located at a proximity distance equal to + 6 m from the pylon, then it decreases rapidly with the increase of the separation distance of the metallic pipeline across the sides of pylon. 374 r. djekidel, m. lahdeb, s. s. m. ghoneim, d. mahi the amount of discharge current which passes through the human body when it accidentally touches the metallic pipeline is linearly proportional to the magnitude of the induced voltage. when the resultant of the induced voltage intensity on the metallic pipeline exceeds the safety threshold of 50 v, it can present risks for the safety of intervention and maintenance agents, also for the pipeline's equipments, these risks can be completely eliminated by applying the mitigation measure, it be sufficient to connect the two endings of the metallic pipeline to the earth through a suitable resistances. the numerical results presented by the hybrid developed methods are compared with the results obtained by two different approaches, concerning respectively the both studied couplings; the comparison shows a good agreement between the simulation results, which confirms the efficiency and the validity of the proposed methods. references [1] cigre, guide on the influence of high voltage ac power systems on metallic pipelines, working group 36.02, technical brochure no. 095, 1995. [2] r. a. gummow, a/c interference guideline final report, nace corrosion specialist, no.17, canadian energy pipeline association, 2014. [3] en 50443, effects of electromagnetic interference on pipelines cased by high voltage a.c. railway systems and/or high voltage a.c. power supply systems, cenelec report no: ics 33.040.20; 33.100.01, 2009. [4] australian new zealand standard, electrical hazards on metallic pipelines, standards australia, standards new zealand, 4853:2000. [5] d. d. micu, e. simion, d. micu and a. ceclan, "numerical methods for induced voltage evaluation in electromagnetic interference problems", in proceedings of the 9th international conference on electrical power quality and utilisation, 2007, pp. 1–6. [6] k. hyoun-su, h. y. min, j. g. chase and c. h. kim, "analysis of induced voltage on pipeline located close to parallel distribution system", energies, vol. 14, pp. 8536–8536, 2021. [7] j. dabkowski, "how to predict and mitigate a.c. voltages on buried pipelines", pipeline & gas j., vol. 206, pp. 19–21, 1979. [8] a. taflove and j. dabkowski, "prediction method for buried pipeline voltages due to 60 hz ac inductive coupling part i analysis", ieee trans. power apparatus and systems., vol. pas-98, no. 3, pp. 780–787, 1979. [9] j. dabkowski, "the calculation magnetic coupling from overhead transmission lines", ieee trans. power appar. syst., vol. pas-100, no. 8, pp. 3850–3860, 1981. [10] f. p. dawalbi and r. d. southey, "analysis of electrical interference from power lines to gas pipelines part i: computation methods", ieee power eng. rev., vol. 9, no. 7, pp.70–70, 1989. [11] f. p. dawalibi and r. d. southey, "analysis of electrical interference from power lines to gas pipelines. ii. parametric analysis", ieee trans. power deliv., vol. 5, no. 1, pp. 415–421, 1990. [12] g. djogo and m. m. a. salama, "calculation of inductive coupling from power lines to multiple pipelines", electr. power syst. res., vol. 41, no. 1, pp. 75–84, 1997. [13] d. d. micu, g. c. christoforidis and l. czumbil, "ac interference on pipelines due to double circuit power lines: a detailed study", electr. power syst. res., vol. 103, pp. 1–8, 2013. [14] a. muresan, t. a. papadopoulos, l. czumbil, a. i. chrysochos, t. farkas and d. chioran, "numerical modeling assessment of electromagnetic interference between power lines and metallic pipelines: a case study", in proceedings of the 9th international conference on modern power systems. cluj-napoca, 2012, pp. 1–6. [15] r. djekidel and d. mahi, "calculation and analysis of inductive coupling effects for hv transmission lines on aerial pipelines", przegląd elektrotechniczny., vol. 190, no.9, pp. 151–156, 2014. [16] l. li and x. gao, "ac corrosion interference of buried long distance pipeline", in proceedings of the 3rd international conference on intelligent control-measurement and signal processing and intelligent oil field. xi’an, 2012, pp. 342–346. [17] k. j. satsios, d. p. labridis and p. s. dokopoulos, "finite element computation of field and eddy currents of a system consisting of a power transmission line above conductors buried in nonhomogeneous earth", ieee trans. power deliv., vol. 13, no. 3, pp. 876–882, 1998. [18] a. cristofolini, a. popoli and l. sandrolini, "numerical modelling of interference from ac power lines on buried metallic pipelines in presence of mitigation wires", in proceedings of the 2018 ieee javascript:void(0) javascript:void(0) combined effects of electrostatic and electromagnetic interferences... 375 international conference on environment and electrical engineering and 2018 ieee industrial and commercial power systems europe, palermo, 2018, pp. 1–5. [19] a. popoli, l. sandrolini and a. cristofolini, "finite element analysis of mitigation measures for ac interference on buried pipelines", in proceedings of the ieee international conference on environment and electrical engineering and industrial and commercial power systems europe, genova, 2019, pp. 1–5. [20] a. popoli, a. cristofolini, l. sandrolini, b. t. abe and a. jimoh, "assessment of ac interference caused by transmission lines on buried metallic pipelines using f.e.m," in proceedings of the 2017 international applied computational electromagnetics society symposium, firenze, 2017, pp. 1–2. [21] n. abdullah, "hvac interference assessment on a buried gas pipeline", iop conf. series: earth and environ. sci., vol. 704, no. 1, pp. 012009, 2021. [22] g. c. christoforidis, p. s. dokopoulos and k. e. psannis, "induced voltages and currents on gas pipelines with imperfect coatings due to faults in a nearby transmission line", in proceedings of the ieee international conference on porto power tech. porto, 2001, pp. 401–406. [23] g. c. christoforidis and d. p. labridis, "inductive interference of power lines on buried irrigation pipelines", in proceedings of the ieee international conference of power, bologna, 2003, pp. 196–202. [24] g. c. christoforidis, d. p. labridi and p. s. dokopoulos, "a hybrid method for calculating the inductive interference caused by faulted power lines to nearby buried pipelines", ieee trans. power deliv., vol. 20, no. 2, pp. 1465–1473, 2005. [25] a. popoli, a. cristofolini and l. sandrolini, "a numerical model for the calculation of electromagnetic interference from power lines on nonparallel underground pipelines", math. comput. simul., vol. 183, pp. 221–233, 2021. [26] c. andrea, a. popoli, l. sandrolini, g. pierotti and m. simonazzi, "laplace transform for finite element analysis of electromagnetic interferences in underground metallic structures", appl. sci., vol. 12, no. 2, pp. 872–872, 2022. [27] h. g. lee, t. h. ha, y. c. ha, j. h. bae and d. k. kim, "analysis of voltages induced by distribution lines on gas pipelines," in proceedings of the ieee international conference on power system technology. singapore, 2004, pp. 598–601. [28] s. al‐alawi, a. al‐badi and k. ellithy, "an artificial neural network model for predicting gas pipeline induced voltage caused by power lines under fault conditions", int. j. comput. math. electr. electron. eng., vol. 24, no. 1, pp. 69–80, 2005. [29] a. popoli, l. sandrolini and a. cristofolini, "comparison of screening configurations for the mitigation of voltages and currents induced on pipelines by hvac power lines", energies j., vol. 14, pp. 3855–3855, 2021. [30] m. a. elhirbawy, l. s. jennings, s. m. ai dhalaan and w. w. l. keerthipala, "practical results and finite difference method to analyze the electric and magnetic field coupling between power transmission line and pipeline", in proceedings of the ieee international symposium on circuits and systems, 2003, pp. 431–434. [31] mazen abdel-salam, abdallah al-shehri, "induced voltages on fence wires and pipelines by ac power transmission lines", ieee trans. ind. appl., vol. 30, no. 2, pp. 341–349, 1994. [32] m. m. saied, "the capacitive coupling between ehv lines and nearby pipelines", ieee trans power deliv., vol. 19, no. 3, pp. 1225–1231, 2004. [33] a. gupta and m. j. thomas, "coupling of high voltage ac power line fields to metallic pipelines", in proceedings of the 9th ieee international conference on electromagnetic interference and compatibility (incemic 2006), bangalore, 2006, pp. 278–283. [34] h. m. ismail, a. m. amin and s. alkhoudary, "comparative study of the effect of hvtl electrostatic fields on gas pipelines using the atp-lcc& csm methods", int. j. eng. res. technol., vol. 2, no. 9, pp. 3037–3043, 2013. [35] r. djekidel and s. a. bessidek, "estimation and mitigation of electrostatic interferences on metallic pipeline by hv overhead power line using differential evolution algorithm", electrotehnica, electronica, automatica eea, vol. 64, no. 3, pp. 83–90, 2016. [36] r. djekidel, s. a. bessedik and a. hadjadj, "electric field modeling and analysis of ehv power line using improved calculation method", fu electr. energ., vol. 31, no. 3, pp. 425–445, 2018. [37] r. djekidel, s. a. bessedik and s. akef, "accurate computation of magnetic induction generated by hv overhead power lines", fu electr. energ., vol. 32, no. 2, pp. 267–285, 2019. [38] t. meriouma, s. a. bessedik and r. djekidel, "modelling of electric and magnetic field induction under overhead power line using improved simulation techniques", eur. j. electr. eng., vol. 23, no. 4, pp. 289–300, 2021. [39] r. v. rao, v. j. savsani and d. p. vakharia, "teaching–learning-based optimization: a novel method for constrained mechanical design optimization problems", comput. aided des. j., vol. 43, no. 3, pp. 303–315, 2011. 376 r. djekidel, m. lahdeb, s. s. m. ghoneim, d. mahi [40] s. li, w. gong, l. wang, x. yan and c. hu, "a hybrid adaptive teaching–learning-based optimization and differential evolution for parameter identification of photovoltaic models", energy convers. manag., vol. 225 p. 113474, 2020. [41] n. h. malik, "a review of the charge simulation method and its applications," ieee trans. electr. insul., vol. 24, no. 1, pp. 3–20, 1989. [42] f. lai, y. wang, y. lu and j. wang, "improving the accuracy of the charge simulation method for numerical conformal mapping", math. probl. eng., vol. 2017, p. 3603965, 2017. [43] r. djekidel and d. mahi, "effect of the shield lines on the electric field intensity around the high voltage overhead transmission lines", amse journals -series: modelling a., vol. 87; no. 1, pp. 1–16, 2014. [44] r. djekidel, d. mahi and a. ameur, "analysis of parameters affecting the capacitive interference between pipelines and power overhead line using genetic algorithms", int. j. electr. eng. inform., vol. 8, no. 2, pp. 315–330, 2016. [45] r. djekidel, "optimum phase configuration and location of the aerial pipeline in the vicinity of a high voltage overhead line", period. polytech. electr. eng. comput. sci., vol. 60, no. 2, pp. 143–150, 2016. [46] r. m. radwn and m. m. samy, "calculation of electric fields underneath six phase transmission lines," j. electr. syst., vol. 12, no. 4, pp. 839–851, 2016. [47] m. m. samy and a. m. emam, "computation of electric fields around parallel hv and ehv overhead transmission lines in egyptian power network", in proceedings of the ieee international conference on environment and electrical engineering and ieee industrial and commercial power systems europe, italy, 2017, pp. 1– 5. [48] y. wang and c. lv, "electric field calculation of the improved charge simulation method based on hybrid coding", chinese automation congress, pp. 1208–1213, 2019. [49] s. nakasumi, k. kikunaga, y. harada, m. ohkubo and k. takagi, "error evaluation of defect shape identification using charge simulation method for static electricity", j. electrostatics., vol. 114, p. 103633, 2021. [50] r. djekidel, s. a. bessedik and a. c. hadjadj, "assessment of electrical interference on metallic pipeline from hv overhead power line in complex situation", fu electr. energ., vol. 34, no. 1, pp. 53–69, 2021. [51] r. djekidel, a. choucha and a. c. hadjadj, "efficiency of some optimization approaches with the charge simulation method for calculating the electric field under extra high voltage power lines," iet gener. transm. distrib., vol. 11, no. 17, pp. 4167–4174, 2017. [52] f. yang, w. he, w. deng and t. chen, "a genetic algorithm‐based improved charge simulation method and its application", int. j. comput. math. electr. electron. eng., vol. 28, no. 6, pp. 1701–1709, 2009. [53] r. wang, j. tian, f. wu, z. zhang and h. liu, "pso/ga combined with charge simulation method for the electric field under transmission lines in 3d calculation model", electronics, vol. 8, no. 10, pp. 1140, 2019. [54] n. tleis, power systems modeling and fault analysis theory and practice, elsevier, second edition 2019, pp. 835–861. [55] ieee std 80-2013, ieee guide for safety in ac substation grounding, (revision of ieee standard 802000), 2013, pp. 1-226. [56] y. degui, l. bing, d. jun, h. danmei and w. xihong, "power frequency magnetic field of heavy current transmit electricity lines based on simulation current method", ieee world autom. congr., pp. 1–4, 2008. [57] r. roshdy, a. s. mazen, m. abdel-bary and s. mohamed, "laboratory validation of calculations of magnetic field mitigation underneath transmission lines using passive and active shield wires", innovative syst. des. eng., vol. 2, no. 4, pp. 218–232, 2011. [58] r. m. radwan, m. abdel-salam, m. m. samy and a.m. mahdy, "passive and active shielding of magnetic fields underneath overhead transmission lines theory versus experiment", in proceedings of the 17th international middle east power systems conference. mansoura, 2015, pp. 1–10. [59] m. abdel-salam, h. abdullah, m. th. el-mohandes and h. el-kishky, "calculation of magnetic fields from electric power transmission lines", electr. power syst. res., vol. 49, pp. 99–105, 1999. [60] m. albano, r. turri, s. dessanti, a. haddad and h. griffiths, b. howat, "computation of the electromagnetic coupling of parallel untransposed power lines", in proceedings of the 41st international universities power engineering conference. newcastle upon tyne, 2006, pp. 303–307. [61] r. djekidel, s. a. bessedik, p. spitéri and d. mahi, "passive mitigation for magnetic coupling between hv power line and aerial pipeline using pso algorithms optimization", electr. power syst. res., vol. 165, pp.18–26, 2018. [62] k. yamazaki, t. kawamoto and h. fujinami, "requirements for power line magnetic field mitigation using a passive loop conductor", ieee trans. power deliv., vol. 15, no. 2, pp. 646–651, 2000. [63] p. cruz, c. izquierdo and m. burgos, "optimum passive shields for mitigation of power lines magnetic field", ieee trans. power deliv., vol. 18, no. 4, pp. 1357–1362, 2003. [64] a. r. memari, "optimal calculation of impedance of an auxiliary loop to mitigate magnetic field of a transmission line", ieee trans. power deliv., vol. 20, no. 2, pp. 844–850, 2005. combined effects of electrostatic and electromagnetic interferences... 377 [65] d. tang, j. zhao and h. li, "an improved tlbo algorithm with memetic method for global optimization", int. j. adv. comput. technol., vol. 5, no. 9, pp. 942–949, 2013. [66] h. r. e. h. bouchekara, m. a. abido and m. boucherma, "optimal power flow using teaching learning based optimization", electr. power syst. res., vol. 114, pp. 49–59, 2014. [67] p. sarzaeim, o. b. haddad and x. chu, teaching-learning-based optimization (tlbo) algorithm. in: advanced optimization by nature-inspired algorithms. studies in computational intelligence, springer, singapore, vol. 720, pp. 51–58, 2018. [68] m. m. puralachetty, v. k. pamula, l. m. gondela, v. n. b. akula, "teaching-learning-based optimization with two-stage initialization", in proceedings of the ieee students' international conference on electrical, electronics and computer science. bhopal, 2016, pp. 1–5. [69] r. venkata-rao, v. patel, "an improved teaching-learning-based optimization algorithm for solving unconstrained optimization problems", scientia iranica., vol. 20, no. 3, pp. 710–720, 2013. [70] o. bozorg-haddad, p. sarzaeim and h. a. loáiciga, "developing a novel parameter-free optimization framework for flood routing", sci. rep., vol. 11, no. 1, p. 16183, 2021. [71] r. venkata-rao and v. patel, "an elitist teaching-learning-based optimization algorithm for solving complex constrained optimization problems," int. j. ind. eng. comput., vol. 3, no. 4, pp. 535–560, 2012. [72] x. he, j. huang, y. rao and l. gao, "chaotic teaching-learning-based optimization with lévy flight for global numerical optimization", comput. intell. neurosci., vol. 8341275, pp. 1687–5265, 2016. [73] s. sleesongsom and s. bureerat, "four-bar linkage path generation through self-adaptive population size teaching-learning based optimization", knowledge-based syst., vol. 135, pp. 180–191, 2017. [74] t. hastie, r. tibshirani and j. friedman, the elements of statistical learning: data mining, inference, and prediction. new york: springer, second edition 2009, pp.745. [75] d. joaquin, g. salvador, m. daniel, h. francisco, "a practical tutorial on the use of nonparametric statistical tests as a methodology for comparing evolutionary and swarm intelligence algorithms", swarm evol. comput., vol. 1, no.1, pp. 3–18, 2011. [76] m. a. el-shorbagy and a. y. ayoub, "integrating grasshopper optimization algorithm with local search for solving data clustering problems", int. j. comput. intell. syst., vol. 14, no. 1, pp. 783–793, 2021. [77] h. moayedi, h. nguyen and l. kok-foong, "nonlinear evolutionary swarm intelligence of grasshopper optimization algorithm and gray wolf optimization for weight adjustment of neural network", eng. with comput., vol. 37, no. 2, pp. 1265–1275, 2021. [78] w. li, y. fan and q. xu, "teaching-learning-based optimization enhanced with multiobjective sorting based and cooperative learning", ieee access j., vol. 8, p. 65937, 2020. [79] m. m. samy, s. barakat and h. s. ramadan, "a flower pollination optimization algorithm for an off-grid pv-fuel cell hybrid renewable system", int. j. hydrog. energy, vol. 44, no. 4, pp. 2141–2152, 2019. [80] n. sinsuphan, u. leeton and t. kulworawanichpong, "optimal power flow solution using improved harmony search method," appl. soft comput. j., vol. 13, no. 5, pp. 2364–2374, 2013. [81] s. shabir and r. singla, "a comparative study of genetic algorithm and the particle swarm optimization", int. j. electr. eng., vol. 9, no. 2, pp. 215–223, 2016. [82] m. h. shwehdi, m. a. alaqil and s. mohamed, "emf analysis for a 380 kv transmission ohl in the vvicinity of buried pipelines", ieee access j., vol. 8, pp. 3710–3717, 2020. [83] r. djekidel and d. mahi, "capacitive interferences modelling and optimization between hv power lines and aerial pipelines", int. j. electr. comput. eng., vol. 4, no. 4, pp. 486–497, 2014. [84] m. samy and a. emam, "induced pipeline voltage nearby hybrid transmission lines", innovative syst. des. eng., vol. 8, no. 3, pp. 31–40, 2017. [85] r. djekidel, a. hadjadj and s. a. bessedik, "electrostatic and electromagnetic effects of hv overhead power line on above metallic pipeline", in proceedings of the 5th ieee international conference on electrical engineering, boumerdes, 2017, pp. 1–6. [86] k. b. adedeji, "effect of hvtl phase transposition on pipelines induced voltage", indones. j. electr. eng. inform., vol. 4, no. 2, pp. 93–101, 2016. [87] a. hellany, m. nassereddine and m. nagrial, "analysis of the impact of the ohew under full load and fault current", int. j. energy environ., vol. 1, no. 4, pp. 727–736, 2010. [88] m. nassereddine and a. hellany, "ac interference study on pipeline: the impact of the ohew under full load and fault current", in proceedings of the 2nd ieee international conference on computer and electrical engineering, dubai, 2009, pp. 497–501. [89] k. b. adedeji, a. a. ponnle, b. t. abe, a. a. jimoh, a. m. i. abu-mahfouz and y. hamam, "gui-based ac induced corrosion monitoring for buried pipelines near hvtls", eng. letters., vol. 26, no. 4, pp. 489–497, 2018. [90] m. vakilian, k. valadkhani, a. shaigan, a. nasiri and h. gharagozlo, "a method for evaluation and mitigation of ac induced voltage on buried gas pipelines", scientia iranica, vol. 9, no. 4, pp. 311–320, 2002. 10877 facta universitatis series: electronics and energetics vol. 36, no 1, march 2023, pp. 91-101 https://doi.org/10.2298/fuee2301091d © 2023 by university of niš, serbia | creative commons license: cc by-nc-nd original scientific paper efficiency and radiative recombination rate enhancement in gan/algan multi-quantum well-based electron blocking layer free uv-led for improved luminescence samadrita das1, trupti r. lenka1, fazal a. talukdar1, ravi t. velpula2, hieu p. t. nguyen2 1department of electronics and communication engineering, national institute of technology silchar, assam, 788010, india 2department of electrical and computer engineering, new jersey institute of technology newark, new jersey, 07102, usa abstract. in this paper, an electron blocking layer (ebl) free gan/algan light emitting diode (led) is designed using atlas tcad with graded composition in the quantum barriers of the active region. the device has a gan buffer layer incorporated in a c-plane for better carrier transportation and low efficiency droop. the proposed led has quantum barriers with aluminium composition graded from 20% to ~2% per triangular, whereas the conventional has square barriers. the resulted structures exhibit significantly reduced electron leakage and improved hole injection into the active region, thus generating higher radiative recombination. the simulation outcomes exhibit the highest internal quantum efficiency (iqe) (48.4%) indicating a significant rise compared to the conventional led. the designed ebl free led with graded quantum barrier structure acquires substantially minimized efficiency droop of ~7.72% at 60 ma. our study shows that the proposed structure has improved radiative recombination by ~136.7%, reduced electron leakage, and enhanced optical power by ~8.084% at 60 ma injected current as compared to conventional gan/algan ebl led structure. key words: ultra-violet (uv), light emitting diode (led), gallium nitride (gan), internal quantum efficiency (iqe), multi-quantum well (mqw), quantum barrier (qb), electron blocking layer (ebl) 1. introduction ultra-violet light emitting diodes (leds) are of immense importance because of their potential applications and have attracted considerable attention in optical communication, pharmaceutical appliances, water and air purification, and many more. gallium nitride (gan), received june 28, 2022; revised july 14, 2022, and july 26, 2022; accepted july 26, 2022 corresponding author: samadrita das department of electronics and communication engineering, national institute of technology silchar, assam, india e-mail: samadrita_rs@ece.nits.ac.in mailto:samadrita_rs@ 92 s. das, t. r. lenka, f. a. talukdar, r. t. velpula, h. p. t. nguyen a promising material for generating uv luminescence over a wide range of spectrum, has attracted many researchers’ attention [1]–[4]. gan shows a wide band gap ranging from 0.7 ev, 3.4 ev to 6.2 ev which can further be amplified by introducing aluminum (al) to prepare algan alloy [5], [6]. moreover, gan being an environmental friendly material has better biocompatibility and low manufacturing price [7]–[11]. gan are used for creating highefficiency shorter wavelength luminescence and fabricate semiconductor based materials such as led, laser diode (ld) with low threshold [12], [13]. from the last few years, research is going on the optimization of gan-based led structure design [14]–[17]. this optimization is beneficial to improve the efficiency in the symmetry of carrier transport, better injected charge carriers, confinement of carriers in the quantum wells which further enhances the radiative recombination rate leading to the breakthrough in the internal quantum efficiency (iqe) [18]–[20]. due to the electron overflow, iqe and efficiency droop at high injection current face a critical issue [22]. although the electron blocking layer (ebl) introduced between the p-region and active region can suppress the electron overflow[23], but the hole injection efficiency is also strongly affected because of positive polarization sheet charges formed at the heterointerface of the last quantum barrier (qb) and ebl[24]. additionally due to high magnesium (mg) activation energy in high al content ebl, efficient p-doping is quite difficult [25]. thus to mitigate these problems, in our paper we have used an ebl free multiquantum well (mqw) uv-led operating at ~354.6 nm wavelength which eliminates the formation of positive polarization sheet charges and shows a significantly enhanced hole injection and reduced electron leakage. we have presented a distinctive design of qb in gan/algan mqw by graded composition inside the entire barrier across [0001] axis. as a result, the performance of the proposed structure is remarkably improved, compared to the conventional uv-led structure using an ebl and with square quantum barriers. the design of led structure and its numeral simulation framework is presented in section 2 followed by results and analysis in section 3. finally, the conclusion is drawn in section 4. 2. device structure and numerical simulation framework in this study, the above-mentioned led structures are numerically studied using the use of computer-aided simulation tool silvaco atlas tcad which is designed to analyze and optimize leds based on wurtzite semiconductor compounds[26]. a gan/algan led with a peak wavelength of ~354.6 nm is presented in fig. 1. the basic device structure considered as the conventional led (ledi) is constructed above a sapphire based substrate with a thickness of 80 µm followed by an undoped gan buffer layer of thickness 1.2µm, n-doped algan coating layer (doping concentration: 1 ×1020 cm-3, width: 1.8 µm, al content: 18%), four pairs of gan (3 nm)/algan (7 nm) mqw, p-doped algan layer as ebl[27] (doping concentration: 2 ×1018 cm-3, width: 20 nm, al content: 20%), p-doped algan coating layer (doping concentration: 1 ×1018 cm-3, width: 180 nm, al content: 15%) and finally p-doped gan contact layer (doping concentration: 2.5 ×1018 cm-3, width: 80 nm). the quantum square barriers have 20% uniform al composition. efficiency and radiative recombination rate enhancement in gan/algan multi-quantum... 93 fig. 1 (a) schematic diagrams of ledi conventional gan/algan mqw with square qbs, (b) ledii with triangular barriers for the betterment of performance, the square barriers of ledi have been optimized with graded composition. along the n-side in each qb, the al composition is integrated to 20% (al0.26ga0.74n) while in the p-side the al composition is defined by the variable x which is in the range (0 ≤ x ≤ 20 %). the al composition in each qb gradually reduces from 20% to x (alxga1-xn, 0 ≤ x ≤ 0.2) across [0001] axis from n to p-side. the calculations are accomplished using the carrier mobilities of 90 (electrons) and 15 (holes) cm2v-1s-1 and the operating temperature is set as 300 k. the device with graded triangular barrier (x=0.02 for reference) is considered as ledii. the final ebl free uv-led proposed structure with graded triangular barrier is considered as lediii which is the optimum goal of this paper. the al composition in each qb is increased to 25% in the nside while in the p-side the value of x is in the range (0 ≤ x ≤ 25 %). the energy band gaps of the gan and algan used in the simulations are taken as 3.42 ev and 6.28 ev respectively. the respective radiative recombination rate of coefficient (copt) are 2×10-10 and 1.1×10-10 cm3/s. the lattice constant of gan is 0.3189 nm. the auger coefficient and carrier lifetime have their default values as 1×10-34 cm6/s and 1×10-9 s respectively. 3. results and analysis 3.1. internal quantum efficiency the iqes of the device with varying values of x in alxga1-xn in the graded qbs with respect to injection current are displayed in fig. 2. as shown, efficiency of ledi has the lowest value compared to the other cases with graded qb (0 ≤ x ≤ 0.2). with decreasing band gap of alxga1-xn from n to p-side, the iqe at the same injection current remarkably raises. due to better prospective, the efficiencies at 60 ma with respect to function x are displayed in the inset of fig. 2. while reducing the values of x from 0.2 to ~0.02, the efficiency increases from 34.79% to 45.68% then vaguely minimizes while x approaches 0. this is because when al-composition is further decreased to 0.02, band gap of alxga1-xn decreases which in turn increases the effective barrier heights for holes and electrons further. ledii acquires 31.3% rise of efficiency at 60 ma compared to ledi. fig. 3 show that the optimized ebl free device (lediii) has the highest iqe of 48.4% at 60 ma. lediii has 39.12% and 5.95% higher iqe than ledi and ledii respectively. the 94 s. das, t. r. lenka, f. a. talukdar, r. t. velpula, h. p. t. nguyen efficiency droop is minimized from 13.87% (ledi) to 10.22% (ledii) and further to 7.72% (lediii) at the same current of 60 ma according to the equation given below: (1) this result establishes that the ebl free device with triangular barriers does contribute to the enhancement of iqe and decrease of efficiency droop. in order to validate our device model and parameters, the iqe is compared with the nearly available experimental result [28] as shown in fig. 2. fig. 2 internal quantum efficiency vs. injected current with varying values of x. inset: values of iqes as a function of x. fig. 3 internal quantum efficiency vs. injected current for all leds inset: the efficiency droop for each led at 60 ma current efficiency and radiative recombination rate enhancement in gan/algan multi-quantum... 95 3.2. energy band diagrams fig. 4 shows the calculated energy band diagrams for ledi, ledii and lediii at 60 ma injected current. the simulated results shown in fig. 4(a)-(b) depict the dissimilarities and the tendency of variation of the energy band diagrams where the band gap of every qb is altered from uniform to graded composition. band diagram of ledi depicts a triangular designed shape because of the presence of internal polarization field and forward bias [29]. the energy band gap (eg) of alxga1-xn can be calculated as – eg(alxga1-xn) = eg(aln)x + eg(gan)(1 – x) – (1.3x(1 – x)) (2) where → eg(aln) = band gap energy of aln = 6.2 ev, eg(gan) = band gap energy of gan = 3.42 ev [30] fig. 4 energy band diagram of active region of gan/algan mqw for (a) ledi, (b) ledii and (c) lediii this mathematical formula shows that the band gap of alxga1-xn decreases with a decrease in the al composition. hence the band gap of every qb reduces while moving from n to p-side, thus influencing the effective barrier heights for electrons as well as holes. the formation of the hole depletion region due to the positive polarization sheet charges 96 s. das, t. r. lenka, f. a. talukdar, r. t. velpula, h. p. t. nguyen interface at lqb/ebl lessens the hole injection efficiency in ledii [24]. this problem can be overcome by removing the ebl from ledii. фcn and фebl are the effective conduction band barrier height (cbbh) at corresponding barrier (n) and ebl respectively. displayed in fig. 4(b), the values of фcn for all qbs i.e. фc1-фc5 are 370.8 mev, 408.2 mev 442.2 mev, 367 mev and фebl is 468.9 mev for ledii which is much higher than фebl for ledi (257.1 mev). the фcn values in the proposed ebl-free lediii i.e. фc1-фc5 are 460.2 mev, 618.3 mev, 505.2 mev and 632.1 mev, respectively. the higher and progressively increased фcn in lediii constructively confine the electrons in the active region and effectively resist the electron overflow into p-region. this leads to the significantly reduced non-radiative recombination in p-region and enhances hole injection into the active region. 3.3. carrier concentration the electron as well as hole concentration distribution in the mqw of various leds is displayed in fig. 5 to further understand the reason behind the tremendous performance improvement in lediii. ledi indicates a hole concentration of 10.4×1018 cm-3 in the initial quantum well from n to p-side which is much lower than ledii (16.9×1018 cm-3). these results specify that graded qb led has superior hole transport lessening the hole concentration. the distribution of electrons, as observed in graded qb led, appears to have better uniformity, compared to ledi, which may proportionate to superior transportation of fig. 5 carrier concentration of (a) ledi, (b) ledii and (c) lediii efficiency and radiative recombination rate enhancement in gan/algan multi-quantum... 97 holes [31]. the electron leakage in lediii is notably mitigated and lower than ledii which blocks the undesired recombination of electrons with incoming holes in the p-region. subsequently, lediii has higher electron (22.6×1018 cm-3) and hole (23.2×1018 cm-3) concentration throughout the active region, compared to ledii. 3.4. radiative recombination the distribution of radiative recombination in the active region at an injection current of 60 ma is simulated and illustrated in fig. 6. the radiative recombination distribution in ledii is more uniform compared to ledi. in ledi, radiative recombination in the primary qw has a recombination rate of 2.38×1028 cm-3s-1. this is probably due to the deficient spatial distribution overlap between holes and electrons [32]. the electrostatic field in mqws of lediii is lower than ledii that supports the spatial overlap of electron-hole wave functions which improves the radiative recombination process [33]. thus, the recombination rate of lediii is increased by ~136.7% compared to ledii. as shown in fig. 5, most electrons still accumulate in the initial well, while the hole concentration in the last well is less than that in the previous wells. however, in conventional led, both holes and electrons are centred at the wells close to p-gan, hence the radiative recombination is extremely effective at that location. above outcomes suggest that in order to diminish the droop behaviour of led without deteriorating total recombination, more attention has to be given to the spatial distribution between the holes and electrons [34]. fig. 6 radiative recombination rate of all led samples 3.5. power fig. 7 illustrates the luminous power vs. current for ledi and ledii. the light output is observed to be amplified with decrease in the value of x because graded qb benefits from superior electron confinement and larger hole injection efficiency. these superior optical properties are also attributed to the decrease in the polarization field in the mqw [35]. this improved power means that more carriers will recombine in the qw of graded qb led thus effectively improving the light efficiency of gan/algan led [36]. furthermore, as shown in fig. 8, the output power of lediii is remarkably increased to 98 s. das, t. r. lenka, f. a. talukdar, r. t. velpula, h. p. t. nguyen 18.075 mw from 16.723 mw (ledi) at 60 ma current injection i.e. ~8.084% enhancement. the normalised power spectral density of the three devices is displayed in fig. 9. conventional ledi has stronger quantum-confined stark effect (qcse) induced by the spontaneous and piezoelectric fields in the mqw layers which shows an obvious screening effect and band-filling effect. this results in a blue-shift in ledi. from fig. 9, lediii shows a red shift of ~5 nm because of negligent presence of qcse. fig. 7 behaviour of luminous power versus forward current with varying values of x. inset: clearer view of power at 60 ma current fig. 8 luminous power as a function of injected current for all leds efficiency and radiative recombination rate enhancement in gan/algan multi-quantum... 99 fig. 9 room temperature el spectra of all the led devices vs. wavelength 4. conclusion to summarize, ebl free uv-led of gan/algan mqws with specially designed graded qbs are numerically simulated. after reducing the band gap of algan across [0001] axis from n towards p region in every qb, the efficiencies of the device enhance. the upgraded led having x=0.02 (ledii) acquires topmost iqe of ~45.68 % at 60 ma which is 31.3% more compared to the conventional one (ledi) with square barriers. the reason behind this improvement is attributed to the modified energy band diagrams in the graded qbs. moreover, we have numerically demonstrated ebl free uv-led graded qb structure and observed that it can effectively suppress electron overflow, support enhanced hole injection into the led active region as compared to the conventional led. the hole transport in mqws was notably intensified at current of 60 ma which is beneficial for droop reduction. the efficiency droop was decreased from 13.87% in conventional led to only 10.22% in graded qb led and further to 7.72% in the proposed ebl free led. the proposed led has an 8.084% increase in the luminous power at an injection current of 60 ma as compared to conventional led and 39.12% rise in the efficiency. we believe that the el performance of the leds based on gan materials can be further improved through elaborate device design and carefully considering the varying carrier transport characteristics of gan based leds, which show different conduction-to-valence band-offset ratios in their mqw structures. acknowledgement: this work is one of the outcomes of dst-serb, govt. of india sponsored matrics project no mtr/2021/000370 which is duly acknowledged for support. 100 s. das, t. r. lenka, f. a. talukdar, r. t. velpula, h. p. t. nguyen references [1] s. das, t. r. lenka, f. a. talukdar and r. t. velpula, "carrier transport and radiative recombination rate enhancement in gan/algan multiple quantum well uv-led using band engineering for light technology", in proceedings of the 2nd international conference on micro and nanoelectronics devices, circuits and systems, mndcs 2022, pp. 1–11. [2] m. usman, u. mushtaq, m. munsif, a. r. anwar and m. kamran, "enhancement of the optoelectronic performance of p-down multiquantum well n-gan light-emitting diodes", phys. scr., vol. 94, no. 10, p. 105808, 2019. [3] h. tao, s. xu, j. zhang, p. li, z. lin and y. hao, "numerical investigation on the enhanced performance of n-polar algan-based ultraviolet light-emitting diodes with superlattice p-type doping", ieee trans. electron devices, vol. 66, no. 1, pp. 478-484, 2019. [4] s. das et al., "effects of polarized-induced doping and graded composition in an advanced multiple quantum well ingan/gan uv-led for enhanced light technology", eng. res. express, vol. 4, no. 1, p. 015030, 2022. [5] y. nagasawa and a. hirano, "a review of algan-based deep-ultraviolet light-emitting diodes on sapphire", appl. sci., vol. 8, no. 8, p. 1264, 2018. [6] m. usman et al., "zigzag-shaped quantum well engineering of green light-emitting diode", superlattices microstruct., vol. 132, p. 106164, 2019. [7] g. kim, j. h. kim, e. h. park, d. kang and b.-g. park, "extraction of recombination coefficients and internal quantum efficiency of gan-based light emitting diodes considering effective volume of active region", opt. express, vol. 22, no. 2, p. 1235, 2014. [8] h. hu, s. zhou, x. liu, y. gao, c. gui and s. liu, "effects of gan/algan/sputtered aln nucleation layers on performance of gan-based ultraviolet light-emitting diodes", sci. rep., vol. 7, p. 44627, 2017. [9] s. zhou, x. liu, h. yan, z. chen, y. liu and s. liu, "highly efficient gan-based high-power flip-chip light-emitting diodes", opt. express, vol. 27, no. 12, pp. a669–a692, 2019. [10] x. zhao, b. tang, l. gong, j. bai, j. ping and s. zhou, "rational construction of staggered ingan quantum wells for efficient yellow light-emitting diodes", appl. phys. lett., vol. 118, no. 18, p. 182102, 2021. [11] s. zhou et al., "numerical and experimental investigation of gan-based flip-chip light-emitting diodes with highly reflective ag/tiw and ito/dbr ohmic contacts", opt. express, vol. 25, no. 22, p. 26615, 2017. [12] y. meng et al., "growth and characterization of amber light-emitting diodes with dual-wavelength ingan/gan multiple-quantum-well structures", mater. res. express, vol. 6, no. 8, p. 0850c8, 2019. [13] c. h. wang et al., "efficiency droop alleviation in ingan/gan light-emitting diodes by graded-thickness multiple quantum wells", appl. phys. lett., vol. 97, no. 18, p. 181101, 2010. [14] z. lin, x. chen, y. zhu, x. chen, l. huang and g. li, "influence of thickness of p-ingan layer on the device physics and material qualities of gan-based leds with p-gan/ingan heterojunction", ieee trans. electron devices, vol. 65, no. 12, pp. 5373–5380, 2018. [15] m. usman, a. r. anwar, m. munsif, s. malik and n. u. islam, "analytical analysis of internal quantum efficiency with polarization fields in gan-based light-emitting diodes", superlattices microstruct., vol. 135, p. 106271, 2019. [16] h. hu et al., "boosted ultraviolet electroluminescence of ingan/algan quantum structures grown on high-index contrast patterned sapphire with silica array", nano energy, vol. 69, p. 104427, 2020. [17] x. fan, s. xu, h. tao, r. peng, j. du, y. zhao, j. zhang, j. zhang and y. hao, "improved performance of gan-based ultraviolet leds with the stair-like si-doping n-gan structure", mdpi, vol. 11, no. 10, p. 1203, 2021. [18] m. h. kim et al., "origin of efficiency droop in gan-based light-emitting diodes", appl. phys. lett., vol. 91, no. 18, pp. 1-4, 2007. [19] j. h. park et al., "enhanced overall efficiency of gainn-based light-emitting diodes with reduced efficiency droop by al-composition-graded algan/gan superlattice electron blocking layer", appl. phys. lett., vol. 103, no. 6, 2013. [20] c. sheng xia, z. m. simon li, w. lu, z. hua zhang, y. sheng and l. wen cheng, "droop improvement in blue ingan/gan multiple quantum well light-emitting diodes with indium graded last barrier", appl. phys. lett., vol. 99, no. 23, p. 233501, 2011. [21] s. das, t. r. lenka, f. a. talukdar, r. t. velpula, h. p. t. nguyen and c. engineering, "carrier transport and radiative recombination rate enhancement in gan / algan multiple quantum well uvled using band engineering for light technology", in: lenka, t.r., misra, d., fu, l. (eds) micro and nanoelectronics devices, circuits and systems. lecture notes in electrical engineering, vol. 904. springer, singapore. pp. 187-198. efficiency and radiative recombination rate enhancement in gan/algan multi-quantum... 101 [22] j. cho, e. f. schubert and j. k. kim, "efficiency droop in light-emitting diodes: challenges and counter measures", laser photonics rev., vol. 7, no. 3, pp. 408-421, 2013. [23] h. hirayama et al., "222-282 nm algan and inalgan-based deep-uv leds fabricated on high-quality aln on sapphire", phys. status solidi appl. mater. sci., vol. 206, no. 6, pp. 1176–1182, 2009. [24] c. chu et al., "on the origin of enhanced hole injection for algan-based deep ultraviolet light-emitting diodes with aln insertion layer in p-electron blocking layer", opt. express, vol. 27, no. 12, p. a620, 2019. [25] m. l. nakarmi, n. nepal, j. y. lin and h. x. jiang, "photoluminescence studies of impurity transitions in mg-doped algan alloys", appl. phys. lett., vol. 94, no. 9, pp. 1–5, 2009. [26] s. clara, “silvaco user’s manual device simulation software,” no. october, 2004, [online]. available: www.silvaco.com. [27] b.-c. lin et al., "hole injection and electron overflow improvement in ingan/gan light-emitting diodes by a tapered algan electron blocking layer", opt. express, vol. 22, no. 1, p. 463, 2014. [28] j. li et al., "carrier transport improvement in zno/mgzno multiple-quantum-well ultraviolet lightemitting diodes by energy band modification on mgzno barriers", opt. commun., vol. 459, 2020. [29] k. mehta et al., "theory and design of electron blocking layers for iii-n-based laser diodes by numerical simulation", ieee j. quantum electron., vol. 54, no. 6, pp. 1–11, 2018. [30] h. hirayama, s. fujikawa and n. kamata, "recent progress in algan-based deep-uv leds", electron. commun. japan, vol. 98, no. 5, pp. 1–8, 2015. [31] r. charash et al., "carrier distribution in ingan/gan tricolor multiple quantum well light emitting diodes", appl. phys. lett., vol. 95, no. 15, pp. 2007-2010, 2009. [32] s. zhou, j. lv, y. wu, y. zhang, c. zheng and s. liu, "reverse leakage current characteristics of ingan/gan multiple quantum well ultraviolet/blue/green light-emitting diodes", jpn. j. appl. phys., vol. 57, no. 5, p. 051003, 2018. [33] y. a. yin, n. wang, g. fan and y. zhang, "investigation of algan-based deep-ultraviolet light-emitting diodes with composition-varying algan multilayer barriers", superlattices microstruct., vol. 76, pp. 149155, 2014. [34] j. chang et al., "algan-based multiple quantum well deep ultraviolet light-emitting diodes with polarization doping", ieee photonics j., vol. 8, no. 1, pp. 1–7, 2016. [35] t. y. wang et al., "algan-based deep ultraviolet light emitting diodes with magnesium delta-doped algan last barrier", appl. phys. lett., vol. 117, no. 25, p. 251101, 2020. [36] h. li, c. j. chang, s. y. kuo, h. c. wu, h. huang and t. c. lu, "improved performance of near uv gan-based light emitting diodes with asymmetric triangular multiple quantum wells", ieee j. quantum electron., vol. 55, no. 1, pp. 1-4, 2019. 10226 facta universitatis series: electronics and energetics vol. 35, no 3, september 2022, pp. 379-391 https://doi.org/10.2298/fuee2203379l © 2022 by university of niš, serbia | creative commons license: cc by-nc-nd original scientific paper verification of calculation method for drone micro-doppler signature estimation aleksandar lebl, mladen mileusnić, dragan mitić, jovan radivojević, vladimir matić iritel a.d., belgrade, serbia abstract. drones micro-doppler signatures obtained by fmcw radars are an excellent procedure for malicious drone detection, identification and classification. there are a number of contributions dealing with recorded spectrograms with these micro-doppler signatures, but very low number of them has analyzed possibility to calculate echo caused by drone moving parts. in this paper, starting from already existing mathematical apparatus, we presented such spectrograms as a function of changing drone moving parts characteristics: rotor number, blades number, blade length and rotor moving speed. this development is the part of a wider project intended to prevent malicious drone usage. key words: malicious drone detection, fmcw radar, spectrogram, drone microdoppler signatures, calculation method 1. introduction drones or unmanned aerial vehicles (uavs) fulfil our everyday lives more and more. they may be used in many friendly types of missions as, for example, aerial photography, traffic supervision, disaster monitoring, precise agriculture, industrial inspection, goods delivery and so on. but, on the other side, drones are used for a number of different malicious purposes [1]. drones may carry explosive devices with the intention to cause numerous victims and damages on objects such as airports, stadiums, governmental buildings, residential areas, commercial and industrial facilities, power plants, etc. they may be used for smuggling activities over state borders or into and out of the prisons, for causing fire in hardly accessible forest areas or to perform assassination on the important persons. there are a huge number of examples for each of these malicious activities, successfully or unsuccessfully realized. this is the reason why systems for drones detection, identification, localization and classification (dilc) become very important today. received november 25, 2021; revised december 28, 2021; accepted january 10, 2022 corresponding author: aleksandar lebl iritel a.d., 11080 belgrade, batajnički put 23, serbia e-mail: lebl@iritel.com 380 a. lebl, m. mileusnić, d. mitić, j. radivojević, v. matić the most often applied sensors for drone dilc are radar, rf signal detector, optical camera, thermal camera and acoustic detector. the benefits and drawbacks of each sensor type are emphasized in details in [2]. drone dilc is usually performed using several sensor types among the mentioned ones. these selected sensors are then combined in one solution [3]-[6]. among these sensor types radar, especially frequency modulated continuous wave (fmcw) radar, is probably the most often applied technique [7]-[17]. the main principles of fmcw radar realization are described in [18]-[20]. fmcw radar allows reliable classification of the detected drone based on the analysis of drone microdoppler signatures. several typical drone construction and functional characteristics such as the number of its rotors, the number of blades in each rotor, rotor angular velocity and the length of blades may be only determined by fmcw radar on the base of drone microdoppler signature even in bad weather conditions. among contributions in the domain of fmcw radar, micro-doppler signatures for various drone types are presented and analyzed in [8]-[11], [15], [17]. contribution [8] gives several drone micro-doppler signature graphs in various flying mode phases (takeoff, hovering, flying phase). in this aspect [8] is more complete than our paper, but it lacks explanation to make the connection between the graphs and the derived formulas for micro-doppler signature calculation. the paper [9] presents a number of microdoppler signature graphs, but with addition of signals used for the communication between the drone and its operator, signals for drone video communication and so on. in [9] it is not possible to distinguish the spectrum behaviour as a consequence of drone flying from other frequency spectrum sources components. contributions [10], [15] are interesting because they pave the way in the comparison of drone and birds micro-doppler signatures, because drone and birds are often hard to distinguish due to their similar dimensions. the paper [21] contains very detailed theoretical and practical analysis of drone micro-doppler signatures, but on the base of experiments performed for drones at the distance of only several meters from the radar. drone micro-doppler signature graphs are often analyzed applying artificial intelligence algorithms, as for example in [22]. elements which have influence on the characteristics of drone micro-doppler signature are briefly emphasized in the section 2. the calculation method for drone micro-doppler signature determination is described in detail in the section 3. the calculation method is illustrated by a number of examples in the section 4 when drone physical characteristics and position relative to fmcw radar are changed. the concluding comments are given in the section 5. 2. drone parts causing micro-doppler effect all drone moving parts may cause micro-doppler effect detectable by fmcw radar. it is very important, because even a drone in hovering state will be detected by radar sensor. drone micro-movable parts are its rotors. each drone has a certain number of rotors, as presented in the fig. 1. there are nr=4 rotors in the example from fig. 1. drone micro-doppler signature depends on this number of rotors. the second important factor which has influence on drone micro-doppler signature is the number of blades (n) in each rotor. the blades 1 and 2 are designated in the fig. 1. verification of calculation method for drone micro-doppler signature estimation 381 two remaining blade characteristics which determine micro-doppler signature are the blades length (l) and blades rotation speed (ω). drone micro-doppler echo also depends on the drone (i.e. drone rotors) elevation angle (β) towards the radar level. this angle is determined by the drone height (h) and its distance from the radar (r0). h l blade blade blade blade l fig. 1 elements which have influence on the drone micro-doppler signature 3. calculation method method for drone micro-doppler signature calculation may be explained referring to the fig. 1. the main characteristic of fmcw radar is that it generates signal of variable frequency as a function of time. this frequency change is usually linear (sweep signal) and it is essentially important for fmcw radar detection principle of operation. the generated signal may be expressed by the equation [23] ( ) cos(2 ( ) ) c s t f b t t=  +   (1) where fc is the starting frequency of fmcw radar sweep signal and b is the slope of generated sweep signal. the generated signal is periodically repeated. the returned echo signal from rotor blades may be expressed by the equation from [7]: 1 0 0 0 1 0 4 ( ) ( ) exp ( sin ) sin ( ( )) exp( ( )) n lk k n k k k s t s t l j r z c t j t −  = − =   = =  −  +          −    (2) where it is 382 a. lebl, m. mileusnić, d. mitić, j. radivojević, v. matić 0 4 2 ( ) cos cos ( =0,1,2,... -1). 2 k l k t t k n n     =     + +     (3) in these two equations: ▪ l is the length of each blade; ▪ n is the number of blades in each rotor; ▪ r0 is the distance between the radar and the drone rotor (approximately the same as between radar and drone); ▪ z0 is the drone height; ▪ β is the drone elevation angle in relation to radar; ▪ ω is rotor angular rotation speed; ▪ φ0 is initial rotation angle; ▪ λ is fmcw signal wavelength. the magnitude of the rotor echo signal is 0 0 1 0 4 exp ( sin ) ( ) . sin ( ( )) exp( ( )) n k k k l j r z s t c t j t  − =    −  +       =    −  (4) the echo signal of all drone rotors is calculated according to the expression from [8]: 1 0 0 1 0 1 1 0 4 ( ) ( ) exp ( sin ) sin ( ( )) exp( ( )) r rn nn lk i i i i k i n ik ik k s t s t l j r z c t j t −  = = = − =   = =  −  +          −     (5) where nr is the number of drone rotors and 0 4 2 ( ) cos cos ( =0,1,2,... -1). 2 ik i i i l k t t k n n     =       + +     (6) as for the case of only one rotor, the magnitude of the whole drone echo signal is, similar to the equation (4): 0 01 1 1 1 0 0 4 exp ( sin ) ( ) ( ) . sin ( ( )) exp( ( )) r r n i i in n i lk n i k ik ik k l j r z s t s t c t j t − =  − = = =    −  +       = =    −     (7) the usual way to analyze drone micro-doppler signatures is the application of drone spectrograms. spectrograms present frequency spectrum of a signal as a function of time. they are obtained after calculation of short-time fourier transform (stft) [24]: ( ( , )) exp( ) n n n m n n stft s m s w j t  − =−  =   −  (8) or in the logarithmic division ( ) 20 log ( ( , )) n stft db stft s m=   (9) the meaning of variables in (8) is: verification of calculation method for drone micro-doppler signature estimation 383 ▪ sn – sequence of time samples of the signal whose spectrogram is calculated; ▪ wn – sequence of time samples of the selected window function; ▪ m – time index, i.e. time shift of the moment for which spectrogram is calculated; ▪ ω – frequency of the signal. hanning window is most often selected for the calculation of stft. the sequence of discretized hanning windows function is expressed as [25]: 1 2 1 cos ( 0,1, 2... ) 2 n n w n n n   =  − =    (10) 4. drone spectrograms drone spectrograms obtained using equations (2) to (9) are presented in the figures 2 to 10. they are derived varying the mechanical and position characteristics of drones to analyze how the change of each parameter influences the spectrogram. the analysis is presented for the hovering drone which means that rotor blades are the only moving parts of the drone. the majority of spectrograms are presented for a single rotor and this corresponds to the class of drones in the shape of helicopter. this is the smaller in number class then the class in the shape of quadcopters (which have four rotors). the starting spectrogram is presented in the fig. 2. it corresponds to the case that there is only one rotor with one blade. the blade rotation speed is ωrot=30rotations/s and the blade length is l=0.24m. after these mechanical characteristics, the drone position in relation to radar is defined by its height h=30m and distance from radar r0=100m meaning that drone position elevation angle relative to radar is β=arc sin (0.3). the radar functional characteristics are operating frequency f=24ghz (operating wavelength 0.0125m) and sampling rate fstep=20khz. time interval for spectrogram presentation is 0.1s and waveform repetition rate during this time interval is 3 or 30 in 1s. it means that spectrogram appearance (time repetition rate) directly follows from the rotor rotation speed. 0 ,0 0 2 0 ,0 0 6 0 ,0 1 0 ,0 1 4 0 ,0 1 8 0 ,0 2 2 0 ,0 2 6 0 ,0 3 0 ,0 3 4 0 ,0 3 8 0 ,0 4 2 0 ,0 4 6 0 ,0 5 0 ,0 5 4 0 ,0 5 8 0 ,0 6 2 0 ,0 6 6 0 ,0 7 0 ,0 7 4 0 ,0 7 8 0 ,0 8 2 0 ,0 8 6 0 ,0 9 0 ,0 9 4 0 ,0 9 8 0 16 32 48 64 80 96 112 128 144 160 176 192 t (sec) f (hz) 0-10 -10-0 -20--10 -30--20 -40--30 -50--40 -60--50 -70--60 -80--70 -90--80 -100--90 -110--100 -120--110 -130--120 -140--130 0 ,0 0 2 0 ,0 0 6 0 ,0 1 0 ,0 1 4 0 ,0 1 8 0 ,0 2 2 0 ,0 2 6 0 ,0 3 0 ,0 3 4 0 ,0 3 8 0 ,0 4 2 0 ,0 4 6 0 ,0 5 0 ,0 5 4 0 ,0 5 8 0 ,0 6 2 0 ,0 6 6 0 ,0 7 0 ,0 7 4 0 ,0 7 8 0 ,0 8 2 0 ,0 8 6 0 ,0 9 0 ,0 9 4 0 ,0 9 8 0 16 32 48 64 80 96 112 128 144 160 176 192 t (sec) f (hz) 0-10 -10-0 -20--10 -30--20 -40--30 -50--40 -60--50 -70--60 -80--70 -90--80 -100--90 -110--100 -120--110 -130--120 -140--130 fig. 2 drone spectrogram for one rotor with one blade, the blade length l=0.24m, blade rotation speed ωrot=30rotations/s, drone height h=30m, drone distance from radar r0=100m, fmcw radar operating frequency f=24ghz, digital sampling rate fstep=20khz. 384 a. lebl, m. mileusnić, d. mitić, j. radivojević, v. matić for our analysis in this paper it is important to notice the frequency at which signal echo falls below -40db, i.e. where the spectrogram colour transfers from yellow to green. this frequency in the case of spectrogram from the fig. 2 is 144hz. fig. 3 presents the drone spectrogram for the same parameters as in the fig. 2 with the only difference that the blade rotation speed is ωrot=20rotations/s. two modifications are noticeable as a consequence of ωrot change: the signal repetition rate has dropped from 3 to 2 during 0.1s and the frequency at which signal echo falls below -40db is 96hz. in both cases the parameter ratio is 2/3 as also the ratio of ωrot values. this change of the important frequency bandwidth is important for our future analysis. fig. 4 presents the drone spectrogram for the case that the blade length has been changed comparing to the fig. 2. in this case the frequency at which signal echo falls below -40db is a bit more than 72hz. it means that in this case the ratio of important frequencies bandwidth has dropped in the ratio 1/2, as also the ratio of blades length. 0 ,0 0 2 0 ,0 0 6 0 ,0 1 0 ,0 1 4 0 ,0 1 8 0 ,0 2 2 0 ,0 2 6 0 ,0 3 0 ,0 3 4 0 ,0 3 8 0 ,0 4 2 0 ,0 4 6 0 ,0 5 0 ,0 5 4 0 ,0 5 8 0 ,0 6 2 0 ,0 6 6 0 ,0 7 0 ,0 7 4 0 ,0 7 8 0 ,0 8 2 0 ,0 8 6 0 ,0 9 0 ,0 9 4 0 ,0 9 8 0 16 32 48 64 80 96 112 128 144 160 176 192 t (sec) f (hz) 0-10 -10-0 -20--10 -30--20 -40--30 -50--40 -60--50 -70--60 -80--70 -90--80 -100--90 -110--100 -120--110 -130--120 -140--130 0 ,0 0 2 0 ,0 0 6 0 ,0 1 0 ,0 1 4 0 ,0 1 8 0 ,0 2 2 0 ,0 2 6 0 ,0 3 0 ,0 3 4 0 ,0 3 8 0 ,0 4 2 0 ,0 4 6 0 ,0 5 0 ,0 5 4 0 ,0 5 8 0 ,0 6 2 0 ,0 6 6 0 ,0 7 0 ,0 7 4 0 ,0 7 8 0 ,0 8 2 0 ,0 8 6 0 ,0 9 0 ,0 9 4 0 ,0 9 8 0 16 32 48 64 80 96 112 128 144 160 176 192 t (sec) f (hz) 0-10 -10-0 -20--10 -30--20 -40--30 -50--40 -60--50 -70--60 -80--70 -90--80 -100--90 -110--100 -120--110 -130--120 -140--130 fig. 3 drone spectrogram for one rotor with one blade, the blade length l=0.24m, blade rotation speed ωrot=20rotations/s, drone height h=30m, drone distance from radar r0=100m, fmcw radar operating frequency f=24ghz, digital sampling rate fstep=20khz. 0 ,0 0 2 0 ,0 0 6 0 ,0 1 0 ,0 1 4 0 ,0 1 8 0 ,0 2 2 0 ,0 2 6 0 ,0 3 0 ,0 3 4 0 ,0 3 8 0 ,0 4 2 0 ,0 4 6 0 ,0 5 0 ,0 5 4 0 ,0 5 8 0 ,0 6 2 0 ,0 6 6 0 ,0 7 0 ,0 7 4 0 ,0 7 8 0 ,0 8 2 0 ,0 8 6 0 ,0 9 0 ,0 9 4 0 ,0 9 8 0 16 32 48 64 80 96 112 128 144 160 176 192 t (sec) f (hz) 0-10 -10-0 -20--10 -30--20 -40--30 -50--40 -60--50 -70--60 -80--70 -90--80 -100--90 -110--100 -120--110 -130--120 -140--130 0 ,0 0 2 0 ,0 0 6 0 ,0 1 0 ,0 1 4 0 ,0 1 8 0 ,0 2 2 0 ,0 2 6 0 ,0 3 0 ,0 3 4 0 ,0 3 8 0 ,0 4 2 0 ,0 4 6 0 ,0 5 0 ,0 5 4 0 ,0 5 8 0 ,0 6 2 0 ,0 6 6 0 ,0 7 0 ,0 7 4 0 ,0 7 8 0 ,0 8 2 0 ,0 8 6 0 ,0 9 0 ,0 9 4 0 ,0 9 8 0 16 32 48 64 80 96 112 128 144 160 176 192 t (sec) f (hz) 0-10 -10-0 -20--10 -30--20 -40--30 -50--40 -60--50 -70--60 -80--70 -90--80 -100--90 -110--100 -120--110 -130--120 -140--130 fig. 4 drone spectrogram for one rotor with one blade, the blade length l=0.12m, blade rotation speed ωrot=30rotations/s, drone height h=30m, drone distance from radar r0=100m, fmcw radar operating frequency f=24ghz, digital sampling rate fstep=20khz. verification of calculation method for drone micro-doppler signature estimation 385 fig. 5 presents the drone spectrogram for the case when its height has changed from h1=30m to h2=70m. it means that the ratio of elevation angle cosine functions has changed in the ratio 2 1 0 2 2 0 1 1.335 1 elev h r q h r   −    = =   −    (11) the bandwidth of important frequencies has changed in approximately the same ratio: from 144hz to about 109hz for the limit of -40db or, in other words, this ratio is 1.32. fig. 6 presents the spectrogram for the more probable case that the rotor has two blades. the other parameters for this spectrogram are the same as in the fig. 2. the important frequencies bandwidth remains 144hz as in the fig 2, but the repetition rate is twice as in the fig. 2, or total 6 due to the increased number of blades. highly similar spectrogram is obtained for the example of a rotor with one blade with two-fold rotation speed (ωrot=60rotations/s) and half a blade length (l=0.12m) and special attention has to be paid to distinguish these two cases. the spectrogram for this second case is presented in the fig. 7. 0 ,0 0 2 0 ,0 0 6 0 ,0 1 0 ,0 1 4 0 ,0 1 8 0 ,0 2 2 0 ,0 2 6 0 ,0 3 0 ,0 3 4 0 ,0 3 8 0 ,0 4 2 0 ,0 4 6 0 ,0 5 0 ,0 5 4 0 ,0 5 8 0 ,0 6 2 0 ,0 6 6 0 ,0 7 0 ,0 7 4 0 ,0 7 8 0 ,0 8 2 0 ,0 8 6 0 ,0 9 0 ,0 9 4 0 ,0 9 8 0 16 32 48 64 80 96 112 128 144 160 176 192 t (sec) f (hz) 0-10 -10-0 -20--10 -30--20 -40--30 -50--40 -60--50 -70--60 -80--70 -90--80 -100--90 -110--100 -120--110 -130--120 -140--130 0 ,0 0 2 0 ,0 0 6 0 ,0 1 0 ,0 1 4 0 ,0 1 8 0 ,0 2 2 0 ,0 2 6 0 ,0 3 0 ,0 3 4 0 ,0 3 8 0 ,0 4 2 0 ,0 4 6 0 ,0 5 0 ,0 5 4 0 ,0 5 8 0 ,0 6 2 0 ,0 6 6 0 ,0 7 0 ,0 7 4 0 ,0 7 8 0 ,0 8 2 0 ,0 8 6 0 ,0 9 0 ,0 9 4 0 ,0 9 8 0 16 32 48 64 80 96 112 128 144 160 176 192 t (sec) f (hz) 0-10 -10-0 -20--10 -30--20 -40--30 -50--40 -60--50 -70--60 -80--70 -90--80 -100--90 -110--100 -120--110 -130--120 -140--130 fig. 5 drone spectrogram for one rotor with one blade, the blade length l=0.24m, blade rotation speed ωrot=30rotations/s, drone height h=70m, drone distance from radar r0=100m, fmcw radar operating frequency f=24ghz, digital sampling rate fstep=20khz. 386 a. lebl, m. mileusnić, d. mitić, j. radivojević, v. matić 0 ,0 0 2 0 ,0 0 6 0 ,0 1 0 ,0 1 4 0 ,0 1 8 0 ,0 2 2 0 ,0 2 6 0 ,0 3 0 ,0 3 4 0 ,0 3 8 0 ,0 4 2 0 ,0 4 6 0 ,0 5 0 ,0 5 4 0 ,0 5 8 0 ,0 6 2 0 ,0 6 6 0 ,0 7 0 ,0 7 4 0 ,0 7 8 0 ,0 8 2 0 ,0 8 6 0 ,0 9 0 ,0 9 4 0 ,0 9 8 0 16 32 48 64 80 96 112 128 144 160 176 192 t (sec) f (hz) 0-10 -10-0 -20--10 -30--20 -40--30 -50--40 -60--50 -70--60 -80--70 -90--80 -100--90 -110--100 -120--110 -130--120 -140--130 0 ,0 0 2 0 ,0 0 6 0 ,0 1 0 ,0 1 4 0 ,0 1 8 0 ,0 2 2 0 ,0 2 6 0 ,0 3 0 ,0 3 4 0 ,0 3 8 0 ,0 4 2 0 ,0 4 6 0 ,0 5 0 ,0 5 4 0 ,0 5 8 0 ,0 6 2 0 ,0 6 6 0 ,0 7 0 ,0 7 4 0 ,0 7 8 0 ,0 8 2 0 ,0 8 6 0 ,0 9 0 ,0 9 4 0 ,0 9 8 0 16 32 48 64 80 96 112 128 144 160 176 192 t (sec) f (hz) 0-10 -10-0 -20--10 -30--20 -40--30 -50--40 -60--50 -70--60 -80--70 -90--80 -100--90 -110--100 -120--110 -130--120 -140--130 fig. 6 drone spectrogram for one rotor with two blades, the blade length l=0.24m, blade rotation speed ωrot=30rotations/s, drone height h=30m, drone distance from radar r0=100m, fmcw radar operating frequency f=24ghz, digital sampling rate fstep=20khz. 0 ,0 0 2 0 ,0 0 6 0 ,0 1 0 ,0 1 4 0 ,0 1 8 0 ,0 2 2 0 ,0 2 6 0 ,0 3 0 ,0 3 4 0 ,0 3 8 0 ,0 4 2 0 ,0 4 6 0 ,0 5 0 ,0 5 4 0 ,0 5 8 0 ,0 6 2 0 ,0 6 6 0 ,0 7 0 ,0 7 4 0 ,0 7 8 0 ,0 8 2 0 ,0 8 6 0 ,0 9 0 ,0 9 4 0 ,0 9 8 0 16 32 48 64 80 96 112 128 144 160 176 192 t (sec) f (hz) 0-10 -10-0 -20--10 -30--20 -40--30 -50--40 -60--50 -70--60 -80--70 -90--80 -100--90 -110--100 -120--110 -130--120 -140--130 0 ,0 0 2 0 ,0 0 6 0 ,0 1 0 ,0 1 4 0 ,0 1 8 0 ,0 2 2 0 ,0 2 6 0 ,0 3 0 ,0 3 4 0 ,0 3 8 0 ,0 4 2 0 ,0 4 6 0 ,0 5 0 ,0 5 4 0 ,0 5 8 0 ,0 6 2 0 ,0 6 6 0 ,0 7 0 ,0 7 4 0 ,0 7 8 0 ,0 8 2 0 ,0 8 6 0 ,0 9 0 ,0 9 4 0 ,0 9 8 0 16 32 48 64 80 96 112 128 144 160 176 192 t (sec) f (hz) 0-10 -10-0 -20--10 -30--20 -40--30 -50--40 -60--50 -70--60 -80--70 -90--80 -100--90 -110--100 -120--110 -130--120 -140--130 fig. 7 drone spectrogram for one rotor with one blade, the blade length l=0.12m, blade rotation speed ωrot=60rotations/s, drone height h=30m, drone distance from radar r0=100m, fmcw radar operating frequency f=24ghz, digital sampling rate fstep=20khz. the echo signal at the frequency 0hz may be used to distinguish whether it is considered the case according to the fig. 6 or the fig. 7. echo signal amplitude oscillations are significantly greater when rotation speed is lower, as is illustrated by the characteristics presented in the fig. 8 and the fig. 9. this peak-to-peak amplitude of the oscillations is even about 17db when there are two blades of 0.24m length and their rotation speed is 30 rotations/s (fig. 8) comparing to only about 2.5db when there is one blade of 0.12m length moving at ωrot=60rotations/s (fig. 9). this presentation of echo verification of calculation method for drone micro-doppler signature estimation 387 signal at the frequency 0hz for spectrograms more reliable distinguishing in some cases is, as for our knowledge, the paper original contribution. -30 -25 -20 -15 -10 -5 0 0,002 0,012 0,022 0,032 0,042 0,052 0,062 0,072 0,082 0,092 t [s] a [ d b ] fig. 8 echo at the frequency 0hz for the case of one rotor with two blades, the blade length l=0.24m, blade rotation speed ωrot=30rotations/s, drone height h=30m, drone distance from radar r0=100m, fmcw radar operating frequency f=24ghz, digital sampling rate fstep=20khz. -30 -25 -20 -15 -10 -5 0 0,002 0,012 0,022 0,032 0,042 0,052 0,062 0,072 0,082 0,092 t [s] a [ d b ] fig. 9 echo at the frequency 0hz for the case of one rotor with one blade, the blade length l=0.12m, blade rotation speed ωrot=60rotations/s, drone height h=30m, drone distance from radar r0=100m, fmcw radar operating frequency f=24ghz, digital sampling rate fstep=20khz. the typical drone construction is with 4 rotors and each rotor with two blades. the spectrogram for such a construction is presented in the fig. 10. the consequence of more rotors and blades existence is that echo signal periodicity is less obvious and that limit value of important echo frequencies is practically constant as a function of time. 388 a. lebl, m. mileusnić, d. mitić, j. radivojević, v. matić 0 ,0 0 2 0 ,0 0 6 0 ,0 1 0 ,0 1 4 0 ,0 1 8 0 ,0 2 2 0 ,0 2 6 0 ,0 3 0 ,0 3 4 0 ,0 3 8 0 ,0 4 2 0 ,0 4 6 0 ,0 5 0 ,0 5 4 0 ,0 5 8 0 ,0 6 2 0 ,0 6 6 0 ,0 7 0 ,0 7 4 0 ,0 7 8 0 ,0 8 2 0 ,0 8 6 0 ,0 9 0 ,0 9 4 0 ,0 9 8 0 16 32 48 64 80 96 112 128 144 160 176 192 t (sec) f (hz) 0-10 -10-0 -20--10 -30--20 -40--30 -50--40 -60--50 -70--60 -80--70 -90--80 -100--90 -110--100 -120--110 -130--120 -140--130 0 ,0 0 2 0 ,0 0 6 0 ,0 1 0 ,0 1 4 0 ,0 1 8 0 ,0 2 2 0 ,0 2 6 0 ,0 3 0 ,0 3 4 0 ,0 3 8 0 ,0 4 2 0 ,0 4 6 0 ,0 5 0 ,0 5 4 0 ,0 5 8 0 ,0 6 2 0 ,0 6 6 0 ,0 7 0 ,0 7 4 0 ,0 7 8 0 ,0 8 2 0 ,0 8 6 0 ,0 9 0 ,0 9 4 0 ,0 9 8 0 16 32 48 64 80 96 112 128 144 160 176 192 t (sec) f (hz) 0-10 -10-0 -20--10 -30--20 -40--30 -50--40 -60--50 -70--60 -80--70 -90--80 -100--90 -110--100 -120--110 -130--120 -140--130 fig. 10 drone spectrogram for four rotors with two blades, the blade length l=0.24m, blade rotation speed ωrot=30rotations/s, drone height h=30m, drone distance from radar r0=100m, fmcw radar operating frequency f=24ghz, digital sampling rate fstep=20khz. the graphs in the figures 2-7 may be compared to the selected graph from [8] which corresponds to the micro-doppler signature of rotors obtained by practical recording. the great similarity is obvious with the exception that the graph in [8] is presented for positive and negative frequencies and the echo signal is symmetrical about the frequency 0hz. this graph from [8] is presented in the fig. 11. it has periodicity – the number of periodical changes is 18 during 1s. according to this characteristic, the graph is most similar to the graph in the fig. 3. the frequency where the signal echo rapidly decreases is about 100hz. let us further suppose that we could conclude by some other technique what is the drone elevation angle, i.e. what is cosine of elevation angle. the final element to determine is now the length of rotor blade/blades (l). under the assumption that elevation angle is the same as in the fig. 3, we obtain l=0.25m. but, if the drone is situated approximately vertically above the fmcw radar (i.e. elevation angle tends to 90o) and the spectrogram is without significant changes, the corresponding l quickly grows. the graph in the fig. 10 is similar to the graph from [8] which corresponds to the drone in the hovering state. this graph from [8] is presented in the fig. 12. there is no obvious periodicity in the recorded characteristic. such a graph is the clear sign that there is a higher number of rotors probably with more than one blade. the summary of conditions for spectrogram characteristic calculation in the figures 2-10 is presented in the table 1. the main specificities to describe the obtained spectrograms for each combination of conditions (i.e. each figure) are also presented in the table 1. verification of calculation method for drone micro-doppler signature estimation 389 fig. 11 practical rotor micro-doppler record [8] fig. 12 practical drone micro-doppler record [8] table 1 summary of figure characteristics figure conditions for spectrogram calculation output spectrogram description 2 1 rotor, 1 blade, l=0.24m, ωrot=30/s, h=30m, r0=100m, f=24ghz, fstep=20khz waveform repetition rate 30/s attenuation 40db at 144hz 3 figure 2 with ωrot=20/s waveform repetition rate 20/s attenuation 40db at 96hz 4 figure 2 with l=0.12/s waveform repetition rate 30/s attenuation 40db at 72hz 5 figure 2 with h=70m (cosine of elevation angle higher 1.335 times) waveform repetition rate 30/s attenuation 40db at 109hz 6 figure 2 with two blades waveform repetition rate 60/s attenuation 40db at 144hz 7 figure 2 with l=0.12/s and ωrot=60/s waveform repetition rate 60/s attenuation 40db at 144hz 8 figure 2 with two blades amplitude oscillations peak-to-peak 17db at 0hz 9 figure 2 with l=0.12/s and ωrot=60/s amplitude oscillations peak-to-peak 2.5db at 0hz 10 figure 2 with four rotors and two blades echo frequencies constant in time, signal periodicity less obvious 390 a. lebl, m. mileusnić, d. mitić, j. radivojević, v. matić 5. conclusions calculation method for drone micro-doppler signature determination is presented in this paper. the influence of various drone parameters (number of rotors, number of blades forming a rotor, blades rotation rate, blades length) on spectrogram shape is analyzed. special attention is devoted to the way how it is possible to distinguish some combinations of drone characteristics which give very similar spectrograms. all results are presented for the fmcw radar which operates on the frequency of 24ghz. the method and the results from the paper may be used in the case that measurement results are not available. the results of calculation are compared to the similar examples from measurements and similarity of the results from these two groups is verified by several practical examples. the results from this paper are related only to the hovering drone. our plan for the future investigation is to try to develop calculation method for the drones in other flying modes (flying, take-off, etc). multi-doppler spectrograms are applicable for drone detection, identification and classification by artificial intelligence algorithms. our other development direction plan is to implement calculated spectrograms for training neural networks in the first phase of such networks construction when numerous practical records of various drone types are still not available. references [1] v. matić, v. kosjer, a. lebl, b. pavić and j. radivojević, "methods for drone detection and jamming", in proceedings of the 10th international conference on information society and technology (icist). kopaonik, 2020, pp.16–21. [2] n. eriksson, conceptual study of a future drone detection system countering a threat posed by a disruptive technology. master thesis in product development, chalmers university of technology, goethenburg, sweden, 2018. [3] advanced protection systems, ctrl+sky drone detection and neutralization system, 2017, http://apsystems.tech/wp-content/uploads/2018/01/aps_broszura_web.pdf. [4] droneshield, "product information", 2018. [5] h. liu, f. qu, y. liu, w. zhao and y. chen, "a drone detection with aircraft classification based on a camera array", in proceedings of the 2018 iop conference series: materials science and engineering, vol. 322, p. 052005. 2018, pp. 1–7. [6] x. shi, c. yang, c. liang, z. shi and j. chen, "anti-drone system with multiple surveillance technologies: architecture, implementation, and challenges", ieee commun. magaz., vol. 56, no. 4, pp. 68–74, 2018. [7] v. c. chen, the micro-doppler effect in radar. artech house, second edition, 2019, isbn: 978-1-63081546-2. [8] c. zhao, g. luo, y. wang, c. chen and z. wu, "uav recognition based on micro-doppler dynamic attribute-guided augmentation algorithm", remote sensing, vol. 13, no. 6, p. 1205, pp. 1–17, 2021. [9] t. šević, v. joksimović, i. pokrajac, r. brusin, b. sazdić-jotić and d. obradović, "interception and detection of drones using rf-based dataset of drones", sci. tech. rev., vol. 70, no. 2, pp. 29–34, 2020. [10] s. rahman and d. a. robertson, "radar micro-doppler signatures of drones and birds at k-band and wband", sci. rep., vol. 8, pp. 1–11, 2018. [11] y. cai, o. krasnov and a. yarovoy, "simulation of radar micro-doppler patterns for multi-propeller drones", in proceedings of the international radar conference (radar-2019), toulon, 2019, pp.1–5. [12] w. wang, j. du and j. gao, "multi-target detection method based on variable carrier frequency chirp sequence", sensors, vol. 18, p. 3386, pp. 1–12, 2018. [13] a. coluccia, g. parisi and a. fascista, "detection and classification of multirotor drones in radar sensor networks: a review", sensors, vol. 20, p. 4172, pp. 1–22, 2020. http://apsystems.tech/wp-content/uploads/2018/01/aps_broszura_web.pdf verification of calculation method for drone micro-doppler signature estimation 391 [14] m. daković, m. brajović, t. thayaparan and lj. stanković, "an algorithm for micro-doppler period estimation", in proceedings of the 20th telecommunications forum (telfor), belgrade, 2012, pp. 851–854. [15] p. molchanov, radar target classification by micro-doppler contributions. thesis for the degree of doctor of science in technology, publication 1255, tampere university of technology, finland, october 2014, issn 1459-2045. [16] e. hyun, y.-s. jin and j.-h. lee, "design and implementation of 24 ghz multichannel fmcw surveillance radar with a software-reconfigurable baseband", j. sensors, vol. 2017, p. 3148237, pp. 1–11, 2017. [17] b. karlsson, modeling multicopter radar return. master’s thesis in applied physics, chalmers university of technology, department of electrical engineering, gothenburg, sweden, 2017. [18] v. m. milovanović, “on fundamental operating principles and range-doppler estimation in monolithic frequency-modulated continuous-wave radar sensors", fu elec. energ., vol. 31, no. 4, pp. 547–570, 2018. [19] c. iovescu and s. rao, the fundamental of millimeter wave radar sensors. texas instruments, 2020. [20] j. zhu, low-cost, software defined fmcw radar for observations of drones. master thesis, university of oklahoma, graduate college, 2017. [21] m. passafiume, n. rojhani, g. collodi and a. cidronali, "modeling small uav micro-doppler signature using milimeter-wave fmcw radar", electronics , vol. 10, no. 6, pp. 1–16, 2021. [22] j. park, j.-s. park and s.-o. park, "small drone classification with light cnn and new micro-doppler signature extraction method based on a-spc technique", https://arxiv.org/abs/2009.14422, pp.1–5, 2020. [23] t. tang and c. wu, design of new frequency modulated continuous wave (fmcw) target tracking radar with digital beamforming tracking. defense research and development canada, scientific report drdcrddc-2019-r175, 2019. [24] m. ahmadizadeh, an introduction to short-time fourier transform (stft). sharif university of technology, department of civil engineering, 2014. [25] h. a. gaberson, "a comprehensive windows tutorial", sound and vibration, instrumentation reference issue, pp. 14–23, 2006. https://arxiv.org/abs/2009.14422 instruction facta universitatis series: electronics and energetics vol. 27, n o 3, september 2014, pp. i i editorial since my appointment as a new editor-in-chief of facta universitatis: series electronics and energetics, in october 2013, we have published the series of three special anniversary issues dedicated to the journal’s majestic age of a quarter of century. the published papers in special anniversary issues not only met the goals consistent with our focused aims, but have surpassed our expectation in quality and practical value. over the past year, we were receiving submissions and publishing papers from a very broad geographical area, making facta universitatis: series electronics and energetics a truly international journal. however, our job is not finished yet and the journal will be improved further. this is the fun part of this job, often it is the journey that is more enjoyable than the destination itself. this issue, the first one in the series of the forthcoming regular issues, is a collection of 5 invited papers by well-known experts for the specific areas, most of them being the members of the advisory board and editorial board, and 7 research papers by the authors from serbian academia environment, who present and discuss the state-of-the-art issues of practical interest in the field. on behalf of our editorial team, i promise to continue to develop and improve facta universitatis: series electronics and energetics in order to keep it at the forefront of science and technology. ninoslav stojadinović editor-in-chief instruction facta universitatis series: electronics and energetics vol. 29, n o 3, september 2016, pp. 395 405 doi: 10.2298/fuee1603395v a new telerehabilitation system based on internet of things  sanja vukićević 1 , zoran stamenković 2 , san murugesan 3 , zorica bogdanović 1 , božidar radenković 1 1 faculty of organizational science, university of belgrade, serbia 2 ihp, frankfurt (oder), germany 3 brite professional services and western sydney university, australia abstract. internet of things (iot) applied in healthcare system has a huge potential to improve patients' quality of life. representing network of devices embedded with electronics and sensors, iot enables constant monitoring of vital body functions, tracking of physical activities of a person and aids rehab physical therapy. such an iot-based system would allow standalone recovery process, minimizing the need for dedicated medical personnel and could be used in both hospital and home conditions. in this paper, we present a telerehabilitation system that uses wearable muscle sensor and microsoft kinect to create interactive personalized physical therapy that can be carried out at home. early experiments and results of pilot implementation validate the feasibility and effectiveness of the proposed iot-enabled telerehabilitation system. key words: telerehabilitation, muscle sensor, kinect, wearable sensor, telemedicine, physical therapy 1. introduction internet of things (iot) is a contemporary technology with the potential to alter or replace the various methods of classical medicine [1] and improve healthcare. the advantage of measuring physical parameters using iot devices instead of conventional ones is that the connected intelligent iot devices can carry out measurements independently, and carry out a specific action based on the measurement results. also, the results of measurements are available via internet and can be recorded in electronic form, enabling medical personnel to monitor patient’s state from any location at any time. the most common application of iot in healthcare is in wellness, using devices for measuring daily activities such as walking, running or riding a bicycle. telerehabilitation is recognized as a necessary form of treatment for numerous neurological, neuromusculoskeletal, cardiovascular and other conditions [2][3][4]. the  received june 30, 2015; received in revised form october 27, 2015 corresponding author: sanja vukićević faculty of organizational science, university of belgrade, jove ilića 154, 11000, belgrade, serbia (e-mail: sandzii@gmail.com) 396 s. vukićević, z. stamenković, s. murugesan, z. bogdanović, b. radenković number of people requiring telerehabilitation is rising. for example, according to world health organization report [5], 5 million people survive stroke each year and half of them remain with hemiparesis (weakness of one side of the body). medical and rehabilitation institutions are usually limited in space and personnel, so patients are forced to continue practicing physical therapy at home. expenses for traveling to rehabilitation centres for daily therapy are not insignificant for disabled persons, which contributes to the need for telerehabilitation. in this article, we present design of telerehabilitation system based on iot, which will enable the implementation of effective physical therapy remotely, then ensure the insight into the recovery process to competent medical personnel from a remote location, and provide interaction of therapists with the patient via communication technologies. a special attention is given to fostering patient's motivation to repeat the same group of exercises daily through serous games. 2. related work and motivation application of iot in healthcare spans a few different areas: physiological monitoring, ambient assisted living and well-being solutions. however, there is a lack of researches and experiments of iot usage in assisting and measuring performance of physical therapy. iot in rehabilitation therapy should ensure a wealth of information that can produce actions based on defined algorithms [6]. different kinds of sensors designed for healthcare, like muscle sensor that is measuring muscle activation via electrical potential and devices specialized for skeleton detection and tracking, are used in this rehabilitation model of physical therapy for obtaining feedback and correctness of the performed exercises [7][8]. readings from sensors are also used for the creation of future exercises and adaptation of interactive physical therapy to the patient's needs [9]. an industrial example of physiological monitoring system is bodyguardian by preventico, based on a band-like sensor patch placed on patient's body. sensor is powered by batteries which enables mobility of the patient and is connected to a smartphone device. smartphone transmits data to a cloud-based health platform which further delivers data and alerts medical personnel. cloud is a logical choice for such a system, as it does not burden patients with configuration of telerehabilitation system [10]. another example of iot device, developed for monitoring vital functions like heart and pulse rate, oxygen saturation, blood pressure, and skin temperature is visi mobile by sotera wireless inc. visi mobile communicates with e-health system using 802.11 wpa2/psk security protocol which guarantees protection of wireless communication channel. ambient-assisted living represents a technical system for supporting elderly people in their daily routine to allow an independent and safe lifestyle. sensors in those systems are wearable (for example, accelerometer or gyroscope) or fixed (proximity) and they gather data in order to monitor patient activities or detect a fall in patient's living environment [11]. one of the biggest iot growth areas is measuring individual health metrics and wellbeing, through self-tracking wearable gadgets. the use of wearable sensors, together with suitable applications running on smartphone devices enables people to track their daily activities (steps walked, running performance, calories burned, exercises performed, etc.), providing suggestions for enhancing their lifestyle. a new telerehabilitation system based on internet of things 397 combining all the three groups of application of iot in healthcare, it is possible to create a model of telereahabilitation designed for physical therapy. physiological measurements of interest in rehabilitation include heart rate, respiratory rate, blood pressure, blood oxygen saturation. parameters extracted from such measurements can provide indicators of patients' health status. but in physical therapy higher focus would be on measuring and stimulating muscle activity using muscle sensors. muscle sensor connected to microcontroller arduino present a low-cost, low-power solution for gathering electromyography (hereinafter: emg) data of skeletal muscle. emg is traditionally used for medical research and diagnosis of neuromuscular disorder. repeating the same exercises in a long-term therapy may lead to saturation and skipping therapy. it is therefore important to constantly maintain the motivation of the patient. serious game is a type of game designed for special purpose in industry of health, education, defence, engineering, and others [12]. although serious games should be entertaining, their main purpose is to train or educate users. recent researches [13] show that the cognitive and motor activity required by video games engage the user’s attention. in addition, users are focused on playing game which helps them in forgetting that they are performing therapy. microsoft kinect was recognized as a low price and clinical practical body sensing device to be applied in rehabilitation [14]. kinect can track a body part and can also reproduce 3d space with player in front of it which enables creation of virtual reality games. therefore, kinect is the basis of most interactive game-based rehabilitation solutions. physical therapy exercises are performed while playing games, which aim to facilitate the implementation of therapy [15][16]. there are several clinically tested solutions of physical therapy using kinect sensor [17][18]. in mirror magic neurorehabilitation clinical trial [19], kinect influences positively the process of rehabilitation. in [20] five rehabilitation games using kinect were evaluated, also with a positive outcome. one great example of application of kinect in rehabilitation is virtualrehab solution, developed by spanish virtualware, which consists of web based control centre administrator software platform and several games designed for kinect (http://www.virtualrehab.info). the control centre is used by therapists to prepare a plan of exercises, to monitor and assess the progress of therapy. 3. telerehabilitation system architecture as a substitute for physical therapy conducted in medical institutions, telerehabilitation therapy should include the same scope of exercises, but without physical presence of physiotherapist. in order to lead a patient through a therapy session, the system must have a virtual assistant in a form of a web based application and a set of games tailored specifically for the patient. telerehabilitation system architecture based on iot is configured in two segments: a home based segment and cloud based software as a service segment (see fig. 1). home based segment requires components such as kinect body tracking sensor and muscle sensor, to be installed and setup at patient's living environment. also, the patient must have a personal computer with internet connection in order to receive therapy sessions and to send data collected by sensors. 398 s. vukićević, z. stamenković, s. murugesan, z. bogdanović, b. radenković wired to laptop medical rehabilitation center physiatrist analytics patient physiatrist s smartphone/tablet collect data from a user data storage patient s laptop kinect home based environment cloud saas rehab platform virtual cloud servers muscle sensor fig. 1 components of telerehabilitation system software as a service cloud based segment serves several purposes:  provides software for telerehabilitation therapy,  collects feedback after performed therapy for each patient,  analyses collected data and represents it in comparative and progressive form, and  allows physiatrists to follow patient's condition remotely and manage further therapies. 3.1. intelligent sensors and actuators the role of sensors and actuators in telerehabilitation is twofold: to diagnose patient's physical abilities based on measurements and to use read values for adaptation, and tailoring rehabilitation game in order to meet patient's mobility. sensor applied in the pilot implementation of this model is muscle sensor v3, electromyography sensor for microcontroller applications, including three electrodes, connected to microcontroller arduino. using the muscle sensor, it is possible to measure muscle activation via electrical potential emg, by placing electrodes in three positions: in the middle of the muscle, at the end of the muscle and on bony part near the muscle. before placing electrodes it is necessary to get skin cleaned using alcohol. this step is mandatory in order to provide a better grip of electrodes and reduce the electrical resistance of the skin. proper placement of emg electrodes is crucial for accurate measurement of muscle contraction. unfortunately, if a muscle has more body fat, emg signal will be weaker and difficult to record. the motivation for using muscle sensor in pilot telerehabilitation of physical therapy is the need to strengthen muscles and also to measure progress of reinforcing muscles, depending on the type of exercises. for example, if the patient is required to alternately contract and relax the muscle, they will experience it as an effort, compared to a situation when they are performing same actions while playing a game, unconsciously. in the second case, the patient will probably perform more repetitions of muscle contractions and relaxations because they are unaware of those actions. based on a new telerehabilitation system based on internet of things 399 the above, the use of muscle sensor in rehabilitation games should lead to improvements in patient's muscle structure. fig. 2 example of reading data from the muscle sensor with electrodes connected to the microcontroller arduino uno. relaxed biceps places blue slider to 0 (left). contracted biceps places blue slider to a specific value measured by the sensor (right) figure 2 shows connection of muscle sensor with arduino uno microcontroller. the sensor requires 9v power, and since arduino uno can provide operating voltage of 5v, muscle sensor must be power supplied by two 9v batteries. in these settings, arduino uno is connected to a computer using a serial connection, but it is preferred to switch to wireless connection using arduino wifi shield. arduino would use the 802.11 b/g/n protocol for communication with application on laptop computer and, as a result, patient wouldn't be limited in space. in the settings as in fig. 2, muscle sensor is placed on biceps, and values representing muscle contractions are displayed on the monitor. when the biceps muscle is relaxed, blue slider (third rectangle on the left side of the figure) shows the voltage of 0 v. when the biceps contracts, blue slider (third rectangle on the right side of the figure) shows the voltage higher than 0. the voltage values depend on the physical condition and muscle function. 3.2 body tracking sensor the system was implemented using the kinect body tracking sensor which consists of an rgb camera, infrared (ir) camera and ir projector. rgb camera is a standard colour camera. ir projector emits infrared rays in space which bounce off objects and return back to ir camera and measures the distance between kinect and objects [21]. this feature is useful in the creation of therapy when the patient has to position the hand in front of or next to his/her body. all three components: rgb camera, ir camera and ir projector allow the creation of 3d images. with the depth stream, it is possible to estimate human motion in real-time. however, the acquired depth data can be quite noisy and the image can consist of pixels with no depth because of multiple reflections. to cope with that it is mandatory to perform denoising. for further information about denoising, refer to paper [22] which presents new data-driven-based denoising technique. 400 s. vukićević, z. stamenković, s. murugesan, z. bogdanović, b. radenković kinect can distinguish parts of the body and it can determine the position and orientation of the body. validity and reproducibility are important characteristics of this device [23] which makes it applicable in telerehabilitation applications. in addition to objects, kinect can detect sound. this feature is not sufficiently exploited although kinect can determine the source of sound very accurately. standalone pc kinect patient s laptop arduino microcontroller connected with cable to analog input of microcontroller rest ws cable adapter for kinect muscle sensor wireless 802.11 wireless 802.11 wlan fig. 3 home based equipment of telerehabilitation system schematic view fig. 3 represents a schematic view of the equipment required for home base segment of telerehabilitation and shows the type of communication protocols and device connection. 3.3 software as a service nowadays, cloud-based data storage, computation, software, platform and computing infrastructure are widely used for many different applications. using content and services from the cloud eliminates time and costs of buying hardware and installing and maintaining software. with cloud infrastructure health monitoring systems become low-cost, platformindependent, and rapidly deployable. applications deployed via cloud can be easily updated without forcing a patient to install any software on their devices, thus making system maintenance quick and cost effective. in fig. 4 we propose the concept of telerehabilitation platform based on software as a service cloud model containing four services: telerehabilitation application for patient, a new telerehabilitation system based on internet of things 401 setup and analytics application for therapist, game session manager and processor for sensor information, database for persisting of rehabilitation information and web and application servers for running the above described services. we envisage integration of telerehabilitation system with medical information system and using information from electronic health record, ehr, of the patient for precise diagnosis. patient medical is ehr diagnosis therapist adapted rehab game setup & preview rehab system rehab db setup and analytic software for therapist rehabilitation software saas rehab system sensor information processor game session manager fig. 4 proposed concept of cloud based software as a service rehabilitation platform in this system, kinect is used in rehabilitation therapy to determine the patient’s mobility and limitations, before the beginning of therapy. by testing patient's limitations in movements (for example, height to which he/she can raise the affected hand or move it to the right, left, or bend) and the actual speed of movement, a set of parameters is obtained, upon which the system may recommend a list of games. during therapy, body position is very important, and it is detected and recorded by kinect, because the patient is tilted to the right if they find it too hard to lift their left arm. after the patient finishes the session, raw data gathered from the muscle sensor and kinect are sent to the cloud saas application server. the received data are filtered and transformed into meaningful information, linked to the patient and stored in cloud data storage. the new session is then provided to the user until they decide to finish the therapy. data stored for each performed session can be used for various purposes and benefits. analytic software for therapists may present diagrams of patient's performance and progress. great amount of gathered data gives an opportunity for medical data mining and opens the door to a vast source of medical data analyses. finding patterns in the impact of a certain exercise to the establishment of the lost physical function, classification and prediction will create a knowledge base able to recommend a set of sessions to any new or existing patient. in order to promote physical and psychological condition of the patient, training sessions should be designed for a specific type of disability. based on stroke statistics report [24], conducted in the united kingdom, there are 77% of post-stroke patients with upper limb disabilities, and 72% of post-stroke patients with lower limb disabilities. according to these findings, the first trial game (fig. 5) is designed for practicing motor skills and coordination of stroke affected hand, especially elbow and shoulder. 402 s. vukićević, z. stamenković, s. murugesan, z. bogdanović, b. radenković fig. 5 serious game for hand and elbow rehabilitation consisting of virtual box with green barrier and twenty balls trial game is designed to contain only essential elements, without details that could draw patient's attention. elements in the game are virtual box filled with virtual balls located at one side of the box and adjustable barrier placed in the middle of the box, separating the box into two parts. patient's movements are tracked using kinect body sensor and their task is to take a virtual ball, placing their palm at the ball position and to drag the ball to the opposite side of the box, over a virtual barrier. the barrier height is adjustable in order to match patient's capabilities. if a patient is able to contract any muscle of the stroke affected hand, it should be insisted on muscle sensor usage in the game because it will increase muscle strength and endurance. in a trial game with a virtual box, emg muscle signals can be used for grabbing the ball when the palm covers it and for releasing the ball after passing the barrier. 4. patient trials and results in order to test telerehabilitation system in the domain of patient environment and patient's reaction to the new type of therapy, we setup a home based equipment containing kinect, arduino uno and muscle sensor in the patient's home environment. the patient was requested to play an interactive telerehabilitation game with a virtual box and the results of playing were recorded to patient's computer and uploaded to the remote computer. the goal of this trial was to test one part of the proposed telerehabilitation system interactive telerehabilitation game. interactive serious game has been tested on a single patient during one month pilot trial. the patient is a 60 year old male, who sustained a right hemisphere stroke a year before trial testing and as a consequence has hemiparesis of the left side of the body. the mobility of his left hand is very low and the goal is to increase it. one week prior to the start of the rehabilitation trial, the patient had gone through baseline rules to play virtual telerehabilitation game with a virtual box. the patient played the game using the stroke affected hand, in the three week period, five to six days per week, one hour per day. unfortunately, the patient was unable to close the fist and therefore contract the biceps, therefore the readings from the muscle sensor are omitted and the virtual ball is considered captured when the patient holds the hand over the ball a new telerehabilitation system based on internet of things 403 position for several seconds and the ball is considered released when the patient’s hand passes the barrier and a half of the box after the barrier. the results obtained after the three week telerehabilitation period was completed, are shown in fig. 6. that figure shows that transferring balls from the left to the right side lasts longer, which confirms that the patient slowly focuses the left, stroke affected side. comparing the measurements in the first five days and the last five days of the session, the duration of the session was reduced by 27% when moving twenty balls from right to left, and 15% when moving them from left to right. fig. 6 progress in playing the game between the first and the last day of telerehabilitation home based trial 5. discussion the results of the pilot trial of three week telerehabilitation session using serious game showed noticeable improvements in rehabilitation. after the trial, the patient showed increase of concentration, faster reflexes, and higher mobility of affected hand and better focus of left side when reading. the trial reveals that compared with in-clinic rehabilitation process, telerehabilitation process offers several benefits. in-clinic rehabilitation is by its nature repetitive and command based, which may reduce patient motivation. serious game telerehabilitation tends to demand movements based on purpose (pick the object, move the object, clean the surface, etc.) and tries to motivate the patient to achive better score in every game iteration. traditional rehabilitation requires one therapist per patient and both have to be in the same place. in telerehabilitation therapy session, one therapist can lead several patients, a session can be designed in advance for each patient and the therapist and patient can be miles apart. thus, telerehabilitation model reduces travelling costs, and reduces the time therapist spends for preparing a single patient therapy – compared to the time when the therapists works with one patient, showing exercises and waiting for the patient to complete it. the distribution of therapists over the territory is usually uneven, 404 s. vukićević, z. stamenković, s. murugesan, z. bogdanović, b. radenković with higher concentration in urban regions and city centres, and there is a lack of skilled therapists in rural and remote locations. it is exactly here that telerehabilitation model can give most contribution in providing an opportunity for each patient to be treated equally well. however, hereby presented telerehabilitation model is highly dependent on internet accessibility, availability of kinect body sensor, muscle sensor and personal computers. also patients or their caregivers should have basic computer knowledge in order to setup telerehabilitation equipment. validity and reliability of kinect body sensor has already been tested and confirmed [23][25]. kinect detects body skeleton automatically, but it requires at least two square meters clean place. compared to kinect, muscle sensor is not that simple to calibrate and to properly set. emg signal is usually very poor, which requires repetition. to improve the emg signal stability, the muscle sensor should be placed on a large muscle. second potential problem regarding muscle sensor is noise. interaction between the electrolytes in the skin and the metal of the detection surfaces of the electrode can produce noise. noise can be reduced employing conductive electrolytes to improve the contact with the skin and also by removing dead dermis from the surface of the skin. 6. conclusion the telerehabilitation model described in this paper allows for a faster recovery of patients who have survived a stroke. kinect device is used as a sensor for detection and tracking of body movements. muscle sensor records the muscle strength. cloud architecture model enables building a stable, scalable, reliable, cost effective and easy to use telerehabilitation system. this telerehabilitation system eliminates the need for mandatory presence of the therapist and enables a patient to perform post-stroke rehabilitation therapy at home, reducing the cost of treatment. therapist has access to patient’s virtual records and may check his/her activities and remotely guide the therapy at any moment. this model developed for research and experiment purposes serves as a foundation for creating a product which will be widely used in post stroke telerehabilitation and evaluation of the recovery degree. acknowledgement: the authors would like to thank to the ministry of education, science and technological development, republic of serbia, for financial support project number 174031. references [1] y. jog, a. sharma, k. mhatre and a. abhishek, "internet of things as a solution enabler in health sector", international journal of bio-science & bio-technology, vol. 7, no. 2, pp. 9-24, 2015. [2] j. langan, k. delave, l. phillips, p. pangilinan and s.h. brown, "home-based telerehabilitation shows improved upper limb function in adults with chronic stroke: a pilot study", journal of rehabilitation medicine, vol. 45, no. 2, pp. 217-220, 2013. [3] l.r. tindall, and r.a. huebner, "the impact of an application of telerehabilitation technology on caregiver burden", international journal of telerehabilitation, vol. 1, no. 1, pp. 3-8, 2009. [4] l. piron, a. turolla, p. tonin, f. piccione, l. lain and m. dam, "satisfaction with care in post -stroke patients undergoing a telerehabilitation programme at home", journal of telemedicine and telecare, vol. 14, no. 5, pp. 257-260, 2008. [5] the world health report 2002: “reducing risk, promoting healthy life”, http://www.who.int/whr/2002/en/. a new telerehabilitation system based on internet of things 405 [6] m.c. domingo, "an overview of the internet of things for people with disabilities", journal of network and computer applications, vol. 35, no. 2, pp. 584-596, 2012. [7] p. pharow, b. blobel, p. ruotsalainen, f. petersen and a. hovsto, "portable devices, sensors and networks: wireless personalized ehealth services", medical informatics in a united and healthy europe, pp. 1012-1016, 2009. [8] b. lange, c.y. chang, e. suma, b. newman, a.s. rizzo and m. bolas, "development and evaluation of low cost game-based balance rehabilitation tool using the microsoft kinect sensor. in engineering in medicine and biology society (embc)”, in proceedings of the ieee annual international conference, 2011 pp. 1831-1834. [9] l. geurts, v. vanden abeele, j. husson, f. windey, m. van overveldt, j.h. annema and s. desmet, "digital games for physical therapy: fulfilling the need for calibration and adaptation", in proceedings of the 5th international conference on tangible, embedded, and embodied interaction (tei '11), acm new york, 2011, pp. 117-124. [10] m. hoda, h. dong, d. ahmed and a. e. saddik, "cloud-based rehabilitation exergames system" in multimedia and expo workshops (icmew), in proceedings of the ieee international conference, 2014, pp. 1-6. [11] a. dohr, r. modre-opsrian, m. drobics, d. hayn, and g. schreier, "the internet of things for ambient assisted living", in proceedings of the 7th international conference on information technology: new generations (itng), las vegas, 2010, pp. 804-809. [12] t. susi, m. johannesson and p. backlund, "serious games: an overview", technical report, sweden: university of skövd, skövde, 2007 [13] b. m. alcover, a. jaume-i-capó, j. varona, p. martinez-bueso, and a. m. chiong, "use of serious games for motivational balance rehabilitation of cerebral palsy patients", in proceedings of the 13th international acm sigaccess conference on computers and accessibility, new york, 2011, pp. 297-298. [14] s. c. yeh, w. y. hwang, t. c. huang, w. k. liu, y. t. chen and y. p. hung, "a study for the application of body sensing in assisted rehabilitation training", in proceedings of the computer, consumer and control (is3c), international symposium, 2012, pp. 922-925. [15] m. f. levin, p. l. weiss and e. a. keshner, "emergence of virtual reality as a tool for upper limb rehabilitation", physical therapy, vol. 95, no. 3, march 2015, pp. 415-425. [16] s. vukićević, "telerehabilitation model of physical therpay using kinect and embedded systems", in proceedings of the 5th international conference on information society and technology, kopaonik, 2015, pp. 214-218. [17] h. m. hondori and m. khademi, "a review on technical and clinical impact of microsoft kinect on physical therapy and rehabilitation.journal of medical engineering", journal of medical engineering, vol. 2014, pp. 1-16, 2014. [18] d. webster and o. celik, "systematic review of kinect applications in elderly care and stroke rehabiliation", journal of neuroeneering and rehabilitation, vol. 11, no. 1, 108, pp. 1-24, 2014. [19] o. erazo, j. pino, r. pino and c. fernandez, "magic mirror for neurorehabilitation of people with upper limb dysfunction using kinect", in proceedings of the 47th hawaii international conference on system sciences (hicss), 2014, pp. 2607-2615. [20] c. m. tseng, c. l. lai, d. erdenetsogt and y. f. chen, "a microsoft kinect based virtual rehabilitation system", in proceedings of the international symposium on computer, consumer and control (is3c), 2014, pp. 934-937. [21] z. zhang, "microsoft kinect sensor and its effect", ieee multimedia, vol. 19, no. 2, pp. 4-10, 2012. [22] y. feng, m. ji, j. xiao, x. yang, j. j. zhang, y. zhuang and x li, "mining spatial-temporal patterns and structural sparsity for human motion data denoising", ieee tran. cybernetics, vol. 99, pp. 1-14, 2014. [23] b. bonnechere, b. jansen, p. salvia, h. bouzahouene, l. omelina, f. moiseev, f. and s. jan, "validity and reliability of the kinect within functional assessment activities: comparison with standard stereophotogrammetry" gait and posture, vol. 39, no. 1, pp. 593-598, 2014. [24] state of the nation january 2015: "stroke statistics", https://www.stroke.org.uk/sites/default/files/ stroke_ statistics_2015.pdf [25] r.a.clark, y.h. pua, k. fortin, c. ritchie, k.e. webster, l. denehy and a.l. bryant, "validity of the microsoft kinect for assessment of postural control", gait & posture, vol. 36, no. 3, pp. 372-377, 2012. http://www.tei-conf.org/11/ facta universitatis series: electronics and energetics vol. 35, no 4, december 2022, pp. 495-512 https://doi.org/10.2298/fuee2204495p © 2022 by university of niš, serbia | creative commons license: cc by-nc-n original scientific paper fuzzy-based real-coded genetic algorithm for optimizing non-convex environmental economic loss dispatch shradha singh parihar1, nitin malik2 1gautam buddha university, greater noida, india 2the northcap university, gurugram, india abstract. a non-convex environmental economic loss dispatch (nceeld) is a constrained multi-objective optimization problem that has been solved for assigning generation cost to all the generators of the power network with equality and inequality constraints. the objectives considered for simultaneous optimization are emission, economic load and network loss dispatch. the valve-point loading, prohibiting operating zones and ramp rate limit issues have also been taken into consideration in the generator fuel cost. the tri-objective problem is transformed into a single objective function via the price penalty factor. the nceeld problem is simultaneously optimized using a fuzzybased real-coded genetic algorithm (ga). the proposed technique determines the best solution from a pareto optimal solution set based on the highest rank. the efficacy of the projected method has been demonstrated on the ieee 30-bus network with three and six generating units. the attained results are compared to existing results and found superior in terms of finding the best-compromise solution over other existing methods such as ga, particle swarm optimization, flower pollination algorithm, biogeography-based optimization and differential evolution. the statistical analysis has also been carried out for convex multi-objective problem. key words: multi-objective optimization, non-convex environmental economic loss dispatch, price penalty factor, pareto optimality, real-coded genetic algorithm, valve-point loading, prohibiting operating zones, ramp rate limit received march 2, 2022; revised june 22, 2022; accepted july 6, 2022 corresponding author: nitin malik the northcap university, sector 23a, gurugram, india e-mail: nitinmalik77@gmail.com 496 s. s. parihar, n. malik list of abbreviations: ceed: combined emission and economic dispatch ed: emission dispatch eld: economic load dispatch fpa: flower pollination algorithm frcga: fuzzy-based real-coded genetic algorithm ga: genetic algorithm n/w: network nceeld: non-convex environmental economic loss dispatch nsga: non-dominated sorting genetic algorithm pozs: prohibiting operating zones ppf: price penalty factor pso: particle swarm optimization rcga: real-coded genetic algorithm rrl: ramp rate limit vpl: valve point loading 1. introduction 1.1. motivation the electrical power networks traditionally functioned to minimize total generation fuel cost and were less bothered about the harmful emissions generated in the network [1-3]. after the us clean air act of 1990 (amended in 2010) and similar legislation in several other countries, the public concern towards the pollutants like cox, so2 and nox produced from the thermal power plant has grown. this, in turn, forces the utilities to deliver the power to the consumers with simultaneous minimum total generator fuel cost and total emission level [4-22]. a high degree of non-linearity and complexity is present in the modern generator’s cost curve function because of the presence of valve point loading (vpl) effect and other effects, the resultant approximate solutions lead to a lot of revenue loss over time which is also affected by the network losses. to overcome this, the optimal amount of generated power of the thermal units are to be determined by minimizing emission, loss and cost simultaneously while satisfying all practical constraints, hence, generating a large-scale highly constrained non-linear multi-objective optimization problem. 1.2. literature survey the economic load dispatch (eld) [1-3] is a real-world problem that, earlier, only considers the minimization of the generator fuel cost. therefore, emission dispatch (ed) is considered in [4] for the very first time. hence, both generator fuel cost and harmful environmental emissions should be treated as competing objectives. the combined emission and economic dispatch (ceed) minimize harmful emissions and generating unit cost simultaneously to obtain optimal generation for each network (n/w) unit satisfying various practical constraints. in [5-9], the authors presented weighted-sum or price penalty factor (ppf) based methods where all the considered objectives are treated as a unit function. conventional genetic algorithm (ga) and differential evolution have been presented in [10] and [11], respectively to demonstrate the effect of vpl on the generators cost function but fuzzy-based real-code genetic algo for optimizing non-convex environment economic loss dispatch 497 ga requires large cpu time for the optimization. a fast initialization approach has been presented in [12] to solve non-convex economic dispatch problem but is usually stuck in local minima. a new whale optimization approach has been presented in [13] and have high computational efficiency. a flower pollination algorithm (fpa) is demonstrated in [14] for solving eld and ceed problem in larger n/w. many evolutionary algorithms such as non-dominated sorting genetic algorithm (nsga) [15], squirrel search algorithm [16], evolutionary programming [17] and nsgaii [18] have been proposed for solving the bi-objective problem. the evolutionary programming has a slow convergence rate for large problem. a mine-blast algorithm has been developed in [19] to incorporate the valve point loading effect for solving the environmental economic load dispatch problem. a new global particle swarm optimization (pso) is developed in [20] to solve bi-objective problem without and with transmission losses. a fuzzified pso technique [21], harmony search [22] and cuckoo search [23] is applied to optimize the solution for the ceed problem. the pso approach deals with the problem of partial optimism. 1.3. paper contributions a) as most of the research has been carried out considering only two objectives (fuel cost and emissions), the authors have incorporated additional objective (network loss) to make the problem formulation more comprehensive and find better solution by merging two soft-computing techniques (rcga and fuzzy) for finding the best compromised solution out of the obtained pareto solutions. moreover, it has been found from the exhaustive literature review that the non-convex multi-objective optimization problem formulation with simultaneous minimization of three objective functions (emission, fuel cost and network loss) at different load demands has not been explored before. b) the different non-linearities like valve-point loading, prohibiting operating zones (pozs) and ramp rate limit (rrl) are considered in this article for three conflicting objectives. c) as all the considered objectives are competitive, the method generates multiple nondominated pareto optimal solutions rather than a single best solution from which the bestcompromised solution is selected based on the highest fuzzy membership function value. d) to validate the proposed methodology, three test cases have been considered at different load demands and the results are compared with already published methods based on ga [25], pso [25, 26], fpa [27], biogeography-based optimization [28] and differential evolution [29]. 2. mathematical modeling the practical non-convex eeld problem has three conflicting objectives which aim to minimize generating cost, amount of harmful emissions and losses of the complex and nonlinear network. to formulate a non-convex eeld problem following objectives and operating constraints are given below: 2.1. non-convex economic load dispatch it is more practical for fossil fuel-based generators to introduce the steam valve-point loading effect in a turbine by adding a rectified sinusoidal term to the quadratic cost 498 s. s. parihar, n. malik equation which leads to non-smooth and non-convex function having manifold minimas [10]. total generator fuel cost based on active power output can be represented as [14] 𝑀𝑖𝑛𝑖𝑚𝑖𝑧𝑒 𝑓1 = 𝐹𝑇 = ∑ (𝑎𝑖 𝑃𝑖 2 + 𝑏𝑖 𝑃𝑖 + 𝑐𝑖 ) 𝑁 𝑖=1 + |𝑒𝑖 × sin (𝑓𝑖 × (𝑃𝑚𝑖𝑛 − 𝑃𝑖 ))| (1) where pi represents the output power generation of i th unit. ai, bi, ci, ei, and fi are the generator fuel cost coefficients. 2.2. emission dispatch (ed) the goal of ed is to minimize the total environmental degradation due to fossil fuel burning to produce power. the total pollution level of the environment that needs to be minimized is given as [14]: 𝑀𝑖𝑛𝑖𝑚𝑖𝑧𝑒 𝑓2 = 𝐸𝑇 = ∑ 10 −2 × (𝛼𝑖 + 𝛽𝑖 𝑃𝑖 + 𝛾𝑖 𝑃𝑖 2)𝑁𝑖=1 + 𝜉𝑖 exp (𝜆𝑖 𝑃𝑖 ) (2) where i, i, i, i, i represents the pollution coefficients of the i th generating unit. 2.3. loss dispatch the loss dispatch aims to minimize power loss without considering the generator cost and harmful emission of the network. to minimize loss [14] 𝑀𝑖𝑛𝑖𝑚𝑖𝑧𝑒 𝑓3 = 𝑃𝐿 = ∑ ∑ 𝑃𝑖 𝐵𝑖𝑗 𝑃𝑗 + ∑ 𝐵𝑖𝑜 𝑃𝑖 + 𝐵𝑜𝑜 𝑁 𝑖=1 𝑁 𝑗=1 𝑁 𝑖=1 (3) where bij, bio and boo represents the line loss coefficients. 2.4. non-convex environmental economic loss dispatch (nceeld) the nceeld problem is to be formulated having an economy, harmful emissions and losses of the network as competing objectives. the proposed complex problem can be written as 𝑀𝑖𝑛𝑖𝑚𝑖𝑧𝑒 𝐶 = 𝑓1 + (𝑝𝑓𝑒) ∗ 𝑓2 + (𝑝𝑓𝑙) ∗ 𝑓3 (4) where ′𝑃𝑓𝑒′ and ′𝑃𝑓𝑙′ are the ppf for emission and loss respectively. 𝑓1 represents total generator fuel cost, 𝑓2 represents total emission and 𝑓3 represents total n/w loss. the ratio of the max value of f1 to the max value of f2 gives ppf for emission, whereas, the ratio of the max value of f1 to the max value of f3 of the corresponding generator gives ppf for loss. the procedure for finding ppf for emission and loss can be given as: (a) the generator fuel cost ($/hr) is calculated at its maximum output using (1) for the convex and non-convex problems. (b) the emission release from every generator (lb/hr or kg/hr) is calculated at its maximum output using (2). (c) the losses of each are calculated at its maximum output using (3). (d) 𝑃𝑓𝑒[𝑖], 𝑃𝑓𝑙[𝑖] (𝑖 = 1,2 . . . 𝑛) for each generator is determined as in (5) and (6). 𝑝𝑓𝑒[𝑖] = ∑ (𝑎𝑖+𝑏𝑖𝑃𝑖 𝑚𝑎𝑥 +𝑐𝑖𝑃𝑖 𝑚𝑎𝑥 2 ) 𝑁 𝑖=1 +|𝑒𝑖×sin {𝑓𝑖×(𝑃𝑖𝑚𝑖𝑛 𝑚𝑎𝑥 −𝑃𝑖 𝑚𝑎𝑥 )}| ∑ 10−2×(𝛼𝑖+𝛽𝑖𝑃𝑖 𝑚𝑎𝑥 +𝛾𝑖𝑃𝑖 𝑚𝑎𝑥 2)𝑁𝑖=1 +𝜉𝑖exp (𝜆𝑖𝑃𝑖 𝑚𝑎𝑥 ) ($/𝑙𝑏) (5) 𝑝𝑓𝑙[𝑖] = ∑ (𝑎𝑖+𝑏𝑖𝑃𝑖 𝑚𝑎𝑥 +𝑐𝑖𝑃𝑖 𝑚𝑎𝑥 2 )𝑁𝑖=1 +|𝑒𝑖×sin {𝑓𝑖×(𝑃𝑖𝑚𝑖𝑛 𝑚𝑎𝑥 −𝑃𝑖 𝑚𝑎𝑥 )}| ∑ ∑ 𝑃𝑖 𝑚𝑎𝑥 𝐵𝑖𝑗𝑃𝑗 𝑚𝑎𝑥 +∑ 𝐵𝑖𝑜𝑃𝑖 𝑚𝑎𝑥 +𝐵𝑜𝑜 𝑁 𝑖=1 𝑁 𝑗=1 𝑁 𝑖=1 ($/𝑝𝑢) (6) where 𝑃𝑖 𝑚𝑎𝑥 is the maximum capacity of the unit. fuzzy-based real-code genetic algo for optimizing non-convex environment economic loss dispatch 499 (e) 𝑃𝑓𝑒[𝑖] and 𝑃𝑓𝑙[𝑖] (i=1, 2... n) are sorted in ascending order. (f) 𝑃𝑖 𝑚𝑎𝑥 is added starting from the generator unit with the smallest 𝑃𝑓𝑒[𝑖] for harmful emissions and the generator unit with the smallest 𝑃𝑓𝑙[𝑖] for the loss until ∑ 𝑃𝑖 𝑚𝑎𝑥 ≥ 𝑃𝐷 . (g) the 𝑃𝑓𝑒[𝑖] and 𝑃𝑓𝑙[𝑖] linked with the last generator unit is the ppf for emission and loss, respectively for a given load 𝑃𝐷 . (h) the 𝑃𝑓𝑒[𝑖] and 𝑃𝑓𝑙[𝑖] for particular load are determined. eq. (4) is optimized subject to constraints in case of the tri-objective minimization problem. for the convex eed problem, the ′𝑃𝑓𝑒 ′ selected is 43.55981 $/kg and 44.07915 $/kg [27] for three generator unit network at 400 mw and 500 mw respectively. for nonconvex problem considering standard ieee 30-bus network, 𝑃𝑓𝑒 ′ and ′𝑃𝑓𝑙′ calculated for load pd of 2.834 p.u is 5932.9377 $/lb & 10445.0680 $/p.u and for load pd = 4.32 p.u is 10949.4251 $/lb & 19612.6323 $/p.u respectively using method given in reference [8]. the optimization process is subjected to the following constraints: a) the active power output of a generating unit is constrained by its bounds for a stable operation and is given as: 𝑃𝑖 𝑚𝑖𝑛 ≤ 𝑃𝑖 ≤ 𝑃𝑖 𝑚𝑎𝑥 𝑖 = 1,2, … . , 𝑁 (7) b) the total generated power balances the sum of the active power loss (pl) and total load demand (pd). therefore, ∑ 𝑃𝑖 − (𝑃𝐷 + 𝑃𝐿 ) = 0 𝑁 𝑖=1 (8) where pl is denoted as b-coefficients. the error in loss coefficients is considered to be constant as in ref [14]. c) generator ramp rate limits: the inclusion of ramp rate limits changes the operating limits of the generator as [24] 𝑀𝑎𝑥(𝑃𝑖 𝑚𝑖𝑛 , 𝑃𝑖 𝑜 − 𝐷𝑅𝑖 ) ≤ 𝑃𝑖 ≤ 𝑀𝑖𝑛(𝑃𝑖 𝑚𝑎𝑥 , 𝑃𝑖 𝑜 + 𝑈𝑅𝑖 ) (9) where, 𝑃𝑖 𝑜 is the previous operating point of ith generator and dri & uri are the down and up ramp rate limits respectively. e) prohibited operating zones: if any power plant works in these zones, some faults might occur for the machines or accessories such as pumps or boilers. therefore, to prevent theses faults, the power generation limits must be changed so that they satisfy the poz constraint. this feature can be included in the non-convex multi-objective problem formulation as [24] min 1 1 max l i i i u l i ik i ik u izi i i p p p p p p p p p p −           (10) here zi are the number of prohibited zones in i th generator curve, k is the index of prohibited zone of ith generator, p ik l is the lower limit of kth prohibited zone, and p ik−1 u is the upper limit of kth prohibited zone of ith generator. 500 s. s. parihar, n. malik 3. solution methodology the paper implemented frcga on threeand six generator networks, to identify the best-compromised solution amongst the available set of pareto optimal solutions. the techniques used in the algorithm are as follows: 3.1. pareto optimality it is defined as the degree of efficacy in multi-objective and multi-criteria solutions and represents a condition where economic resources and its output have been assigned in such a manner that no objective can be made better without losing the well-being of the other. there is no way to improve one part of a pareto optimal solution set without making another part worse. a state u will dominate state v if u is superior to v in at least one objective function and not worse in regard to the other objective functions. a decision vector ‘u’ will dominate another vector ‘v’ (as m˂n) if 𝑓𝑗 (𝑢) ≤ 𝑓𝑗 (𝑣) ⩝ 𝑗 = 1,2,3, , , 𝑖 (11) and 𝑓𝑗 (𝑢) ˂ 𝑓𝑗 (𝑣) for at least one j (12) where j shows a total number of objectives considered for simultaneous optimization. the reduction in fuel cost of generator increases the environmental emissions and vice-versa. as the considered objectives are conflicting in nature so instead of getting an optimal solution a set of non-dominated (pareto-optimal) solutions have been obtained, hence, pareto-optimal solution has been considered. 3.2. real-coded genetic algorithm in a real-coded genetic algorithm (rcga) for optimization, the output of each generator in the system is illustrated as a floating point rather than a binary number resulting in high precision solution [30]. for discontinuous, non-differentiable and discrete objective functions the algorithm is proved to be effective and superior to binary coded genetic algorithm. the outputs of all the generating units generate a solution string known as chromosome. the initial population is randomly generated in a given search space. the rcga loop comprises pre-processing, three genetic operations and post-processing. it performs a global optimization to identify the best solution to the formulated problem and iterates until the convergence criteria is met. to estimate the fitness value for each individual to optimize nceeld problem mentioned by (4) for a given load while satisfying limits shown in (7) and (8): 𝑀𝑖𝑛 𝐶 = (𝑓1 + 𝛼[∑ 𝑃𝑖 𝑁 𝑖=1 − (𝑃𝐷 + 𝑃𝐿 )]) 2 + ([𝑝𝑓𝑒 ∗ (𝑓2 + 𝛼[∑ 𝑃𝑖 𝑁 𝑖=1 − (𝑃𝐷 + 𝑃𝐿 )] 2]) + ([𝑝𝑓𝑙 ∗ (𝑓3 + 𝛼[∑ 𝑃𝑖 𝑁 𝑖=1 − (𝑃𝐷 + 𝑃𝐿 )] 2]) (13) where α represents the penalty parameter that occurs if n/w load demand is not satisfied. this guarantees that a feasible solution gets higher fitness as compared to an infeasible solution. fuzzy-based real-code genetic algo for optimizing non-convex environment economic loss dispatch 501 3.3. fuzzy approach based on min-max proposition to optimize three conflicting objectives (fuel cost, emission and n/w loss) simultaneously is a tedious task as there are no single criteria to finalize the merit of the available non-dominated solutions. due to the conflicting nature of the objectives, it is hard to find the best solution. every objective is assigned a degree of satisfaction based on the membership functions provided by the fuzzy method. the membership functions represent the degree of membership in fuzzy sets in the range [0,1].  (fi) is monotonically decreasing function given as [9]: min max min max max min max 1; ( ) ; 0; i i i i i i i i i i i i f f f f f f f f f f f f     − =   −    (14) where f i min represents the expected minimum value and f i max represents the expected maximum value of objective function i. the membership function value signifies how much a solution satisfies fi on a scale of 0 to 1. the fuzzy min-max proposition to nominate the best solution amongst many solutions can be given as [9] µ𝑏𝑒𝑠𝑡𝑠𝑜𝑙𝑢𝑡𝑖𝑜𝑛 = 𝑀𝑎𝑥{min [µ(𝐹𝑗 )] 𝑘 } (15) where k is the number of pareto-optimal solutions. each objective is expected to attain higher satisfaction for each solution. the bestcompromised solution is identified based on the highest rank among k solutions. the pseudo-code to solve nceeld problem is shown below step i: initialise the cost coefficients, generator limits, load demand and the min-max values of each objective. step ii: create a random population to define the number of generators within specified limits. step iii: evaluate the fitness of the constrained tri-objective problem of the network with prohibiting operating zones and ramp rate limits. step iv: single point crossover is used for pairing and mating of the selected chromosomes. step v: mutant is created on a random basis. step vi: create new chromosomes and offspring for convergence check. step vii: select the fittest individual for the next generation. step viii: check the convergence criteria. if the maximum counter is reached, jump to step ix. else, step iv. step ix: calculate the membership value of the pareto optimal solutions using (14). the fmin and fmax value of each objective are determined by optimizing all the objectives independently to determine the endpoints of the obtained pareto front. step x: the degree of satisfaction attained for each objective is used to find the bestcompromise solution based on min-max proposition as given in (15). 502 s. s. parihar, n. malik 4. results and discussion to validate the performance, frcga has been employed to solve nceeld problem on two networks having 3 and 6 generators satisfying all the operational network constraints at various power demands. the network data for 3 and 6 generating units is given in the appendix (table 13, table 14, table 15 and table 16). a program to imitate results for both the test n/w is written on matlab 7.10. the standard ieee-30 bus network with six generator units is presented in fig.1. fig. 1 one-line diagram of 30-bus network to demonstrate the superiority of the frcga, three different test cases have been identified at different network complexity. the convergence test was carried out employing the same evaluation function for the same no. of iterations for convex case. the results for one trial of 250 iterations are shown in fig. 2, fig. 3 and fig. 4 for optimized cost, emission and loss function respectively. it can be seen that frcga converges faster for the population size of 500. fuzzy-based real-code genetic algo for optimizing non-convex environment economic loss dispatch 503 fig. 2 convergence characteristic for best fuel cost solution for different pop sizes fig. 3 convergence characteristic for best emission solution for different pop sizes fig. 4 convergence characteristic for best n/w loss solution for different pop sizes 0 50 100 150 200 250 606 608 610 612 614 616 618 620 622 no. of iteration fu e l c o s t popsize=200 popsize=300 popsize=500 popsize=400 0 50 100 150 200 250 0.18 0.2 0.22 0.24 0.26 0.28 0.3 0.32 no. of iteration e m is s io n popsize=200 popsize=300 popsize=400 popsize=500 0 50 100 150 200 250 0 0.02 0.04 0.06 0.08 0.1 no. of iteration s y s te m l o s s popsize=200 popsize=300 popsize=400 popsize=500 504 s. s. parihar, n. malik hence, the optimal settings for both cases are the same, with the exception of population size and are mentioned in table 1 table 1 frcga parameters for different case studies parameters selected value population size 200 (case 1) 500 (case 2 & 3) selection rate 0.3 mutation rate 0.2 trials 60 iterations 250 4.1. environmental economic dispatch three and six generator networks have been tested without considering the effect of vpl in the network. table 2 illustrates the best cost and emission linked with the network at two different power demands of 400 mw and 500 mw. when cost minimization is performed, the generating fuel cost and n/w emissions are 20792.88 $ and 206.3426 kg, respectively, but the cost of the generator increases to 20846.60 $, and the network harmful emission reduces to 200.1578 kg in ed case at power demand of 400 mw. for 500 mw, the generator cost and n/w emissions are 25453.26 $ and 319.5089 kg when cost minimization is performed, but the cost rises to 25500.40 $ and emission reduces to 311.0776 kg. using min and max values of each objective function, the membership value of the non-dominated solutions is determined. table 2 best solution for eld and ed of 3-unit n/w at pd=400 mw and 500 mw load demand 400 mw 500 mw eld ed eld ed p1(mw) 81.4957 106.4685 103.5167 130.8372 p2(mw) 175.8190 151.1246 217.1612 190.1187 p3(mw) 149.8137 149.7724 190.9736 190.7181 fuel cost ($) 20792.88 20846.60 25453.26 25500.40 emission (kg) 206.3426 200.1578 319.5089 311.0776 loss (mw) 7.5560 7.3865 11.9239 11.6800 the simultaneous optimization of the environmental emission and the generator fuel cost is carried out to determine a best-compromise solution. in table 3 and table 4, five intermediate pareto solutions are listed from the attained pareto solution set using the presented approach with its membership values. solution 5 is selected as the best solution having the highest rank of 0.1584 and 0.1110 at 400 mw and 500 mw respectively. fuzzy-based real-code genetic algo for optimizing non-convex environment economic loss dispatch 505 table 3 pareto optimal solutions for the convex-eed problem at pd=400 mw (3-unit n/w) solution number cost ($) emission (kg) µ𝟏 µ𝟐 µ𝒎𝒊𝒏 1 20845.74 203.7849 0.0160 0.4135 0.0160 2 20843.59 200.6626 0.0560 0.9184 0.0560 3 20812.80 205.3911 0.6293 0.1539 0.1539 4 20838.31 200.3850 0.1544 0.9633 0.1544 5 20838.09 200.2123 0.1584 0.9912 0.1584 table 4 pareto optimal solutions for the convex-eed problem at pd=500 mw (3-unit n/w) solution number cost ($) emission (kg) µ𝟏 µ𝟐 µ𝒎𝒊𝒏 1 25497.79 312.3221 0.0553 0.8524 0.0553 2 25497.63 311.0877 0.0586 0.9988 0.0586 3 25497.56 312.2660 0.0602 0.8590 0.0602 4 25496.93 311.1103 0.0737 0.9961 0.0737 5 25495.17 311.1194 0.1110 0.9950 0.1110 the summarized result for a best-compromised solution for three generating unit network is tabulated in table 5 and is compared with the other methods such as ga [25], pso [25] and fpa [27]. table 5 best solution for the convex-eed problem at pd=400 mw and 500 mw (3-unit n/w) best-compromised solution 400 mw 500 mw frcga ga [25] pso [25] fpa [27] frcga ga [25] pso [25] fpa [27] p1 (mw) 102.8514 102.617 102.612 102.4468 129.3252 128.997 128.984 128.8074 p2 (mw) 154.0217 153.825 153.809 153.8341 192.4745 192.683 192.645 192.5906 p3 (mw) 150.5278 151.011 150.991 151.1321 189.8764 190.11 190.063 190.2958 fuel cost ($) 20838.09 20840.10 20838.30 20838.10 25495.17 25499.40 25495.00 25494.70 emission (kg) 200.2123 200.256 200.221 200.2238 311.1194 311.273 311.15 311.155 loss (mw) 7.4090 7.41324 7.41173 7.4126 11.6882 total cost ($) 29559.59 29563.20 29559.90 29559.81 39209.7 39220.10 39210.20 39210.15 the comparison depicts that the total generation cost incurred in solving eed problem from the frcga approach is lower than that incurred using other optimization approaches in both test cases. thus, frcga succeeds to obtain the global minimum solution and performs superior to these algorithms in respect of all parameters. the total network losses for the best-compromised solution are 7.4090 mw and 11.6882 mw for power demand of 400 mw and 500 mw, respectively. for 30-bus n/w, the best-compromised solution attained has the value of 0.1999 lb/hr and 619.90 $/hr respectively for harmful environmental emission and cost, respectively at load demand of 2.834 p.u and is in close agreement with 0.1969 lb/hr and 623.87 $/hr as mentioned in [20]. fig. 5 is the pareto front drawn between the fuel cost and the emission points which was found to have an inverse relationship between the two objectives. 506 s. s. parihar, n. malik fig. 5 pareto front between generator fuel cost ($/hr) and emission (lb/hr) for convex eed 4.2. environmental economic loss dispatch with valve-point loading the performance of the frcga on the nceeld problem is examined for the first time on the ieee 30-bus network at two different loading conditions. three objectives (fuel cost, environmental emission and losses) are simultaneously considered and optimized to obtain minimum network generation cost. the total generation cost comes out to be 1810.10 $/hr at 2.834 p.u load demand which is found to be superior to published results at 2.834 p.u. the minimum-maximum limits for fuel cost with vpl effect, harmful environmental emissions and losses for load demand of 2.834 p.u and 4.32 p.u are given in table 6. for the load of 2.834 p.u, the values attained for cost and emission is 608.02 $/hr and 0.1938 lb/hr that is found to be less when compared to 626.96 $/hr & 0.2110 lb/hr [26], 613.342 $/hr & 0.2028 lb/hr [28] and 613.338 $/hr & 0.1953 lb/hr [29], respectively. the membership values of all the pareto optimal solutions for the nceeld problem are obtained. five intermediate solutions are tabulated in table 7 and table 8 for pd=2.834 p.u and pd=4.32 p.u respectively. table 6 min-max limit for fuel cost with vpl effect, emission and loss at 2.834 p.u and 4.32 p.u load (p.u) 2.834 4.32 cost ($/hr) minimum 608.02 965.93 maximum 646.19 980.67 emission (lb/hr) minimum 0.1938 0.2263 maximum 0.2211 0.2422 loss (p.u) minimum 0.0209 0.0514 maximum 0.0379 0.0612 table 7 pareto optimal set of nceeld problem with vpl effect for load pd=2.834 p.u solution number cost ($/hr) emission (lb/hr) loss (p.u) 1 2 3 µ𝑚𝑖𝑛 1 622.74 0.1973 0.0262 0.6144 0.8704 0.6894 0.6144 2 622.62 0.2001 0.0228 0.6174 0.7697 0.8862 0.6174 3 621.53 0.2022 0.0228 0.6461 0.6911 0.8863 0.6461 4 619.84 0.2021 0.0268 0.6905 0.6961 0.6558 0.6558 5 614.99 0.2027 0.0255 0.8174 0.6742 0.7284 0.6742 615 620 625 630 635 640 645 650 0.193 0.194 0.195 0.196 0.197 0.198 0.199 0.2 0.201 cost e m is s io n fuzzy-based real-code genetic algo for optimizing non-convex environment economic loss dispatch 507 table 8 pareto optimal set of nceeld for load pd=4.32 p.u solution number cost ($/hr) emission (lb/hr) loss (p.u) total cost ($/hr) 1 2 3 µ𝒎𝒊𝒏 1 973.38 0.2326 0.0555 4490.2 0.4944 0.6013 0.5774 0.4944 2 972.85 0.2329 0.0563 4515.4 0.5301 0.5831 0.4959 0.4959 3 972.96 0.2335 0.0541 4489.12 0.5228 0.5451 0.7296 0.5228 4 972.76 0.2332 0.0545 4475.29 0.5365 0.5666 0.6788 0.5365 5 972.22 0.2335 0.0543 4497.8 0.5728 0.5480 0.7045 0.5480 the results reveal that the best-compromise solution for load demand of 2.834 p.u is 2099.20 $/hr and for load pd=4.32 p.u is found to be 4497.82 $/hr with the highest rank of 67.42% and 54.80% respectively depending upon its membership value of each objective. fig. 6 depicts the convergence criteria of 30-bus network on two different loads which reveal that the convergence of load pd= 2.834 p.u and pd= 4.32 p.u is attained faster even for the complex multi-objective minimization problem. fig. 6 convergence characteristic for total generation cost for different load conditions 4.3. environmental economic loss dispatch with valve-point loading, pozs and rrl for this test case, all the mentioned practical constraints and non-linear characteristic of non-convex multi-objective problem are considered. due to which this test case is more complex than other test cases considered above. data for the ramp rate limits and pozs has been taken from appendix (table 15 and table 17). the generator ramp rate limit needs to be satisfied as generator output cannot change (increase or decrease its output) arbitrarily to any value, the change has to within the up/down ramp rate limits. the inclusion of ramp rate limits changes the operating limits of the generator. the minimum-maximum limits of fuel cost, emission and loss evaluated for the six-unit system with pozs and rrl are given in table 9 with load demand 2.834 pu. the results presented in table 10 provides the intermediate solutions obtained using rcga. the best solution is ranked on the basis of its performance for all the objectives considered. therefore, overall rank for extreme points is zero. the rank of best solution is found to be 0.6685 which indicated that all three objectives are satisfied at least 66.85 % for load of 2.834 p.u. 0 50 100 150 200 250 0 1 2 3 4 5 6 x 10 4 iteration fi tn e s s f u n c ti o n load= 2.834 pu load= 4.32 pu 508 s. s. parihar, n. malik table 9 min-max limit for fuel cost with vpl effect, emission and loss with pozs and rrl at 2.834 p.u cost($/h) emission(lb/h) loss(pu) minimum maximum minimum maximum minimum maximum 611.2998 645.3562 0.1942 0.2073 0.0256 0.0358 table 10 pareto optimal set of nceeld with pozs and rrl for load pd=2.834 p.u cost ($/h) emission (lb/h) loss (pu) µ1 µ2 µ3 µ𝑚𝑖𝑛 sol.1 624.7335 0.1975 0.0257 0.6055 0.7473 0.9901 0.6055 sol.2 623.7816 0.1989 0.0242 0.6335 0.6421 1.0000 0.6335 sol.3 623.3781 0.1979 0.0291 0.6453 0.7208 0.6589 0.6453 sol.4 621.4747 0.1987 0.0283 0.7012 0.6598 0.7379 0.6598 sol.5 620.3646 0.1985 0.0256 0.7338 0.6685 1.0000 0.6685 the results clearly showed that all the constraints, such as vpl effect, pozs, rrl, generation limits and power balance constraints were fully satisfied for all considered test cases of tri-objective optimization problem. due to the non-convexity constraints introduced in test system, the cost increases from 608.0296 $/hr to 611.2998 $/hr, emission increases from 0.1938 lb/hr to 0.1942 lb/hr and system loss from 0.0209 p.u to 0.0256 p.u. 4.4. statistical analysis table 11 lists the comparison of different approaches for cost and emission minimization in terms of their minimum, maximum, mean and median values, respectively, for ieee 30bus n/w. the cost minimum (cmin), cost mean (cmean), cost median (cmedian), emission minimum (emin), emission mean (emean) and emission median (emedian) values obtained for the eld and ed problem, respectively, are found to be lowest as compared to other published work. the statistical comparison of ceed problem has also been shown in table 12 in terms of their mean and standard deviation. the values of cmean and emean obtained from solving convex ceed problem also demonstrates the superiority of the method. the value of cost standard deviation (cstd) and emission standard deviation (estd) attained from the proposed approach of frgca are 7.127 and 0.0057, respectively which is less than that obtained from other approaches. this clearly shows that the obtained results lie close to its mean value as compared to other published methods. table 11 statistical comparison of eld and ed minimization for ieee 30-bus n/w at load pd=2.834 p.u [1] fuel cost minimization cmin cmax cmean cmedian proposed approach 601.31 610.07 603.20 602.23 gqpso [31] 606.38 611.86 609.49 609.66 saiwpso [32] 605.99 606.00 605.99 605.99 ngpso [20] 605.99 605.99 605.99 605.99 [2] emission minimization emin emax emean emedian proposed approach 0.1938 0.2295 0.1941 0.1940 gqpso [31] 0.1942 0.1946 0.1944 0.1944 saiwpso [32] 0.1941 0.1941 0.1941 0.1941 ngpso [20] 0.1941 0.1941 0.1941 0.1941 fuzzy-based real-code genetic algo for optimizing non-convex environment economic loss dispatch 509 table 12 statistical comparison of ceed minimization for ieee 30-bus n/w at load pd=2.834 p.u cmean cstd emean estd proposed approach 622.62 7.127 0.2012 0.0057 gqpso [31] 644.09 12.2 0.2109 0.0095 saiwpso [32] 623.76 0.1970 ngpso [20] 623.86 0.1969 5. conclusion the fuzzy-based rcga is demonstrated to solve multi-objective environmental economic loss dispatch problem considering non-convex and non-smooth fuel cost function. the multiobjective minimization problem is transformed into the constrained single-objective problem by the use of price penalty factor which blends all competing objectives (generator cost, environmental emission and system losses). because the objectives are inversely related, a set of pareto optimal solutions are attained rather than a single optimal solution for a given objective. furthermore, a fuzzy approach is exploited to extract best-compromised solution as per the highest rank based on their membership values. the convergence of the nceeld problem at different load demand is also analyzed considering the different practical operating limits (pozs, rrl and vpl) of the network. the total generation cost of the network attained from the proposed method for different test cases has been compared to the other techniques which validate the solution to nceeld problem for small and large networks. the statistical analysis also validates the frgca approach. the percentage reduction in cstd and estd values are 41.5% and 40% as compared to ref. [31]. the proposed work can further be extended for the study of integration of renewable energy sources and for practical transmission networks considering dynamic non-convex ceeld problem. appendix table 13 generator cost, emission coefficients & generation constraints for three generating unit network cost coefficients g1 g2 g3 ai 0.03546 0.02111 0.01799 bi 38.30553 36.32782 38.27041 ci 1243.5311 1658.5696 1356.6592 emission coefficients αi 0.00683 0.00461 0.00461 βi -0.54551 -0.5116 -0.5116 𝛾i 40.2669 42.89553 42.89553 unit limits pmin (p.u) 35 130 125 pmax(p.u) 210 325 315 510 s. s. parihar, n. malik table 14 b-coefficients for three generating unit network bij * 0.0001 0.71 0.3 0.25 0.3 0.69 0.32 0.255 0.32 0.8 table 15 generator fuel cost, emission coefficients and n/w generation constraints for 30bus n/w cost coefficients g1 g2 g3 g4 g5 g6 ai 100 120 40 60 40 100 bi 200 150 180 100 180 150 ci 10 10 20 10 20 10 ei 200 200 200 200 200 200 fi 0.0050 0.0060 0.0010 0.0009 0.0009 0.0015 emission coefficients αi 4.091 2.543 4.258 5.326 4.258 6.131 βi -5.554 -6.047 -5.094 -3.550 -5.094 -5.555 𝛾i 6.490 5.638 4.586 3.380 4.586 5.151 𝜁i 0.0002 0.0005 0.00001 0.002 0.000001 0.00001 𝜆i 2.857 3.333 8.000 2.000 8.000 6.667 generator unit constraints pmin (p.u) 0.05 0.05 0.05 0.05 0.05 0.05 pmax (p.u) 0.5 0.6 1.0 1.2 1.00 0.60 ramp rate limits dri(up)/h 0.08 0.11 0.15 0.18 0.15 0.18 dri(dn)/h 0.08 0.11 0.15 0.18 0.15 0.18 table 16 b-coefficients for six generating unit network bij 0.1382 -0.0299 0.0044 -0.0022 -0.0010 -0.0008 -0.0299 0.0487 -0.0025 0.0004 0.0016 0.0041 0.0044 -0.0025 0.0182 -0.0070 -0.0066 -0.0066 -0.0022 0.0004 -0.0070 0.0137 0.0050 0.0033 -0.0010 0.0016 -0.0066 0.0050 0.0109 0.0005 -0.0008 0.0041 0.0066 0.0033 0.0005 0.0244 bo -0.0107 0.0060 -0.0017 0.0009 0.0002 0.0030 boo 0.00098573 table 17 pozs of units for ieee-30 bus n/w unit 1 2 5 poz [0.10 0.15] [0.25 0.30] [0.50 0.55] references [1] j. c. dodu, p. martin, a. merlin and j. pouget, "an optimal formulation and solution of short-range operating problems for a power system with flow constraints", proc. ieee, vol. 60, no. 1, pp. 54-63, 1972. [2] m. modiri-delshad, s. h. a. kaboli, e. taslimi-renani and n. a. rahim, "backtracking search algorithm for solving economic dispatch problems with valve-point effects and multiple fuel options", energy, vol. 116, pp. 637-649, 2016. [3] m. pradhan, p. k. roy and t. pal, "grey wolf optimization applied to economic load dispatch problems", int. j. electr. power energy syst., vol. 83, pp. 325-334, 2016. [4] m. r. gent and w. l. john, "minimum-emission dispatch", ieee trans. power syst., vol. 90, pp. 2650–2660, 1971. fuzzy-based real-code genetic algo for optimizing non-convex environment economic loss dispatch 511 [5] k. t. chaturvedi, m. pandit and l. srivastava, "modified neo-fuzzy neuron-based approach for economic & environmental optimal dispatch", appl. soft comput., vol. 8, no. 4, pp. 1428-1438, 2008. [6] s. zaoui and a. belmadani, "solution of combined economic and emission dispatch problems of power systems without penalty", appl. artif. intell., p. 1976092, 2021. [7] a. chatterjee, s. p. ghoshal and v. mukherjee, "solution of combined economic and emission dispatch problems of power system by an opposition-based harmony search algorithm", int. j. electr. power energy syst., vol. 39, no. 1, pp. 9-20, 2012. [8] c. palanichamy and k. srikrishna, "economic thermal power dispatch with emission constraint", j. institution of eng., vol. 72, pp. 11-18, 1991. [9] s. s. parihar and n. malik, "multi-objective optimization with non-convex cost functions using fuzzy mechanism based continuous genetic algorithm", in proceedings of the ieee 4th international conference on electrical, computer and electronics, 2017, pp. 457-462. [10] d. c. walters and g. b. sheble, "genetic algorithm solution of economic dispatch with valve point loading", ieee trans. power syst., vol. 8, no. 3, pp. 1325-1332, 1993. [11] d. zou, s. li, g. g. wang, z. li and h. ouyang, "an improved differential evolution algorithm for the economic load dispatch problems with or without valve-point effects", appl. energy, vol. 181, pp. 375-390, 2016. [12] w. t. el-sayed, e. f. el-saadany, h. h. zeineldin and a. s. al-sumaiti, "fast initialization methods for the nonconvex economic dispatch problem", energy, vol. 201, p. 117635, june 2020. [13] s. m. abd elazim and e. s. ali, "optimal network restructure via improved whale optimization approach", int. j. commun., vol. 34, no. 1, e. 4617, 2021. [14] a. y. abdelaziz, e. s. ali and s. m. abd elazim, "flower pollination algorithm to solve combined economic and emission dispatch problems", eng. sci. technol. int. j., vol. 19, no. 2, pp. 980-990, 2016. [15] m. a. abido, "a novel multi-objective evolutionary algorithm for environmental/economic power dispatch", int. j. electr. power system res., vol. 65, no. 1, pp. 71–91, 2003. [16] v. p. sakthivel, m. suman and p. d. sathya, "combined economic and emission power dispatch problems through multi-objective squirrel search algorithm", appl. soft comput., vol. 100, p. 106950, march 2021. [17] n. sinha, r. chakrabarti and p. k. chattopadhyay, "evolutionary programming techniques for economic load dispatch", ieee trans. evol. comput., vol. 7, no. 1, pp. 83-94, 2003. [18] m. basu, "dynamic economic emission dispatch using nondominated sorting genetic algorithm – ii", int. j. electr. power energy syst., vol. 30, no. 2, pp. 140-149, 2008. [19] e. s. ali and s. m. abd elazim, "mine blast algorithm for environmental economic load dispatch with valve loading effect", neural comput. appl., vol. 30, pp. 261-270, 2018. [20] d. zou, s. li, z. li and x. kong, "a new global particle swarm optimization for the economic emission dispatch with or without transmission losses", energy convers. manag., vol. 139, pp. 45-70, 2017. [21] l. wang and c. singh, "environmental / economic power dispatch using fuzzified multi-objective particle swarm optimization algorithm", int. j. electr. power syst. res., vol. 77, no. 12, pp. 1654-1664, 2007. [22] s. sivasubramani and k. s. swarup, "environmental/economic dispatch using multi-objective harmony search algorithm", electr. power syst. res., vol. 81, no. 9, pp. 1778-1785, 2011. [23] l. benyekhlef, s. abdelkader, b. houari and a. a. n. el-islam, "cuckoo search algorithm to solve the problem of economic emission dispatch with the incorporation of facts devices under the valve-point loading effect", fu: elec. energ., vol. 34, no. 10, pp. 569-588, 2021. [24] q. quande, c. shi, c. xianghua, l. xiujuan and s. yuhui, "solving non-convex/non-smooth economic load dispatch problems 2 via an enhanced particle swarm optimization", appl. soft comput., vol. 59, pp. 1-24, 2017. [25] a. l. devi and o. v. krishna, "combined economic and emission dispatch using evolutionary algorithms – a case study", arpn j. eng. appl. sci., vol. 3, no. 6, pp. 28-35, 2008. [26] s. hemamalini and s. p. simon, "emission constrained economic dispatch with valve point effect using particle swarm optimization", in proceedings of the ieee region 10 conference (tencon), 2008, vol. 1, pp. 1-6. [27] a. y. abdelaziz, e. s. ali and s. m. abd elazim, "combined economic and emission dispatch solution using flower pollination algorithm", int. j. electr. power energy syst., vol. 80, pp. 264-274, 2016. [28] a. bhattacharya and p. k. chattopadhyay, "application of biogeography-based optimization for solving multi-objective economic emission load dispatch problem", electr. power compon. syst., vol. 38, no. 3, pp. 826-850, 2010. [29] a. bhattacharya and p. k. chattopadhyay, "solving economic emission load dispatch problems using hybrid differential evolution", appl. soft comput., vol. 11, no. 2, pp. 2526-2537, 2011. [30] r. l. haupt and s. e. haupt, practical genetic algorithm, 2004. (book) https://www.sciencedirect.com/journal/engineering-science-and-technology-an-international-journal/vol/19/issue/2 512 s. s. parihar, n. malik [31] s. agrawal, b. k. panigrahi and m. k. tiwari, "multiobjective particle swarm algorithm with fuzzy clustering for electrical power dispatch", ieee trans. evol. comput., vol. 12, no. 5, pp. 529-541, 2008. [32] m. a. c. silva, c. e. klein, v. c. mariani and l. s. coelho, "multiobjective scatter search approach with new combination scheme applied to solve environmental/economic dispatch problem", energy, vol. 53, no. 5, pp. 14-21, 2013. 10614 facta universitatis series: electronics and energetics vol. 35, no 4, december 2022, pp. 571-585 https://doi.org/10.2298/fuee2204571s © 2022 by university of niš, serbia | creative commons license: cc by-nc-n original scientific paper fast doa estimation of the signal received by textile wearable antenna array based on ann model* zoran stanković, olivera pronić-rančić, nebojša dončov university of niš, faculty of electronic engineering, niš, serbia abstract. mlp_doa module, being an integral part of the smart twaa doa subsystem, intended for fast doa estimation is proposed. multilayer perceptron network is used to create the mlp_doa module that provides a radio gateway location in azimuthal plane at its output when a spatial correlation matrix, found by receiving the radio gateway signal using two-element textile wearable antenna array, is on its input. mlp_doa network training with monitoring the generalization capabilities on the validation set of samples is applied. the accuracy of the proposed modeling approach is compared to the classical approach in mlp_doa module training previously developed by the authors. comparison of the presented ann model with the root music algorithm in terms of accuracy and program execution time is also done. key words: ann, mlp, doa, twaa, root music 1. introduction wearable wireless systems play an integral role in the fifth generation (5g) networks, which operate with higher bit rates, lower latency, and lower outage probabilities in smaller microcells and picocells covering broader areas than 4g or older technologies. in addition, beam reconfigurability and beamforming are expected to facilitate spectral and energy efficiency at both the mobile devices and base station levels. besides mobile communications, wearable wireless systems find numerous applications in areas such as health-care, security, ambient assisted leaving, sports etc., [2]-[7]. wearable antennas are among the most important elements of wearable wireless systems, [8]-[15]. they are usually integrated within the clothing by any of current stateof-the-art fabrication methods (fabric-based embroidered antennas, polymer-embedded antennas, microfluidic antennas with injection alloys, inkjet printing, screen printing and photolithography, 3d-printed antennas, etc.) [9]. depending on the type of application, it received march 24, 2022; revised may 15, 2022; accepted june 5, 2022 corresponding author: zoran stanković university of niš, faculty of electronic engineering, aleksandra medvedeva 14, 18000 niš, serbia e-mail: zoran.stankovic@elfak.ni.ac.rs * an earlier version of this paper was presented at the 15th international conference on advanced technologies, systems and services in telecommunications (telsiks 2021), october 20 22, 2021, in niš, serbia [1] 572 z. stanković, o. pronić-rančić, n. dončov is vital to choose a suitable antenna form, as one-design-fits-all approach often does not meet all requirements. in health care monitoring (hcm), wireless technology enables a significant reduction in the cost of health services, while at the same time providing the necessary quality of service. combination of biosensors placed on patient’s body and antennas integrated into garments to transmit/receive signals to the remote wireless monitor point can allow the patients to receive the needed assistance, while continuing to live in their own homes [15]. a single textile wearable antenna with an omnidirectional radiation pattern is commonly used in health care monitoring systems. it allows to avoid signal level fluctuations between the antenna and a radio gateway (rg) of the hcm system due to wearer movements. however, the range of the single antenna is significantly reduced both outdoors and indoors due to its small gain. the classic antenna arrays provide significantly higher gain but have narrow and spatially invariant radiation patterns and therefore cannot overcome the problem of signal fluctuations due to the movement of antenna wearer. textile wearable antenna arrays (twaa) with adaptive beamforming (smart twaa), on the other hand, provide that the main lobe of the antenna array radiation pattern is always directed towards the rg [15]. direction-of-arrival (doa) estimation of rg signal represents a crucial factor in adaptive beamforming [16]. usually, doa estimation requires intensive matrix calculations аs it is based on super resolution algorithms such as music, esprit, and their modifications. therefore, their real-time implementation requires powerful hardware platforms, [17-20], which makes them unsuitable for implementation on small mobile platforms used to realise smart twaa. on the other hand, artificial neural networks (anns) for doa estimation do not require complex matrix calculations and can be easily implemented on modest mobile hardware platforms, [21]-[26]. further, from our previous research, it was shown that they have approximately the same modelling accuracy as super resolution algorithms, but significantly higher calculation speed [1], [24]-[26]. this paper is a continuation of the research presented in [1] where the basic version of the doa module based on the multilayer perceptron (mlp) network (mlp_doa module) was proposed. that module is an integral part of the smart twaa doa subsystem with two textile antennas that performs fast doa estimation of the rg signals and determination of the rg location in the azimuthal plane. the research conducted within this paper relates to further development and improvement of the mlp_doa module as well as to the examination of its performances in a working environment having a wide range of signal-to-noise ratio changes. unlike the classical approach in mlp_doa module training, applied in [1], that did not include mechanisms of control of the achieved generalization capabilities of mlp network, in this paper, network training with monitoring the generalization capabilities on the validation set of samples and thus preventing the effect of its overlearning is applied. the proposed ann approach in doa estimation of the rg signal is compared with the classical approach in doa estimation based on the root music algorithm in terms of accuracy and program execution time. the paper is organized as follows. after introduction, a brief description of the proposed smart twaa doa subsystem is given in section 2. the architecture, training, and testing of mlp_doa network are presented in section 3. the most illustrative numerical results are presented in section 4, and finally conclusion remarks are given in section 5. fast doa estimation of the signal received by textile wearable antenna array based on ann model 573 in order to facilitate interpretation of the material presented in these sections, the list of used acronyms is given in table 1. table 1 list of used acronyms a term replaced by its acronym (acronym) health care monitoring (hcm) mean square error (mse) radio gateway (rg) maximum validation failures (mvf) textile wearable antenna arrays (twaa) worst case error (wce) direction-of-arrival (doa) average test error (ate) artificial neural networks (anns) pearson product moment correlation coefficient (rppm) multilayer perceptron (mlp) signal-to-noise ratio (snr) 2. smart twaa doa subsystem architecture of the smart twaa doa subsystem is shown in fig. 1. it consists of two-element twaa, narrowband filters, a/d convertors, fpga module and doa module. the distance between the antenna elements is d = c/2f, where c is the speed of light. twaa, filters and a/d convertors perform the rg signal sampling at frequency f. based on the samples provided by twaa, fpga module calculates the spatial correlation matrix (c). this matrix is then sent to the input of doa module that determines the azimuth positions of the radio gateway (). anns are proposed for the realization of the doa module (ann based doa module). fig. 1 architecture of the smart twaa doa subsystem [1] in the absence of the antenna noise, the vector of signals induced on twaa with elements having omnidirectional radiation pattern in the azimuth plane is xs(t) = [xs1(t) xs2(t)], where xs1(t) and xs2(t) are the signals induced on the first and second antenna element, respectively. accordingly, the correlation matrix of signals induced on elements can be expressed as [16] 574 z. stanković, o. pronić-rančić, n. dončov sin h sin [ ( ) ( )] j d h s s s j d p pe e t t p pe p     −  = = =     c x x ss (1) where e [] denotes expectation operator, s = [1 ejdsin]t is the steering vector,  is the phase constant ( =2π/λ), and p is the power of the signal induced on one omnidirectional antenna element. in the initial state, when the twaa wearer does not move and the textile is not deformed, the gains of antenna elements are mutually equal, g()=g1()=g2(). in the general case, the twaa wearer moves, the textile deformations occur and consequently there are changes in the orientation of the antenna elements and in their effective apertures. therefore, the gains of antenna elements in the direction of the rg change over time and in general case, they can have different values at the same moment 1 1 2 2 ( , ) ( , ), for most valuesg g t g g t t =  = (2) here, we assume that creasing of textile does not lead to a significant change in the distance between the antenna elements, i.e., this change can be neglected. therefore, the equation (1), defining the correlation matrix of the signals received by the mobile twaa, must be modified as follows         = − pgpegg peggpg dj dj s 2 sin 21 sin 211   c . (3) when the antenna noise is present and there is not any external rg signal, the noise vector induced on the antenna elements can be represented as n(t) = [n1(t) n2(t)], where n1(t) and n2(t) are random noise components on the first and second antenna element, respectively. for uncorrelated noise, e.g., white gaussian noise, the noise correlation matrix is obtained as 2 h 2 0 [ ( ) ( )] 0 n n n e t t     = =     c n n . (4) the spatial correlation matrix at the twaa output, c, can be obtained as a superposition of the correlation matrix of signals and the noise correlation matrix, h 2 sin 1 1 2 sin 2 1 2 2 [ ( ) ( )] j d n s n j d n e t t g p g g pe g g pe g p       − = =  + = + =   +   c x x c c (5) where x(t) = xs(t)+n(t) is the vector at the twaa output. the signal-to-noise ratio (snr) is defined with respect to the power of the signal received by the first element of the antenna array, 2 1 n pg snr  = . (6) fast doa estimation of the signal received by textile wearable antenna array based on ann model 575 therefore, equation (5) can be expressed as follows             + + = − snr pg pgpegg pegg snr pg pg dj dj 1 2 sin 21 sin 21 1 1   c . (7) normalization of the matrix c does not lead to a change in the results obtained by music algorithm for doa estimation, [25]. by normalizing the matrix c with respect to element c11, it is obtained that normalized matrix, c, is invariant to the signal strength p and for its determination is not necessary to know the gains of both antennas but only their relative ratio g2/g1,                       + ++  +  = − snrg g snr snr e snr snr g g e snr snr g g dj dj 1 11 1 1 1 2sin 1 2 sin 1 2   c (8) with the introduction of the variables: g (root gain ratio), 12 ggg = , and the distance between antenna elements expressed in wavelengths, d, eq. (8) becomes                   + ++  +  = − snr g snr snr e snr snr g e snr snr g dj dj 1 11 1 1 2sin2 sin2     c (9) in the real scenario, the twaa wearer moves, and the textile is crumpled, so it is exceedingly difficult to determine the parameters g and snr at each time point. also, the angle  is unknown, so the spatial correlation matrix cannot be determined directly by applying the above formula. the spatial correlation matrix is estimated from a large number of twaa output samples in a short time interval (twaa snapshots) using fast a/d converters and calculating the matrix elements on the fpga module using the approximate formula  =  sn s h ss s n 1 1 xxc , (10) where xs is the sample of s-th snapshot at twaa output and ns is the number of snapshots. an example of a measuring point and the necessary laboratory equipment for obtaining the elements of a correlation matrix by measurement are presented in [26]. 3. ann based doa module the ann based doa module consists of a single mlp neural network (mlp_doa) that estimates the angle of arrival of the rg signal on the twaa based on the signal information contained in the spatial correlation matrix. this can be represented as follows 576 z. stanković, o. pronić-rančić, n. dončov _ ( ) mlp doa f = c . (11) the first row of a normalized spatial correlation matrix without autocorrelation element is sufficient for estimating the angular positions of em radiation sources, [1], [25]. the real and the imaginary part of the elements in the first row without the autocorrelation element, are brought separately to the neurons in the input layer of the mlp network. in this way, a model is obtained that is more suitable for implementation and training in relation to the case when the complex values of these elements are taken at the input of the mlp network, [23]. accordingly, for the two-element twaa, eq. (11) can be written in the form _ _ 12 12 ( ) (re{ }, im{c }) mlp doa mlp doa f f c  = =c , (12) where c is the vector of the input variables of the mlp neural network (c = [re{c12՛} im{c12՛}].) 3.1. architecture of mlp_doa network the architecture of mlp_doa network is shown in fig. 2. it consists of a total of l layers of neurons: one input and one output layer of neurons and a total of l-2 hidden layers of neurons between them. fig. 2 architecture of mlp_doa network. the signal propagation from the input to the output of the mlp network and the corresponding transfer functions of the mlp_doa network (eq. 12) can be described by the output vectors of each network layer. the input layer is a buffer layer and, according to eq. (12), has two neurons. thus, the output vector of the input layer is y1 = c = [re{c12՛} im{c12՛}]. the output vector of l-th layer (except the input layer) can be expressed as fast doa estimation of the signal received by textile wearable antenna array based on ann model 577 llf lll l l ,,3,2)( 1 =+= − bywy (13) where yl-1 represents the output of (l-1)-th layer. in eq. (13), wl is the connection weight matrix between the (l-1)-th and the l-th layer where matrix element wli,j represents the connection weight between the j-th neuron of the (l-1)-th layer and the i-th neuron of the l-th layer, bl is the vector containing biases of the l-th layer where vector element bi l represents bias of the i-th neuron of the l-th layer, while fl() is an activation function of l-th layer neurons. the hyperbolic tangent sigmoid transfer function was used as an activation function of hidden layers 1,...,3,2,)( −= + − = − − ll ee ee uf uu uu l . (14) the output layer has one neuron with the linear activation function fl(u) = u and its output is given as llllll l l f bywbywy +=+== −− 11 )( . (15) the weight matrices w1, w2,…, wl, and bias vectors b1, b2,…, bl form the set w of the trainable parameters of the mlp network. the values of the elements of this set are adjusted during the network training with the aim that the mapping expressed by eq.(12) is realized with the desired accuracy. the general architecture of this mlp_doa neural network is represented by the notation mlph-n1-…-ni-…-nh. h and ni in this notation are the total number of hidden layers in mlp architecture (h = l-2) and the total number of neurons in the i-th hidden layer, respectively. 3.2. training and testing of mlp_doa network mlp_doa network training is performed on a set of training samples p = {(c1,1 d), (c2,2 d),..., (cs,s d),...,(cnp,np d)}, where s d is the desired value of the network output when the sample cs is brought to its input and np is the total number of training samples. to monitor the achieved degree of the network generalization, the validation set v, containing the total number of nv samples of the same format as the samples of the training set p, is applied. during the network training, the samples from the training set are brought to the network input and an iterative change of weights and biases from the set w is performed in accordance with the chosen training algorithm. the goal is to minimize the mean square error (mse) of the network output relative to the desired output values. regarding the observation of network performance at the training set, the network training is stopped either when the target mse at the training set (eptarget) is reached or if the maximum number of iterations, nimax, is reached. during the network training, the mse of the network output at the validation set, ev(w), is also monitored and when its minimum value (evmin) is reached, the training is stopped, even if the above conditions for termination of the network training are not met. in fact, when evmin is achieved, any further training of the mlp_doa network leads to the network overfitting and deterioration of its generalization abilities. in other words, this means that the problem of finding the optimal breakpoint of the neural network training comes down to 578 z. stanković, o. pronić-rančić, n. dončov finding the values of network weights and biases from the set w for which the network will have a minimum mean square error at the validation set (eq. 16). v 2 min 1 1 ( ) min ( ) 2 n d v s s w sv e w n   =   = −     (16) if during the iterative training of the mlp_doa network is noticed that the error at the validation set after a period of continuous decline begins to grow in the next mvf (maximum validation failures) successive iterations, then it is considered that the minimum error has been reached and the training should be stopped. the mvf value is set before the start of the network training. in the example of mlp_doa network training that is presented in this paper, a test set intended for checking the generalization performance of the trained network was used as a validation set in the network training process (v=t). each sample used for neural network training or testing was obtained by establishing an inverse doa mapping according to eq. (9) and averaging a large number of consecutive twaa snapshots according to eq. (10). the training and test set of the samples contain ordered triplets of the format (re{c12( d [], g [db], snr [db])}, im{c12( d [], g [db], snr [db])},  d []), where the samples are generated for different values of the angle  d and the parameter g. the mlp_doa network training set is generated by a uniform distribution of the variables  d and g as ( ) 12 12 max max (re{ ( , , )}, im{ ( , , )}, }) | [ : : ], [ : : ] d d d snr d d d d min step min step c g snr c g snr p g g g g            =       (17) where  dmin,  d step and  d max are the minimum value, step, and the maximum value of the angle  d, respectively, and gmin, gstep and gmax are the minimum value, step, and the maximum value of the parameter g in the training set, respectively. in order to assess the quality of network training, the quality of generalization of the trained network and the final choice of mlp_doa network architecture to be used for the implementation of doa module, each trained network was tested on a test set that does not contain samples used in the training process. similar to the training set, the test set was generated by a uniform distribution of the variables  d and g as 12 12( ) (re{ ( , , )}, im{ ( , , )}, ) | [ : : ], [ : : ] step step d d d snr d dt dt dt dt dt dt min max min max c g snr c g snr t g g g g            =       (18) where  dtmin,  dt step and  dt max are the minimum value, step, and the maximum value of the angle  d in the test set, respectively, and g tmin, g t step and g t max are the minimum value, step, and the maximum value of the parameter g in the test set, respectively. the following metrics were used in the neural network testing process: worst case error (wce), average test error (ate) and pearson product moment (ppm) correlation coefficient (rppm), [22]. worst case error is calculated as 1 max min ( , ) max t dn s s d d s w wce    = − = − c , (19) fast doa estimation of the signal received by textile wearable antenna array based on ann model 579 where nt is the total number of test set samples,  (cs,w) is the output of mlp_doa network when the sample cs is brought to its input, and  d max and  d min are the maximum and minimum desired values of angle  in test set, respectively. average test error is calculated as 1 max min ( , )1 t dn s s d d st w ate n    = − = −  c . (20) ppm correlation coefficient is calculated as 1 2 2 1 1 ( ( , ) ) ( ) ( ( , ) ) ( ) t t t n d d s s ppm s n n d d s s s s w r w         = = = −  − =     −  −           c c , (21) where 1 1 ( , ) tn s st w n   = =  c represents the average value of neural network output and  = = tn s d s t d n 1 1  represents the average value of expected output values. 4. modeling results simulation of twaa doa subsystem operation, generation of training and testing samples, as well as development and testing of mlp_doa modules were performed in matlab environment. the reference computer configuration used to implement doa module and for all simulations was: intel core i7-9700f cpu @ 3 ghz, with 16 gb ram. the following modeling scenario was considered: rg has radiation power of 1 w (0 dbw) and its distance from twaa is 100m. twaa wearer moves in the azimuth plane and its positions in relation to the rg change from -60° to +60°. as the wearer moves, textile creases, and the root gain ratio changes from -10 to 10 db. the antenna elements are at a constant distance d=0.5 and the number of snapshots is ns=300. for the development and testing of mlp_doa module, training and test sets are formed using the eqs. (17) and (18). the training set, p(20), is formed for snr = 20 db and test sets are formed for the following signal to noise ratio values: snr{20 db, 15 db, 10 db, 5 db, 0 db, -5 db} (denoted as t(20), t(15 ), t(10), t(5), t(0), t(-5)). the following parameter values in the eq. (17) were used to generate the training set: min = -60, step = 0.5,max = 60, gmin = -10 db, gstep = 1db and gmax = 10 db. in this way, a training set containing 5061 samples was generated. the following parameter values in the eq. (18) were used to generate the test set:  tmin = -60,  t step = 0.7,  tmax = 60, g t min = -10 db, g t step = 1.3 db, and g t max = 10 db. in this way, 2752 samples were generated for each test set. the development phase of the doa module includes training and testing of a number of different mlp_doa networks as well as selection of mlp network with the best test characteristics for the implementation of the doa module. during this phase, it is 580 z. stanković, o. pronić-rančić, n. dončov assumed that the antenna environment is almost ideal in terms of noise, the signal to noise ratio is snr=20 db. therefore, the sets p(20) and t(20) were used to train and test different mlp_doa networks. for the implementation of the mlp_doa module, mlp architectures with two hidden layers (h = 2) and a variable number of neurons in them were considered. a number of different mlp networks having n1 ≤ 8 neurons and n2 ≤ 22 neurons were trained and tested. levenberg – marquardt algorithm [22] was chosen to train mlp_doa networks by tracking the achieved degree of network generalization at the validation set. during the training of the mlp_doa networks, t(20) test set was used as a validation set. the following values of training parameters were selected: eptarget = 10 -6, nimax = 1000 and mvf = 20. testing of all trained mlp_doa networks was performed with the t(20) test set. worst case error (wce), average test error (ate) and correlation coefficient (rppm) were monitored during the test procedure in order to find the mlp_doa network capable of providing the angle of arrival of rg signal on the twaa with the best accuracy. eight mlp_doa networks that have the best test statistics are shown in table 2. it can be seen that mlp2-18-16 neural network has the lowest values of wce and ate and the highest value of rppm. therefore, this neural network was chosen for the implementation of the mlp_doa module. the test statistics obtained by the presented modelling approach are significantly better than the corresponding ones presented in [1] where the selected mlp_doa module (mlp2-10-5) had the following statistics: wce=2.7949, ate=0.3699 and rppm=0.9998546. namely, it is shown that approach in training and selection of the appropriate mlp network architecture for the realisation of mlp_doa module presented here, significantly improves the accuracy of doa estimation compared to the classical approach in mlp_doa module training presented in [1]. the scattering diagram of the selected mlp2-18-16 neural network is shown in fig. 3. in this case, a very high accuracy of doa estimation can be observed. since the mlp network of mlp_doa module was trained and tested in almost ideal noise conditions (snr=20 db), it was necessary to test the mlp_doa module in case of an environment with increased noise in order to investigate the impact of noise on its accuracy. therefore, the mlp_doa module was tested in a noisy environment with a snr of 15 db, 10 db, 5 db, 0 db, and -5 db using t(15 ), t(10), t(5), t(0), and t(-5) test sets, respectively. in order to compare the accuracy of the proposed ann approach in doa estimation of the rg signals with the classical approach based on super-resolution algorithms, the implementation of doa module with the root music algorithm was performed (root music doa module). testing of the root music doa module was performed under the same conditions and with the same test sets as in the case of the mlp_doa module. table 2 testing results for mlp_anns with the best test statistics mlp_doa network wce (%) ate (%) r ppm mlp2-18-16 0.3466 0.0344 0.9999993 mlp2-14-11 0.3479 0.0432 0.9999980 mlp2-12-12 0.3735 0.0495 0.9999975 mlp2-15-11 0.3971 0.0596 0.9999964 mlp2-17-11 0.4368 0.0627 0.9999958 mlp2-22-10 0.4685 0.0488 0.9999974 mlp2-14-12 0.4837 0.0439 0.9999978 mlp2-14-11 0.5366 0.0571 0.9999965 fast doa estimation of the signal received by textile wearable antenna array based on ann model 581 fig. 3 scattering diagram of mlp2-18-16 neural network (snr = 20 db) based on the test results, the accuracy of both modules was examined and compared for different snr values. the values of the worst case errors, average test errors and correlation coefficients obtained by the mlp_doa module and by the root music doa module versus signal-to-noise ratio are shown in fig. 4 6. it is evident that both modules have very high accuracy in the case of low noise environment (snr=20 db, 15 db, 10 db). with increasing noise, i.e., decreasing snr, there is a decrease in the accuracy of both modules, which becomes significant for snr values less than 5 db. however, in the case of increased noise, the proposed mlp_doa module achieves better results. fig. 4 worst case error versus snr obtained by mlp_doa module and by the root music doa module 582 z. stanković, o. pronić-rančić, n. dončov fig. 5 average test errors versus snr obtained by mlp_doa module and by the root music doa module fig. 6 ppm correlation coefficient obtained by mlp_doa module and by the root music doa module the scattering diagram of both modules in case of extremely high noise, snr = -5db, are shown in figs. 7 and 8. fig. 7 shows the scattering diagram of the mlp_doa module. in this case, the following test statistics were obtained: wce=79.9515, ate=5.5533 and rppm=0.9175. fig. 8 shows the scattering diagram of root music doa module. in this case, the following test statistics were obtained: wce=116.0627, ace=6.5200 and rppm=0.8376. comparing the scattering diagrams of both modules, similar conclusions can be drawn as in the previous case. both modules show significant deviation of the output values from the referent (desired) ones for a large number of samples, however, the scattering in the case of the mlp_doa module is less than the scattering of the root music module, therefore, the mlp_doa module shows less accuracy reduction in conditions of intense noise than the root music module. fast doa estimation of the signal received by textile wearable antenna array based on ann model 583 fig. 7 scattering diagram obtained by the mlp_doa module in conditions with high noise level (snr = -5 db, solid line line of ideal value matching, dashed lines boundaries of the scattering area) in addition, the average program execution time, measured on the test set with 2752 samples, for the mlp_doa module is 0.008054 seconds and for the root music doa module is 0.366337 seconds (table 3). obviously, the mlp based doa module performs doa estimation significantly faster compared to the root music doa module (approximately 45 times faster). fig. 8 scattering diagram obtained by the root music doa module in conditions with high noise level (snr = -5 db, solid line line of ideal value matching, dashed lines boundaries of the scattering area) 584 z. stanković, o. pronić-rančić, n. dončov table 3 comparison of doa estimation speed of the mlp_doa module and the root music module measured on test set (intel core i7-9700f cpu @ 3 ghz, 16 gb ram) doa module run time @ 2752 samples (s) mlp_doa module 0.008054 root music doa module 0.366337 5. conclusion an improved mlp_doa module for fast doa estimation of the rg signal arrival angle on two-element textile wearable antenna array has been proposed. the multilayer perceptron network, which was used to create this module, learned to accurately determine the position of the radio gateway in the azimuth plane from the spatial correlation matrix obtained by sampling the rg signal at twaa. since the classical approach in mlp_doa module training, did not include mechanisms to control the achieved generalization capabilities of the mlp network, in this paper the training of mlp network was performed by monitoring the generalization capabilities on the validation set of samples. the obtained mlp_doa module has an extremely high accuracy of doa estimation in low noise conditions, i.e., better modelling accuracy was achieved compared to the results obtained by the classical approach in the training of the mlp_doa module. in addition, the proposed module was compared with the root music algorithm in terms of accuracy and execution time of the program. the selected mlp_doa module was shown to have approximately the same accuracy as the root music doa module in the case of low noise conditions and less degradation of the model accuracy in a very noisy environment. besides, mlp_doa module performs doa estimation approximately 45 times faster compared to the root music doa module. creasing of textiles can cause the center frequencies of the antenna elements of twaa to shift, as well as change the distance between the antenna elements. this leads to the effect of changing the phase difference of the signals received by the antennas regardless of the change in the angular position of the rg. this effect limits the accuracy of the mlp_doa module. therefore, further research will be aimed at increasing the accuracy of the mlp_doa module by developing the methods to reduce this effect. one of the methods that will be applied is the training of mlp_doa network with the samples of rg signals emitted at two different frequencies. also, during further research, mlp_doa module for twaa with more than two antenna elements will be developed. acknowledgement: this work was supported by the ministry of education, science and technological development of republic of serbia (grant no. 451-03-9/2021-14/200102). fast doa estimation of the signal received by textile wearable antenna array based on ann model 585 references [1] z. stanković, o. pronić-rančić and n. dončov, "ann based doa estimation of the signal received by two-element textile wearable antenna array", in proceedings of the 15th international conference on advanced technologies, systems and services in telecommunications (telsiks), 2021, pp. 86-91. [2] cisco white paper, "cisco visual networking index: global mobile data traffic forecast update, 20162021 white paper", march 2017. [3] z. lin et al., "a low-power, wireless, real-time, wearable healthcare system", in proceedings of the ieee mtt-s international wireless symposium (iws), 2016, pp. 1-4. [4] t. liang and y. j. yuan, "wearable medical monitoring systems based on wireless networks: a review," ieee sensors j., vol. 16, no. 23, pp. 8186-8199, dec. 2016. [5] c. lin et al., "wireless and wearable eeg system for evaluating driver vigilance", ieee trans. biomed. circuits syst., vol. 8, no. 2, pp. 165-176, april 2014. [6] v. misra et al., "flexible technologies for self-powered wearable health and environmental sensing", proc. ieee, vol. 103, no. 4, pp. 665-681, april 2015. [7] s. saponara, "wearable biometric performance measurement system for combat sports", ieee trans. instrum. meas., vol. 66, no. 10, pp. 2545-2555, oct. 2017. [8] n. f. m. aun, p. j. soh, a. a. al-hadi, m. f. jamlos, g. a. e. vandenbosch and d. schreurs, "revolutionizing wearables for 5g: 5g technologies: recent developments and future perspectives for wearable devices and antennas", ieee microw. mag., vol. 18, no. 3, pp. 108-124, 2017. [9] b. mohamadzade, r. m. hashmi, r. b. v. b. simorangkir, r. gharaei, s. ur rehman and q. h. abbasi, "recent advances on fabrication methods for flexible antennas in wearable devices: state of the art", sensors, vol. 19, no. 10, p. 2312, 2019. [10] a. sabban, "small new wearable metamaterials antennas for iot, medical and 5g applications", in proceedings of the 14th european conference on antennas and propagation (eucap), 2020, pp. 1-5. [11] h. lee, j. tak and j. choi, "wearable antenna integrated into military berets for indoor/outdoor positioning system", ieee antennas wirel. propag. lett., vol. 16, pp. 1919-1922, 2017. [12] s. m. saeed, c. a. balanis, c. r. birtcher, a. c. durgun and h. n. shaman, "wearable flexible reconfigurable antenna integrated with artificial magnetic conductor", ieee antennas wirel. propag. lett., vol. 16, pp. 2396-2399, 2017. [13] s. su and y. hsieh, "integrated metal-frame antenna for smartwatch wearable device", ieee trans. antennas propag., vol. 63, no. 7, pp. 3301-3305, july 2015. [14] m. virili, h. rogier, f. alimenti, p. mezzanotte and l. roselli, "wearable textile antenna magnetically coupled to flexible active electronic circuits", ieee antennas wirel. propag. lett., vol. 13, pp. 209-212, 2014. [15] p. j. soh et al., "a smart wearable textile array system for biomedical telemetry applications", ieee trans. microw. theory techn., vol. 61, no. 5, pp. 2253-2261, may 2013. [16] l. c. godara, "application of antenna arrays to mobile communications, ii: beamforming and direction-ofarrival considerations", proc. ieee, vol. 85, pp. 1195-1245, 1997. [17] m. i. miller, and d. r. fuhrmann, "maximum likelihood narrow-band direction finding and the em algorithm", ieee trans. acoust., speech signal processing, vol. 38, no. 9, pp. 1560-1577, 1990. [18] r. schmidt, "multiple emitter location and signal parameter estimation", ieee trans. antennas propag., vol. 34, no. 3, pp. 276-280, 1986. [19] r. roy and t. kailath, "esprit-estimation of signal parameters via rotational invariance techniques", ieee trans. acoust., speech signal process, vol. 37, no. 9, pp. 984-995, 1989. [20] v. v. reddy, m. mubeen and b. poh ng, "reduced-complexity super-resolution doa estimation with unknown number of sources". ieee signal process. lett., vol. 22, no. 6, pp. 772-776, 2015. [21] s. haykin, neural networks, new york, ieee press, 1994. [22] q. j. zhang and k. c. gupta, neural networks for rf and microwave design, boston, artech house, 2000. [23] a. hirose, complex-valued neural networks: advances and applications, wiley, 2013. [24] z. stanković, n. s. dončov, i. milovanović and b. milovanović, "1d doa estimation of mobile stochastic em sources with a high level of correlation using mlp-based neural model", electromagnetics, vol. 38, no. 8, pp. 500-516, 2018. [25] z. stanković, n. dončov, i. milovanović, b. d. milovanović, "doa estimation of mobile stochastic em sources with variable radiation powers using hierarchical neural model", int. j. rf microwave computer-aided eng., vol. 29, no. 10, p. e21901, pp. 1-17, 2019. [26] m. agatonović, z. stanković, i. milovanovic, n. s. dončov, l. sit, t. zwick, b. d. milovanović, "efficient neural network approach for 2d doa estimation based on antenna array measurements", prog. electromagn. res., pier 137, vol. 137, pp. 741-758, 2013. instruction facta universitatis series: electronics and energetics vol. 30, n o 4, december 2017, pp. 557 570 doi: 10.2298/fuee1704557j information system for the centralized display of the transport comfort information * željko jovanović 1 , ranko bačević 1 , radoljub marković 1 , siniša ranđić 1 , dragan janković 2 1 university of kragujevac, faculty of technical sciences, ĉaĉak, serbia 2 university of niš, faculty of electronic engineering, niš, serbia abstract. this paper introduces the information system for presenting road comfort map. the map is generated based on the conducted transportations. as a basis for the information system and the source of the comfort information, developed android application is used. it calculates comfort parameters using three-axis accelerometer values. the calculated data are recorded into the files in the proper format. recorded files are uploaded on the information system to be viewed and analyzed. as a final result of all recorded transportations, it is possible to generate a map of roads comfort. the paper presents the current functionality of the system and the current roads comfort map of covered roads in serbia. based on the collected data about 50% of transportation intervals were comfortable, 44% was moderately uncomfortable, and 6% was uncomfortable. key words: android, transport comfort, gis, comfort map 1. introduction the term transport comfort cannot be strictly defined, but it is of great importance in the assessment of transport quality. the problem is the subjective comfort feeling which is different for every person. comfort depends on many factors like acceleration (vibration), noise, temperature, compartment space, etc. if only mechanical effects are of interest, then generally the acceleration and vibration that passengers feel during the ride have the greatest impact on passenger comfort. vibrations are caused by three factors: vehicle condition, driver skills (driving style), and road condition. as for vehicle condition factors, suspension system and tires are most important vehicles parts that affects on vibration. nowadays, some vehicles suspension systems have active suspension control for better received november 14, 2016; received in revised form march 2, 2017 corresponding author: željko jovanović university of kragujevac, faculty of technical sciences, ĉaĉak, serbia (e-mail: zeljko.jovanovic@ftn.kg.ac.rs) * an earlier version of this paper received best paper award in computer science section at 60 th conference on electronics, telecommunications, computers, automation and nuclear engineering (etran 2016), june 1316, zlatibor, serbia [1] 558 ž. jovanović, r. baĉević, r. marković, s. ranđić, d. janković comfort and vehicle stability. driver skills and driving style are also important. at the same road and with the same vehicle, two different drivers could provide different comfort for their passengers. sharp turning, sudden braking, and accelerating are usually marked as uncomfortable actions. road conditions are probably the most important for the passenger’s comfort and safety. they can be categorized as static and dynamic factors. static factors are commonly associated with a location, like road bumps and potholes. dynamics factors appear suddenly, like rain, snow, or landslides. also, the impact of other traffic participants is significant dynamic factor. on the basis of the above facts it is evident that many factors affect the comfort. it is very important to achieve as comfortable transportation as possible. uncomfortable transportation affects the mental and physical conditions of even healthy passengers. the impact on passengers with the health problems is even greater since uncomfortable driving could impair their medical condition. due to the discomfort location, as one of the most important information for assessing the comfort of transport, the usage of geographic information systems (gis) is of great importance. nowadays, there are several commonly used gis systems, e.g. openlayers, arcgis, openstreetmaps, geomedia, and googlemaps. usability and the possibility of integration into other application increase their popularity. the aforementioned gis systems are under constant development and new functionalities are implemented almost every day. its interactivity with the users in real time provides an increasing amount of information. the dynamics of development and user interaction information can be seen in the example of latest googlemaps gis novelty. the route from point a to the point b is colored according to the traffic jams detected on presented location. information for appropriate road color marking is gathered from numerous users of google maps navigation. as a basis for generating road maps of comfort, which are presented in this paper, features of googlemaps gis are used. the aim of the information system presented in this paper is to generate road comfort maps according to the information gathered from the users of the client android application. client android application calculate comfort parameters based on the accelerometer and gps data. the paper is organized as follows. related work is presented. android application functionalities and implemented calculations for transport comfort are demonstrated. after that, functionalities and usage of the developed web-based information system are demonstrated. at the end, overall information gathered by developed information system is presented. 2. related work in 1972, the international organization for standardization (iso) issued a standard: "a guide to the evaluation of human exposure to whole-body vibration" [2] which is still in general use. it is used for the evaluation of working conditions and exposure to the vibrations. the effect of vibration on health, at work, sitting, and other life situations is described in the paper [3]. higher vibration exposure has negative health effects. for vehicles, transport comfort is most affected by tires, suspension, shock absorbers, seats, etc. the suspension system impact on the passenger's comfort on various types of roads is presented in papers [4-6]. it is presented that suspension system has great positive effects on comfort but can't eliminate it. besides vibrations, the noise produced by tires may also affect passenger’s acoustic comfort as presented in [7]. information system for generating road comfort maps 559 gathering information from nodes to centralized unit is the trend nowadays. wireless sensor networks allow data collection and centralized processing, like in paper [8]. sometimes it is called swarm intelligence [9]. the role of smartphones is increasing in this area of research. the reason lies in the fact that smartphones equipped with sensors such as accelerometer, gyroscope, and gps are increasing their processing capabilities for better performances. some phones have processing power almost as classic computers. the paper [10] presented a system based on mobile phones to detect potholes on the roads. the phones were placed in taxi vehicles and recorded the locations of detected discomfort. for detection, only vertical (z-axis) was used. in the paper [11] smartphones are used to monitor conditions during transport. potholes, bumps, and siren sounds are detected. comfort calculations are usually based on the accelerometer signals processing. accelerometer detects dynamic movements and is also affected by the static gravity influence. for appropriate dynamic calculations, it is necessary to eliminate the static gravity influence from accelerometer signal values. this is usually done by some signal filter implementation. the authors of [12, 14] developed the automotive real-time observers and attitude estimation system, based on an extended kalman filter (ekf). the authors of [15] used high-pass filter for the road potholes detection. vibration duration exposure and interaxial influence need to be addressed. in [16] authors didn’t observe any statistically significant differences in discomfort between the 10, 15 or 20-second vibration exposure. in [17] authors showed that single axis vertical vibrations were typically associated with the less discomfort than multi-axis vibrations. also, different sensitivity for different axes is detected, for similar ranges of vibration. according to these, the data from all axes need to be collected for appropriate comfort level classification. although the vertical axis is the most influential, the others cannot be ignored. artificial intelligence usage is increasing in this field of research. in [18] neural network was used in order to analyze the quality of public transport. in [19] bayesian network was used for recognizing the mode of transport. for artificial intelligence implementation it is necessary to collect a lot of data for its training. work presented in this paper is based on the android applications, which development and use are presented in the paper [20]. measurement of comfort is realized by using values obtained from the triaxial accelerometer that helps determining the level of passenger comfort. in addition to the accelerometer, gps is used for discomfort location detection. accelerometer signals are passed through high pass filter for the static gravity influence elimination. it is set up to cut off 10% of low frequency signals. decision time intervals is set to 10s, and all three axes are used for comfort level classification. developed information system will be used to collect large number of information that could be used for artificial intelligence implementation in the future fork. 3. information system architecture developed information system is realized in form of a client-server system. client part is realized as an android application which measures comfort parameters during transportation. server part is realized in form of java web application with google maps support for data presentation. block diagram of the developed client-server information system usage is presented in fig. 1. 560 ž. jovanović, r. baĉević, r. marković, s. ranđić, d. janković fig. 1 blog diagram of the developed information system usage as presented, information gathered from all client users are stored into one database (in server part of client-server system), which allows transportation and road comfort analysis. with longer information system usage collected information will be of more significance and more detailed analyses could be performed. since smartphones are widely used nowadays, there is a large number of potential users for the developed information system. 4. android application – client application android application is based on three-axis accelerometer data calculations in the standard ten seconds time interval. the development of the main application functionalities was presented in [20] using rxjava [21] for accelerometer calculations, gps monitoring, and main application thread. to classify transport comfort, it was necessary to determine the comfort levels. 4.1. comfort levels iso standard [2] is in general use for the comfort level determination. it assumes that acceleration magnitude, frequency spectrum, and duration represent the principal exposure variables, which account for the potentially harmful effects. at the national level (serbia), there is standard ics 13.160 (srps iso 2631-1:2014 mechanical vibration and shock: evaluation of human exposure to whole-body vibration, part 1: general requirements) which is translated version of iso standard [2]. in [2-3], authors have shown that the sensitivity of the human body at different frequencies depends on the intensity of acceleration. the information system for generating road comfort maps 561 effective value of the acceleration (arms) for a discrete system is calculated according to the equation (1): 2 2 2 1 2 1 ( ... ) zrms z z zn a a a a n     (1) where azi is the i th z-axis acceleration (vertical axis) and n=200 is a number of samples. for real android application usage accelerometer sampling is performed 20 times per second and decision interval is set to 10s. according to [2] comfort levels are defined and presented in table 1. only vertical axes are used for comfort level classification. table 1 comfort levels according to iso standard 2631-1 [2] arms [m/s 2 ] comfort level 0-0.315 not uncomfortable 0.315-0.63 a little uncomfortable 0.5-1 fairly uncomfortable 0.8-1.6 uncomfortable 1.6-2.5 very uncomfortable > 2 extremely uncomfortable in developed android application, all axes arms values are used for the comfort level classification. three comfort levels are chosen and defined in a way presented in table 2. table 2 used comfort levels in the developed android application arms [m/s 2 ] comfort level 0-0.315 comfortable 0.315-1 little uncomfortable > 1 uncomfortable by comparing table 1 and table 2 it can be seen that first comfort level (comfortable) from table 2 is the same as the first comfort level (not uncomfortable) from table 1. little and fairly uncomfortable levels from table 1 are merged to one level (little uncomfortable) in table 2. also, very and extremely uncomfortable levels from table 1 are merged to one level (uncomfortable) in table 2. 4.2. all implemented calculations the android application is designed to calculate parameters in standard time intervals. accumulated vibrations are calculated according to equation (1). this calculation is performed for all three axes (rms_x, rms_y, and rms_z). besides these values, for every acceleration sample, the application calculates the magnitude of all three-axis accelerations (2). 𝑎 = √a + a + a (2) 562 ž. jovanović, r. baĉević, r. marković, s. ranđić, d. janković where axi, ayi, azi are x-, y-, and z-axis acceleration in the i-th sample, and airms is the magnitude of all three-axis accelerations. according to these, arms for the decision time interval is calculated according to equation (3). 𝑎 = √ ∗ (a + a + ⋯ + a ) (3) where airms is the i th magnitude of all three-axis accelerations, and n is the number of samples. for the most uncomfortable sample, maximum magnitude values (apeak) and their three-axis values (apeak_x, apeak_y, and apeak_z) are calculated. for detected apeak the gps data (latitude, longitude, speed, and time) are stored. beside these, all three-axis maximum accelerations (max_x, max_y, and max_z) are calculated. as a result of the decision time interval calculations, the following values are saved:  idt: location marker id  rms_x: x-axis calculation according to equation (1)  rms_y: y-axis calculation according to equation (1)  rms_z: z-axis calculation according to equation (1)  arms: acceleration magnitude according to equation (3)  apeak: acceleration magnitude maximum value in the values calculated by equation (2)  apeak_x: x-axis acceleration value in apeak  apeak_y: y-axis acceleration value in apeak  apeak_z: z-axis acceleration value in apeak  latitude: gps data for apeak  longitude: gps data for apeak  time: gps data for apeak  speed: gps data for apeak  max_x: x-axis real value where absolute maximum x-axis value is detected  max_y: y-axis real value where absolute maximum y-axis value is detected  max_z: z-axis real value where absolute maximum z-axis value is detected  comfort: (0=comfortable; 1=a little uncomfortable; 2=uncomfortable) 4.3. file formats developed android application on a used mobile device saves the final result in files. measurement data are located in the file name separated by the symbols "--" (two hyphens). these are: the time when the measurement is taken, title and description of the measurement, a unique user id, and measurement id. below is the example of the file name format. // format date--title--description--userid—measurementid.txt // example 2016-04-02-09-02-52--transport—mladenovac to belgrade--13--183.txt the data that need to be stored in the database are also separated with "--" symbols. each file row data presents calculated data for one decision interval. file row data order needs to be the same as the column order in a database for a successful insert. as a row information system for generating road comfort maps 563 delimiter, symbol ";" is used. below is the example of file row data format and a sample data row. // format measurementid -userid --rmsx--rmsy---rmsz--arms--apeakx--apeaky-apeakz--apeak--latitude--longitude--time--speed--maxx--maxy--maxz-description; // example 183--13--0.0627--0.033--0.0917--0.1159--0.4161---0.0571--0.0059--0.42-44.4534864--20.6799459--2016-04-02-09-03-01--0--0.4161---0.2515--0.295-merenje; 5. web application for centralized data processing – server application the server part of the developed client-server information system (vibromap) is realized as java web application. it is developed as multilayer application in the model view controller (mvc) architecture. for data presentation, the view part of the mvc, java server pages (jsp) is used. the controller part is developed using servlets, while the model is based on java beans and the data access object (dao) classes for communication with the mysql database. developed information system allows to its users to preview and analyze saved transportations (created by the developed client android application) in a gis form. the server part of a client-server information system has two types of users, registered users and administrators. vibromap registered user functionalities are presented by the use-case diagram shown in fig. 2. fig. 2 the vibromap use-case diagram after the registration and successful login, the registered user can upload the created measurements, preview them in gis or chart form, and also delete them. list of measurement 564 ž. jovanović, r. baĉević, r. marković, s. ranđić, d. janković can be presented by date (today or dates in between). main functionality is the measurement upload. the measurement upload algorithm is presented in fig. 3. fig. 3 the measurement upload algorithm successfully created files with developed android application can be uploaded to the vibromap by the registered users by choosing the measurement upload functionality. the information system for generating road comfort maps 565 choose file window will appear and the created file needs to be selected. selected file is sent to the controller part of vibromap to the servlet called servletuploadfile. it creates a temporary file with uploaded file and checks its name format and content format. after a successful check, an existence of the saved file’s measurementid in the database is performed. if the id doesn't exist, the file data are stored in the database. if the id exists, then the rows numbers in the file and the database (rows with uploaded measurementids) are compared. if the file rows number is higher than the database rows number, then the database rows are deleted. this functionality is implemented for the future work in real time android application data upload because the internet connectivity loss could lead to incomplete data recording to the database. after successful upload the data are presented in gis format using google maps. every file row is presented with an appropriate marker on the map. in a case of error in any of the presented validations, an appropriate error message is displayed and data upload is canceled. copying the data from the file into the database is performed by the sql query in the following format. load data local infile file_location into table table_name fields terminated by '--' lines terminated by ';' (column names,…); file data are first stored in the temporary table. then, dependending on the marker type, data are copied to appropriate table (marker or analysis). after a successful copy, the uploaded file and the temporary data are deleted. beside registered users’ type, vibromap has administrator user type. it can preview data of all registered users and generate overall road comfort map which is the most important functionality presented in this paper. this functionality is realized using googlemaps gis. the google provided googlemaps api in june 2005 which allowed its integration into third party applications. the integration is performed by client-side scripting using javascript and ajax. as default map center, latitude and longitude of the city of ĉaĉak, serbia, are chosen, with zoom value of 8. this location is chosen since most of the measurements are conducted in its surrounding. to take advantage of gis, data with location parameters like latitude and longitude need to be passed to the map. these data are usually stored in a server side application database. googlemaps api is realized using client side language (javascript). the server side java programming language functionalities are included to perform database queries. by combining serverside java programming language and client-side javascript programming language, the appropriate data arrays for gis presentation are created. every array member is presented by one marker on the map. as additional functionality, the actions could be added to the markers by defining the appropriate marker listener using google.maps.event.addlistener function. this enables info window preview for the marker click action. all calculated data for the clicked location are presented. 5. presentation of the usage scenario of the developed information system with every uploaded measurement to the web application, new data about transportation comfort are stored in the database. road comfort map becomes more complete and more comprehensive. the presented system is in usage since december 2015. first upload was on december 7, with measurement created on relation ĉaĉak-užice. until competition of this paper, 38 successful file uploads were performed. fig. 4 presents all collected data (zoomedout). 566 ž. jovanović, r. baĉević, r. marković, s. ranđić, d. janković fig. 4 the complete road comfort map (zoomed-out) the presented information system is mostly developed on faculty of technical sciences in ĉaĉak. therefore, most of the measurements were conducted from the city of ĉaĉak, serbia, to the other cities. at this stage, comfort map could be analyzed on several destinations: ĉaĉak-užice, ĉaĉak-beograd, ĉaĉak-kraljevo, ĉaĉak-kragujevac, and the part of mladenovac-beograd destination. on the zoomed-out map it is not easy to detect the comfort value (marker color) for certain location. since the map is interactive, it could be zoomed into the desired location for a detailed preview. the road comfort map of ĉaĉak main roads is presented in fig. 5. fig. 5 the road comfort map, city of ĉaĉak, serbia since the developed android application is most widely used for intercity relations, only the main roads are marked. the marker color presents first level of information for measured comfort level: information system for generating road comfort maps 567  green comfortable interval,  yellow semi-uncomfortable interval,  red – uncomfortable interval. every presented marker contains information for ten seconds driving interval around its location. marker location is a location with highest detected discomfort value (max apeak) in ten seconds comfort decision interval. detailed information (second level of information) are presented by click on the desired marker in the info window form. this is shown in fig. 6. fig. 6 the preview of all calculated data for the desired location, presented by clicking on the desired marker in the info window form as presented in fig. 6, besides gps data (latitude, longitude, time, and speed) all calculated parameters over accelerometer signals are displayed. thanks to these, it is possible to provide additional analyses about comfort, or the discomfort cause. by increasing the number of the uploaded measurements, the map becomes more complete and provides more information. it is important to mention that some relations are measured only once while some have several conducted measurements. the fig. 7 presents the part of the relation ĉaĉak-beograd which is measured three times. by analyzing the fig. 7, it can be seen that some locations from different measurements are marked with the different color markers, and some with the same color markers. in the case of the same color markers, every measurement detected the same comfort level. in the case of different color markers, detected comfort level from different measurements differed. there are several causes for this situation. the first cause is the elapsed time between measurements. it is possible that a location that was comfortable in the meantime had a problem (the appearance of potholes, bumps, landslides ...) and that in the next measurement became uncomfortable. also, it is possible that the uncomfortable location is repaired in the meantime by the road maintenance department and therefore become comfortable. the second cause lies in the variety of vehicles that are used for the measurement. to be precise, on the same road different vehicles have different comfort levels for their 568 ž. jovanović, r. baĉević, r. marković, s. ranđić, d. janković passengers according to the vehicles class and condition. in any case, the presented comfort is detected inside the vehicle. fig. 7 the preview of the relation with multiple conducted measurements the third cause is driver and its driving skills. driver with more experience adjust its driving based on the road conditions. this could result with the comfortable driving even though road conditions are not satisfying. also, the opposite scenario is possible, when the less experience driver makes sharp turning or hard braking for not being able to adapt to the road conditions. the forth cause lies in the influence of the other traffic participants. their activities like overtaking or braking could result with lower comfort level. based on all measurements performed so far it is possible to analyze the transportation comfort statistics. the statistics for all 38 performed measurements are shown in table 3. table 3 the statistic of performed measurements title value (%) number of measurements 38 total number of markers 6923 comfortable markers 3500 (50,56) semi uncomfortable markers 3012 (43,50) uncomfortable markers 411 (5,94) since developed android application is used only in intercity transportations, presented statistics demonstrate transportation comfort for the part of serbia primary roads. according to the results, around 6% of driving intervals was very uncomfortable, while almost 44% have some discomfort cause in it. overall only 50% of intervals was comfortable. 6. conclusion in this paper, a web-based information system for generating transport comfort maps is presented. measurement of transportation comfort is conducted using android application presented in [20]. presented information system provides the centralized data information system for generating road comfort maps 569 functionalities and gis preview in the form of interactive googlemaps. the advantages of presented information system are the web-based approach, whereby the data are available to all users of the system worldwide. information that the system can provide can be of great importance when choosing the route for the next transportations. if some of the already recorded paths are going to be used, reviewing the routes previous transportations before setting off on the trip can provide information about expected road conditions. the implementation of googlemaps api enables an efficient and interactive display of the measured data. the benefits of the developed information system are increasing with every newly conducted measurement. as a future work some improvements are planned. the first one is a grouping of close markers into a cluster of markers and providing one overall piece of information based on all gathered markers data. this would provide road comfort information based on all conducted transportations in the presented location. the second one is the communication between client android application and the server part of the information system. detected comfort values using android application could be transferred to the developed information system in real time. this would allow an insight into the current status of each vehicle with an active android app in real time. the third one is implementation of artificial intelligence in comfort level recognition where the collected data would be used for its training. acknowledgment: the work presented in this paper was funded by grant no. tr32043 for the period 2011-2016 from the ministry of education and science of the republic of serbia. references [1] z. jovanovic, r. bacevic, r. markovic, s. randjic and d. jankovic, "information system for generating road comfort maps", in proceedings of the 60 th international conference etran, zlatibor, serbia, 2016, rt 5.6. [2] iso 2631-1:1997 mechanical vibration and shock -evaluation of human exposure to whole-body vibration -part 1: general requirements [3] m. j. griffin, handbook of human vibration, elsevier, 1996. [4] f. yi and s. zhang, "ride comfort simulation under random road based on multi-body dynamics", in 3rd international workshop on intelligent systems and applications, ieee, 2011, pp. 1–3. [5] j. sun and q. yang, "advanced suspension systems for improving vehicle comfort", in proceedings of the international conference on automation and logistics, ieee, 2009, pp. 1264–1267. [6] s.a. abu bakar, p.m. samin and a.a. azhar, "modelling and validation of vehicle ride comfort model", applied mechanics and materials, vol. 554, pp. 515–519, 2014. [7] j. ahmad kadri, w.m. wan zuki azman, m.n. zulkifli, m.j.m. nor, a. kamal ariffin and m. hosseini fouladi, "a study on the effects of tyre to vehicle acoustical comfort in passenger car cabin", in proceedings of the 3rd international conference on computer research and development, ieee, 2011, pp. 342–345. [8] nikolic, n. neskovic, r. antic and a. anastasijevic, "industrial wireless sensor networks as a tool for remote on-line management of power transformers’ heating and cooling process", facta universitatis, series: electronics and energetics, vol. 30, no. 1, pp: 107–119, 2017. [9] f. elfouly, r. ramadan, m. mahmoud and m. dessouky, "swarm intelligence based reliable and energy balance routing algorithm for wireless sensor network", facta universitatis, series: electronics and energetics, vol. 29, no. 3, pp. 339-355, 2016. [10] mednis, g. strazdins, r. zviedris, g. kanonirs and l. selavo, "real time pothole detection using android smartphones with accelerometers", in proceedings of the international conference on distributed computing in sensor systems and workshop, dcoss’11, 2011. 570 ž. jovanović, r. baĉević, r. marković, s. ranđić, d. janković [11] p. mohan, v.n. padmanabhan and r. ramjee, "trafficsense : rich monitoring of road and traffic conditions using mobile smartphones", in proceedings of the 6th acm conference on embedded network sensor systems, acm 2008, pp. 323-336. [12] j. cuadrado, d. dopico, j. perez, et al., "automotive observers based on multibody models and the extended kalman filter", multibody system dynamics, vol. 27, no. 1, pp. 3-19, 2012. [13] j. lee, e. park, s. robinovitch, "estimation of attitude and external acceleration using inertial sensor measurement during various dynamic conditions", ieee transactions on instrumentation and measurement, vol. 61, no. 8, pp. 2262-2273, 2012. [14] j. blanco-claraco, j. torres-moreno, a. giménez-fernández, "multibody dynamic systems as bayesian networks: applications to robust state estimation of mechanisms", multibody system dynamics, vol. 34, no. 2, pp. 103-128, 2015. [15] j. eriksson, l. girod, b. hull, r. newton, s. madden and f. balakrishnan, "the pothole patrol: using a mobile sensor network for road surface monitoring", in proceedings of the 6th international conference on mobile system, applications and services, 2008, pp. 29-39. [16] j. dickey, m. oliver, p. boileau, t. eger, l. trick and a. edwards, "multi-axis sinusoidal whole-body vibrations: part i how long should the vibration and rest exposures be for reliable discomfort measures?", journal of low frequency noise, vibration and active control, vol. 25, no. 3, pp. 175-184, 2006. [17] j. dickey, t. eger, m. oliver, p. boileau, l. trick and a. edwards, "multi-axis sinusoidal whole-body vibrations: part ll relationship between vibration total value and discomfort varies between vibration axes", journal of low frequency noise, vibration and active control, vol. 26, no. 3, pp. 195-204, 2009. [18] c. garrido, r. de oña, j. de oña, "neural networks for analyzing service quality in public transportation", expert systems with applications, vol. 41, no. 15, pp. 6830-6838, 2014. [19] g. xiao, z. juan, c. zhang, "travel mode detection based on gps track data and bayesian networks", computers, environment and urban systems, vol. 54, pp. 14-22, 2015. [20] z. jovanovic, r. bacevic, r. markovic, s. randjic, "android application for observing data streams from built-in sensors using rxjava", in proceedings of the 23 rd telecommunition forum telfor, ieee, 2015, pp. 918–921. [21] reactivex, (n.d.). http://reactivex.io/ (accessed october 5, 2015). facta universitatis series: electronics and energetics vol. 30, n o 4, december 2017, pp. 627 638 doi: 10.2298/fuee1704627a a novel supply voltage compensation circuit for the inverter switching point alexandru-mihai antonescu, lidia dobrescu faculty of electronics, telecommunication and information technology, university “politehnica” of bucharest, romania abstract. the present work proposes an innovative circuit that is able to compensate the inverter switching point voltage variation due to supply voltage change. the circuit is designed to work for a 1.6v to 2v supply voltage range. the operation principle includes the back gate effect and an original transistor switching. key words: adaptive threshold, inverter, back gate effect 1. introduction increasing modern circuits working frequency precision and low current consumption are important prerequisites for a modern design. the logic gate delays are used for periodical signal generation and time synchronizing. at high speed, the gate delay approach is to be considered, due to its low power consumption, reduced area, simplicity in design and large integration. when using logic gates as delays, the gate delay is proportional with the supply voltage variation. for a ring oscillator case, the frequency lowers as the supply voltage rises due to the fact that the stage capacitors are charging to higher voltage. the inverter schematic is exposed in figure 1. the usual approach is to design the inverter switching point to be half of the supply voltage. the switching point of the inverter is denoted with . the transfer characteristic for the former stated case is shown in figure 2. received january 24, 2017; received in revised form may 18, 2017 corresponding author: lidia dobrescu faculty of electronics, telecommunication and information technology, university “politehnica” of bucharest, 1-3 iuliu maniu boulevard, district 6, bucharest, romania (e-mail: lidiutdobrescu@yahoo.com) fig. 1 cmos inverter schematic 628 a. antonescu, l. dobrescu fig. 2 cmos inverter transfer characteristic [1] in region 1 of the transfer characteristic of the inverter m2 is in on state and m1 in off state. as the input voltage rises over the m1 threshold voltage, the transistor enters conduction state and m2 remains in on state. this is represented as the entering point in region 2. as the input voltage further increases, m1 turns on completely. at half of the second region represented with c is the switching point. in the second region, both transistors are in saturation state. as the circuit is leaving the second region, m2 transistor starts to turn off and, eventually, it reaches the full off state as it enters the third region. for the design of half the supply voltage switching point, both devices have the same drain current. under this assumption, the next equation can be written 1 [1]: 2 2 ( ) ( ) 2 2 pn sp thn dd sp thp v v v v v      (1) after some equation processing, one can find equation 2 [1] which calculate the switching point voltage of the inverter; βn and βp are the nmos and pmos transistor transconductances, vthp and vthp are the pmos and nmos device thresholds: ( ) 1 n thn dd thp p sp n p v v v v          (2) the ratio between the pmos and nmos devices geometry can be estimated using equation 3, where x is a factor between 2 and 4, depending on the technology used: ( / ) ( / ) n p w l x w l  (3) 2. controlling the inverter switching point if the inverter is at the switching point, rp1 and rp2 form a resistor divider. the output resistance of the nmos and pmos devices are given by equations 4 and 5, λ is the body effect parameter: a novel supply voltage compensation circuit for the inverter switching point 629 2 1 ( ) 2 on n gs thn r v v       (4) 2 1 ( ) 2 op p sg thp r v v       (5) continuing with the resistor divider analogy, the inverter switching point is given by the following equation: ,on sp dd tot on op tot r v v r r r r     (6) an increase of the threshold of the nmos device will result in a decrease of the drain current and an increase of the device resistance. on the other hand, an increase of the absolute value of the pmos threshold will result in an increase of the device current and a decrease of the device resistance. the decrease of the pmos resistance will lead to the lowering of rtot; the value of will rise. on the nmos side, increasing the device resistance will lead to the increase of rtot; vsp will go down. these statements are expressed in a simplified manner in equation 7 and 8:  spdsndnthn vriv (7)  spdspdpthp vriv (8) the transistor threshold can be modified using the back gate effect, by regulating the bulk voltage accordingly, and sensing the threshold on an inverter with the output tied to the input. as stated in [2], the transistor threshold is influenced by the bulk voltage according to equation 9: 0 ( | 2 | | 2 |) th th f sb f v v v       (9) in figure 3, the inverter schematic with bulk control voltages and the intrinsic device diodes are shown. fig. 3 back gate controlled cmos inverter 630 a. antonescu, l. dobrescu in normal operation, the bulk is tied to the source and diodes dp1 and dn2 are shorted out. the other two diodes, dp2 and dn1 are reversed biased [5] and only a small current crosses them. when the source bulk voltage is applied, diodes dp1 and dn2 become forward biased and, if the correct amount of voltage is applied on the bulk of the transistor, the diodes can enter conduction. this situation is illustrated in figure 4. fig. 4 bulk diode forward biasing as resulting from figure 4, the maximum back bias voltage should not exceed 600mv. for the sake of safe design, the back bias voltage will be designed not to exceed 550mv. the downside of bulk biasing is that across the forward biased bulk diode there will flow a small amount of current no matter how small the bulk voltage will be. also, the nmos device has to be isolated in a separate well. regarding the pmos transistor, this device is already isolated due to the nwell in which it is constructed. the good part is that the diodes will start to conduct when having a drop of at least 0.6v. also, from the stand point of view of area, all the nmos isolated devices can share the same well [5]. 3. adaptive threshold circuit 3.1. modified inverter schematic first the inverter supply voltage switching point variation must be evaluated. then, the supply voltage range will be split into two domains, symmetrical around the centre value. for this particular case, the centre value for the supply voltage is 1.8v, the minimum and maximum being 1,6v and, respectively, 2v. the switching point voltage will be increased for the low range of the supply and decreased for the high range. by referring to the original inverter schematic, at half the supply range the original switching point characteristic and the adjusted one will cross. simulations lead to the conclusion that the supply range is too large to be adjusted only by bulk regulation. therefore, the approach of switching a part of the original transistors for each range has been taken into consideration. a novel supply voltage compensation circuit for the inverter switching point 631 the new inverter schematic is shown in figure 5. fig. 5 modified inverter schematic a similar principle is proposed in [3]. this is done by using multiple fingers for the nmos transistor. these transistors, are floating gate type, and need to have their threshold programmed. by programming or erasing, there is added or subtracted one or several nmos devices. this method, although very effective, cannot be used to adjust continuously the inverter switching point. the inverter is made out of transistors p1, p3, n1, and n2; without switching (p2 and n3 are conducting), p1 and p3 are acting as an equivalent pmos transistor of 4w channel width, and n1, n2 as a nmos transistor of 2w channel width. p2 and n3 transistors are in charge of disconnecting transistors p3, respectively, n2, in the regions where the inverter threshold cannot be adjusted only by bulk regulation. the two transistor used as switches, are controlled by signals disc_p and disc_n. p3 represents ¼ of the total pmos width and n2 ½ of the nmos width. the bulks are common and yet to be externally controlled for p1 and p3 pair, and n1, n2 pair. by disconnecting p3 pmos transistors (p2 is off), the inverter threshold will be lowered. in the same manner, by disconnecting n2 nmos transistors (n3 is off), the inverter threshold will rise. the centre of the supply span is taken as reference. this is the reason that the 1.8 volt supply is the point where the circuit will switch from increasing the switching point voltage to decreasing it. that means that the circuit will switch from regulating the pmos bulk voltage to regulating the nmos bulk voltage. the regulation of both transistors bulks will not bring any functional improvement, because the effects are in opposite directions. switching p3 off (which has ¼ of the total pmos width) will decrease the switching point voltage. on the other hand, switching n2 off (which has ½ of the total nmos width) will increase the switching point voltage. this is the reason for the extra two sub domains that control the switching of p3 and n2; this brings a rough adjustment where 632 a. antonescu, l. dobrescu the body effect can no longer tune the vsp variation. to describe better the functionality of the circuit, the voltage domains are exposed in figure 6. fig. 6 threshold adjustment principle perfect symmetry of the domains is desired but it would be very hard to obtain. the goal is to layout the complementary pmos and nmos transistors as a number of identical fingers having the same dimensions. also, the width of the composing finger should be a fractional number but containing only the first digit after the point. there is no use to try to obtain high precision when switching the extra transistors, because their effective length and width will change due to process variation. in the end, the maximum bulk voltage will be different between the pmos and nmos devices. this is also due to different mobility and geometry of the complementary devices. in the end, the importance of this circuit is to obtain a threshold with a low variation and to keep the bulk voltages under the maximum limits. the circuit needs two regulation loops, one for the pmos bulk and one for the nmos bulk and a supply voltage monitoring block. the supply monitor will be composed out of 3 independent circuits, each signalling one of the voltage intervals shown in figure 6. 3.2. adaptive threshold circuit the proposed adaptive threshold circuit is exposed in figure 7. fig. 7 adaptive threshold circuit a novel supply voltage compensation circuit for the inverter switching point 633 the bulk voltages are generated using regulated current to voltage converters [6]. the centre inverter, enclosing the schematic exposed in figure 5, is used as switching point reference. for regulating the pmos bulk voltage is used the operational amplifier oa1, nmos transistor n3, resistor r1 and ilim1 current source (this one being a client form the general biasing mirror). oa1 compares the switching point voltage of inverter inv with the vthr_adj voltage. if vsp is lower, the vbulk_p voltage will be increase and the threshold will go up. on the other hand, the nmos bulk voltage is regulated by oa2, pmos transistor p6 and current source ilim2. oa2 compares the vsp value with vthr_adj voltage and regulates the vbulk_n node voltage in order to decrease the threshold. both ilim values are set to 5ua so that the voltage drop across r1 and r2 would be limited to about 500mv. with a 10% variation of the bias current, the maximum bulk voltage will not exceed the safe operation area, depicted in figure 4. the value for the vthr_adj voltage is the inverter switching point at half the supply domain; in this case for a supply of 1.8v it will be 0.75v. setting this parameter is critical, in order to obtain the lowest variation. the supply monitor block has three outputs, comp_1v7, comp_1v8 and comp_1v9. the comp_1v7 controls the dis_n signal of the inverter, disconnecting the extra nmos transistor for a supply voltage under 1.7v and connecting it when the supply exceeds this limit. comp_1v9 signal controls the dis_p signal, keeping the extra pmos connected for supply voltages under 1.9v and disconnecting it when the voltage limit is exceeded. comp_1v8 controls which of the pmos or nmos bulks are regulated. for supply voltages under 1.8v, the pmos bulk is regulated, and over 1.8v, the nmos bulk is regulated. the operational amplifiers have basically the same schematic, the only difference is that oa1 pulls down the output when disabled and oa2 pulls is up. this is done in order to disconnect n3 and p6 when the opamps are off. in this manner, the pmos bulk will be pulled up, and nmos bulk pulled down, by r1 and r2 resistors. the supply monitor schematic is exposed in figure 8. fig. 8 supply monitoring circuit 634 a. antonescu, l. dobrescu the resistor divider has been drawn as four independent resistors (normally it contains several resistor fingers, each finger hasthe same number of squares and resistance value); the bias for the operational amplifiers is omitted in the picture. each of the amplifiers is using an external 1ua bias current. the resistor divider composed out of r1, r2, r3, r4 and the disabling p1 transistor, reduces each supply voltage threshold to the value of the system voltage reference vbg=1.2v. the three opamps, oa1, oa2 and oa3 compare the two values (the certain tap of the resistor divider and the bandgap voltage reference) and switch the output to logic 1 when the resistor tap voltage exceed the bandgap voltage value. p1 transistor has the role of switching off the resistor divider, in order to reduce current consumption when the circuit is disabled. in simulation, the reference voltage is provided by an ideal voltage source. the schematic of comparator is depicted in figure 9. fig. 9 comparator schematic the comparator uses a trans-admittance amplifier. the topology was chosen due to the capacitive load driving capability. p2 and p3 pmos transistors act as active loads for the differential input stage. the input offset is optimised by providing high matching between the differential pair bias currents. this is done by using a topology that offers high precision biasing for the differential input stage of the comparator. to enhance even more the schematic, the nmos current mirror (n7-n8) which regulates the current between the active load current mirrors (p1p2 and p3-p4) is cascaded with transistors n5 and n6. the cascode topology also enhances the output resistance of the block. 3. simulation, results and discussion figure 10 exposes the circuit operation. the source-bulk and bulk-source voltages define the two operation modes. a novel supply voltage compensation circuit for the inverter switching point 635 fig. 10 inverter switching point and bulk voltages simulation the regions where the back bias increases (pmos) and decreases (nmos) rapidly represent the switching points for the extra transistors. from simulation, the inverter switching point can be tuned by body effect only between 1.7v and 1.9v supply voltage. for the nmos bulk regulation, the bulk voltage almost reaches the maximum safe operation voltage. the simulation was done for a 1.6-2v supply voltage sweep, using cadence spectre. the temperature set is 27ºc, and it used the typical corner for simulation. finer adjustments can be made. for example, the extra nmos that is disconnected can have the width a bit lower. although this adjustment is possible, it would require to work with transistor fingers of 0.5w or smaller, reaching the minimum technological size. comparing the original threshold with the adjusted one, there is to conclude that the circuit is doing its job properly. figure 11 presents the difference between the original inverter, with all transistors working and no bulk regulation, and the same inverter that has both bulk regulation and transistor switching. fig. 11 comparison between the original and improved inverter switching point 636 a. antonescu, l. dobrescu the two characteristics are close to meet on the 1.8v supply voltage value. there is a little offset due to the fact that the original inverter had the switching point a bit higher than 0.75v at vdd=1.8v and the regulated value was chosen to be 0.75v. also, when reaching the upper part of the supply range, the nmos bulk regulation cannot cope with the variation anymore. same thing happens around half the designed supply voltage but the threshold decreases. the maximum values for some important signals, in the case of the, nominal, 25ºc simulation, are underlined in table 1: table 1 circuit relevant signals absolute values for the nominal corner vsp_adj vsp_orig vsb_p vbs_n nom(vdd=1.8v) 0.7504 0.7572 0.0743 0 min 0.7433 0.6801 0 0 max 0.7601 0.8368 0.2902 0.5282 delta 0.0168 0.1567 delta[%] 2.24 20.7 the switching point value in the table, represent the total variation reported to the central 1.8v supply voltage value (0.75v). the adjusted switching point variation is almost ten times lower than the original one. the overall technological corner variation is depicted in table 2. table 2 inverter switching point overall corner variation detailed corner simulation results are underlined in table 3 and figure 12. table 3 inverter switching point technological corners simulation results temp. [°c] corner fast fast_hh fast_hl fast_lh fast_ll fast fast_hh fast_hl fast_lh fast_ll fast fast_hh fast_hl fast_lh fast_ll min. [v] 0.7409 0.7326 0.7326 0.7484 0.7484 0.7389 0.7389 0.7389 0.7478 0.7478 0.7474 0.7375 0.7375 0.7481 0.7481 max. [v] 0.7667 0.7595 0.7595 0.7736 0.7736 0.7633 0.7545 0.7545 0.7717 0.7717 0.7636 0.7531 0.7531 0.7739 0.7739 delta [v] 0.0258 0.0269 0.0269 0.0252 0.0252 0.0244 0.0156 0.0156 0.0239 0.0239 0.0162 0.0156 0.0156 0.0258 0.0258 delta[%] 3.44 3.59 3.59 3.36 3.36 3.25 2.08 2.08 3.19 3.19 2.16 2.08 2.08 3.44 3.44 corner slow slow_hh slow_hl slow_lh slow_ll slow slow_hh slow_hl slow_lh slow_ll slow slow_hh slow_hl slow_lh slow_ll min. [v] 0.7472 0.7391 0.7391 0.7491 0.7491 0.7487 0.7434 0.7434 0.749 0.749 0.7483 0.7401 0.7401 0.7487 0.7487 max. [v] 0.773 0.7658 0.7658 0.7797 0.7797 0.7679 0.7593 0.7593 0.7761 0.7761 0.7665 0.7563 0.7563 0.7765 0.7765 delta [v] 0.0258 0.0267 0.0267 0.0306 0.0306 0.0192 0.0159 0.0159 0.0271 0.0271 0.0182 0.0162 0.0162 0.0278 0.0278 delta[%] 3.44 3.56 3.56 4.08 4.08 2.56 2.12 2.12 3.61 3.61 2.43 2.16 2.16 3.71 3.71 corner typ_hh typ_hl typ_lh typ_ll typ_hh typ_hl typ_lh typ_ll typ_hh typ_hl typ_lh typ_ll min. [v] 0.7356 0.7356 0.7488 0.7488 0.7408 0.7408 0.7483 0.7483 0.7384 0.7384 0.7484 0.7484 max. [v] 0.7622 0.7622 0.7763 0.7763 0.7565 0.7565 0.7735 0.7735 0.7542 0.7542 0.7747 0.7747 delta [v] 0.0266 0.0266 0.0275 0.0275 0.0157 0.0157 0.0252 0.0252 0.0158 0.0158 0.0263 0.0263 delta[%] 3.55 3.55 3.67 3.67 2.09 2.09 3.36 3.36 2.11 2.11 3.51 3.51 -40 25 85 a novel supply voltage compensation circuit for the inverter switching point 637 fig. 12 inverter switching point – technological corners simulation results 4. conclusions the present work proposes a compensation technique for the inverter switching point supply voltage variation. this is based on bulk voltage regulation and variable inverter geometry. the circuit has a maximum power supply switching point variation of 4% (simulated in technological corners, complete with temperature variation). this represents the difference between the maximum and minimum vsp values, and it is computed using the 0.75v switching point (at 1.8v supply voltage) value as reference. compared to the original inverter (simulated in the nominal corner at 25ºc), the compensation scheme brings an improvement of 16% for vsp variation (or a 5 times lower variation). the compensation technique implies nodal capacitance variation according to circuit operation and supply voltage variation. for this reason, there is the need to take precautions when using the proposed circuit for generating time delays. an oscillator circuit is a typical application for the proposed circuit. schematic is proposed in [9]. the additional circuitry, for bulk voltage regulation and transistor switching, brings both additional area and power consumption. for the typical application presented in [9], the frequency accuracy increased almost 6 times for, at most, 20% higher power consumption. acknowledgement: the paper includes the results of a distinct research work at starting point of the first author’s phd thesis. 638 a. antonescu, l. dobrescu references [1] r. jacob baker, “circuit design, layout and simulation”, ieee press series on microelectronic systems, 2002. [2] yannis tsividis, “operation and modeling of the mos transistor, second edition”, oxford university press, 2003. [3] j. segura, j. l. roselle’s, j. morra, and h. sigg, “a variable threshold voltage inverter for cmos programmable logic circuits”, ieee journal of solid-state circuits, vol. 33, no. 8, august 1998. [4] p. gray, r. meyer, p. hurst, s. lewis, “analysis and design of analog integrated circuits”, john wiley & sons, inc., 2009. [5] a. hastings, “the art of analog layout”, prentice hall, 1997. [6] d. johns, k. martin, “analog integrated circuit design”, john wiley & sons, inc., 2011. [7] g. streel, d. bol, “study of back biasing schemes for ulv logic from the gate level to the ip level”, journal of low power electronics and applications, 2014. [8] c. toumazou, g. moschytz, b. gilbert, “trade-offs in analog circuit design”, kluwer academic publishers, 2004. [9] a. antonescu, l. dobrescu, “70 mhz oscillator circuit based on constant threshold inverters”, in proceedings of the 10 th international symposium on advanced topics in electrical engineering, bucharest, march 25-27, 2017. instruction facta universitatis series: electronics and energetics vol. 30, n o 4, december 2017, pp. 639 646 doi: 10.2298/fuee1704639d exact analytical solutions of continuously graded models of flat lenses based on transformation optics mariana dalarsson 1 , raj mittra 2 1 department of physics and electrical engineering, linnaeus university, växjö, sweden 2 emc laboratory, department of electrical engineering, the pennsylvania state university, university park, pa, usa abstract. we present a study of exact analytic solutions for electric and magnetic fields in continuously graded flat lenses designed utilizing transformation optics. the lenses typically consist of a number of layers of graded index dielectrics in both the radial and longitudinal directions, where the central layer in the longitudinal direction primarily contributes to a bulk of the phase transformation, while other layers act as matching layers and reduce the reflections at the interfaces of the middle layer. such lenses can be modeled as compact composites with continuous permittivity (and if needed) permeability functions which asymptotically approach unity at the boundaries of the composite cylinder. we illustrate the proposed procedures by obtaining the exact analytic solutions for the electric and magnetic fields for one simple special class of composite designs with radially graded parameters. to this purpose we utilize the equivalence between the helmholtz equation of our graded flat lens and the quantummechanical radial schrödinger equation with coulomb potential, furnishing the results in the form of kummer confluent hypergeometric functions. our approach allows for a better physical insight into the operation of our transformation optics-based graded lenses and opens a path toward novel designs and approaches. key words: flat lenses, graded permittivity and permeability models, transformation optics, exact analytical solutions 1. introduction flat lenses designs based on transformation optics (to) and using left-handed (negative refractive index) metamaterials have been discussed in a number of recent publications ([1], [2]). basically, using the electromagnetic design, one is able to design a lens with the full functionality of a conventional lens, but compressed in space and possibly having additional functionalities. it is possible to do this in a wide range of operating frequencies, including microwave, terahertz and optical. however, the metamaterial composites proposed for such designs may be difficult to manufacture, received april 4, 2017; received in revised form may 21, 2017 corresponding author: mariana dalarsson department of physics and electrical engineering, linnaeus university, 351 95växjö, sweden (e-mail: mariana.dalarsson@lnu.se) 640 m. dalarsson, r. mittra especially when the required values of relative magnetic permeability and relative dielectric permittivity are less than unity, as argued in [3]. in order to avoid problems with fabrication of metamaterials with suitable values of magnetic permeabilities, it is possible to set the value of and to vary only to create the desired refractive index of √ , but at the cost of decreasing the efficiency ofthe composite lenses [4]. in [4] a plano-concave lens has been designed with metamaterials to obtain a gain above 13 db in the frequency band between 10 and 12 ghz. such a lens has a narrow bandwidth typical for a majority of designs using metamaterials. the conventional flat lens designs, using ray optics (ro) approach, avoid the abovementioned difficulties with to designs, but they do not have the same flexibility to control the phase and amplitude of the fields within the lens structure. an approach to remedy the drawbacks of both to and ro designs is the field manipulation (fm) method, described in [3]. the studies of the flat-lenses design approaches mentioned above, however, generally require a direct numerical approach in solving the field equations. in the present paper, we use an alternative approach and investigate the possibilities to identify and study some special designs that allow for the exact solutions of the field equations analogous to those obtained in studying various planar and cylindrical metamaterial structures [5] [11]. the main motive for pursuing analytical solutions of the problems involving flat lenses is that the detailed knowledge of analytical structure of the field solutions may provide additional insights leading to improved or even entirely new designs. we apply our approach to a specific case of a gradient-index (grin) flat lens. 2. problem formulation and field equations the graded index (grin) approach to the design of a flat lens is based on the concept of field transformation, similar to that proposed by luneburg for the design of spherical lenses [12]. similarly to luneburg's approach, a desired field distribution in the output port (the exit aperture) is specified and the medium parameters of the intervening medium are determined such that the given field distribution in the input port (input aperture) is transformed to the desired field distribution in the exit plane. in many practical cases, this can be performed by tracing rays through a designed inhomogeneous medium. the design parameters of the lens include center frequency, focal length, thickness, and gain. the physical size (diameter d) of the lens will depend on the gain and the radial model function (e.g. radial dependence of the permittivity). one typical design layout is shown in fig. 1. the design goal is to maximize the performance of the lens, and for that purpose we want to realize the desired phases on the face b of the lens while simultaneously maximizing the transmission coefficient over a broad frequency band. the problem is typically solved using a multi-layer structure, with the desired phase at the center frequency and a transmission coefficient as close to one as possible over abroad frequency band for each of the ten rings shown in fig. 1. in fig. 1 the following symbols are used: t – thickness of the lens f – focal length of the lens i – phase of the plane wave incident from the left on the face a a, b – the notation for the two faces (a and b) of the lens exact analytical solutions of continuously graded models of flat lenses 641 fig. 1 flat grin lens. left: cross section (side view) of the lens showing layers; right: top view of the lens the middle layer perform a majority of the phase transformation, while the other layers act as matching layers to maximize the transmission of the waves incident from either side (graded antireflection structure). in the present approach we model the discrete structure shown in fig. 1 by a cylindrical composite structure with the electric permittivity and permeability being continuous spatial functions ( ) ( ), ( ) ( ) (1) where ( ) is the set of cylindrical coordinates and the structure is centered around the z-axis. we consider a case of te-wave propagation through the structure, so that the electric and magnetic field are ( ) , ( ) ( ) (2) here we note that the choice of te-waves is by no means a restriction, and writing an analogous procedure for tm-waves is straightforward. in the case of te-waves as described by (2), maxwell equations for the scalar field components become ( ) , ( ) ( ) (3) ( ) (4) substituting equations (3) into (4), we obtain helmholtz equation for the electric field ( ) ( ) ( ) ( ) ( ) ( ) ( ) (5) or introducing a new function ( ) ( ) ( ) ( ) (6) where . the equation (5), or (6), is quite general. after choosing suitable model functions ( ) ( ) ( )and ( ) ( ) ( ), if we can determine the analytic solution for the electric field ( ), then using (3) we can readily obtain the magnetic field components ( ) and ( ) as well. the challenge is therefore to find suitable model functions ( ) ( ) ( ) and ( ) ( ) ( ) that provide 642 m. dalarsson, r. mittra a reasonable resemblance of actual design structures like the one described in table 1 and fig. 2 of [3]. 3. analytics of a simple model of composite designs at this stage, we need to restrict the form of the functions (1) to allow for a suitable analytical solution. let us here consider a simple model where ( ) ( ) ( ) ( ( ) ) , ( ) (7) in (7) we require that at large distances ( ) the composite permittivity ( ) becomes unity, which describes the gradual transition to the free space outside the structure. this is simultaneously the condition for the antireflective behavior of the lens surface and thus the maximum input electromagnetic flux. utilizing (7) and separating variables using the ansatz ( ) ( ) ( ) ( ), the equation (6) gives rise to two ordinary differential equations for the two functions, ( ) and ( ), as follows ( ) (8) [ ( ) ] (9) as indicated in (8), the solutions for ( ) are simple plane waves propagating in the z-direction, and we only need to solve equation (9). introducing ( ) √ ( ), the equation (9) becomes * ( ) + (10) let us now introduce two constants , , whereby the equation (10) becomes the well-known radial schrödinger equation * ( ) ( ) + (11) where we notice the following analogy between the parameters of the electromagnetic equation (11) and the parameters of the usual quantum-mechanical radial schrödinger equation ( ) ( ) (12) since we require that ( ) when , the simplest model that we can adopt is the coulomb potential ( ) ( ) (13) where α is a constant that must be chosen to provide the best fit to the presented graded model. such a choice of ( ) introduces an unphysical singularity of the permittivity function for , but with a proper choice of boundary conditions it can provide a sufficiently accurate model of the realistic graded permittivity structures. substituting ( ) from (13) into (11) we obtain * ( ) + (14) exact analytical solutions of continuously graded models of flat lenses 643 the equation (14) has an exact analytical solution ( ) ( ) ( ) (15) where ( ) ( ) and ( ) ( ) are whittaker functions that can be expanded in terms ofkummer confluent hypergeometric functions f1 and u. based on the asymptotic behavior of the whittaker functions for and , and the physical requirements on the behavior of the electric field functions ( ), we see that we must choose c2 = 0, such that for , we have ( ) √ ( ) √ ( ) (16) and for waves propagating in the positive z-direction, we can write ( ) ( ) ( ) ( ) (17) it is here convenient to express the result (17) in terms of kummer confluent hypergeometric functions, in order to further clarify the mathematical properties of the electric field intensity function. thus, we finally obtain ( ) ( ) ( ) (18) the result (18) for the electric field intensity function ( ) refers to the φcomponent of the electric field due to the assumed te-wave as defined in (2). it should however be noted that the assumption of the te-wave is by no means limiting the generality of the results obtained in the present paper. the case of the tm-wave is fully analogous to the case of the te-wave, and the only difference is that the result (18) is then valid for the magnetic field intensity function ( ) which refers to the φcomponent of the magnetic field. the electric field components ( ) and ( ) are then readily obtained using the tm-wave analogues of the equations (3). the choice of the te-wave in the present paper was made for illustration purposes. following the approach in [3], the relative permittivities are here assumed to be real functions and no dielectric losses are taken into account. it should however be noted that there is nothing in the present theory that limits the values of the relative permittivities to be real. it is fully feasible to use the present model with complex relative permittivities as well. this will be the subject of our future studies. 4. study of a specific numerical case let us now turn to the specific case of a grin lens studied in [3], where we have a structure with radially graded permittivities for the middle layer, as listed in table 1. table 1 radially graded permittivities of the middle layer of a grin lens. layer 1 2 3 4 5 6 7 8 9 10 ̅ ( ) 1.5875 4.7625 7.9375 11.1125 14.2875 17.4625 20.6375 23.8125 26.9875 30.1625 ( ̅) 25.5 24.5 22.3 18.5 14.55 10.5 7.65 5.5 3.5 1.65 644 m. dalarsson, r. mittra using the model function (7) with (13), we obtain the fitting graph as shown in fig. 2, where we have chosen the parameter α to be equal to 0.36. fig. 2 fitting of grin lens relative permittivity data using coulomb function with . the cross section of the solution (18) for a constant z is shown in fig. 3. fig. 3 cross section of the electric field function e(r, z) for given constant z (z = 0), with c1 = 1, f = 30 ghz, k = 2π f/c and kz = 0.8 k. exact analytical solutions of continuously graded models of flat lenses 645 finally, a three dimensional plot of the solution (18) is shown in fig. 4. fig. 4 electric field function e(r, z). from fig. 4 we readily see how the wave is radially focused while moving along the zdirection, as expected. the size of the wave amplitudes is not normalized with respect to any starting position, and does not reflect any specific initial electric field strength. even though the coulomb function is far from the optimum fit for the grin lens data, the obtained results can be used to describe simply and sufficiently accurately the chosen lens. it should be noted here that our choice of the model function (coulomb function) has been made based on the well known analytical solutions for that function. there is a number of other functions that also allow the exact analytic solutions of the problem at hand, in particular if the model is extended to allow the graded permeability of the lens layers. the studies of other models involving such more accurate model functions will be the subject of our coming papers. 646 m. dalarsson, r. mittra 5. conclusions the possibility to find exact analytic solutions for the electric and magnetic fields in continuously graded flat lenses has been studied. the flat lenses are modeled as compact composites with continuous permittivity and permeability functions which asymptotically approach unity at the boundaries of the composite cylinder. in order to illustrate the present approach, we obtain an exact analytic solution for the electric field intensity for an fm composite lens with constant magnetic permeability ( z) and radially dependent dielectric permittivity. in our coming research efforts, we see the need to look for the possible models with exact analytical (or at least perturbational and/or wkb) solutions for graded profiles of some more complex flat-lens designs studied in literature. references [1] r. yang, w. tang, and y. hao, "a broadband zone plate lens from transformation optics", optics express, vol. 19, no. 13, pp. 12348 12355, 2011. [2] d. a. roberts, n. kundtz, and d. r. smith, "optical lens compression via transformation optics", optics express, vol. 17, no. 19, pp. 16535 – 16542, 2009. [3] s. jain, m. abdel-mageed and r. mittra, "flat-lens design using field transformation and its comparison with those based on transformation optics and ray optics", ieee antennas and wireless propagation letters, vol. 12, pp. 777 – 780, 2013. [4] t. driscoll, g. lipworth, j. hunt, n. landy, n. kundtz, d. n. basov, and d. r. smith, "performance of a threedimensional transformation-optical flattened luneburg lens", optics express, vol. 20, no. 12, pp. 13264 13273, 2012. [5] m. dalarsson and p. tassin, "analytical solution for wave propagation through a graded index interface between a right-handed and a left-handed material", optics express, vol. 17, no. 8, pp. 6747 – 6752, 2009. [6] m. dalarsson, m. norgren, and z. jaksic, "lossy gradient index metamaterial with sinusoidal periodicity of refractive index: case of constant impedance throughout the structure", journal of nanophotonics, vol. 5, no. 1, pp. 051804-1 – 8, 2011. [7] m. dalarsson, m. norgren, n. doncov, and z. jaksic, "lossy gradient index transmission optics with arbitrary periodic permittivity and permeability and constant impedance throughout the structure," journal of optics, vol. 14, no. 6, pp. 065102-1 – 7, 2012. [8] m. dalarsson, m. norgren, t. asenov, n. doncov, and z. jaksic, "exact analytical solution for fields in gradient index metamaterials with different loss factors in negative and positive refractive index segments," journal of nanophotonics, vol. 7, no. 1, 073086-1 – 13, 2013. [9] m. dalarsson, m. norgren, t. asenov, and n. doncov, "arbitrary loss factors in the wave propagation between rhm and lhm media with constant impedance throughout the structure", pier, vol. 137, pp. 527 – 538, 2013. [10] m. dalarsson, m. norgren, and z. jaksic, "exact analytical solution for fields in a lossy cylindrical structure with linear gradient index metamaterials", pier, vol. 151, pp. 109–117, 2015. [11] m. dalarsson, and z. jaksic, "exact analytical solution for fields in a lossy cylindrical structure with hyperbolic tangent gradient index metamaterials", optical and quantum electronics, vol. 48, no. 3, pp. 1–6, 2016. [12] r. k. luneburg, "mathematical theory of optics", university of california press, berkeley, 1964. instruction facta universitatis series: electronics and energetics vol. 29, n o 4, december 2016, pp. 653 674 doi: 10.2298/fuee1604653p high performance digital current control in three phase electrical drives  ljiljana s. peric, slobodan n. vukosavic university of belgrade, dept. of electrical engineering, belgrade, serbia abstract. majority of contemporary static power converters makes use of three-phase, pwm controlled igbt inverters. typical applications include electrical drives and grid connected converters. in both cases, a high closed loop bandwidth is highly desirable. the bandwidth is constrained by the problems of the feedback acquisition. the feedback errors are caused by the noise, parasitic phenomena and by the current ripple at the pwm frequency. the errors can be reduced by considering deriving the average value of the output current within the past pwm period. this feedback acquisition method reduces the noise, but it also introduces delay into the feedback lines. along with delays brought in by digital pwm, the feedback delay reduces the range of stable gains and limits the closed loop bandwidth. effects of the delay can be reduced by conveniently placing the control interrupt and adopting an optimum parameter setting which meets both the bandwidth requirements and the robustness against the noise and the parameter changes. experimental verification proves that the proposed current controller achieves the response speed and the robustness against the noise which outperforms the competitive solutions. key words: current control, high-performance control, signal acquisition, ac motor drives, three phase inverters 1. introduction electrical drives are frequently used to control the speed or the position of the work piece or the tool. in such cases, the speed and position controllers are used as the outer control loop, which provide the torque reference. the later determines desired currents that have to be injected into the stator windings in order to obtain the desired torque. digital current controllers are the inner loop of the drive, [1], [2]. the bandwidth of the current loop determines the torque response time. therefore, it determines the overall performance of the drive [3], [4], such as the closed loop bandwidth of the speed or position loop. for the proper operation of the drive, it is essential to decouple the flux control loop and the torque control loop. the basic prerequisite for the decoupled flux and torque control is a fast and robust current controller [5]-[7]. in high speed drives, the received august 28, 2015; received in revised form november 13, 2015 corresponding author: slobodan n. vukosavic university of belgrade, dept. of electrical engineering, 11000 belgrade, serbia (email: boban@ieee.org) 654 lj. s. peric, s. n. vukosavic fundamental frequency ff of the stator currents and voltages can reach considerable values. in some cases, ff can reach a considerable fraction of the sampling frequency fspl of the current controller [8]-[11]. in such cases, it is of particular importance to have a high closed loop bandwidth of the digital current controller. although the objective of this paper is to maximize the closed loop performances in the presence of delays, and it is reasonable to expect that the devised control measures would also improve the closed loop performances with very high fundamental-to-switching-frequency ratios, we did not discuss nor did we verify such performances in this paper. a high bandwidth of the current controller is also important in grid-connected static power converters, such as the three phase inverters that regenerate into the grid, used in conjunction with wind-power plants and solar-power plants. in order to inject undistorted, sinusoidal currents into the grid, the closed loop current controllers have to overcome the nonlinearities such as the lockout time, reduce the line harmonics, and to secure a very low factor of total harmonic distortions (thd). for this to achieve, digital current controllers should have a quick response and a high disturbance rejection. several imperfections and nonlinearities make this requirement difficult to achieve. performance enhancement requires the proper modeling and accounting for such imperfections and nonlinearities [8], [10], [11]. certain popularity is gained by the dead-beat and predictive controllers [13], due to their capacity to cope with transport delays, but their wider use is hindered by pronounced sensitivity to changes in system parameters. digital current controllers are frequently located in synchronous dq frame. this is done in order to achieve constant steady-state references, instead of sinusoidal steady-state references, that would have to be tracked with the controllers located in stationary coordinate frame. synchronous controllers have the possibility to achieve the steady state performance with zero phase error and zero amplitude error. controller structure includes proportional and integral action. when used with elevated fundamental frequencies ff, the current controller has to be enhanced by the dq decoupling actions [10], [11], [14]. in cases where the pwm delays and imperfections are negligible, and where the current ripple does not impair the feedback acquisition process, conventional pi controllers can be tuned by using well known tuning procedures [11], [12], [14], derived by applying the imc concept [5]. neglecting the imperfections, the synchronous frame current controllers can reach the bandwidth frequency of 0.11∙fspl. with stationary frame controllers, it is possible to achieve the bandwidth frequency of 0.07∙fspl [12]. in cases where the pwm ripple of the output current is not negligible, as well as in cases where the pwm delays and lockout time delays impair the sampling process and contribute to parasitic alias components, it is not possible to use the conventional structure and parameter setting of digital current controllers. the current controller designed in this paper reduces the sampling errors by taking the average value of the feedback signals over the past pwm period. in essence, the well known technique of oversampling is applied within the current controller environment, paced by the pwm carrier, and implemented on an industrial dsp as a time-skewed onepwm-period-averaging. the method uses an automated, dma-driven oversampling. the feedback signal is calculated from a large number of equally spaced samples collected within the past two sampling periods. with double-update mode, the two sampling periods correspond to one pwm period. although this approach reduces the feedback errors caused by the noise and the ripple, it also introduces delay into the feedback lines. the feedback delay adds to the delay contributed by the digital pwm. in a conventional high performance digital current control in three phase electrical drives 655 implementation, where the control interrupt gets triggered by the zero-count and the period-count of the pwm carrier, the equivalent transport delay encountered with the proposed one-pwm-period-averaging reaches 2.5 sampling periods, thus reducing the range of stable gains and limiting the closed loop bandwidth. effects of the delay can be reduced by conveniently shifting the control interrupt and adopting an optimum parameter setting which meets both the bandwidth requirements and the robustness against the noise and the parameter changes. in order to deal with transport delays in digital current controllers, and to provide and error-free feedback acquisition, the authors recently designed and tested several solutions. the most important previous results are given in [20] and [21]. as well as this paper, [20], [21] deal with digital current controllers. therefore, it is of interest to clarify what has been done in [20], [21], and what is the contribution proposed in this paper. in [20], delay compensation is performed by introducing a differential control action with the proper d-gain setting. in this paper, we took a different approach. we do not use the differential action. instead, we rely on the time-skewed acquisition window of fig. 9, which permits rescheduling of the interrupt as an effective way of coping with delays. it is also of interest to compare the feedback acquisition technique used in [20], [21], and in this paper. the approach used in this paper is similar, but not identical to the one used in [20], while the approach of [21] is quite different than both. in [20], the impact of the lockout time, the motor cable capacitance and the switching noise on the feedback errors incurred with conventional regular-sampling-double-update approach are experimentally verified, suggesting the need for the use of the oversampling. thorough experimental evidence of [20] is obtained with variable length of the motor cable, and it proves that the pwm-period-based oversampling and decimation results in considerable reduction of the sampling errors. the method is coined into one-pwm-period-averaging, and it takes the average of the samples acquired within one tpwm window encircled by the interrupt ticks. while the method used in this paper also uses the oversampling-decimation, the current samples are acquired within a different acquisition window. in fig. 9, the new acquisition window is shifted by texe with respect to the interrupt ticks. the time shift texe corresponds to the execution time of the control interrupt. the skew between the corresponding sampling windows can be observed by comparing [20, fig. 5] and fig. 9. in [21], the authors deal with the current controllers which do not use the oversampling/ decimation, but rely instead on the conventional regular-sampling-double-update approach. this approach is applicable to noise-free sampling cases, such as the one where the inverter is integrated within the motor housing. the structure of the current controller of [21] uses p, i, and d actions, and it is different that the controller considered in this paper. in [21], the parameter setting deals with the three gains, and it takes into account the controller capability to suppress the impact of the electromotive-force disturbances. in this paper, the parameter setting focuses on p and i gains, and it does not consider the disturbance rejection capability. the objective of parameter setting rules in [21, page 7, criterion function q] is different than the objective of the parameter setting rules in section 5 of this paper. the former results in an optimum p-i-d gains, while the later deals with a p-i controller and it searches for the optimum p gain while maintaining a constant p/i ratio. the feedback acquisition technique, the structure of the current controller, and the goal of the parameter setting procedure in [21] are different that those proposed in this paper. this paper is organized as follows. the system with three phase igbt-based inverter, the typical load and the digital controller are reinstated in section 2. the feedback 656 lj. s. peric, s. n. vukosavic acquisition system is discussed in section 3, with the proposal of the error-free sampling scheme which operates with a minimum indispensable delay. in section 4, the two competitive structures of the digital current controller are analyzed and discussed. the optimum parameter setting is proposed in section 5, focused of achieving quick response, robustness, and high rejection of the input disturbances. experimental results obtained with the proposed current controller are included in section 6. conclusions are given in section 7. 2. synchronous-frame digital current controller most 3-phase digital current controllers are either employed in electrical drives or in grid connected static power converters. the former have the task of controlling the stator currents of 3-phase ac machines, while the later control the current injected into the 3phase ac grids. an igbt inverter, used to supply an ac machine, is shown in fig. 1. the ac line voltage is rectified to obtain the dc voltage e. the voltage e feeds the 3-phase inverter. by means of the pulse width modulation (pwm), the inverter generates variable frequency, variable amplitude voltages, required for the proper current control. in fig. 2, the two igbt inverters are used to take the energy from the wind turbine and pass it into the ac grid. one of the inverters (on the right) has the function rather similar to the one in ac drives, and it controls the stator current of the ac generator, thus controlling the flux and torque of the machine. the inverter on the left in fig. 2 takes over the energy that comes through the dc link circuit and passes the energy into the grid. it has to control the current injected into the grid. the grid current has to be sinusoidal, with a low thd, and with the power factor which depends on the active power and reactive power commands. in both fig. 1 and fig. 2, the three phase inverters are used as the voltage actuators, which supply the voltages required for the proper current control. fig. 1 three phase inverter as the voltage actuator within an electrical drive. the three phase inverters of figs. 1 and 2 are nonlinear voltage actuators that cannot supply a continuously changing voltage. they are linearized by means of the pulse width modulation, illustrated in fig. 3. in each switching period tpwm, the output phases are connected to the upper rail of the dc bus during ton conduction interval, and then switched to the lower rail of the dc bus during toff = tpwm ton conduction interval. in this way, the average value within the switching period is uav = eton/tpwm, where e is the dc voltage across the dc bus, while the voltage uav is referred to the minus rail of the dc bus. in the prescribed way, the switching bridge as a voltage actuator is linearized. in each switching interval tpwm, it provides the average voltage which can be adjusted by altering the value of ton. high performance digital current control in three phase electrical drives 657 fig. 2 the use of the 3-phase inverters as the voltage actuators within the power conversion system which takes over the energy from a variable speed generator (right) and recuperates the energy into the ac grid. the left side of fig. 3 illustrates asymmetrical pwm technique, while symmetrical pwm is given in the right. due to the inferior performances, asymmetrical pwm is not used in 3-phase inverters. in both electrical drives and grid side power converters, the 3phase inverters use the symmetrical pwm technique. other techniques which are also used have the same sequences as the carrier-based symmetrical pwm. in cases where the output voltage of the 2-level 3-phase inverter is generated by the space vector modulation, the resulting pwm pattern can be proved equal to the pattern obtained with symmetrical pwm where the modulating signal is conveniently changed. fig. 3 pulse width modulation with asymmetrical (left) and symmetrical carrier. the waveform of the line-to-line voltage uab is given in lower right. in fig. 3, the intervals where the digital controller executes the control algorithm are designated by exe. one execution occurs during the rising edge of the carrier, while the successive execution takes place during the falling edge. in other words, there are two executions in each tpwm. therefore, the sampling time of the current controller is tspl = tpwm/2. each execution instants calculates a new value for the conduction intervals ton for the phases a, b, and c. due to uav = eton/tpwm, calculated intervals ton actually represent the voltage commands. the ratio m=ton/tpwm represents the modulation index. in fig. 3, the modulation indices are denoted by ma, mb, and mc. considering the execution instant between b0 and a1 in fig. 3, it calculates the values of ton (m) that cannot be applied before the instant a1, when the carrier reaches the period count and starts do decline. therefore, the effects of the calculated ton (m) take place between a1 and b1, thus affecting the falling edges of the phase voltage pulses. 658 lj. s. peric, s. n. vukosavic the same way, the execution instant between a1 and b1 produce ton and m that would be applied only after the instant b1, when the carrier reaches zero, thus affecting the rising edge of the phase voltage pulses between b1 and a2. delays are introduced due to the hardware properties of the pwm peripheral units. in order to avoid multiple commutations within a single period tpwm, the values of modulation signals ma, mb, and mc are reloaded into the pwm comparators only at instants where the pwm carrier reaches either zero or the period count. intrinsic delay of the pwm peripheral unit introduces a transport delay of tspl/2 into the voltage actuator. this delay has to be taken into account when designing the structure and deciding parameters of the digital current controller. a simplified schematic of the inverter supplied stator winding is given in fig. 4, made by adopting the subsequent assumptions. for both the induction motors and synchronous motors, it is reasonable to assume that the magnetic flux within the machine exhibits very slow changes, compared to the desired dynamics of the loop. the same assumption holds for the rotor speed. at the same time, the electromotive force emf is the product of the flux and the rotor speed. therefore, it is reasonable to assume that the electromotive force emf has the role of a slow, external disturbance within the current control system. fig. 4 simplified schematic of the inverter supplied stator winding. this schematic does not reflect the coupling between phases in an electric machine. in cases where the inverter of fig. 1 generates the three phase system of symmetrical, sinusoidal voltages, the average value of the three output phase voltages corresponds to the center-point of the dc bus, denoted by the ground symbol in fig. 4. if the three phase winding of the ac machine is symmetrical, and the electromotive forces are symmetrical and balanced, then the star connection of the stator winding remains at the potential of the ground, as denoted in fig. 4. in such cases, the stator winding can be represented by simplified schematic of fig. 4. similar considerations can be drawn for grid side connected power converters with series l filter. the only difference is that the phase voltages of the ac grid replace the electromotive forces of the stator winding, while rl parameters of the output filter and the grid replace the elements r and l in fig. 4. in order to perform the current control task, it is necessary to acquire the feedback signals, namely, the value of the stator current. characteristic waveforms are given in fig. 5. due to the pulsed nature of the stator voltage, the current has the fundamental component and a superimposed ripple. the ripple comprises the spectral component at the pwm frequency fpwm and a certain spectral content at fpwm integer multiples. with tspl = tpwm/2, the most of the ripple energy resides at the nyquist frequency and impairs the sampling process. with assumed linear change of the ripple (curve a in fig. 5), it would be possible high performance digital current control in three phase electrical drives 659 to acquire the samples at the center of each voltage pulse, whether positive or negative, and obtain a ripple free feedback (i1) (regular-sampling-double-update). due to rl nature of the winding impedance, the waveform of the ripple assumes the form of the curve b in fig. 5. moreover, the center of the voltage pulses gets affected by unpredictable effects of the lockout time and gating signal delays. therefore, an attempt to acquire a single sample in each halfperiod of the pwm would result in sampling errors, denoted by i2 in fig. 5. one of the ways is to respect the limits imposed by kotelnikov sampling theorem, that is, to filter out any spectral content above fspl/2 = fpwm. considering a limited resolution of the analog-to-digital converter, it is usually considered quite sufficient to reduce any spectral content in the forbidden area below the level of 1 lsb of the adc. even so, a heavy low pass filtering would be required to complete the task, thus introducing unacceptable delays, phase errors and amplitude errors. alternative ways of securing an error-free sampling without introducing considerable delays is explained in the following section. fig. 5 pwm-related ripple and the fundamental component of the stator current. 3. an advanced feedback acquisition system the main problem in acquiring the current feedback is the presence of the current ripple, caused by the pulsed nature of the inverter voltages. the ripple has a triangular form, illustrated in fig. 6. most of the spectral energy of the ripple resides at the pwm frequency, with some minor components located at integer multiples of fpwm. the consequential sampling errors, illustrated in fig. 5 can cause considerable performance deterioration of the current controller performance. in ac drives environment, single-sample feedback acquisition of fig. 5 is prone to sampling errors [17], [18]. with relatively large dv/dt values at the inverter output, the switching causes parasitic oscillations of the voltage and current [17]. the frequency of such oscillations is well above the nyquist frequency. the parasitic lc elements that give rise to poorly damped parasitic oscillations are present even with a rather short inverter-motor cable [17]. in all the sampling schemes where the feedback is obtained from a single sample in each sampling period, the oscillations above the nyquist frequency introduce the sampling errors. in a 3-phase ac controller, the switching instants continuously change according to the voltage command, and their position relative to the sampling instant is variable. the consequential sampling noise is larger when the switching comes close to the sampling instants, where the switching-excited parasitic oscillations may contribute to considerable feedback errors [18]. regular-sampling-double-update approach introduces the sampling errors even in cases where 660 lj. s. peric, s. n. vukosavic the switching-noise-oscillations are much lower, such as the case with ac drives with integrated motor-inverter and no cable. the feedback errors are introduced whenever the sampling instants slide away from the zero-crossing instants of the current ripple. any imperfection, parasitic effect or delay that moves the ripple-zero-crossing from the sampling instant introduces the sampling errors. one way of reducing the such errors is finding the average value of the current in each pwm period by oversampling. fig. 6 sampling scheme with pwm-period averaging. the oversampling technique aided with digital filtering/averaging is well known and widely used [16], [20]. in order to reduce the measurement errors in an industrial current control environment, we implemented the oversampling to perform time-skewed onepwm-period averaging of the feedback signals on an industrial dsp. devised technique is rather simple. a similar technique has been introduced and tested in [20]. thorough experimental evidence of [20] proves that the proposed oversampling and decimation technique, also called one-pwm-period-averaging results in considerable reduction of the sampling errors. while the feedback acquisition proposed in [20] takes the average of the samples acquired between the interrupt events, the acquisition window proposed in this paper (fig. 9) uses another acquisition window. it is skewed by texe, where texe corresponds to the execution time of the control interrupt. in [20], the feedback delay is compensated by extending the controller with a differential control action. in this paper, we avoid the differential action and rely on the time-skewed acquisition window (fig. 9) which enables an effective reduction of time delays. the basic approach taken is [20, fig. 5] is similar, but not identical to the time-skewed approach illustrated in fig. 9, where the oversampling window is shifted by texe. the general description of the oversampling/decimation is found in [20] in a brief, eight lines paragraph above (12). at the same time, neither this specific implementation nor the analysis of consequential delays were published by other authors. a more detailed description of the one-period-averaging is reinstated in this section, along with the necessary information and the model of the transport delay, which is required for the proper understanding of the next steps and for the further analysis. later on, the time-skewed sampling window of fig. 9 is emphasized and discussed, as it represents the difference between the feedback acquisition of [20] and the feedback acquisition used in this paper. the feedback averaging is implemented on a low cost dsp controller. the implementation of the time-skewed one-pwm-period averaging on an industrial dsp requires some skill, as the resources are cost-limited and the programming is not trivial. we are also aware of the possibility to implement the relevant algorithm on the high performance digital current control in three phase electrical drives 661 laboratory control platform dspace microautobox ii, which has the hardware support for the synchronization of a/d and pwm processes, and it also supports the oversampling. this opens the possibility to implement and verify the time-skewed onepwm-period averaging in laboratory environment, with much lower effort, and without the need to change the dsp code of the actual industrial drive. the current ripple can be reduced by the sampling scheme outlined in fig. 6. the feedback signal at instant (n+1)tspl is calculated from a number of equidistant samples, acquired within the past pwm period, starting from (n-1)tspl and ending with (n+1)tspl. the number of equidistant samples acquired within each tpwm = 2tspl interval can be very large. with contemporary digital signal processors, it is possible to scan all the adc channels each 1s. hence, with a typical fpwm = 10 khz, it is possible to acquire up to 100 successive samples of all the analog channels. for practical reasons, the number of samples is usually adjusted to 2 n , hence, either 32 or 64 samples. the samples are collected automatically, by an internal dma machine, without an additional overload of the cpu. collected samples are automatically stored in a designated region of the internal ram, thus made ready for further processing. during each control interrupt, it is necessary to find the sum of the samples acquired over the past pwm period, and to calculate their average value by dividing the sum by the number of samples. in cases where the oversampling factor is equal to the power of two (2 n ), division can be replaced by simple right-shifting of the sum. the sum of the samples corresponds to the average value of the output current in stationary coordinate frame. transformation into the stationary frame requires the proper angle between the two frames. the averaged feedback signals have to be transferred into the synchronous frame by using the average angle within the same pwm period where the actual samples have been acquired. in both direct and inverse park transformations, the angle has to be time-synchronized with the samples that are being transformed. with a considerable number of samples, it is reasonable to assume that the average value of the collected samples corresponds to the average value of the sampled current within the preceding tpwm period. therefore, value of i f n+1 at instant (n+1)tspl can be expressed in terms of the current samples in-1, in and in+1, representing the instantaneous value of the output current at instants (n-1)tspl, ntspl, and (n+1)tspl. it is of interest to notice that the samples in-1, in and in+1 are not actually acquired, and they are not available in the dsp ram. they are mentioned in an effort to relate the feedback signals to the output response of the actual system. namely, the dsp controller does not acquire the samples at instants (n-1)t, nt, and (n+1)t, and it does not have the information on the actual output (i dq in fig. 7). the feedback loop is closed by using the signal if dq in fig. 7, obtain by onepwm-period averaging. for the purposes of the subsequent analysis, it is necessary to find appropriate model of the delay, and to express the feedback i f n+1 in terms of the samples in-1, in and in+1. the samples in-1, in and in+1 coincide with the zero-count and the period-count of the pwm carrier (figs. 3 and 5). with regular-sampling-double-update, tpwm=2tspl, and the average value of the inverter voltage is changed in each tpwm/2. with l/r >> tspl, and neglecting the current ripple, the remaining ripple-free component of the output current has a quasi-linear change between (n-1)tspl and ntspl, as well as between ntspl and (n+1)tspl. with this assumption, the average value of the output current from (n-1)tspl to ntspl is roughly equal to (in-1+in)/2, while the average value from ntspl to (n+1)tspl is equal to (in+in+1)/2. therefore, the average value on the interval [(n-1)tspl .. (n+1)tspl] becomes the average value of (in-1+in)/2 and (in+in+1)/2, 662 lj. s. peric, s. n. vukosavic 1 1 1 2 4 f n n n n i i i i        . (1) both the samples of the stator current, such as (n-1)tspl, ntspl, and (n+1)tspl, and the samples of the feedback i f can be transformed into z domain and represented by corresponding complex images i f (z) and i(z). the former and the later are related by the transfer function of the feedback path wf, 2 2 ( ) 2 1 ( ) ( ) 4 f f i z z z w z i z z      . (2) with tspl = tpwm/2, the transfer function wf has an infinite attenuation at the switching frequency fpwm. the attenuation is also infinite at integer multiples of fpwm. therefore, the feedback signal acquired from (1) does not get affected by the ripple. an average transport delay introduced by (1) and (2) is equal to tspl, hence, considerably lower than the delay of conventional anti-aliasing filters that can be used instead of the proposed averaging. it is of interest to compare the frequency response of the pulse transfer function (2) and the actual one-pwm-period-averaging. with a large number of current samples within each period, the transfer function of the feedback acquisition system is very close to the analog-implemented average over the past tpwm, which has an infinite attenuation at the switching frequency and its integer multiples. delay model of one-pwm-period averaging with n samples requires modified z-transform with the fractional sampling period ratio of n=32 or n=64, which is less convenient, less instructive, and hardly suitable for the analysis, design and the parameter setting of the controller. for that reason, we adopted the approximation (2). this approximation is verified by considerable similarity between simulations and the experimental results. thus, all the further design phases and parameter setting procedures use delay approximation of (2). the purpose of wf(z) approximation is to model the transient phenomena below the nyquist frequency, which is the frequency range of interest when it comes to designing and tuning the digital current controller. in the frequency range well above the nyquist frequency, there are differences between one-pwm-period-averaging and the transfer function (2), but they do not have any meaningful influence on the setting of the feedback gains. validity of the above approximations are justified by the experiment. 4. the structure of the current controller the current controller provides the two voltages (ud and uq) supplied to the three phase ac machine, where they affect the two output currents (id and iq). whether represented in synchronous or in stationary frame, the plant has two inputs and two outputs. the transient phenomena in orthogonal axis are coupled. the coupling depends on the revolving speed of the dq frame, that is, on the revolving speed of the electrical machine. the coupling is also affected by the transport delays. direct digital design (ddc) with the imc applied in z domain [11] decouples the transient phenomena in orthogonal axes by means of the controller with conveniently embedded proportional and integral actions. the pulse transfer of such controller gets multiplied by the pulse transfer function of the plant (fig. 7) to obtain the open loop transfer function which does not have any coupling terms [11, iv.e]. relying on this result, it is possible to add the filtering terms in the feedback path and high performance digital current control in three phase electrical drives 663 perform the gain setting assuming that the coupling between the orthogonal axis does not exist; namely, assuming that the current controlled system is a single input single output system (siso). the analysis given in subsections 4.1 and 4.2 are performed under this assumption. more detailed support for such a claim is given in subsections 4.3-4.5. the current controller executes in each tspl interval, that is, two times in each pwm period. the current controller tasks include the acquisition of the feedback signal, the execution of the control algorithm, and writing the voltage references, in the form of ton commands, into the corresponding registers of the pwm peripheral. the exact sequence of events has considerable effects on the consequential transport delay and it determines the closed loop performance. the execution of the current control tasks takes place within an interrupt, triggered by a programmable event. the interrupts have to repeat twice in each pwm period. in fig. 8, the interrupt events are created whenever the pwm carrier reaches either zero or the period count. one such execution account is denoted in fig. 8. it takes place after the period count of the pwm carrier, at t = (n1)tspl. the interrupt calculates the feedback signal i f n-1 as the average value of successive current samples within the past pwm period. it can be approximated by the function of the samples in-3, in-2, and in-1, as i f n-1= 0.25(in-3+2in-2+ in-1). based upon such feedback, the current controller derives the current error, and it calculates the voltage command un-1, suited to drive the current error back to zero. once calculated, the voltage command un-1 is expressed in terms of the pulse width ton for each of the three inverter phases. the pulse width values are reloaded into the pwm peripheral at instant t = ntspl. therefore, the voltage command un-1 determines the average voltage on an interval [ntspl .. (n+1)tspl]. this implies another transport delay that has to be taken into account in the controller design. fig. 7 block diagram of the digital current controller. fig. 8 execution of the control interrupt immediately after the pwm carrier reaches zero-count or period-count. 664 lj. s. peric, s. n. vukosavic the transport delays can be reduced by using the multisampling technique of [18]. instead of executing the control interrupt twice per each pwm period tpwm, as is the case in most dual-update-mode solutions, the interrupt can be executed n > 1 times in each half-period tpwm/2. each time the interrupt is triggered, a new feedback sample is acquired and a new voltage reference calculated. the voltage references affect the modulating signals that change n times in each half-period. most of these references do not get implemented, as the pwm process permits only one switching per half-period, as it accepts only one crossing of the modulation signal and the pwm carrier. therefore, the pwm unit uses only one of the voltage references in each tpwm/2, while the multisampling generates considerably more references. one of the consequential drawbacks is a nonlinear relation between the voltage command and the actual output voltage. positive side of the multisampling approach is reduction of the transport delay. besides the nonlinear inputoutput relation, caused by specific insensitivity at vertical transitions, the multisampling picks up the current ripple, which has to be reduced by introducing a dedicated digital filtering, proposed in [18]. in order to suppress the impact of the switching transients on critical samples, it is of vital interest to avoid and skip any sampling after the pwm switching. in order to avoid possible multi-switching conditions, the multisampling requires specialized pwm logic which prevents the system from making more than one commutation in each half-period of the pwm. another possibility of sequencing the current control tasks is denoted in fig. 9, where the zero-count events and period-count-events of the pwm carrier take place at (n1)tspl, (n-0)tspl, and (n+1)tspl. the interrupts occur texe before each counter event. the value of texe should be larger than the worst-case execution time of the control interrupt. in this case, the control interrupt would complete before the successive event of the pwm counter. with recent dsp controllers, the interrupt execution time does not exceed 4s. hence, the interval texe is considerably shorter than the sampling period tspl. the interrupt which completes just before t = ntspl calculates the feedback signal i f n as the average value of successive current samples within the past pwm period. with texe << tspl, the feedback signal can be approximated by the function of the samples in-2, in-1, and in-0, as i f n= 0.25(in2+2in-1+in). the current controller derives the current error and calculates the voltage command un, expressed in terms of the pulse widths ton in corresponding phases. the values are ready before ntspl, and they are reloaded into the pwm peripheral at instant t = ntspl. in this way, transport delay is reduced as un determines the average voltage on an interval [ntspl .. (n+1)tspl]. both the schedule of fig. 8 and the schedule of fig. 9 are considered in this section. 4.1. the schedule with the control interrupt executed after the counter event. in this section, the schedule of fig. 8 is considered, where the current controller collects the feedback i f n-1 as 0.25(in-3+2in-2+ in-1), calculates the voltage command un-1, which, in turn, determines the average voltage on an interval [ntspl .. (n+1)tspl]. the 3 output currents (ia, ib, and ic) can be converted into  frame of reference and expressed in terms of their components i and i. the current can be expressed as a vector i  =i + ji. by introducing  =exp(-rtspl/l), and assuming that the slowly changing emf can be neglected, the difference equation which describes the change of the output current becomes high performance digital current control in three phase electrical drives 665 1 1 1 . n n n i i u r          (3) introducing the complex images i(z) and u(z) by 0 0 ( ) , ( ) , k k k k i z i u z u        the transfer function wp(z) of the plant can be obtained by 0 ( ) 1 1 ( ) ( ) ( ) mf p e i z w z u z r z z        . (4) the structure of the current controller can be determined by applying the imc principle on wp, 1 1 ( ) 1 c p q z w z w z z      , where q is adjusted to make wc feasible, while  is design parameter that determines the response speed. applied to (4), the imc concept results in a pi controller, with both pand igains determined by decoupled variation of the gains provides an additional degree of freedom which helps meeting the desired performances. the current controller with proportional action kp and the integral action ki can be described by the following transfer function, ( ) . 1 c p i z w z k k z    (5) with transfer functions (2)-(5), the block diagram of the current controller is given in fig. 7, where emf is assumed to a slow, external disturbance, the effects of which are reduced by the integral action of the controller. fig. 9 execution of the control interrupt just before the pwm carrier reaches the next zerocount or period-count. the interrupt must start at least texe before the rollover of the pwm carrier. the worst case execution of the interrupt should not exceed texe. 666 lj. s. peric, s. n. vukosavic the open loop transfer function wol = wcwpwf is equal to 2 2 ( ) 2 1 1 1 ( ) . 1 ( )4 p i p ol k k z k z z w z z r z zz            (6) introducing the relative gains p and i, 1 1 , , 4 4 p i p k i k r r      (7) 2 3 [( ) ] ( 2 1) ( ) . ( 1)( ) ol p i z p z z w z z z z         (8) the closed loop transfer function is * 3 2 5 4 3 2 ( ) ( ) 1( ) 4( ) 4 . ( 1 ) ( ) ( 2 ) ( ) c p cl ol w wi z w z wi z p i z pz z z z p i z p i z i p p                    (9) the optimum gain setting and the resulting performances are discussed in section 5. 4.2. the schedule with the control interrupt executed before the counter event if the execution time of the control interrupt represents a negligible fraction of the sampling time, than the delay of texe can be neglected in fig. 9. in this case, the feedback signal i f n is obtained as 0.25(in-2+2in-1+ in), and it is used to obtain the voltage command un, which gets applied on the interval [ntspl .. (n+1)tspl]. in this case, 1 1 , n n n i i u r         (10) which leads to the transfer function wp(z) of the plant 0 ( ) 1 1 ( ) ( ) mf p e i z w z u z r z       . (11) the open loop transfer function wol = wcwpwf of the system in fig. 9 becomes 2 2 [( ) ] ( 2 1) ( ) . ( 1)( ) ol p i z p z z w z z z z         (12) the closed loop transfer function becomes * 3 2 4 3 2 ( ) ( ) 1( ) 4( ) 4 . ( 1 ) ( 2 ) ( ) c p cl ol w wi z w z wi z p i z pz z z p i z p i z i p p                   (13) due to reduced delays, the order of the system is reduced from the 5 th down to the 4 th , providing the potential to improve the bandwidth and robustness of the system. high performance digital current control in three phase electrical drives 667 4.3. decoupling of d-axis and q-axis the plant has two inputs, the voltages in d-axis and q–axis. the two outputs are the corresponding currents. therefore, it is a multi input, multi output system (mimo). in cases where the transient phenomena in orthogonal axis are decoupled, it is possible to design and tune the current controller in a simplified way, considering a single input, single output system (siso). adopting the complex vector notation, where the current error is expressed as i=id+jiq, while the output current is i=id+jiq, the product wcwp = i(z)/i(z) in fig. 9 is a complex number. in cases where the axes are coupled, this number has a non-zero imaginary part. in cases where the controller wc cancels the undesired dynamics of the plant wp and achieves complete decoupling, the transfer function wc(z)wp(z), as well as the closed loop transfer function i(z)/i * (z) do not have an imaginary part. one such example is given in [11, iv.e], where the equation (12) represents the resulting closed loop transfer function. in such cases, it is possible to adopt siso approach in parameter tuning. some key considerations on decoupling and tuning of the current controllers are given in [8], [11], [12], [14], and [19]. the axis decoupling can be obtained by using the s-domain imc approach, as shown in [5-7]. in absence of additional delays, the approaches of [5-7] would provide a decoupled operation of the current controller. yet, any digital implementation of the current controller is time-discrete, and it involves additional time delays. there is a number of valuable contributions that deal with the current controller in sdomain, and they provide a useful insights to readers [12], [14], [19]. in s-domain, the transport delays have to be modeled by rational approximation, and most frequently by pade approximation. at the same time, discrete-time integrators are represented by tustin approximation. designing and tuning digital current controllers in s-domain has a limited accuracy of representing the discrete-time phenomena in s-domain. consequential errors are more emphasized in the frequency range next to the desired bandwidth. the errors get more visible as the frequency comes close to the desired bandwidth, which is close to 20% of the pwm frequency in cases with regular-sampling-double-update. in addition, rational approximation of delays also introduces the non-minimum phase phenomena that do not correspond to the behavior of the actual system. the above mentioned problems do not exist in cases where the controller design relies on direct digital design, and in particular on the implementation of the imc concept in z-domain. in such cases, the errors introduced by s-model representation of discrete-time phenomena are absent. in [19], the current controller is designed in s-domain, with approximation of discretetime phenomena. the transport delay is equal to 3/2 of the sampling period, and it is approximated by pade delay. this approximation has the phase error of 1 degree at 10% of the sampling frequency. the error rises to 8 degrees at 20% of the sampling frequency. the phase shift of the relevant vectors due to delay is compensated by introducing a "lead" compensation which rotates the vectors by an angle of tdelay. the integrators are represented by tustin approximation. notwithstanding the decoupling measures, some cross-coupling effects remain due to delays. remaining cross coupling is seen from nondiagonal elements. these cross coupling effects are seen in differential equation [19, eq. 17] and the plant transfer function [19, eq. 18]. the cross coupling is greatly reduced by mimo design and a systematic procedure for an accurate tuning of the pi controller. the non-minimal phase due to numerator of [19, eq. 23] causes some inaccuracy at the very beginning of transients. 668 lj. s. peric, s. n. vukosavic 4.4. direct digital design with z-domain imc the effects caused by approximations inherent to s-domain design are also seen in the first four controllers considered in [11]. in the design iv.e of [11], where the approximations are not used and the imc design is applied in z-domain, the open loop transfer function wol(z) = idq(z)/idq(z) and the closed loop transfer function wcl(z) = idq(z)/i * dq(z) do not have the cross coupling terms, and do not include delay-dependent factors. the function wcl(z) in [11, eq. 12] represents the ratio between the output current idq(z) = id(z) +j iq(z) and the corresponding reference. the imaginary part of this wcl(z) is equal to zero. therefore, the q axis output current is not affected by the d axis reference, and vice versa. the input step response obtained with the current controller of [11, iv.e] does not depend on the excitation frequency, proving the effective decoupling. it has to be noticed at this point that the above conclusions consider the closed loop transfer function. the same does not hold for the disturbance transfer function, which relates the output to the voltage disturbance. while the imc-designed controller wc of fig. 9 gets multiplied with wp to obtained (wpwc), the product free from any coupling terms; the electromotive force in fig. 9 acts between wp and wc. this results in the disturbance transfer function which comprises the factor wp on its own, without getting multiplied by wc. thus, the undesired coupling does not get canceled in disturbance transfer function, which depends on the fundamental frequency even in systems with ddc-imc designed controllers. the same holds in all the competitive current controllers [14], where the response of the output current to changes in the electromotive force depends on the fundamental frequency. disturbance response of the current controller is of considerable importance, but it falls out of the scope of this paper. at the same time, the electromotive force in synchronous permanent magnet motors comes as a product of the constant flux of the magnets and the revolving speed. therefore, the changes in the electromotive force are determined by the speed changes, which are considerably slower than the current loop transients. in grid connected inverters, where the electromotive force gets substituted by the mains voltage, disturbance transfer function is of particular importance. namely, it describes the capability of the current loop to reject the low order harmonics of the grid and prevent them from introducing distortion in the output current. 4.5. the ratio between proportional and integral gains the imc design in s-domain results in wc(s)= r/s+ l+ jl/s. the ratio between the proportional gain l and the integral gain r is defined by the electrical time constant, and it has to be maintained in order to cancel the undesired plant dynamics. direct digital design and the use of the imc method in z-domain results in the current controller given in [11, fig. 10] and [11, eq. 11]. it has the proportional gain of l/tspl and the integral gain of r. hence, the ratio between the proportional and integral gain has to remain equal to (1/tspl)(l/r) in order to maintain the desired decoupled operation. within the closed loop transfer function wcl, the pulse transfer of the controller wc is multiplied by the plant wp to obtain the product wcwp= idq(z)/idq(z). with ddc-imc design [11, iv.e], wcwp does not have any coupling terms. relying on that, it is possible to add the term wf(z) in the feedback path and to perform the gain setting assuming that the coupling between the orthogonal axis does not exist. as long as the function wf(z) does not introduce the coupling terms of its own, the closed loop gain wcwpwf and the closed loop transfer function wcwp/(1+wcwpwf) will remain coupling-free. high performance digital current control in three phase electrical drives 669 the previous conclusions can be applied to fig. 9, where ddc-imc designed controller wc multiplies the plant transfer function wp and provides the direct-path gain wpwc with no diagonal elements. with proper implementation, the transfer function wf of (2) does not introduce any coupling elements. therefore, the analysis and the parameter setting of the system with one-pwm-period averaging in the feedback path can be performed by adopting siso approach, as already done in previous subsections. while applying the siso design procedure, a particular attention has to be paid to the parameter setting procedure. namely, the choice of the proportional and integral gains is not free. in order to maintain the decoupled operation, the gains have to maintain the ratio (1/tspl)(l/r). 5. the optimum parameter setting design and tuning of digital current controllers has attracted considerable attention. a comprehensive and instructive summary of most relevant controllers is given in [11]. it includes several s-domain approaches to designing and tuning the digital current controllers, as well as the case with direct digital control (ddc) with z-domain implementation of the imc concept. while the other approaches introduce a number of s-domain approximations of discrete-time features, and therefore introduce additional coupling terms, the approximation-free ddc-imc concept provides a flawless decoupling. the controller wc comprises the basic proportional and integral actions along with several decoupling terms that include exp(jtspl) elements. therefore, parameter tuning procedure has to establish the proportional and integral gains that provide the desired response. in current controller design iv.e of [11], the open loop transfer function and the closed loop transfer function do not have the cross coupling terms, and they do not include delay-dependent factors. based upon that, we concluded that in the systems with ddc-imc designed controller, the feedback filtering and the parameter setting procedure can be performed on a simplified, more transparent and more intuitive bases. in order to keep decoupled response, it is necessary to maintain the ratio between the proportional and integral gains, which reduces the gain tuning to selecting one single parameter. performance of both closed loop transfer functions (9) and (13) depends on the closed loop gains p and i. with characteristic polynomials of the 4 th and the 5 th order, it is difficult to find analytical relation between the gains and the closed loop performance. parameter setting procedure proposed in this section envisages (a) definition of the performance criterion, (b) the search of the p-i plane for the point which offers the best performance. in order to obtain a fast, high-bandwidth response with well damped waveforms, the performance criterion includes the settling time t01. the value of t01 has to do with the closed loop step response. following the input step, the output of the system moves towards the target, and it settles on the target exponentially, or with some damped oscillations. after the time delay t01, the output error falls into +/-1% wide strip. following t01, the error does not leave the strip, unless another input disturbance is received. the interval t01 is called "1% settling time", and it is of interest to keep it as small as possible. the settling time as performance criterion is an effective way of discarding the reponses which have a high bandwidth and a short rise time, but at the same time exhibit poorely damped response with oscillatory approach to the target value. 670 lj. s. peric, s. n. vukosavic in addition to the settling time, it is also of interest to evaluate the robustness of the controller. due to on-line changes in the system parmeters, such as the ac grid impedances in a grid connected power converter, it is of interest to maintain the stability and the response characters in the presence of variable parameters. the robustness of the system can be measured by the vector margin (vm), as effectively used in [11]. the vector margin is usually calculated from the open loop transfer function wol(z). for the given excitation frequency , the argument z is exp(jspl). while  sweeps from 0 up to the nyquist frequency, the values of wol(z) are complex numbers which move in the complex plane and draw a graph. for the system stability, this graph must not pass through the point (1, j0). the robustness can be judged from the the minimum distance (radius) between the graph and the point (-1, j0). the value of the radius is called the vector margin. with vm < 0.5, one would expect an oscillatory response that is likely to pass into instability in the case of a significant parameter change. the motivation of using a larger vm comes from the fact that magnetic saturation in electrical machines has considerable effect on the equivalent inductance of the stator winding. in this paper, there are two search runs for the optimum parameters. the first search assumes that the feedback gains p and i can be changed independently, while the second search respects the need to maintain a constant p/i ratio. 5.1. parameter search in p-i plane parameter search performed in this subsection assumes that the feedback gains p and i can be changed independently. in other words, it is assumed that the ratio p/i does not have to be kept constant. the optimum gains p and i are searched for the closed loop transfer functions of (9) and (13). the space where the optimum gains are searched is a domain in the 2-dimensional p-i space, limited by p > 0, i > 0, and by p < 1, i < 1. the search method is rather simple, it starts by selecting a large number of equally spaced discrete gains along both axes, it proceeds by calculating the performances for each pair of the gains (p, i), and ends by selecting best pair of gains according to design criteria. the optimum gains are searched for the execution schedule of figs. 7 and 8. in both cases, the search method provided the optimum gains (p, i), the frequency f45 where the phase of wcl drops to -45 o , the frequency fbw where the amplitude of wcl drops to -3db, the vector margin vm, and the overshoot of the step response. all the results are obtained with fpwm = 10khz. the results are summarized in table 1. in cases with vm > 2/3 (p < 0.0777 in table 1), the character of the closed loop response is maintained for a wide range of parameter changes. the step responses are compared in fig. 10. it has to be noted in table 1 that, although the ratio p/i is not fixed, the ratio between the optimum gains remains close to (1/tspl)(l/r). the ratio p/i is equal to 119 for the schedule of fig. 8, and 131 for the schedule of fig. 9. for the given motor, the ratio (1/tspl)(l/r), required for the decoupled operation is equal to 144. hence, in a way, the search procedure finds the optimum close to the area which secures decoupled operation. it is of interest to perform the search procedure where the ratio p/i is kept constant and equal to (1/tspl)(l/r). 5.2. parameter search with constant p/i ratio although there are two gains, p and i, they have to maintain the same ratio, determined by parameters l and r, in order to preserve the proper decoupling between d and q axis [11], [14]. therefore, it is of interest to consider the changes of the gain p, assuming that the ratio p/i remains unaltered and equal to (1/tspl)(l/r). in this case, the search results are high performance digital current control in three phase electrical drives 671 given in table 2. with an overshoot of 2.64%, the closed loop bandwidth reaches 20% of the switching frequency. in table 2, the gain p sweep from 100% to 150% of the optimum value makes the overshoot increase from 3.4% up to 22%, while the closed loop bandwidth increases from 21% up to 33% of the switching frequency. starting with the optimum gain setting, the gain reduction of 50% leads to an overshoot of 0%, while the closed loop bandwidth drops to 7% of the switching frequency. stability limit is reached with the gain equal to 410% of the optimum value. comparable state-of-the-art solutions are summarized in [11] and [14]. they do not use one-period-averaging, and rely instead on regular-sampling-double-update with one sample in each tpwm/2. in table 1 of [11], the closed loop bandwidth of a well damped, low overshoot response reaches 10% of the sampling frequency (20% of the switching frequency). in figs. 12-14 of [14], a well damped, low overshoot response has the rise time of 5-6 sampling times. the corresponding bandwidth is, roughly, 0.35/(5ts) = 0.07fs, (14% of the switching frequency). solution devised in this paper has an additional delay, caused by the feedback averaging. with devised control methods, the consequential closed loop bandwidth of table 2 is better than with comparable solutions. table 1 performance factors obtained with the optimum gains and with the execution schedule of figs. 7 and 8. parameter search is performed in two dimensional space, with unconstrained proportional and integral gains. schedule p i f45 [hz] fbw [hz] vm overshoot [%] t01 fig. 8 0.0442 0.00037 541 1177 0.677 0.84 21 fig. 9 0.0708 0.00054 994 1882 0.705 0.67 9 table 2 parameter search is performed for the schedule of fig. 9, and for fixed ratio between the proportional and integral gains. schedule p fbw [hz] vm overshoot [%] fig. 9 0.065 1607 0.722 0.42 fig. 9 0.067 1687 0.715 0.75 fig. 9 0.071 1862 0.701 1.61 fig. 9 0.075 2005 0.689 2.64 fig. 9 0.077 2116 0.679 3.45 fig. 9 0.081 2252 0.668 4.8 fig. 9 0.086 2474 0.648 6.8 fig. 9 0.091 2618 0.636 8.4 fig. 9 0.095 2753 0.623 10.2 fig. 9 0.1 2912 0.607 12.1 fig. 9 0.116 3382 0.553 22 6. experimental results the experimental verification of the two current controllers is performed on an experimental setup which comprises a synchronous motor with surface mounted magnets, a pwm inverter and a digital control platform. the stack length of the motor is l = 128mm, and it has 6 poles. the rated torque is 7.3 nm while the rated current is 7.3 arms. the motor has stator resistance of 0.47and the inductance of 3.4 mh. the pwm inverter has the dc-bus voltage of edc = 520v, and it has the switching frequency of 10khz. the rated 672 lj. s. peric, s. n. vukosavic lockout time is set to 3s. the digital control platform uses tms320f28335 dsp. it has the adc unit with 12-bit resolution and with 16 input channels. the oversampling mechanism acquires 32 samples per base period. the sampling and storing is automated by embedded dma machine. the anti-aliasing filters are designed as passive rc filters, using the standard procedures, and also taking into account the fact that the effective sampling frequency is 32 time larger. with one-pwm-period-averaging, the effective nyquist frequency is increased 32 times, as well as the desired cutoff frequency of the analog anti-aliasing filters. therefore, it is possible to design a passive anti-aliasing filter that would remove any residual noise, while having the cutoff frequency considerably above the desired bandwidth. thus, the impact of such anti-aliasing filter on phase lag, delays and the closed loop dynamics is negligible, and it has no detrimental effects on the closed loop response. in most reports on digital ac current controllers, the experimental waveforms of the output current in dq frame are calculated by the dsp controller, and then written on a dac or copied into a pc. similar procedure is not feasible with the present system of fig. 9, where the output current i dq (z) gets filtered through the block wf(z) to obtain the feedback signal if dq (z). it has to be noted at this point that the only signal available to the dsp controller is the feedback signal. for that reason, the output current i dq (z) cannot be observed from the registers of the dsp controller. the only way to access the output current instead of the average feedback is direct measurement of the actual motor current. the phase current does reflect the changes in id(t) and iq(t), but it is also affected by the rotor position. therefore, experimental traces in fig. 11 are obtained with the rotor locked in position where the measured phase current gets equal to the q-axis current. it has to be notice though that this approach does not allow the experimental verification of the axes decoupling at high speeds. in this regard, the authors rely on the analytical and experimental findings of [11], which considers the direct digital design and the implementation of the imc approach in z-domain, the approach also used in this paper. results of [11] prove that any coupling is removed, while the input-step response does not depend on the excitation (fundamental) frequency. the execution of the control interrupt takes 3.5s on the selected dsp platform. therefore, for the scheduling scheme of fig. 9, the interrupts were triggered 4s prior to each pwm counter event. experimental traces are obtained in fig. 11, showing a reasonable similarity with the simulation results shown in fig. 10. all the measurements were done at the zero speed, with the rotor locked in position where the measured phase current corresponds to the q axis current. fig. 10 step response with execution fig. 11 experimental step responses schedules outlined in figs. 7 and 8. with schedules of figs. 7 and 8. high performance digital current control in three phase electrical drives 673 7. conclusions this paper deals with practical implementation of digital current controllers, which represent the key elements in majority of contemporary static power converters. high performance current controllers are required in the electrical drives and also in grid connected converters. a high bandwidth and considerable robustness are of uttermost importance in all applications of digital current controllers. in this paper, a novel approach to acquiring and filtering the feedback signal is proposed. devised approach is free from sampling errors and it introduces only a minimum delay into the feedback path. we also consider the transport delays in the voltage actuation path and the transport delays in the feedback path. proposed parameter setting procedures takes into account the delays, and it meets both the bandwidth requirements and the robustness against the noise and the parameter changes. proposed results are verified by simulation and also on an experimental setup comprising a brushless dc motor, a pwm inverter and a dsp-based control platform. for the relative gain p = 0.075, and for gain ratio p/i determined by the imc procedure, the closed loop bandwidth reaches 20% of the switching frequency (that is, 10% of the sampling frequency) with an overshoot of 2.64% and with a vector margin of 0.689. the system parameters can to change more than 4 times before entering the instability region. devised control solutions have the potential of reducing the noise sensitivity and improving the closed loop performance of digital current controllers applied in 3 phase ac drives and in grid connected power converters. references [1] e. levi, “foc: field oriented control,” in the industrial electronics handbook, (power electronics and motor drives), 2 nd ed., boca raton, fl, usa: crc press, 2011. [2] d. g.holmes, b. p. mcgrath, and s. g. parker, “current regulation strategies for vector-controlled induction motor drives,” ieee trans. ind. electron., vol. 59, no. 10, pp. 3680–3689, oct. 2012. [3] j.-w. choi and s.-k. sul, “fast current controller in three-phase ac/dc boost converter using d-q axis crosscoupling,” ieee trans. power electron., vol. 13, no. 1, pp. 179–185, jan. 1998. [4] j.-w. choi and s.-k. sul, “new current control concept minimum time current control in the threephase pwm converter,” ieee trans. power electron., vol. 12, no. 1, pp. 124–131, jan. 1997. [5] l. harnefors and h. p. nee, “model-based current control of ac machines using the internal model control method,” ieee trans. ind. appl., vol. 34, no. 1, pp. 133–141, jan./feb. 1998. [6] f. briz, m. degner, and r. lorenz, “dynamic analysis of current regulators for ac motors using complex vectors,” ieee trans. ind. appl., vol. 35, no. 6, pp. 1424–1432, nov./dec. 1999. [7] f. briz, m. degner, and r. lorenz, “analysis and design of current regulators using complex vectors,” ieee trans. ind. appl., vol. 36, no. 3, pp. 817–825, may/jun. 2000. [8] j.-s. yim, s.-k. sul, b.-h. bae, n. patel, and s. hiti, “modified current control schemes for highperformance permanent-magnet ac drives with low sampling to operating frequency ratio,” ieee trans. ind. appl., vol. 45, no. 2, pp. 763–771, mar./apr. 2009. [9] j. holtz, j. quan, j. pontt, j. rodriguez, p. newman, and h. miranda, “design of fast and robust current regulators for high-power drives based on complex state variables,” ieee trans. ind. appl., vol. 40, no. 5, pp. 1388–1397, sep./oct. 2004. [10] b.-h. bae and s.-k. sul, “a compensation method for time delay of full digital synchronous frame current regulator of pwm ac drives,” ieee trans. ind. appl., vol. 39, no. 3, pp. 802–810, may/jun. 2003. [11] h. kim, m. degner, j. guerrero, f. briz, and r. lorenz, “discrete-time current regulator design for ac machine drives,” ieee trans. ind. appl.,vol. 46, no. 4, pp. 1425–1435, jul./aug. 2010. [12] d. g. holmes, t. a. lipo, b. mcgrath, and w. kong, “optimized design of stationary frame three phase ac current regulators,” ieee trans. power electron., vol. 24, no. 11, pp. 2417–2426, nov. 2009. [13] h.-t. moon, h.-s. kim, and m.-j. youn, “a discrete-time predictive current control for pmsm,” ieee trans. power electron., vol. 18, no. 1, pp. 464–472, jan. 2003. 674 lj. s. peric, s. n. vukosavic [14] a. g. yepes, a. vidal, j. malvar, o. lopez, j. d. gandoy, "tuning method aimed at optimized settling time and overshoot for synchronous proportional-integral current control in electric machines," ieee trans. power electron., vol. 29, no. 6, pp. 3041–3054, jun. 2014. [15] s. h. song, j. w. choi, s. k. sul, "current measurements in digitally controlled ac drives," ieee ind. appl. magazine, vol. 6, no. 4, pp. 51-62, jul./aug. 2000. [16] l. r. carley "an oversampling analog-to-digital converter topology for high resolution signal acquisition systems", ieee trans. circuits sys., vol. cas-34, pp.83 -90 1987 [17] a. said, a. h. kamal, "a modeling technique to analyze the impact of inverter supply voltage and cable length on industrial motor-drives," ieee trans. power electron., vol. 23, no. 2, pp. 753–762, mar. 2008. [18] l. corradini, w. stefanutti, and p. mattavelli, "analysis of multi-sampled current control for active filters," ieee trans. ind. appl., vol. 44, no. 6, pp. 1785-1794, nov./dec. 2008. [19] f. d. freijedo, a.vidal, a. g. yepes, j. m. guerrero, o. lopez, j. malvar, and j. dovai-gandoy," “tuning of synchronous-frame pi current controllers in grid-connected converters operating at a low sampling rate by mimo root locus”, ieee trans. ind. electron., vol. 62, no. 8, pp. 5006-5017, aug. 2015. [20] s. n. vukosavic, s. l. peric, and e. levi, “ac current controller with error-free feedback acquisition system,” ieee trans. energy convers., accepted for publication, doi 10.1109/tec.2015.2477267. [21] s. n. vukosavic, s. l. peric, “a modified digital current controller with reduced impact of transport delays,” iet electric power appl., under review (epa-2015-0507). http://dx.doi.org/10.1109/tec.2015.2477267 plane thermoelastic waves in infinite half-space caused facta universitatis series: electronics and energetics vol. 31, n o 1, march 2018, pp. 101 113 https://doi.org/10.2298/fuee1801101w a system-on-chip 1.5 ghz phase locked loop realized using 40 nm cmos technology weiyin wang 1 , xiangjie chen 1 , hei wong 2 1 institute of photonics and microelectronics, school of information sciences and electronic engineering, zhejiang university, hangzhou, china 2 department of electronic engineering, city university of hong kong, tat chee avenue, kowloon, hong kong abstract. this work presents the design and realization of a fully-integrated 1.5 ghz sigma-delta fractional-n ring-based pll for system-on-chip (soc) applications. some design optimizations were conducted to improve the performance of each functional block such as phase frequency detector (pfd), voltage-controlled oscillator (vco), filter and charge pump (cp) and so as for the whole system. in particular, a time delay circuit is designed for overcoming the blind zone in the pfd; an operational amplifierfeedback structure was used to eliminate the current mismatch in the cp, a 3rd lpf is used for suppressing noises and a current overdrive structure is used in vco design. the design was realized with a commercial 40 nm cmos process. the core die sized about 0.041 mm2. measurement results indicated that the circuit functions well for the locked range between 500 mhz to 1.5 ghz. key words: pll, blind zone, current mismatch, ring oscillator 1. introduction phase-locked loops (plls) are widely used in modern digital and communications systems for frequency synthesis, clock generation, retiming, clock signal recovery, etc. [1-6]. with fractional-n feature, variable frequency clocks can be generated for the operation of different communication and digital sub-systems in a mixed signal systemon-chip (soc) design. to meet some specific system requirements, a pll was usually realized with analog charge pump, lc oscillator, loop filter with large component values which require large silicon area and non-standard digital/analog cmos fabrication process [3, 6-7]. this work presents one fully integrated, fractional-n, pll design solution based solely on a commercial 40 nm cmos process without much significant performance tradeoff. received april 7, 2017; received in revised form october 3, 2017 corresponding author: hei wong department of electronic engineering, city university of hong kong, tat chee avenue, kowloon, hong kong (e-mail: eehwong@cityu.edu.hk) 102 w. wang, x. chen, h. wong figure 1 shows the system block diagram of a fractional-n pll. it consists of a phase frequency detector (pfd), charge pump (cp), low-pass filter (lpf), voltage-controlled oscillator (vco) and multi-modulus divider (mmd) for sigma-delta modulation. the whole structure is configured as a negative feedback system that can keep track with the output signal frequency, fout, with the reference frequency, fref. if there is a difference in frequency or phase, the pfd will output control pulses up and dn which are fed into a charge pump for converting into a single end current. a low-pass filter will filter out the high-frequency components and output a dc voltage for the vco control. that is, the vco voltage is proportional to the phase error between the fref and vco output frequency, fout, or its fraction as divided by the mmd. a generalized transfer function for describing this feedback system is given by ( ) 2( ) 1 1 ( ) 2 cp vco lpf cp vco lpf i k h s sh s i k h s s n     (1) where icp/2 is the current generated by charge pump per phase angle; hlpf(s) is the transfer function of low-pass filter. kvco/s is the transfer function of vco; and 1/n represents the frequency division for fractional-n operation. fig. 1 major constitutions of a fractional-n pll. the system performance is governed by various characteristics of the building blocks. the blind zone of pfd and the mismatch of control currents of the charge pump will cause the vco to output incorrect signal frequency and cause a large phase noise in the pll [8-13]. high-order active filter would lead to a better loop and high-order harmonics suppression. however, it may be costly for implementation and conventional integrated plls often put the filter as an external component which allows user to have custom designs. the characteristics of vco will greatly affect the overall performance of pll a system-on-chip 1.5 ghz phase locked loop realized using 40 nm cmos technology 103 [7]. within the constraint of given design rules, available component types and values of target 40 nm cmos process, we designed a fully integrated fractional-n pll with some special circuit configurations. in pfd circuit, conventional d-type flip-flop based phase detector circuit configuration [14] was used where we introduced a delay line with approximately 280ps delay to eliminate the blind zone [14]. in cp circuit, with reference to some recently reported configuration [9-12], a feedback mechanism constituted by an opamp was established in order to tract with the up and dn currents. to suppress noises of system, a 3 rd order passive lpf was used. in vco circuit, io device was used to control the ring oscillator. preliminary results of this design has been reported in ref.[15]. this paper presents further detailed analysis on the circuit configuration, results, circuit constraints and methodologies for further performance improvement. 2. design methodologies 2.1. phase frequency detector phase frequency detector (pfd) which generate a pulse output is proportional to the phase difference between the input and output frequencies/phases [4, 12]. in this work, we construct the pfd using the conventional d-type flip-flop based circuit which is typical in a commercial pll design [14]. figure 2(a) gives the specific circuit of pfd. the d flip-flops make the pfd be sensitive to the rising edges of fref or fout only. in connection with the charge pump (cp), the high output of the top d flip-flop enables the up current of the charge pump; whereas the output of the lower d flip-flop enables the dn current. at the rising edge of the reference signal fref , the up signal changes from 0 to 1. it may remain even after the rising edge of feedback signal fout which changes the dn output from 0 to 1 also. at this point, both signals are fed into the and gate which resets up and dn signals simultaneously. this results in a blind zone for up and dn signal. blind zone will make the cp be insensitive to a small phase errors and results in a large phase noise of the pll. thus, measure to eliminate the blind zone needs to be introduced. here we introduce a delay line (dl) of about 280 ps in order to eliminate the blind zone. with this configuration, dn will be reset for a period of dl after the up signal and that eliminates the blind zone. the detailed operation of this circuit could be understood with the aid of the state diagram given in fig. 2(b). figure 2(b) highlights the three states of the up and dn signal generation for the pfd. when a rising edge of fout detected, there is a positive transition from the cp. when the system starts up, the pfd is in the “state 0” (up = 0, dn = 0). when a rising edge of fref comes up, pfd changes from “state 0” to “state 1” (up = 1, dn = 0). if a rising edge of fout comes, the pfd will go back to “state 0”; and if there is another rising edge of fref detected, the pfd will keep at “state 1”. when the system is at “state 0” with a rising edge of fout, the pfd will go into “state 2” (up = 0, dn = 1). at this point, if a rising edge of fref comes up, it will switch back to “state 0”. however, if a second rising edge of fout detected, the pfd will keep at “state 2”. 2.2. charge pump the control signal up and dn generated by the phase detector are fed into charge pump (cp) to control the current flows in the charge pump so as to produce a single current which will be further converted into a voltage for the vco control via the low104 w. wang, x. chen, h. wong pass filter. the total charge, qcp, is proportional to the durations of up and dn signals, namely (2) where iup and idn are the up and dn current, respectively, tup and tdn are the pulse duration of up and dn control in cp, respectively. (a) (b) fig. 2 (a) schematic of phase frequency detector; and (b) state diagram showing the operation flow of the phase frequency detector. in ideal case, iup is always equal to idn. in real case, the currents may be different due to device mismatch, charge injection, clock feedthrough, channel-length modulation and etc. [8-13]. this issue becomes even worse in nano cmos circuits as channel length modulation will be more significant. to eliminate the mismatch between up and dn current, several circuit configurations such as drain-switching charge pump, current steering charge pump, source-switching charge pump and cascade current source charge pump were proposed [8-13]. in our work, we incorporate a comparator to keep up and dn current to track each other. as shown in fig. 3, the drains of m1 and m3 are tied to the noninverting and inverting input of the op amp, respectively, which will make the drain voltage of m1 and m3, and then m2 and m4, be equal. it can be readily shown that both up current (id, m3) and dn current (id, m4) are equal and are both governed by the mirror current of ibias. they will not be affected by the output voltage, channel length modulation or size mismatch of the transistors. cp up up dn dn q i t i t  a system-on-chip 1.5 ghz phase locked loop realized using 40 nm cmos technology 105 fig. 3 schematic of the cp structure with a feedback path constituting by an operational amplifier to keep track with the up and dn currents. 2.3. low-pass filter as shown in fig.1, the low-pass filter (lpf) shapes the error current for vco control. it governs the damping factor and natural frequency of the system. for sake of flexibility and to save the chip area, most of commercial plls often put the lpf as externallyconnected circuit and allow the maximum flexibility for specific circuit application design. for soc applications, this work implements the lpf with on-chip components and it does not treat filter as a tunable building block. to keep a reasonable system performance, simple, third order, passive rc filter shown in fig. 4 is used. processes for determining the component values are given in appendix a. fig. 4 schematic of the low-pass filter circuit used in this work. 106 w. wang, x. chen, h. wong 2.4. voltage control oscillator voltage control oscillator is one of the most important building block governing the performance of the pll. the key performances of concern include: (a) frequency tuning range: it determines the operation range of the pll. (b) tuning gain: expressed in terms of v/hz, indicates the change in voltage level as the frequency change. it governs the overall gain of the whole system. (c) phase noise level: the phase noise level of the vco is of particular importance in some applications such as used as a frequency synthesizer. it affects the stability of the system and is the major jitter source. many advanced vco circuit configurations based on rc or lc structures have been proposed [1-3, 7]. in most of the high-performance oscillators and plls, lc structures are always favorable. however, a high-quality factor inductor requires a thick metal layer for implementation which is not available in most of the cmos process. it requires a large silicon area also. rc based oscillator also requires a large chip area and there are constraints in high-frequency operation also. ring-based vco has poor frequency stability, large phase noise and they are more vulnerable to power and temperature fluctuation [7]. in addition, it was difficult to achieve very high frequency operation and is seldom used in any highperformance pll. the advantages of the ring-based vco is that it is simple in circuit design. it is simply built with some cascaded inverters. ring oscillator is very compact and can be realized with any cmos process. with the available of nano cmos technology and some digital circuit techniques, high-performance and high-frequency ring-based plls have been obtained [6]. another area that makes the ring-based pll be more attractive is the need of low-cost and readily available pll for full cmos system integration and soc applications. ring-based vco together with the nano cmos technology do provide a lowcost implementation of pll with reasonable performance. in present design, because of the available 40 nm gate length devices, high-speed operation can be readily obtained. performance of the circuit is still acceptable with such simple vco design as will be demonstrated later. in our design, we implemented the vco with a five-stage cmos inverters. to achieve a higher operation frequency and better stability, the circuit is overdriven with large size io mosfet, m1, which operates at high analog supply voltage avdd of 2.5v. as shown in fig. 5, being biased at 2.5v, m1 would produce a larger control current so as to generate a higher oscillation frequency. it also reduces the effects fig. 5 schematic of a five-stage voltage control ring oscillator with current overdrive. a system-on-chip 1.5 ghz phase locked loop realized using 40 nm cmos technology 107 due to supply voltage, temperature, and process parameter fluctuations. note that the core ring oscillator, constituted by the five-stage inverters, was operated at digital supply voltage with dvdd = 1.1 v. to protect the vco output snw not to exceed dvdd, an operational amplifier op and level limiter m2 were used. if snw exceeds dvdd, transistor m2 will be turned on so as to lower the snw voltage level to a value equal to dvdd. 3. testing and validating we have realized the designed pll with a commercial 40 nm cmos process. because such short-gate length transistors are available, high-frequency operation can be readily obtained. the layout of the design is shown in fig. 6. major building blocks, pfd, cp, lpf, vco, mmd, sigma-delta modulator are highlighted. the chip size of core functional block (excluding pads and ios) is about 250 μm × 165 μm or 0.041 mm 2 .which is half size of the latest digital fractional-n pll realized using 65 nm technology [6]. the circuit is operated with both digital and analog supply voltages of 1.1 v and 2.5 v, respectively. the locked range of the chip in from 500 mhz to 1.5 ghz. the overall power consumptions are 1.5 mw and 2.8 mw at 750 mhz and 1.5 ghz, respectively. (a) (b) fig. 6 (a) layout of a system chip with embedded ppl designed in this work; (b) layout of designed pll. the size of the chip is 250 μm×165 μm or 0.041 mm 2 . it can be seen that although the low-pass filter is rather simple from the circuit configuration point of view, it occupies the largest portion of chip area for realizing the few passive capacitive and resistive components. compared to other soc designs, the chip area of lpf is larger because the use of third order filter. the size of vco is comparatively compact because the use of ring oscillator structure and minimum size inverters. designed ppl 108 w. wang, x. chen, h. wong the preliminary testing results of the fabricated pll chip are shown in fig.7. figure 7(a) shows the measured output phase noises for the vco output at 1.5 ghz. the lowfrequency (1 khz) and high-frequency (1 mhz) noise level are -62 dbc/hz and -81 dbc/hz, respectively. figure 7(b) depicts the output spectrum of vco at 1.5 ghz with 300 mhz frequency span. the peak value 13.42 dbm and a number of spurious peaks at various frequencies such as ±20, ±30, ±40, ±50 mhz and their multiples are found. the sources of the spurs should be due to the clock frequency of signal source and the power line frequency as well. the characteristics are not very good as compared with other integrated plls [6]. however, when the operation frequency is lowered, better characteristics are found. figure 8 plots the phase noise levels as a function of vco frequency. phase noise is smaller for smaller value of fractional n. at 500 mhz, the 1 khz phase noise reduces to -71 dbc/hz. the phase noises increase as the frequency goes up. in general, the low-frequency noise levels, less than -62 dbc/hz, are acceptable for a ring-based pll in some general applications. the high-frequency phase noises are less than -81 dbc/hz for all investigated vco frequencies. in addition, the time domain jitter noise was also measured. fig.9 plots the root-mean-square value of jitter noise as function of oscillator frequency. the peak value was about 13 ps at 500 mhz. the jitter noise levels are smaller at other frequencies. figure 10 plots the output amplitude as a function of vco frequency. as shown in fig. 10, the output amplitude was -8.6 dbm when the output is set to 500 mhz, it decreases as the frequency increases. figure 11 shows the change of primary spurs (at 20 mhz) for various vco frequencies. the spur amplitude is less than -64 dbm for 500 mhz center frequency and the largest spur was found for 1.2 ghz output spectrum. the levels of spurs in our pll are high and needs to be suppressed. although it does not directly show up in the present measured data, one can readily anticipate that the performance of the lpf and vco could be one of the major sources for the characteristics degradation of the system. these are the major performance trade-off for the compact and simple design in the sense of soc applications. the loop dynamic of the present design could not be adjusted with the fixed lpf components and the loop gain may be lowered because of the losses in the parasitic passive components. in addition, the integrated capacitors used usually have large leakage current because of the use of ultrathin dielectric. this is even worse in the 40 nm technology. as a consequence, the constant dc level for the low-voltage vco is hard to maintain at the lpf output and that causes some undesirable frequency drifts of vco. hence further improvement should focus on the lpf design such as the use of active filter to reduce the chip size and yet to suppress the effect of gate leakage. yet the second issue needs special attention is stability of vco against power supply and temperature fluctuations. ring oscillator was known to have poor power supply and temperature stability. in the 40 nm technology, the digital power supply voltage (dvdd) has been scaled down to 1.1 v. the low supply voltage makes the ring-based vco more sensitive to supply voltage and temperature fluctuation. here we use the io devices for driving and level limiting with operation voltage of 2.5 v (avdd). it should help in alleviating these effects. further experimental validation and detailed characterization are under investigation. a system-on-chip 1.5 ghz phase locked loop realized using 40 nm cmos technology 109 (a) (b) fig. 7 phase noise characteristics and output spectrum of the ring-based vco in the fabricated pll operated at 1.5 ghz: (a) phase noise; and (b) output amplitude response. 110 w. wang, x. chen, h. wong fig. 8 plot of measured phase noise of vco output at different frequencies. vco frequency (mhz) 0 200 400 600 800 1000 1200 1400 1600 r m s j it te r (p s ) 4 6 8 10 12 14 fig. 9 measured root-mean-square value of jitter for the whole frequency range of the vco. fig. 10 peak amplitudes of the vco running at different frequencies. a system-on-chip 1.5 ghz phase locked loop realized using 40 nm cmos technology 111 fig. 11 levels of primary spur observed at the vco’s output. 4. conclusion in this work, a fully-integrated compact cmos fractional-n pll was designed and realized. we adopted various strategies for most of the key functional blocks so as to improve the overall performance of the pll. in particular, a time delay circuit was introduced to the phase detector for overcoming the blind zone in control signal generation; the charge pump characteristics was improved by using an operational amplifier to mirror the up and dn currents so as to alleviate the effects of current mismatch and channel length modulation of short-channel devices. the major strategy in performance, cost and technology trade-off is the use of a five-stage ring-based vco in the design. a current overdrive structure was introduced by using io device and analogue supply voltage which allow a better frequency range and alleviate the possible degradations due to power supply and temperature fluctuations. the design is compact in size and has been realized with a 40 nm cmos process. measurement results indicated that the circuit functions well for the locked range between 500 mhz to 1.5 ghz. references [1] v. ravinuthula and s. finocchiaro, “a low power high performance pll with temperature compensated vco in 65nm cmos", in proceedings of the ieee radio frequency integrated circuits symp., 2016, pp. 31-34. [2] d. liao, h. wang, f. f. dai, y. xu, r. berenguer, “an 802.11 a/b/g/n digital fractional-n pll with automatic tdc linearity calibration for spur cancellation”, in proceedings of the ieee radio frequency integrated circuits symp, 2016, pp. 134-137. [3] s. ikeda, h. ito, a. kasamatsu, y. ishikawa, t. obara, n. noguchi, et al., “an 8.865-ghz -244db-fom high-frequency piezoelectric resonator-based cascaded fractional-n pll with sub-ppb-order channel adjusting technique”, in proceedings of the ieee symp. vlsi circuits, 2016, pp. 1-2. [4] t. li x. fan, and z. hua, “cmos phase frequency detector and charge pump for multi-standard frequency synthesizer”, in proceedings of the ieee int’l conf. microwaves, communications, antennas and electronic systems, 2015, pp. 1-4. [5] m. ghasemzadeh, s. mahdavi, a. zokaei, and k. hadidi, “a new adaptive pll to reduce the lock time in 0.18 μm technology”, in proceedings of the 23rd international conference mixed design of integrated circuits and systems, 2016, pp. 140-142. 112 w. wang, x. chen, h. wong [6] a. elkholy, s. saxena, r. k. nandwana, a. elshazly, p. k. hanumolu, “a 2.0-5.5 ghz wide bandwidth ring-based digital fractional-n pll with extended range multi-modulus divider”, ieee j. solid-state circuits, vol. 51, pp. 1771-1784, 2016. [7] m.-t. hsieh, j. welch, g. e. sobelman, “pll performance comparison with application to spread spectrum clock generator design,” analog integr. circ. sig. process, vol. 63, pp. 197-216, 2010. [8] a. g. amer, s. a. ibrahim, and h. f. ragai, “a novel current steering charge pump with low current mismatch and variation”, in proceedings of the ieee international symp. circuits and systems, 2016, pp. 1666-1669. [9] s. g. kim, j. rhim, d. h. kwon, m. h. kim, and w. y. choi, “a low-voltage pll with a current mismatch compensated charge pump”, in proceedings of the international soc design conference, 2015, pp. 15-16. [10] m. k. hati and t. k. bhattacharyya, “a pfd and charge pump switching circuit to optimize the output phase noise of the pll in 0.13μm cmos," in proceedings of the international conference on vlsi systems, architecture, technology and applications, 2015, pp. 1-6. [11] n. joram, r. wolf, and f. ellinger, “high swing pll charge pump with current mismatch reduction”, electron. lett., pp. 661-663, 2014. [12] y. he, x. cui, c. l. lee, and d. xue, “an improved fast acquisition pfd with zero blind zone for the pll application”, in proceedings of the ieee international conference on electron devices and solidstate circuits, 2014, pp. 1-2. [13] m.-s. shiau, c.-h. cheng, h.s. hsu, h.c. wu, h.-h. weng, j.j. hou, r. c. sun, “design for low current mismatch in the cmos charge pump”, in proceedings of the international soc design conference, 2013, pp. 310-31. [14] analog devices totorial mt-086, fundamentals of phase locked loops (plls), analog devices, 2009. [15] w. wang, x. chen and h. wong, “1.5 ghz sigma-delta fractional-n ring-based pll realized using 40 nm cmos technology for soc applications,” in proceedings of the international conference on electronics, information, and communications, phuket, thailand, january 11-14, 2017. appendix a according to fig. 1, the lpf transfer function is defined as the change in voltage at the tuning port of the vco divide by the current level from the charge pump. for 3 rd lpf given in fig. 4, the transresistance function is: 2 2 2 1 0 1 ( ) ( ) st z s s s a sa a     (a1) where a0 = c1 + c2 + c3, a1 = c2r2(c1 + c3) + c3r3(c1 + c2), a2 = c1c2c3r2r3, and time constant for zero t2 = r2c2. by expressing the transfer function in terms of poles and zeroes, we have: 2 0 1 3 1 ( ) (1 )(1 ) st z s sa st st     (a2) where poles t1 and t3 are given, respectively, by t1 = c1c2r2/(c1 + c2), t3 = r3c3. the phase margin can be readily determined from (a2) with: o 1 1 1 2 1 3 180 tan ( ) tan ( ) tan ( ) c c c t t t           (a3) by setting the derivative of the phase margin to zero, the following relationships can be obtained: (a4)   2 2 1 2 0 12 2 2 1 1 a t c t a a t a           a system-on-chip 1.5 ghz phase locked loop realized using 40 nm cmos technology 113 (a5) other components can be evaluated with the following equations: (a6) (a7) (a8) 2 2 2 1 2 1 1 2 0 3 2 2 1 2 t c t a c a a c t c a      2 0 1 3 c a c c   2 2 2 t r c  2 3 1 3 2 a r c c t  plane thermoelastic waves in infinite half-space caused facta universitatis series: electronics and energetics vol. 31, n o 2, june 2018, pp. 279 285 https://doi.org/10.2298/fuee1802279m design of novel efficient full adder circuit for quantum-dot cellular automata technology  dariush mokhtari 1 , abdalhossein rezai 1 , hamid rashidi 1 , faranak rabiei 2 , saeid emadi 3 , asghar karimi 1 1 acecr institute of higher education, isfahan branch, isfahan, iran 2 institute for mathematical research, universiti putra malaysia, malaysia 3 ipe manager school, france abstract. in this paper the novel coplanar circuits for full adder implementation in quantum-dot cellular automata (qca) technology are presented. we propose a novel one-bit full adder circuit and then utilize this new circuit to implement novel four-bit ripple carry adder (rca) circuit in the qca technology. the qcadesigner tool version 2.0.1 is utilized to implement the designed qca full adder circuits. the implementation results show that the designed qca full adder circuits have an improvement compared to other qca full adder circuits. key words: full adder, quantum-dot cellular automata, coplanar circuit, ripple carry adder, high-performance design 1. introduction computer arithmetic plays an important role in the information and communication applications such as cryptography and alu [1-3]. full adders have an important role in computer arithmetic. so, the efficiency of many computer arithmetic applications is primarily determined by the efficiency of the full adder implementation [1-3]. on the other hand, quantum-dot cellular automata (qca) technology is a promising technology, which can continue the moore’s law development. this technology uses charge formation to information transition instead of current. as a result, circuit design in the qca technology has advantages in comparison with conventional technologies such as cmos technology in terms of small dimension, fast operation and low power consumption [4, 5]. recently, several efforts have been done to improve the efficiency of the full adder implementation in the qca technology [6-15]. hänninen and takala [6] presented a qca full adder that requires 102 qca cells and 0.1 µm 2 area. ramesh and rani [7] received june 6, 2017; received in revised form november 11, 2017 corresponding author: abdalhossein rezai acecr institute of higher education, isfahan branch, isfahan, 84175-443, iran (e-mail: rezaie@acecr.ac.ir) 280 d. mokhtari, a. rezai, h. rashidi, f. rabiei, s. emadi, a. karimi designed a qca full adder that consists of 52 qca cells and 0.038 µm 2 area. abedi et al. [8] designed a qca full adder that requires 59 qca cells and 0.043 µm 2 area. hashemi, and navi [9] have offered a qca full adder that requires 71 cells and 0.06 µm 2 area. mohammadi et al. [10] presented a qca full adder that requires 38 qca cells and 0.02 µm 2 area. ahmad et al. [11] constructed a qca full adder that consists of 41 qca cells and 0.04 µm 2 area. labrado and thapliyal [12] have presented a qca full adder that requires 63 qca cells and 0.05 µm 2 area. balali et al. [13] designed a qca full adder that requires 29 qca cells and 0.02 µm 2 area. however, these full adder circuits have advantages, but the complexity and required area of full adder circuit in the qca technology can be reduced with a described new technique in this paper. in this paper, an efficient circuit for one-bit qca full adder is presented and evaluated. then, an efficient circuit is designed for four-bit qca ripple carry adder (rca). the functionality of the designed circuits is verified using qcadesigner tool version 2.0.1. the implementation results show that the designed circuits have advantages compared to recent modified one-bit qca full adder circuits and four-bit qca rca circuits. the rest of this paper is organized as follows. background of the developed circuits is presented in section 2. the designed circuits are presented in section 3. section 4, evaluates the designed circuits. finally, section 5 concludes this paper. 2. background 2.1. qca technology qca technology is an emerging technology that can utilize for development of digital circuits based on moore’s law. this new technology utilizes the charge formation instead current for information transition. the basic element in this technology is a four dots square, which has two free electrons. fig. 1 shows a simplified qca cell [1, 5]. fig. 1 a simplified qca cell [1, 5] the formation of these free electrons is utilized to denote the zero state and one state logic in this technology. using this cell, the logic elements such as majority gate [4] can be developed. it should be noted that other logic elements such as or gate and and gate can be developed using majority gate [1, 4, 5]. moreover, complex digital circuits such as full adder circuits [6-15] and multiplexer circuits [1, 5] are developed using these logic elements. design of novel efficient full adder circuit for quantum-dot cellular automata technology 281 2.2. qca full adder circuit full adder plays a vital role in the complex digital circuits. as a result, highperformance implementation of this circuit is an attractive research area. the logical function of one-bit full adder can be shown by following equation: )cb,(a, 3 maj=bc+ac+ab=carry ininin (1) )carry),cb,(a, maj3,maj3(c=c ba=sum ininin (2) in (1) and (2), a, and b denote the inputs of one-bit full adder. cin and carry denote the carry input and carry output, respectively. sum denotes the output of sum in the onebit full adder. moreover, maj3 denotes the 3-input majority function, which can be implemented using 3-input majority gate in the qca technology [1, 4, 9]. fig. 2 shows the circuit diagram for the one-bit qca full adder [6-9]. fig. 2 the circuit diagram for the one-bit qca full adder [6] in addition, the four-bit qca rca circuit can be achieved by using consecutively four one-bit full adder [1, 6-9]. fig 3 shows the four-bit qca rca circuit. fig. 3 the four-bit qca rca circuit [6] in this circuit, a = (a3, a2, a1, a0), b = (b3, b2, b1, b0) are two four-bit inputs. the cin and cout denote the one-bit carry input and carry output, and sum = (sum3, sum2, sum1, sum0) is four-bit output. 282 d. mokhtari, a. rezai, h. rashidi, f. rabiei, s. emadi, a. karimi 3. the designed qca circuits this section outlines a novel one-bit qca full adder circuit. then, the new four-bit qca rca circuit is designed based on the designed one-bit qca full adder circuit. 3.1. the designed qca full adder circuit the designed circuit for the one-bit qca full adder circuit is shown in fig. 4. fig. 4 the designed circuit for the one-bit qca full adder in this circuit, a and b are two one-bit inputs and c is the carry input. carry and sum denote the outputs of carry and sum, respectively. this circuit consists of 46 qca cells. note that, four clocking zones are utilized in this circuit as follows: white indicates clock zone 3, light blue indicates clock zone 2, violet indicates clock zone 1, and green indicates clock zone 0. 3.2. the designed qca rca circuit the designed circuit for the four-bit qca rca circuit is shown in fig. 5. fig. 5 the designed circuit for the four-bit qca rca in this circuit, a and b are two four-bit inputs and cin is the one-bit carry input. carry and sum denote the outputs of one-bit carry and four-bit sum, respectively. this circuit consists of 187 qca cells. design of novel efficient full adder circuit for quantum-dot cellular automata technology 283 4. implementation results and comparison the designed circuits are implemented using qcadesigner tool version 2.0.1. this section presents these implementation results. 4.1. the designed qca full adder circuit fig. 6 shows the implementation results of the designed circuit for the one-bit qca full adder. fig. 6 the implementation results for the designed one-bit qca full adder circuit the implementation results of the designed circuit for the one-bit qca full adder confirm the correctness of this circuit. table 1 summarizes the implementation results of the designed circuit for the one-bit qca full adder compared to other one-bit qca full adder circuits in [6-13]. table 1 the comparative table for one-bit qca full adder circuits reference complexity (#cell) area (µm 2 ) delay (clock zone) [6] 102 0.1 8 [9] 71 0.06 5 [7] 52 0.038 4 [8] 59 0.043 4 [10] 38 0.02 3 [11] 41 0.04 2 [12] 63 0.05 3 [13] 29 0.02 2 this paper 46 0.04 4 based on our implementation results that are shown in fig. 6 and table 1, the designed circuit for the one-bit qca full adder has an improvement in terms of complexity compared to other one-bit qca full adder circuits in [6-9, 12]. although the cell count in one-bit qca 284 d. mokhtari, a. rezai, h. rashidi, f. rabiei, s. emadi, a. karimi full adder circuits in [10, 11, 13] is lower than our designed one-bit qca full adder circuit, our designed four-bit qca rca circuit, which is utilized this one-bit qca full adder circuit as its basic block, has advantages compared to four-bit qca rca circuits in [10, 13]. it is because the input/output ports in the developed one-bit qca full adder have suitable places. so, the place and route results in the developed four-bit qca rca presents a better results. moreover, the output cells of one-bit qca full adder circuit in [11] aren’t suitable placed. so, the implementation of four-bit qca rca circuit using the one-bit qca full adder circuit in [11] is hard. 4.2. the designed qca rca circuit fig. 7 shows the implementation results of the designed circuit for the four-bit qca rca. fig. 7 the implementation results for the designed the four-bit qca rca circuit the implementation results of the designed circuit for the four-bit qca rca confirm the correctness of this circuit. table 2 summarizes the implementation results of the designed circuit for the four-bit qca rca compared to other four-bit qca rca circuits in [6-10, 12-14]. table 2 the comparative table for four-bit qca rca circuit reference complexity (#cell) area (µm 2 ) delay (clock zone) [6] 558 0.85 20 [9] 442 1 8 [7] 260 0.28 10 [8] 262 0.208 28 [10] 237 0.24 6 [12] 295 0.3 6 [13] 269 0.37 14 [14] 339 0.2542 7 this paper 187 0.2 16 based on our implementation results that are shown in fig. 7 and table 2, the designed circuit for the four-bit qca rca has an improvement in terms of complexity, and area compared to other four-bit qca rca circuits in [6-10, 12-14]. design of novel efficient full adder circuit for quantum-dot cellular automata technology 285 5. conclusion full adders play an important role in computer arithmetic fields. so, efficient implementation of full adders can increase the efficiency of the computer arithmetic circuits. this paper presented and evaluated an efficient full adder circuit in the qca technology. in addition, we implemented a four-bit qca rca circuit based on this new one-bit qca full adder. the designed circuits have been implemented using qcadesigner tool version 2.0.1. the implementation results confirmed that the designed circuits outperform recent modified one-bit qca full adder circuits and four-bit qca rca circuits in [6-9, 12] in terms of complexity, and required area. references [1] p. balasubramanian, “a latency optimized biased implementation style weak-indication self-timed full adder,” facta universitatis, series: electronics and energetics. vol. 28, pp. 657-671, 2015. [2] a. rezai, and p. keshavarzi, “high-performance scalable architecture for modular multiplication using a new digit-serial computation,” micro. j., vol.55, pp. 169–178, 2016. [3] a. rezai, and p. keshavarzi, “high-throughput modular multiplication and exponentiation algorithm using multibit-scan-multibit-shift technique,” ieee trans. vlsi syst., vol. 23, pp. 1710-1719, 2015. [4] m. balali, a. rezai, h. balali, f. rabiei, and s. emadi “a novel design of 5-input majority gate in quantumdot cellular utomata technology,” in proceedings of the ieee symp. comput. appl. indust. electr. (iscaie 2017), 2017, pp. 13-16. [5] h. rashidi, a. rezai, and s. soltani, “high-performance multiplexer circuit for quantum-dot cellular automata,” j. comput. electr., vol. 15, pp. 968–98, 2016. [6] i. hänninen, and j. takala, “binary adders on quantum-dot cellular automata,” j. sign. process. syst., vol. 58, pp. 87–103, 2010. [7] b. ramesh, and m. a. rani, “design of binary to bcd code converter using area optimized quantumdot cellular automata full adder,” int. j. eng., vol. 9, pp, 49-64, 2015. [8] d. abedi, g. jaberipur, and m. sangsefidi, “coplanar full adder in quantum-dot cellular automata via clock-zone-based crossover,” ieee trans. nanotech., vol. 14, pp. 497–504, 2015. [9] s. hashemi, and k. navia, “a novel robust qca full-adder,” proc. mater. sci., vol. 11, pp. 376-380, 2015. [10] m. mohammadi, m. mohammadi, and s. gorgin, “an efficient design of full adder in quantum-dot cellular automata (qca) technology,” microelectr. j., vol. 50, pp. 35-43, 2016. [11] f. ahmad, g. m. bhat, h. khademolhosseini, s. azimi, s. angizi, and k. navi, “towards single layer quantumdot cellular automata adders based on explicit interaction of cells,” j. comput. sci., vol. 16, pp. 8-15, 2016. [12] c. labrado, and h. thapliyal, “design of adder and subtractor circuits in majority logic-based fieldcoupled qca nano computing,” electron. lett.., vol. 52, pp.464-466, 2016. [13] m. balali, a. rezai, h. balali, f. rabiei, and s. emadid, “towards coplanar quantum-dot cellular automata adders based on efficient three-input xor gate,” result. phys., vol. 7, pp. 1389-1395, 2017. [14] v. pudi, and k. sridharan, “low complexity design of ripple carry and brent-kung adders in qca,” ieee trans. nanotech., vol. 11, pp. 105-119, 2012. [15] h. cho, and e. e. swartzlander, “adder and multiplier design in quantum-dot cellular automata,” ieee trans. comput., vol. 58, pp. 721-727, 2009. plane thermoelastic waves in infinite half-space caused facta universitatis series: electronics and energetics vol. 31, n o 1, march 2018, pp. 75 87 https://doi.org/10.2298/fuee1801075b electronic gearing of two dc motor shafts for wheg type mobile robot  miloš božić 1 , sanja antić 1 , vojislav vujičić 1 , miroslav bjekić 1 , goran đorđević 2 1 university of kragujevac, faculty of technical sciences, ĉaĉak, serbia 2 university of niš, faculty of electronic engineering, niš, serbia abstract. this paper describes the implementation of electronic gearing of two dc motor shafts. dc motors are drives for a mobile robot with wheels in the form of wheel leg (wheg) configuration. a single wheel consists of two whegs (dwheg). the first dc motor drives one wheg, while the second one drives another independent wheg. one motor serves as the master drive motor, while the other represents the slave drive motor. as the motors are independent, it is necessary to synchronize the speed and adjust the angle between shafts. the main contribution of this paper is the implementation of control structure that enables the slave to follow the master drive, without mechanical coupling. based on encoder measurements, the slave effectively follows the master drive for the given references of speed and angle. speed and positioning loops are implemented on real time controller sbrio. the laboratory setup was created and comparison of realized and required angles and speeds was made. key words: electronic gearing, master, slave, wheg, dc motor, sbrio 1. introduction coupling of motion axis in industrial and robotic applications is always a challenging task. it is often necessary to do coupling and synchronization of two or more axes, linear [1, 2], circular or other complex movements, such as curve profiles. the task can be performed in contact way, by using mechanical couplings like gear pairs, differential drive, belts, chains, etc. another possible alternative is contactless coupling. in industrial applications, it is used in servo applications under the name electronic coupling, electronic gearing or electronic line shafting [3, 4], while in the automobile industry, this technology is called drive-by-wire [5, 6]. the examples of the electronic coupling can be found in haptic devices [7, 8]. the advantages of electronic over mechanical coupling are numerous. electronic gearing is used instead of mechanical assemblies, given that the received march 14, 2017; received in revised form july 27, 2017 corresponding author: miloš božić university of kragujevac, faculty of technical sciences, svetog save 65, 32000 ĉaĉak, serbia (e-mail: milos.bozic@ftn.kg.ac.rs) 76 m. božić, s. antić, v. vujiĉić, m. bjekić, g. đorđević latter are the weakest link in the system. electronic coupling makes the system more efficient and flexible. typically, there is one master drive and one or more slave drives which follow references set by the master drive. reference can be given in the form of torque (current), speed or position therefore the concept can be applied to a wide variety of applications. in this paper, the concept master-slave drive was applied to wheel leg drive (wheg) [9, 10]. wheg has a form of legged wheel with n legs (spokes). in the setup employed in the paper, wheg had four spokes. placing two whegs next to each other and by independent drive of master and slave, double wheg drive (dwheg) is obtained. dwheg drive allows independent control of the two whegs. figure 1 left illustrates all the seven elements of the dwheg master-slave drive wheel. fig. 1 dwheg master-slave drive (left) and prototype of robot with rigid dwheg drives (right) the figure 1 left shows the main parts of the dwheg drive module. the parts are numbered as follows: 1) master motor, 2) slave motor, 3) coupling, 4) pulleys with a belt, 5) bearing case and shaft 6) slave wheg, 7) master wheg. dwheg presents original authors solution [9]. this type of drive is designed to increase the efficiency of mobile walking robot, planned for the uneven terrain. the appearance of the prototype robot during the testing phase with the rigid type of dwheg drive can be seen in figure 1 right. this study deals with practical issues of active control of the parameter α. by changing the angle α different compliance and contact area with the ground are provided. this allows the mobile robot to move efficiently over the flat terrain when α tends to 45° as well as over a rough terrain when the angle α tends to 0°. because of the nature of the problem this system does not require precise angle adjustment. the response time of angle adjustment is not critical. it is sufficient to adjust the angle of at least one full revolution of dwheg. varieties of synchronizing methods are presented in the industry [2,16,17,18]. in industrial drives, it is necessary to perform the synchronization speed in the shortest possible time and change of the speed of the master drive is allowed, so the cross coupling technique is commonly used. a typical example of cross coupling is the linear interpolation between two axes. in dwheg drive the slave drive must not have influence on the work speed of master drive. slave motors represent an additional support system and allow efficient travel of the robot over different terrains. for example, in case of loss of function of slave drive in case of cross coupling techniques that would mean that 0.5m electronic gearing of two dc motor shafts for wheg type mobile robot 77 master motor should stop or slow down. so cross coupling synchronization will impact on mobile robot movement and it would be impossible for robot to track reference path. the paper presents implementation of master slave synchronization on dwheg drive type with additional position loop. in the next part of the paper the mathematical model of the dwheg drive motor is presented. the third part shows the control structure and the synthesis of speed and position loops. the fourth part of the paper shows the experimental setup and obtained results. the fifth part is a brief conclusion with the future steps. 2. mathematical model of dc motor electric circuit of the dc motor with permanent magnets is given in figure 2. fig. 2 electric circuit of the dc motor with permanent magnets mathematical model of the dc motor can be described by following equations: ( ) ( ) m o t t n    (1) ( ) ( ) ( )( const .) m m me me m me o d e t k k t nk t dt         (2) ( ) ( ) ( ) ( )r r r r r m di u t l t r i t e t dt    (3) ( ) ( ) m em r m t k i t (4) ( ) ( ) ( ) ( ) ( ) m m c l m d m t m t m t j t f t dt       (5) where: ur(t), ir(t) – armature voltage and current, rr, lr – armature resistance and inductance em(t)  armature induced emf, m – angular displacement of motor shaft; o – angular displacement of gearbox out; m(t) – angular speed of motor shaft [rad/s]; o(t) – angular speed of gearbox out [rad/s]; kme, kem – emf and torque constant, mm(t) – generating motor torque; ml(t) – load torque on motor side, mc(t) – coulomb friction; jm, jo – motor and load inertia; j = jm + jo/n 2 – 78 m. božić, s. antić, v. vujiĉić, m. bjekić, g. đorđević total inertia on motor side; fm, fo – motor and load viscous friction; f = fm + fo/n 2 – total viscous friction on motor side; n – gear ratio. appropriate block diagram of the dc motor model is presented in figure 3. fig. 3 block diagram of the dc motor mathematical model to make realistic mathematical model and to perform controller synthesis, it was necessary to identify relevant parameters. parameters that were not known were determined experimentally [11]. appendix provides a table with motor data. the measurement of the parameters showed that the mathematical model given in the form of block diagram in figure 3 could be further simplified. firstly, coulomb friction was neglected because of its minor influence on the system behavior, given its low value [11]. the motor transfer function becomes: 2 2 5 2 ( ) 19.175 ( ) ( ) ( )( ) 2 1 2.849 10 0.019 1 m em m m r r r em me v r v s k k g s u s l s r js f k k t s t s s s                (6) where em m r em me k k r f k k    gain, r v r em me jl t r f k k    time constant, 2 ( ) r r r r r em me fl jr jl r f k k      relative damping factor. secondly, motor inductance could also be neglected. this was confirmed by the relatively small value of motor inductance. therefore, the transfer function of unloaded drive becomes of the first order: ( ) 19.175 ( ) ( ) 1 0.019 1 m m m r m s k g s u s t s s       (7) where 19.175 em m r m em me k k r f k k    , 19 ms,m r m r m em me j r t r f k k    (8) are the gain factor and time constant of the motor. based on the above analysis, the first order model (7) is further used. electronic gearing of two dc motor shafts for wheg type mobile robot 79 the selection of the measurement period was based on the bandwidth of the closedloop system. bandwidth of the motor in a closed loop in the absence of controllers is determined by the equation: 0 m 1 rad 105.263 . 0.5 st    (9) according to the sampling theorem, the sampling frequency s should be at least twice the bandwidth of the system o [13], which infers that 0 0, 5 29.845 ms. m t т        (10) this value of the sampling period represents the theoretical maximum. however, more practical reasons require that sampling period be lower than the theoretical maximum allowed. the relatively large sampling period in relation to the real dynamics of the system can have negative impact on the closed-loop system stability [13]. factors that determine the lower allowable value of sampling period are quality reference tracking, quality control, measured by the error in the system response due to the presence of external disturbance, system sensitivity to parameter variations and noise sensor-induced errors. in real applications, for adequate reference tracking and elimination of disturbance, sampling frequency selection s = (10  20)o is proposed [14]. this results in a preferred range of the sampling period  2.98ms 5.97 ms .t   (11) given that the introduction of the controllers additionally increases the bandwidth of closed loop system, sampling periods in the speed and position loops, based on (11), were selected to be 1ms and 5ms, respectively. 3. system control structure the control structure that provides electronic coupling of two motor shafts is shown in figure 4. the block diagram shows two independent references, those for speed – ωr and angle – α. master drive contains only a speed loop, while slave drive has a cascade structure with a position and a speed loop. on the summing junction in the slave loop, there are two references that are summarized. the first reference is the instantaneous master drive speed. the second one is a speed requirement for changing the shaft angle of slave motor, which can be either positive or negative. pi controller is selected for speed loop and pd controller for positional one. the synthesis and selection of the controller parameters will be explained in more detail in chapters that follow. 80 m. božić, s. antić, v. vujiĉić, m. bjekić, g. đorđević fig. 4 master slave loops 3.1. the synthesis of speed loops in z domain the torque/current loop was implemented in the motor driver. in the block diagrams, the driver is represented only by its transfer function. the transfer function of the driver was obtained by recording the response of motor speed to step excitation reference voltage. due to the cascade control structure, speed loop can be adjusted independently of the position loop. figure 5 shows simulink model of the speed loop. fig. 5 simplified model of the speed-loop where: t1 = 1ms is the speed loop sampling period; kd = 8.463910 3 is the driver constant; kc = 1 / 30 is the speed translation constant from rpm to impulses per interval (imp/int); 230 5.166 10 d m c k k k k      ; nr(t), n(t) are the reference and the measured speed in imp/int. the transfer function of the motor with the driver in  domain is: / / 1 1 ( ) 1 m m t tst t t m e k e g z k s st z e             (12) 1 2 0.002643 ( ) . 0.9488 c g z z z c     (13) the characteristic equation is now: 1 2 ( ) 1 ( ) 1 0 1 p i c z w z k k z c z        (14) i.e. 2 1 2 1 2 [( ) 1] 0 p i p z z k k c c c k c         electronic gearing of two dc motor shafts for wheg type mobile robot 81 since the characteristic equation of a second-order system for pseudo periodic time response when 0 1   has the form 22 2 ( ) 1 2 cos( 1 ) 0n n t t n w z z e t z e             (15) it follows 2 2 1 nt p c e k c     , 22 1 1 (1 2 cos( 1 ) )n n t t i n k e t e c          by selection different values of the damping factor  and selecting the natural frequency n i.e. the desired bandwidth of the closed-loop system 0 to be n  0 = 314 rad/s (closed loop bandwidth without controller from (9) is 105.263 rad/ s ) different values of kp = 45.5984 and ki = 33.7229 were obtained, and are given in table 1. table 1 parameters of pi regulator for 314 rad n    kp ki 0.3 45.5984 33.7229 0.5 82.5883 31.7538 0.7 115.2122 29.9389 0.9 143.9854 28.2646 digital pi parameters were also selected with ziegler-nichols method. 737.3 pkr k  , 32 10 s kr t   , 0. 3 .845 31pkrpk k   , 1.2 199 i p kr t k k t     fig. 6 motor speed for different methods of digital pi parameters synthesis however, the aim was to achieve the aperiodic velocity response to the given reference with desired dynamic. so, for the regulator parameters synthesis, the method of compensation was applied. closed loop transfer function of the velocity loop is defined with 2 1 2 ( )( 1) ( ) . ( ( 1) ) ( )( 1) s p i z c z w z c k z k z z c z         (16) 82 m. božić, s. antić, v. vujiĉić, m. bjekić, g. đorđević the parameters of the pi controller were selected in that manner so that one pole was equal 1 2 0.9488z c  . in this way, the transfer function of the velocity loop was reduced to the first-order. now with the selection of the second pole in order to achieve the aperiodic response of the given time constant 5 mst   , 2 0.8187 t t z e    the parameters of the pi speed regulator were determined. 2 1 2 1 65.0842 p c z z k c    and 1 2 1 2 1 1 3.5121 i z z z z k c      . pi controller was implemented within the labview code. its discrete transfer function is: 1( ) 65.0842 3.5121 65.0842 3512.1 1 1 r t zz g z z z       (17) where t1=1ms was the speed loop sampling period. figure 7 shows the speed responses of simulink and real model during pi speed control using compensation method. measurement confirms the achievement of the desired time constant using this method. response matching of the model and the actual system was obtained. fig. 7 the speed responses during pi speed control with parameters obtained using compensation method: simulink and the real system after the adjustment of speed loop parameters, speed loop could be presented as a simple first-order transfer function. since the open loop transfer function had the first order astatism, the zero steady-state error is provided when the input signal has the step or constant input. therefore, in order to provide the necessary dynamic characteristics of the response, which is preferably to have the aperiodic character, pd controller was selected. desired aperiodic time response was obtained with auto tuning option with selection of 0.06 p k  and 0.3 d k  , the transfer function of pd controller in discrete time domain, realized in labview code, was electronic gearing of two dc motor shafts for wheg type mobile robot 83 1 1 2 0.0015 ( ) 0.06 0.3(1 ) 0.06 (1 ), r g z z z t         (18) where t2 = 5ms was the position loop sampling period. figure 8 shows angular responses during pd position control. matching of the model and the real system angular response is apparent. fig. 8 the angular responses during pd position control: simulink and the real system 4. experiment setup and results in order to test the proposed control structure, experimental setup was made. the setup is shown in figure 9. fig. 9 experimental setup for realization of electronic gearing [19,20] the main elements of the experimental setup were: 1 – computer with labview applications for monitoring, 2 – master motor with encoder resolutions of 500 ppr, 3 – slave motor with encoder resolutions 500 ppr, 4 – real time sbrio9636 controller, 5 – master dc driver, 6 – slave dc driver, 7 – dual power 30vdc, 5a. the video showing the experimental setup in operation is available at the link given in [19, 20]. 84 m. božić, s. antić, v. vujiĉić, m. bjekić, g. đorđević fig. 10 block diagram of the experimental setup the block diagram of the experimental setup shows that the drivers run in open loop configuration. the speed loop was realized in the fpga section of the controller, while the position loop was realized in the real-time part of the controller. figures below show the reference and the obtained values of the input signals. figure 11 shows angular tracking of the slave for the given master reference speed and angle shift. fig. 11 master wheg (red) angular reference, under constant velocity, followed by slave wheg (green) the influence of external disturbance on the wheg drive of the master motor is shown in figure 12. a satisfactory tracking of the slave drive against low resolution of the encoder and short duration of disturbance can be observed. electronic gearing of two dc motor shafts for wheg type mobile robot 85 fig. 12 robustness demonstration on the master motor disturbance to test the accuracy of the system in the range of speeds and angles, whereby angle α was controlled, the speed–angle matrix was formed by measuring. speed range was examined in the range of 0 to 3000 rpm, with a resolution of 200 rpm. the angles were ranging from 0° to 90° with a resolution of 5°. at higher speeds of the master a higher value of error in the slave was observed. the reasons for the error were noise increase at the encoder due to vibration and problems with fixing the encoder to the motor shaft. the greatest error that occurred was around 2%. fig. 13 angle error for range of the angle and speed values 86 m. božić, s. antić, v. vujiĉić, m. bjekić, g. đorđević 5. conclusion a functional master–slave drive structure for electronic gearing was implemented. after the identification of parameters, the mathematical model of the motors was formed. also, the complete synthesis of the digital control system is shown in the paper. the speed and position controller were designed by using the rt controller fpga sbrio9636. the efficiency of the controller was demonstrated and no notable delay in re-aligning the slave wheg to the master wheg velocity at given angle shift was observed. this practically means that realignment can be done within one single rotation of dwheg. this allows a robot to rapidly adapt to even small obstacles like stones or wet ground. future work will be based on enhancement of the system performance by increasing the resolution of the encoder and the realization of all loops on fpga platform. further testing of this drive will be carried out in real conditions, on a treadmill belt figure 14 left and on a rotating test station figure 14 right. currently used controller and driver for rapid testing of the algorithm will be replaced with a cheaper microcontroller and a driver, in order to set up a complete system into the mobile robot. fig. 14 models of treadmill force plate (left) and rotating test station (right) references [1] v. mhase, k.r. sudarshan, o. pardeshi, p.v. suryawanshi, "integrated speed – position tracking with trajectory generation and synchronization for 2 – axis dc motion control", international journal of engineering research and development, vol. 1, issue 6, pp. 61-66, 2012. [2] f.j. pkrez-pinal, c. ngez, r. alvarez, i. cervantes, "comparison of multi-motor synchronization techniques", industrial electronics society ieee, pp. 1670-1675, 2004. [3] y.h. chang, w.h. chieng, c.s. liao, s.l. jeng, "a novel master switching method for electronic cam control with special reference to multi-axis coordinated trajectory following", control engineering practice, vol. 14, pp. 107-120, 2006. [4] c.s. liao, s.l. jeng, w.h. chieng, "electronic cam motion generation with special reference to constrained velocity, acceleration, and jerk", isa transactions, pp. 427-443, 2004. [5] p. ciáurriz, i. díaz, j.j. gil, "bimanual drive-by-wire system with haptic feedback", in proceedings of the ieee international symposium on haptic audio visual environments and games (have), 2013, pp. 18-23. [6] o. sename, "hα control of a teleoperation drive-by-wire system with communication time-delay", in proceedings of the 14th mediterranean conference on control and automation, 2006, pp. 1-6. electronic gearing of two dc motor shafts for wheg type mobile robot 87 [7] a.m. sharma, s. kumar, a. kumar, "implementation of force feedback (haptic) in master slave robotic configuration", communication, control and intelligent systems (ccis), pp. 267 – 271, 2015. [8] r. antonello, r. oboe, "force controller tuning for a master-slave system with proximity based haptic feedback", in proceedings of 40 th annual conferfence iecon 2014, ieee industrial electronics society, 2014, pp. 2774-2779. [9] m. fremerey, s. köhring, o. nassar, m. schöne, k. weinmeister, f. becker, g.s. đorċević, h. witte, "a phase-shifting double-wheg-module for realization of wheg-driven robots", in proceedings of the third international conference, living machines, milan, italy, 2014, pp. 97-107. [10] m. fremerey, g. djordjevic, h. witte, "warmor: whegs adaptation and reconfiguration of modular robot with tunable compliance", in proceedings of the first international conference, living machines, barcelona, spain, 2012, vol. 7375, pp. 345-346. [11] m. bjekic, s. antic, a. milovanovic, "permanent magnet dc motor friction measurement and analysis of friction’s impact", int. rev. electr. eng., vol. 6, no. 5, pp. 2261-2269, 2011 [12] m.j. stojĉić, "design of a digital positioning system with sinusoidal change of the jerk", appl. mech. mater., vol. 474, pp. 255–260, jan. 2014 [13] n. s. nise, control systems engineering, 6th edition international student version, wiley, 2011, isbn : 978-0-470-64612-0 [14] k. ogata, modern control engineering, fifth edition, pearson, 2010. [15] m. stojić, digitalni sistemi upravljanja. nauĉna knjiga, beograd, 1989. [16] y. koren, "cross-coupled biaxial computer control for manufacturing systems", journal of the dynamic systems, measurement and control, vol. 102, pp. 265-272, 1980. [17] m.b. naumović, and m.r. stojić, "two distant cross-coupled positioning servo drives: theory and experiment", electronics, university of banja luka, vol. 13, no. 2, pp. 25-29, december 2009. [18] l. feng, y. koren, j. borenstein, cross-coupling motion controller for mobile robots, ieee control systems, pp. 35-43, december 1993. [19] electronic gearing of two dc motors, video file part 2, laboratory for electrical machines and drives, facultyof technical sciences ĉaĉak, university of kragujevac, accessed 02.2017 [20] https://www.youtube.com/watch?v=dpx8kdulsz0 [21] electronic gearing of two dc motors, video file part 1, laboratory for electrical machines and drives, faculty of technical sciences ĉaĉak, university of kragujevac, accessed 02.2017 [22] https://www.youtube.com/watch?v=osbq7e46yu4 appendix table 2 motor data parameter value nominal voltage 24 v nominal current 0.9 a nominal torque 3.810 -2 nm nominal speed 3600 rpm friction torque at no load 0.7 10 -2 nm no load speed 4200 rpm nominal power 14.3 w torque constant 5.14 10 -2 nm/a terminal resistance 5.95 ω terminal inductance 8.9 mh gear ratio 6.25 nominal torque 40 10 -2 nm viscous friction 6.5 10 -6 nm/rad/s coulumb friction 4.9 10 -6 nm https://www.youtube.com/watch?v=dpx8kdulsz0 https://www.youtube.com/watch?v=osbq7e46yu4 10744 facta universitatis series: electronics and energetics vol. 36, no 1, march 2023, pp. 17-29 https://doi.org/10.2298/fuee2301017k © 2023 by university of niš, serbia | creative commons license: cc by-nc-nd original scientific paper the impact of finite dimensions on the sensing performance of terahertz metamaterial absorber anja kovačević, milka potrebić, dejan tošić university of belgrade, school of electrical engineering, belgrade, serbia abstract. this paper investigates the impact of finite number of unit cells on the sensing performance of chosen thz metamaterial absorber. sensor models with different number of unit cells varying from 16 to infinite have been created using wipl-d software. the results of comparison show that as the sensor’s size increases, its absorption response becomes more similar to the one of an infinite sensor structure. metamaterial absorber with 50 unit cells expresses the similar behavior in terms of the corresponding frequency and amplitude shifts as the infinite absorber when the h9n2 virus sample of variable thickness is uniformly deposited on the top of the sensors’ surface. the uneven distribution of sample affects the sensor’s absorption response which has been proven on the example of sensor with 50 unit cells. key words: thz metamaterial absorber, finite dimensions, absorption response, h9n2 virus sample 1. introduction various metamaterials have been artificially designed to manipulate electromagnetic (em) waves in the manner that enables their functional use in a wide range of device applications such as in switches, modulators, filters and sensors [1]. basically, metamaterials are structures with periodic sub-wavelength metallic [1] or dielectric [2, 3] patterns that possess em properties that are not found in natural materials [2]. metamaterial metallic-based structures inherently have dissipation losses which can be used to enhance their absorption capabilities [1]. metamaterial absorbers (ma) are devices that can minimize the reflection and theoretically eliminate transmission of the incident em wave [4]. they are typically designed as metaldielectric-metal structures [1, 5–7], but other possible designs include dielectric grating-based structures [4], integrated microfluidic structures [8] and dielectric-metal structures [2]. mas can be used in solar power harvesting, material detection, thermal imaging and sensing [4]. received may 08, 2022; revised july 02, 2022; accepted july 16, 2022 corresponding author: milka potrebić university of belgrade, school of electrical engineering, belgrade, serbia e-mail: milka.p@mts.rs 18 a. kovačević, m. potrebić, d. tošić mas that work in terahertz (thz) domain are crucial for bio-sensing applications since the vibration resonances of biomolecules coincide with the thz range [9]. besides that, thz technology has several different advantages relevant for the field of bio-sensing such as non-ionizing property and strong penetration capability [10]. sensors based on thz ma can be used to detect various virus subtypes with wide range of particle size [11]. since the physically realizable sensor has finite dimensions and therefore its structure cannot be fully periodical, we wanted to investigate the impact of finite number of unit cells on the sensing performance. first, we had to come up with proper modelling technique for both the infinite and finite sensor structure in wipl-d software. the whole modelling process alongside the geometrical and material properties of the chosen thz ma will be described in section 2. in section 3, the obtained results that describe the behavior of modelled sensor structures with and without the sample will be presented and thoroughly discussed including the case when the sample is unevenly spread across the ma’s surface. 2. sensor design and modelling process for the purpose of investigating the impact of finite dimensions on sensing performance, we have selected quad-band metamaterial absorber presented in [5]. the chosen ma is a typical planar metal-dielectric-metal structure whose quad-band absorption is achieved by introducing slight deformation to the traditional rectangular metallic resonator rather than using multiple single-band resonators of different sizes. although there are four resonant frequencies, we will focus our analysis on the range of the first resonant frequency which is below 1 thz, but the concept can be broadened to the higher frequencies. 2.1. unit cell and modelling of infinite sensor structure the unit cell structure is composed of metallic ground layer and perforated metallic resonator separated by a polyimide lossy dielectric spacer. the dimensions of interest are given in figure 1. both metallic layers are made of gold whose conductivity varies with the increase of frequency, but since the frequency range of interest is below 1 thz, the fixed value of 40.9 ms/m used in [5] is sufficient for obtaining good-quality results. if the analysis is to be extended to the range of higher frequencies, variation of conductivity can be taken into account by using drude model [12]. in addition, the ground layer is thicker than the skin depth in the whole frequency range of interest which is essential for proper isolation between the substrate and the sensor itself. metal dielectric: ɛ = 3(1+j0.05) 9.5 25 35 2.5 45.5 10 units μm 9 0.4 0.4 x y z fig. 1 thz ma unit cell with given dimensions тhe impact of finite dimensions on the sensing performance of terahertz metamaterial absorber 19 metamaterials are composed of a large number of meta-atoms represented by unit cells. consequently, the proper model of an infinite ma structure implies creating an orthogonal lattice of unit cells through the periodical repetition along the xand y-axes which can be achieved by using periodic boundary conditions (pbc). pbc are a set of boundary conditions applied for analysis of infinite 2d em structures by using a single unit cell [13]. the modelling process of an infinite sensor structure in wipl-d software using pbc option consists of three main steps. since pbc option is only available in scatterer operation mode, the first step is to choose an adequate scatterer mode. the bistatic radar cross-section (rcs) mode is more suitable for this particular structure considering the fixed position of field generator. second step involves setting the values that define unit cell in terms of the occupying space and spatial repetition. x and y values correspond to the start and end coordinates of the unit cell in xy-plane while the z values are recommended to be set to 10% higher values than the cell size determined by its geometry [13]. port 1 and 2 have been positioned at the top and on the bottom of the structure respectively. the last step consists of making planar unit cell structure by defining plates and their domains determined by the used materials and finally, specifying the source as a transverse electromagnetic (tem) plane wave vertically irradiated to the sensor surface and the frequency range of interest. additionally, the quality of planar structure model can be significantly improved by using imaging and edging. 2.2. modelling of finite sensor structure in order to create a model of finite sensor structure in wipl-d, whole modelling process has to be done manually since the pbc option is no longer suitable which results in significantly higher time-consumption. despite the introduced difficulties, the modelling of finite sensor has some significant advantages such as the ability to analyze the impact of the end effects which are inevitably present in the physically realizable structure and the possibility of modelling the uneven distribution of the sample across the sensor’s surface which will be demonstrated in section 3. to fully investigate the impact of dimensions on sensor performance, we have created models for different numbers of unit cells (16, 50, 100 and 400). although the expected dimensions of metamaterial biosensor device for experimental measurements are around 12 mm x 12 mm [14] which is equivalent to 24000 unit cells of the sensor observed in this paper, the thz source is usually focused on a much smaller area of the metamaterial sensor (approximately 1 mm2 [15] which is equivalent to around 167 unit cells). in order to improve the efficiency of simulations, we have exploited the symmetry of modelled structure and the excitation by using the symmetry plane in our models which has cut the number of unknowns around two times without compromising the results of numerical calculations. the simulation frequency range was set to the frequency range of the first resonant peak of the infinite structure. the example of modelling a sensor of finite dimensions is given for structure made of 50 cells in figure 2. 20 a. kovačević, m. potrebić, d. tošić fig. 2 modelling of the sensor with 50 unit cells (the pink plane represents the used symmetry plane) main difficulty that has occurred during the modelling process is how to adequately define sensor ports so that the results can be compared with the previously obtained results for an infinite structure. the main goal is to determine the scattering parameters of sensor which can be achieved by mimicking the process that has been incorporated into the functioning of pbc option. due to the existence of ground plane (figure 1), the transmission coefficients s21 and s12 are practically brought down to zero. for that reason, in order to reduce the complexity of analysis, we have observed only s11. by definition |s11| is: refl 11 inc p s p = (1) where prefl and pinc are powers of reflected and incident wave that had to be calculated in order to determine s11. it should be noted that the definition (1) is valid only on condition that s12 is equal to zero which has been fulfilled. to calculate these powers, we have simulated the near field distribution in the plane parallel to the sensor’s surface. the power of the wave can be calculated by using complex poynting vector s : * s' s' re d ' re ( ) d 'p        =  =              s s e h s (2) where e and h are field vectors described by their x, y and z components and ds’ = ds’iz is vector of the infinitely small surface ds'. after arranging expression (2), it is necessary to perform its discretization since the analysis has been conducted in a finite number of points n = nx ∙ ny: тhe impact of finite dimensions on the sensing performance of terahertz metamaterial absorber 21 * * re s' ( )x y y x n p e h e h    =  −      (3) where nx and ny are number of points along xand y-axes in which the near field distribution has been calculated and δs' = s' / n is an elementary surface of the observed surface s'. we have assumed that all the surfaces δs' are equal and small enough so that the field distribution is approximately constant within them. in order to find the optimal n which supports this assumption, we have varied the total number of points in which the near field distribution is calculated from 441 to 10201. we have concluded that increasing number of points doesn’t lead to the significant variations in the results. therefore, we have set the total number of points to 441 for structures of 16 and 50 cells, 1681 for 100 cells and 3721 for 400 cells. in order to get the near field distribution for the incident wave, we have created separate model with a single wire that doesn’t have significant impact on the field. the incident wave only has ex and hy components onwards marked as ex0 and hy0, thus simplifying power formula (3) to: * inc 0 0re s' ( )x y n p e h    = −       (4) minus sign has been added as the incident waves enters the surface. for calculating the power of reflected wave, we have used the field components imported from the sensor model from which we have subtracted the field components of incident wave to obtain the fields of reflected wave eir and hir (i = x, y): * * refl re s' ( ) xr yr yr xr n p e h e h   =  −     (5) finally, we have used (1) to determine |s11| for different frequencies from the operating range. considering that the selected sensor was designed as ma, we have chosen the absorption as a reference parameter in our analysis. since the transmission through the structure is negligible due to the existence of ground plane, the absorption of the chosen ma is fully defined through the reflection described by s11: 2 111a s= − (6) where |s11| 2 is normalized reflected power. 2.3. sample to investigate the sensing capabilities of both infinite and finite sensor structures, we have chosen the sample of h9n2 subtype of influenza a virus (iav). iavs are respiratory viruses with rna genome and a serious possibility of causing human epidemics or pandemics [16]. virus sample has been modeled as a continuous dielectric layer that completely covers the top of the ma structure. the complex permittivity of the sample has been determined by the frequency-dependant dispersive refractive index ( n = n + jk) derived from the drude-lorentz model 22 a. kovačević, m. potrebić, d. tošić 2 2 2 2 0 1.5 j p n      = = − − + (7) where ωp = 4 thz is the plasma frequency, ω0 = 2.8π thz is the resonant frequency and γ = 4 thz is the damping coefficient [17]. calculated n for certain frequency from the operating frequency range has been modified with coefficients a and b retrieved by thz spectroscopy for h9n2 sample of protein concentration 0.28 mg/ml into the form of complex refractive index an + b jk where 1.2a = and 1.4b = [17]. finally, the complex permittivity required by wipl-d software was calculated by squaring the corresponding complex refractive index. the whole process was repeated for each frequency used in simulation. it should be noted that coefficients a and b and therefore calculated values for complex permittivity of the sample refer to the specific protein concentration and may vary if it is changed. therefore, the samples of different concentrations can be treated as completely different sample types. during the analysis, we have varied the thickness of virus layer to examine the sensors’ behavior with different quantities of the deposited sample. the same analysis can be conducted for a different virus type by altering the coefficients a and b in addition to the parameters of drude-lorentz model given in (7). for example, for iav subtypes h1n1 and h5n2, the drude-lorentz parameters remain the same as for h9n2, but the coefficients a and b have to be modified to (1, 1.4) and (1, 1) respectively [17]. 3. results and discussion the results for selected thz ma are presented and discussed with the aim of investigating the effect that finite dimensions have on sensor’s properties and sensing capabilities. 3.1. behavior without the sample the absorption response of a finite sensor structure obtained in the frequency range of the first peak significantly varies with the change of number of unit cells (figure 3). as the number of cells increases, the peak width and its resonant frequency decrease while the prominence of the peak increases resulting in the response that becomes more similar to the one of an infinite sensor structure. all of the peaks show strong absorption which can be contributed to the combination of two effects: the influence of the perforated metallic resonator and the fabry-pérot effect as a consequence of the multiple reflections between the metallic layers [18]. in order to further compare infinite and finite structures, the corresponding q-factors have been calculated and presented in table 1 alongside with other parameters of interest such as resonant frequency fresonant, full-width at half-maximum (fwhm) and maximal absorption value (amax). the values given in table 1 numerically confirm conclusions made by observing figure 3. the structure with 16 cells does not have enough prominent resonant peak to determine fwhm and q-factor. as the number of unit cells increases, the sensor’s performance in the frequency range of the first resonant peak enhances which can be seen through the increase of q-factor. all of the made observations lead to a very important conclusion that the finite sensor structure with the sufficient number of unit cells can potentially give very approximate results to the ones that are theoretically obtained using the infinite sensor model. тhe impact of finite dimensions on the sensing performance of terahertz metamaterial absorber 23 fig. 3 absorption response of the finite structure for different numbers of unit cells in comparison with the response of the infinite structure table 1 numerical comparison of sensor models with different number of unit cells number of unit cells fresonant [thz] amax fwhm [thz] q-factor 16 0.883 0.983 / / 50 0.878 0.9742 0.169 5.2 100 0.877 0.9662 0.166 5.2 400 0.876 0.9663 0.147 6 infinite 0.864 0.9741 0.127 6.8 to gain a better insight into the underlying physical mechanism of investigated sensor structures, we have calculated the distributions of electric and magnetic fields for both the infinite and finite sensor with 50 unit cells at their first resonant frequencies. the sensor with 50 unit cells has been chosen for further analysis since it has fewer unit cells than other models with prominent peaks which reduced total modelling and simulation time. the results are presented in figure 4. figure 4 (a) shows that the electric field calculated in the parallel plane close to the sensor’s surface is mainly concentrated at the area around the resonator perforation. the electric field distribution is exactly the same for all the unit cells that compose the infinite sensor structure. on the contrary, figure 4 (b) shows that the field distribution on the finite sensor’s unit cell is dependent on its position in the structure. mentioned phenomenon is the direct consequence of the finite dimensions of the sensor and the end effect that occurs on the borders of the structure. figure 4 (c) and (d) show that the magnetic field distribution in the cross-section of both structures is fairly similar as the field is mainly gathered in the middle layer made of lossy dielectric. such confinement of electromagnetic field is typical for the metal–dielectric–metal structures as shown in [8]. the field localization predominately affects the sensing performance as the placement of the sample should coincide with the strongest wave-matter interaction zone in order to achieve high sensitivity. therefore, inverting the placement of the substrate and the sample has been proposed in an effort to enhance the interaction between the thz wave and the sample. mas with integrated microfluidic channels based on this approach were built and tested with solutions of ethanol, glucose and bovine serum albumin (bsa) [8, 19]. however, it should be noted that, due to the technical difficulties during placing and removing samples, these 24 a. kovačević, m. potrebić, d. tošić sensors may not be the most suitable candidates for applications that require large number of consecutive sensing tests and/or have samples that are not in the fully liquid form. (a) (b) (c) (d) fig. 4 distribution of electric field [v/m] for (a) infinite and (b) finite sensor model and magnetic field [ma/m] for (c) infinite and (d) finite sensor model at the first resonant frequency 3.2. behavior with the presence of sample the example of absorption response of both structures in the frequency range of the first peak for three different thicknesses (d) of h9n2 is given in figure 5. both structures show the similar behavior with the presence of virus sample as the resonant peak shifts to the left when the thickness of the sample layer is increased. consequently, the resonant frequency shift can be used not only as an indicator of the virus presence in the sample, but тhe impact of finite dimensions on the sensing performance of terahertz metamaterial absorber 25 also to determine the sample thickness. figure 5 also suggests that there is a certain limit in such detection because of the frequency shift saturation that the resonant peak undergoes when the sample thickness is increased to a certain extent. beside the frequency, the resonant peak amplitude also varies with the modification of sample properties as shown in figure 5. the values of both frequency and amplitude shifts for different thicknesses of the sample deposited on top of the both sensor structures are presented in table 2. it should be noted that, unlike the resonant frequency that never grows when the thickness of the sample increases, the resonant peak amplitude sometimes grows and sometimes declines. in that sense, the values for amplitude shifts given in table 2 are absolute values. table 2 shows that the resonant peak of the absorption response that corresponds to the finite structure experiences larger frequency shifts and saturates faster compared to the one of the infinite structure. additionally, the amplitude shifts are also more dynamic for the finite structure. fig. 5 comparison between the absorption responses of the finite sensor made of 50 unit cells with different thicknesses of h9n2 sample (full line) and the results for the infinite model (dashed line) table 2 frequency and amplitude shifts for different thicknesses of h9n2 sample deposited on top of the infinite and finite sensor structures structure thickness [µm] fresonant [thz] amax frequency shift [ghz] amplitude shift [x10-4] infinite 0 0.864 0.9741 0 0 1 0.827 0.9736 37 5 5 0.771 0.9844 93 103 8 0.750 0.9828 114 87 finite 0 0.878 0.9741 0 0 1 0.823 0.9268 55 473 5 0.761 0.9458 117 283 8 0.748 0.9840 130 99 the previously conducted analysis refers to the uniform distribution of the virus sample across the sensors’ surface. in order to investigate the impact of uneven sample distribution 26 a. kovačević, m. potrebić, d. tošić on the response, we have created several models with different sample distributions based on the model of sensor with 50 unit cells. these models have been created by removing the sample from certain unit cells thus creating the “holes” in the sample layer. since the observed structure has 50 unit cells, there are 250 different distributions that can be analyzed (each unit cell can be covered with the sample or not). taking into account the symmetry plane used in the modelling process shown in figure 2, the number of possible distributions decreases to 225 which is still considerable number to cover by analysis. in order to find the representative distributions to include into our models, we have set three possible parameters that have impact on the absorption response we wanted to characterize: the number of “holes”, the separation between them and their position in terms of the field distribution given in figure 4. let us first formalize the coordinates that describe the position of the “hole” in the sample placed on the top of the sensor’s surface as in figure 6. the gray unit cells from figure 6 belong to the part of the structure that is obtained by using symmetry plane. we can only choose the position of the “hole” from one of the white unit cells and that choice will automatically place another “hole” on the symmetrical gray unit cell. for example, if the “hole” is placed on (2, 3), it will also inevitably be placed on (2, -3). having that in mind, in the following analysis we will only declare the position of the “hole” from the white part of the structure and the position of the corresponding “hole” from the gray part will be implied. the number of “holes” will thus always be even. fig. 6 the coordinates of the “holes” in the sample first, the number of “holes” was set to two and their position and mutual distance were varied. the results presented in figure 7 indicate that placing two “holes” in the sample does lead to certain changes in the absorption response such as small frequency and amplitude shifts and slight deformations of the resonant peak’s shape. both the amplitude and the frequency of the resonant peak increase when two “holes” in the sample are introduced. the maximum increase for both parameters is achieved in the case of two тhe impact of finite dimensions on the sensing performance of terahertz metamaterial absorber 27 connected “holes” in the center of the structure ((3, 1) and its pair), corresponding maximal frequency and amplitude shifts are 3 ghz and 0.0074. the changes of resonant frequencies and amplitudes are the smallest when the “holes” are further away from the center whether the “holes” are connected ((5, 1) and its pair) or completely separated from each other ((4, 4), (2, 3) and their pairs). the differences between the absorption values for models with “holes” in the sample and the original model with uniform distributions indicate that the shape of the resonant peak is slightly altered with the introduction of two “holes”. fig. 7 absorption response for different positions of the “holes” in the 8 µm thick sample in the case of two “holes” next, the number of “holes” was increased to six and three different distributions were observed. the obtained results are shown in figure 8. the changes in the absorption response are more pronounced than when there were two “holes”. the maximal frequency shift of 8 ghz is achieved when there are six consecutive “holes” forming a 1x6 rectangular “hole” near the center of the structure ((2, 1 – 3) and their pairs). the peak amplitude for that case has the maximal decrease of 0.0221 which is about three times the absolute value fig. 8 absorption response for different positions of the “holes” in the 8 µm thick sample in the case of six “holes” 28 a. kovačević, m. potrebić, d. tošić of the corresponding shift for the two “holes”. in other two cases, the peak amplitude is slightly increased, but significantly less than for the two “holes” in the sample. 4. conclusion we have thoroughly investigated the impact of finite dimensions on the sensing performance of the thz metamaterial absorber based on the typical planar metal-dielectricmetal structure. the results have shown that as the number of unit cells increases, the absorption response approaches the one of an infinite structure which is numerically reflected in the decreased width of the resonant peak and the increased q-factor. the calculated electric field distribution has indicated that the field was mainly localized around the rectangular perforation regardless of the number of unit cells. unlike the infinite structure, the structure with finite number of unit cells has shown the dependency of the field distribution on the position of the unit cell due to the presence of the end effect. the electromagnetic field was primarily confined in the lossy dielectric layer for both the infinite and the finite structure which is typical for metal-dielectric-metal based structures. the behavior of the infinite and finite sensors in the presence of the h9n2 virus sample was examined. first, the sample was evenly distributed across the sensors’ surfaces. the results have shown that the resonant peak of the finite structure experiences greater frequency shifts and saturates more quickly with the increase of the virus layer thickness in comparison with the infinite structure. finally, we investigated the effect of uneven sample distribution on the finite sensor structure by removing the sample from the top of the certain unit cells. the analysis has shown that creating “holes” in the sample does lead to changes in the absorption response such as frequency and amplitude shifts and slight deformations of the resonant peak’s shape. the number of “holes” in the sample is proven to be the parameter that contributes to the mentioned changes the most. acknowledgment: this research was supported in part by the ministry of education, science and technological development of the republic of serbia, project no. 2022/200103, and by the innovation fund of the republic of serbia. the authors would like to acknowledge the contribution of the eu cost action ca18223. references [1] b. x. wang, w. q. huang and l. l. wang, "ultra-narrow terahertz perfect light absorber based on surface lattice resonance of a sandwich resonator for sensing applications", rsc advances, vol. 7, pp. 4295642963, 2017. [2] d. hu, t. meng, h. wang, y. ma and q. zhu, "ultra-narrow-band terahertz perfect metamaterial absorber for refractive index sensing application", results in phys., vol. 19, p. 103567, pp. 1-5, 2020. [3] y. wang, d. zhu, z. cui, l. hou, l. lin, f. qu, x. liu and p. nie, "all-dielectric terahertz plasmonic metamaterial absorbers and high-sensitivity sensing", acs omega, vol. 4, pp. 18645-18652, 2019. [4] f. yan, q. li, h. tian, z. wang and l. li, "ultrahigh q-factor dual-band terahertz perfect absorber with dielectric grating slit waveguide for sensing", j. phys. d: appl. phys., vol. 53, p. 235103, pp. 1-9, 2020. [5] q. xie, g. dong, b. wang and w. huang, "design of quad-band terahertz metamaterial absorber using a perforated rectangular resonator for sensing applications", nanoscale res. lett., vol. 13, p. 137, pp. 18, 2018. тhe impact of finite dimensions on the sensing performance of terahertz metamaterial absorber 29 [6] m. janneh, a. de marcellis, e. palange, a. t. tenggara and d. byun, "design of a metasurface-based dual-band terahertz perfect absorber with very high q-factors for sensing applications", optics commun., vol. 416, pp. 152-159, 2018. [7] w. yin, z. shen, s. li, l. zhang and x. chen, "a three-dimensional dual-band terahertz perfect absorber as a highly sensitive sensor", front. phys., vol. 9, p. 665280, pp. 1-10, 2021. [8] x. hu, g. xu, l. wen, h. wang, y. zhao, y. zhang, d. r. s. cumming and q. chen, "metamaterial absorber integrated microfluidic terahertz sensors", laser photonics rev., vol. 10, pp. 962-969, 2016. [9] l. cong, s. tan, r. yahiaoui, f. yan, w. zhang and r. singh, "experimental demonstration of ultrasensitive sensing with terahertz metamaterial absorbers: a comparison with the metasurfaces", appl. phys. lett., vol. 106, p. 031107, pp. 1-7, 2015. [10] a. kovačević, m. potrebić and d. tošić, "sensitivity analysis of possible thz virus detection using quad-band metamaterial sensor", in proceedings of the ieee 32nd international conference on microelectronics (miel), niš, serbia, 2021, pp 107-110. [11] n. akter, m. m. hasan and n. pala, "a review of thz technologies for rapid sensing and detection of viruses including sars-cov-2", mdpi biosensors, vol. 11, p. 349, pp. 1-21, 2021. [12] n. shen, p. tassin, t. koschny and c. soukoulis, "comparison of goldand graphene-based resonant nano-structures for terahertz metamaterials and an ultra-thin graphene-based modulator", phys. rev. b, vol. 90, no. 11, p. 115437, pp. 1-8, 2014. [13] wipl-d pro 17, 3d electromagnetic solver, wipl-d d.o.o., belgrade, serbia, 2021. available online: http://www.wipl-d.com (accessed on 29 april 2022). [14] g. wang, f. zhu, t. lang, j. liu, z. hong and j. qin, "all-metal terahertz metamaterial biosensor for protein detection", nanoscale res. lett., vol. 16, p. 109, pp. 1-10, 2021 [15] s. j. park, s. h. cha, g. a. shin and y. h. ahn, "sensing viruses using terahertz nano-gap metamaterials", biomed. opt. express, vol. 8, pp. 3551-3558, 2017. [16] b. dadonaite, b. gilbertson, m. l. knight, s. trifković, s. rockman, a. laederach, l. e. brown, e. fodor, d. l. v. bauer, "the structure of the influenza a virus genome", nat. microbiol., vol. 4, no. 11, pp. 1781-1789, 2019. [17] m. amin, o. siddiqui, h. abutarboush, m. farhat and r. ramzan, "a thz graphene metasurface for polarization selective virus sensing", carbon, vol. 176, pp. 580-591, 2021. [18] b. wang, a. sadeqi, r. ma, p. wang, w. tsujita, k. sadamoto, y. sawa, h. r. nejad, s. sonkusale, c. wang et al, "metamaterial absorber for thz polarimetric sensing", in proceedings of the spie, terahertz, rf, millimeter, and submillimeter-wave technology and applications xi, san francisco, ca, usa, 2018, vol. 10531, pp. 1-7. [19] f. lan, f. luo, p. mazumder, z. yang, l. meng, z. bao, j. zhou, y. zhang, s. liang, z. shi et al, "dualband refractometric terahertz biosensing with intense wave-matter-overlap microfluidic channel", biomed. opt. express, vol. 10, pp. 3789-3799, 2019. http://www.wipl-d.com/ 10963 facta universitatis series: electronics and energetics vol. 36, no 2, june 2023, pp. 189-208 https://doi.org/10.2298/fuee2302189d © 2023 by university of niš, serbia | creative commons license: cc by-nc-nd original scientific paper design and implementation of fractional-order controller in delta domain sujay kumar dolai1, arindam mondal2, prasanta sarkar3 1department of electrical engineering, dit, kolkata, west bengal, india 2department of electrical engineering, dr. bc roy engineering college, durgapur, west bengal, india 3department of electrical engineering, nitttr kolkata, west bengal, india abstract. in this work, a fractional-order controller (foc) is designed in a discrete domain using delta operator parameterization. foc gets rationally approximated using continued fraction expansion (cfe) in the delta domain. whenever discretization of any continuous-time system takes place, the choice of sampling time becomes the most critical parameter to get most accurate results. obtaining a higher sampling rate using conventional shift operator parameterization is not possible and delta operator parameterized discretize time system takes the advantages to circumvent the problem associated with the shift operator parameterization at a high sampling limit. in this work, a first-order plant with delay is considered to be controlled with foc, and is implemented in discrete delta domain. the plant model is designed using matlab as well as in hardware. the fractional-order controller is tuned in the continuous domain and discretized in delta domain to make the discrete delta foc. continuous time fractional order operator (s±α) is directly discretized in delta domain to get the overall foc in discrete domain. the designed controller in implemented using matlabsimulink and dspace board such that dspaceboard acts as the hardware implemented foc. the step response characteristics of the closed-loop system using delta domain foc resembles to that of the results obtained by continuous time controller. it proves that at a high sampling rate, the continuous-time result and discrete-time result are obtained hand to hand rather than the two individual cases. therefore, the analysis and design of foc parameterized with delta operator opens up a new area in the design and implementation of discrete foc, which unifies both continuous and discrete-time results. the discrete model performance characteristics are evaluated in software simulation using matlab, and results are validated through the hardware implementation using dspace. key words: continued fraction expansion, delta operator, dspace, fractional order controller received august 01, 2022; revised october 20, 2022; accepted november 04, 2022 corresponding author: sujay kumar dolai department of electrical engineering, dit, kolkata, west bengal, india e-mail: dolaisujay@gmail.com 190 s. k. dolai, a. mondal, p. sarkar 1. introduction a fractional-order system (fos) is a system having a non-integer order differentiator and integrator. nowadays fos has become a vital research arena not only in mathematics but also in the system theory and control. from the literature, most of the real-world system is inevitably fractional order [1]–[3]. since its inception in the year 1695, the mathematicians have done value addition and its utilization in control theory [4]. for the last few decades, the researchers have paid attention in modeling, analysis, simulation, solution of differential equations in fractional order domain to deliver a clear concept on fos [5]–[7]. the control engineers are nowadays using the fractional-order calculus as a background of fractional-order controllers (foc). to control the plant, the fractionalorder controller becomes very much essential tools rather than the integer-order controller, and it is evident from the literature that the performance of the fractional-order controller is better than that of the integer-order controller [8]. the electrochemical process [9], dielectric polarization [10], visco-electric materials [11], chaos electromagnetic fractional poles [12], signal processing [13] are the primary areas where the fractional order calculus has been rigorously used for the last decade. in the case of fos, the differentiator/integrator is symbolized by an irrational operator s±μ,where s is a complex quantity and known as laplace transform variable. for the value of μ = ±1 the irrational operator becomes an integer order operator s±1.the infinite dimensional irrational operator s±μ is usually converted to the rational function either in a continuous domain (s-domain) or discrete domain (z or δ domain). to implement the fractional operators in the discrete domain, the discretization of the same operator is of primary concern [14]. the most common discretization method is tustin operator-based discretization method. the comparative study between the different discretization methods in the z-domain is summarized in [14] to get the merits and demerits of each of the methods. for the realization of the fractional order operator in discrete domain, sampling rate during discretization should be at least 6-10 times the system bandwidth, as suggested by shannon. but when the sampling rate is increased to a certain extent, corresponding z-domain transfer function becomes numerically ill conditioned thereby fails to provide meaningful insights. the digital controller design in delta domain is better than the corresponding controller designed using shift operator [15]. the advantages of the delta operator parameterization are elaborated in [16], [17] particularly while the discrete 𝓏-domain results fails at high sampling rate. delta operator has proven its potential for its application in control theory [18], system identification [19] in case of fault detection and network control [20], for kalman filter-based controller design used in cyber-physical systems [21]. direct discretization from continuous time domain to delta domain can make the procedure for fo controller design smoother and methods for the same has been proposed in [22], [23]. high speed digital realization for the fractional order operator can be possible using the properties of delta operator parameterization [24]. moreover, delta operator parameterization has made it possible to understand both continuous and discrete-time systems in a unified framework. for designing the fractional order controller, there are different works of literatures (an231e04 data sheet., 2012), [25]–[27]where different realization techniques are discussed. the tuning of parameters for the controllers is a fundamental issue. several optimization techniques [28], [29] in the frequency domain [8], [30] are available. the analog realizations of fractional-order pid controllers have been proposed in [31]–[34]. design and implementation of fractional-order controller in delta domain 191 digital implementation of the foc for boost converter using shift operator parameterization has been successfully done in [33]. digital implementation of fractional-order controllers using fpga via shift operator parameterization in indirect discretization domain is presented in [35], [36]. in this paper, ds1202 dspace board is a platform where a realtime controller in the discrete delta domain is implemented. in this paper, the performance of the proposed controller is studied using both simulations and digital hardware platforms, and a comparative study is done. the significant contributions are made in this paper in manifold: in the earlier work, the fractional-order controllers are designed in different analog realization techniques. the discrete-time systems so far designed are done using shift operator parameterization, but shift operator parameterization fails to provide meaningful information at a high sampling rate. the real-time implementation of the controller in the digital domain needs a very high sampling rate to get a better result. in this work, the fo controller design for the integer-order plant with dead time is done using the delta operator parameterization and hardware realization is made using dspace. at a fast-sampling limit, the discrete domain results resemble to that of the continuous-time results providing a unified method of foc design in delta domain. a new direct discretization method for discretizing the fractional order continuous time operator into discrete delta domain is utilized to obtain the rational transfer function in delta domain for the implementation using dspace board. therefore, digital design and implementation of foc using delta operator parameterization using dspace is a newer concept and a new direction for further research. the organization of the paper is as: the basics of fractional-order system and controller are discussed in section 2. in section 3, the discretization of fractional order operators using the delta operator is described. the digital realization of the fopid controller using the delta operator is demonstrated in section 4. in section 5, the implementation of the proposed controller in simulink and dspace board is discussed. finally, section 6 & section 7 is devoted to analyzing the result analysis and conclusion, respectively. 2. fractional order system 2.1. fractional order calculus in fractional calculus, the non-integer order differentiation/integration is denoted by a fundamental operator md  , where ψ is used to specify the order of the operation like differentiation or integration. this operator is known as an integro-differentiator operator; this is mathematically represented as ( 0) 1 ( 0) ( ) ( 0) ψ ψ ψ τ τ ψ m d ψ dτ md ψ dτ ψ    = =      (1) there are two popular definitions, such as grünwald-letnikov (gl) and riemannliouville (rl) definitions, to express the integro-differentiator operator. (2) and(3) describe the gl and rl definitions, respectively. 192 s. k. dolai, a. mondal, p. sarkar gl definition: 0 0 ( ) lim ( 1) ( ) p ψ n τ p n md t p np n       − → =    = −   −     (2) rl definition: 1 1 ( ) ( ) ( ) ( ) x ψ τ x x m d p md t dp x d p   − +   =  −    −  (3) where the value of  varies from (x − 1) to x and  is used to represent the euler's gamma function. 2.2. fractional order differential equation and transfer function the fractional-order differential equation is used to describe the dynamics of a fractional-order system (fos). likewise, with the case of the classical integer order system, the laplace transform of the fractional-order differential equation generates the transfer function of the fos. the mathematical equation of a fractional-order system is described by (4). 1 0 1 0 1 0 1 0 ( ) ( ) ( ) ( ) ( ) ( ) n n m m r n n m r r m a d y t a d y t a d y t b d u t b d u t b d u t − −    − − + + + = + + + (4) where,  tdd 0 is known as rl-derivative or caputo fractional derivat ive. the input and the output of the system are denoted by u(t) and y(t) respectively, ai(i = 0,......,n) and bi(i = 0,......,m) are constants and i(i = 0,......,n), ri(i = 0,......,m) are arbitrary real numbers. in general, the values of iψ and rj can be considered as 01 ψ.....ψψ nn  − , and 01 r.....rr mm  − . laplace transform of (1) gives rise to a continuous-time transfer function as given by (5). { ( )} ( ) ψ τ l md t s s   = (5) according to the definition of caputo, the fractional derivative m is taken equal to 0 , and the laplace transform ( )t is denoted by ( )s . by using the expression as derived in (6), laplace transform is applied on both sides of the (4) gives rise to the transfer function of a system with y(t) as the output and u(t) is the input. 01 01 01 01 0 )( )( )(  sasasa sbsbsb su sy sg nn mm nn rr m r m +++ +++ == − − − −   (6) where, u(s) = lu(t) and y(s) = ly(t), 2.3. fractional order pid controller (pid) the fractional order pid controller performs better than the integer-order pid controller owing to its greater number of degrees of freedom. in case of the fopid controller, the orders of the integrator and differentiator ( < 0,  > 0) are non-integer. design and implementation of fractional-order controller in delta domain 193 therefore, by using the fractional-order calculus for differentiation, integration and laplace transform, the continuous-time domain transfer function of fractional order pid controller gets the following form: ( ) ( ) ( ) , 0 ( ) c p i d u s g s k k s k s e s −  = = + +    (7) where, u(s) = lu(t) and e(s) = le(t) are output and the input of the controller, respectively. the integer-order pid controller can be obtained by using  = 1 and  = 1 in (7). likewise, the pd controller can be obtained if the value of  = 0, and ki = 0. this may conclude that (7) is the generalized transfer function of integer/fractional-order controller. the basic structure of the fopid controller is given in fig. 1. fig. 1 fractional order pi d   controller 3. direct discretization of fractional order integrator and differentiator using delta operator 3.1. relationship between s-domain and  -domain the shift operator parameterization is used to describe the discrete-time system. the forward shift operator is usually denoted by q. the delta domain is an area where discrete-time systems are represented using the delta operator . the delta operator () is nothing but the scaled and shifted version of the forward shift operator (q). the operator is related with the forward shift operator q as (  is the sampling time).  − = 1q  (8) at high sampling period ( → 0), the following identity is obtained when delta operator is applied on a differentiable signal y(t): 0 ( ) ( ) lim ( ) ( ) y t y t d y t y t dt   → +  − = =  (9) the continuous-time derivative can be obtained from the delta operated signal at a fast-sampling limit as can be seen from (9). the relationship between the frequency 194 s. k. dolai, a. mondal, p. sarkar variable '' in the delta domain and the frequency variable '' z of the shift operator domain is given below:  − = 1z  (10) in (10), replacing = sez the relationship between the frequency variables in continuous time and discrete delta time is obtained and is depicted by (11).  − =  1 s e  , or, +=  1 s e )1ln( 1 +  = s (11) equation (11) represents the direct relationship between the variable s and . 3.2. direct discretization of fractional order operator in delta domain for the realization of foc in delta domain, discretization of the fractional order operator (s) in delta domain plays the pivotal role. from (11), the transformation of the fractional order operator into delta domain from continuous time domain can be re-established as:         +  = )1ln( 1 s (12) by using trapezoidal quadrature rule [37] and cfe, ln (1 + x) function can be successfully approximated to its closed form is as follow: 2 2 66 36 )1ln( xx xx x ++ + =+ (13) replacing x by  in (13), (11) can be rewritten as         ++ +        +  = 22 2 66 36 )1ln( 1   s (14) from (14), it is evident that at fast sampling rate ( → 0), s   meaning, the continuous and discrete delta domain becomes replicate to each other, thereby (14) gives a direct relationship between the two domains. equation (12) can be rewritten as: 2 2 2 6 3 6 6 s          +  =   +  +   (15) rational transfer function in delta domain corresponding to any fractional order operator can be realized using (15) through the direct discretization method as demonstrated in [23] in continuous-time system representation, fractional-order differentiator (fod) and fractional-order integrator (foi)are mathematically expressed as: )10()( = rssg r d (16) )10()( = − rssg r i (17) design and implementation of fractional-order controller in delta domain 195 continued fraction expansion (cfe) [38], [39] is used as a powerful tool that operates on the generating function to get a rational transfer function. the cfe approximation is mathematically formulated using (18)[39]. .....2 )3( 5 )2( 2 )2( 3 )1( 2 )1( 1 1)1( + − + + + − + + + − + +=+ pq pq pq pq pq qp p q (18) to obtain the standard form of cfe as given in (18), p is replaced by         −         ++ + 1 66 36 22 2   to get the result obtained by cfe in (15). here, (15) is used as the generating function for the integer order approximation of the fractional-order differentiator/integrator in the delta domain as mathematically represented by (19). r del cfeg          ++ + = 22 2 66 36 )( (19) in this work, third order approximation of fod and foi are considered for the realization and implementation purpose. delta domain coefficients [23] for the third order approximation of rs are tabulated in table 1. table 1 delta-domain coefficients for third-order approximation of rs 6 5 4 3 2 3 (3 ) ( 1) (4096 26624 9472 201472 252944 331304 506955)dnum / r / r + / r + r + r r r + r +=  coefficient numerator 0h 6 3 5 2 7 4 3 (30720 454416 36096 838259 78360 4096 192000 506955) r + r r r + r r r + dnum 1h 2 3 5 6 4 3 ( 938460 1388142 723408 608640 76800 12288 12288 ) r + r + r r + r + r dnum        2h 2 2 2 2 3 2 5 2 2 4 3 ( 465120 195900 128640 15360 714105 57600 ) r r + r r + + r dnum       3h 3 2 3 4 3 3 ( 64320 7680 97950 )+ r + r + dnum   coefficient denominator 0i 7 6 5 4 3 2 3 (4096 30720 36096 192000 454416 78360 838259 506955) r + r + r r r + r + r + / dnum 1i 2 3 5 6 4 3 (938460 1388142 723408 608640 76800 12288 12288 ) + r + r r + r + r + r / dnum        2i 2 2 2 2 3 2 5 2 2 4 3 ( 465120 195900 128640 15360 714105 57600 ) + r + r r + r + + r / dnum       3i 3 2 3 4 3 3 ( 64320 7680 97950 )+ r + r + / dnum   196 s. k. dolai, a. mondal, p. sarkar from the coefficients of table 1, the 3rd order rational approximation of sr can be obtained and 3rd order generalized transfer function as given by (20). 3 22 0 1 2 3 2 2 3 2 0 1 2 3 6 3 ( ) 6 6 r r d h h h h g s i i i i              + + ++  = = =  +  +  + + +  (20) 4. digital realization of fractional-order pid controller in the delta domain the transfer function of the pid controller in continuous time is given by (7). to realize the controller transfer functions in the delta domain, fractional order operator such as s− and s are to be implemented in the delta domain using (20).the pid controller in the delta domain takes the form as 2 2 2 2 2 2 6 3 6 3 ( ) 6 6 6 6 p i d c k k k             −    +  +  = + +    +  +  +  +     (21) in this work, the proposed foc, designed in the delta domain is to control a plant, which is of a first order with time delay [33]. the plant transfer function gp(s) is modeled through the first order padé approximation to obtain (22). 1 2( ) 1 1 1 2 p pls p l sk k g s e lst st s −   −   =     + +   +    (22) considering t = 1, l = 0.1, the plant becomes 0.1 1 0.05 ( ) 1 1 1 0.05 p ps p k k s g s e s s s −   −  =      + + +   (23) the fopid controller in the continuous-time domain is tuned using particle swarm optimization (pso) [33]for the plant as given by (23) and tuned parameters of the fopid controllers are as: proportional gain(kp) = 0.7469, integral gain(ki) = 0.874, derivative gain ( ) 0.0001, 1.2089 d k = = and 0603.0= the fopid in discrete delta domain takes the form as shown in (24). 1.2089 0.0603 2 2 2 2 2 2 6 3 6 3 ( ) 0.7469 0.874 0.0001 6 6 6 6 c           −    +  +  = + +    +  +  +  +     (24) 3rd order rational approximation of the controller in delta domain (sampling time is considered to be 001.0= second) is obtained using (20) and expressed by (25). 3 2 8 14 3 2 8 13 5 3 8 2 13 18 3 2 9 14 9.514 0.0006938 1.293 7.102 ( ) 0.7469 0.0009524 4.021 3.909 9.031 1.558 4.316 3.148 0.0001543 4.106 2.911 e e c e e e e e e e e               − − − − − − − − − −  − − − − = +    − − −   + + + +   + + +     (25) design and implementation of fractional-order controller in delta domain 197 4.1. realization of controller using df-ii method in this work, the delta domain fopid controller is realized using direct form ii (dfii) realization method. the foc can be realized in iir form in z-domain as follows 1 21 1 0 1 2 1 1 2 0 1 2 ( ) ( ) ( ) m m n n b b z b z b zy z f z x z a a z a z a z − − −− − − − − −    + + + + = =    + + + +    (26) the foc can be realized in iir form in  -domain as follows: 1 21 1 0 1 2 1 1 2 0 1 2 ( ) ( ) ( ) m m n n m m m my f x n n n n         − − −− − − − − −    + + + + = =    + + + +    (27) the functional diagram of the delta df-ii realization method is depicted in fig.2. corresponding to governing iir equation (27). fig. 2 delta direct form ii realization structure the unit delay block (z−1) corresponding to discrete z-domain is rebuilt in the discrete domain using (10) to realize the foc in delta domain. this can be called as delta direct form -ii(ddf-ii) realization. the unit delay block ( −1) in the -domain in represented by (28). 1 1 1 (1 ) z z  − − − =  − (28) 4.1.1. delta direct form-ii realization of foi the integrator part of (25) is considered for the ddf-ii realization purpose. in fig. 3, the ddf-ii realization of integrator section is demonstrated. 198 s. k. dolai, a. mondal, p. sarkar fig. 3 delta direct form ii realization of fractional-order integrator section of fractional order controller 4.1.2. delta direct form-ii realization of fod the differentiator part of (25) is considered for the ddf-ii realization purpose. in fig. 4, the ddf-ii realization of differentiator section is demonstrated fig. 4 delta direct form ii realization of fractional-order differentiator section of fractional order controller design and implementation of fractional-order controller in delta domain 199 4.2. implementation of digital controller designed in delta domain using dspace data acquisition and control of the prototype system with a controller is accomplished using ds1202 dspacemicrolabbox, which can be reprogrammed using matlab/ simulink, and dspace software. the dspace is a software package where the real-time interface with the model-based input-output can be integrated with the simulink control desk. if any continuous system is to be controlled with a digital controller having a sampling time of , the following functional diagram as shown in fig. 5can be utilized. the interfacing of the system and the controller can be pictorially demonstrated in fig.5. to get the information from the sensor to the controller in dspace, analog to digital (adc) converter is used and digital to analog (dac) is used to send the signal back. fig. 5 real-time control structure the selection of sample time of the control program using dspace depends on the time constant of the physical system, which is again related to the dynamics of the system. the actual hardware set up for the experiment is shown in fig.6 where the plant is designed in a continuous-time domain and controller is designed in the delta domain (discrete-time domain) and implemented through the ds1202 dspace board. fig. 6. actual photograph of the experimental setup in fig. 7, analog realization of fo plant [8] controller in the continuous-time domain is shown. the parameters required to design the fo plant as shown in fig.7. is summarized in table 2. 200 s. k. dolai, a. mondal, p. sarkar table 2 component specifications for designing the fo plant elements value r1 40 k r2 10 k r3 500  c1, c2 15 nf fig. 7 analog realization of fractional order plant fig. 8 shows the digital realization of the fopid controller designed using the delta operator used to control the continuous-time plant in matlab/simulink. fig. 9 demonstrates the step response of the overall system where the fopid controller using the delta operator is designed using matlab/simulink. fig. 8 digital realization of fopid controller designed in the delta domain (kp = 0.25) design and implementation of fractional-order controller in delta domain 201 fig. 9 step response of the overall system with fopid controller designed in delta domain (kp = 0.25) fig. 10 hardware implementation of the plant of first order with time delay 202 s. k. dolai, a. mondal, p. sarkar 5. result analysis in this work, delta operator parameterization is used to design the discrete fopid controller, and the same is realized by delta direct form ii structure. the plant is considered to be one first order with time delay, is designed on a real-time basis. the designed delta fopid controller is implemented using the ds1202 dspace board, and the unit step responses of the overall system for variation of the dc gain kp are demonstrated in fig. 12 to fig. 17. fig. 12 step response characteristics of the overall system with delta fopid controller in dspace (kp = 0.25, the maximum overshoot percentage or mp (%) = 1.4 and ts (ms) =1.3) fig. 13 step response characteristics of the overall system with delta fopid controller in dspace (kp = 0.5, the maximum overshoot percentage or mp (%) = 9.28 and ts (ms) = 1.5) design and implementation of fractional-order controller in delta domain 203 fig. 14 step response characteristics of the overall system with delta fopid controller in dspace (kp = 1, the maximum overshoot percentage or mp (%) = 14.53 and ts (ms) = 1.6) fig. 15 response characteristics of the overall system with delta fopid controller in dspace (kp = 2, the maximum overshoot percentage or mp (%) = 13.59 and ts (ms) = 1.2) 204 s. k. dolai, a. mondal, p. sarkar fig. 16 step response characteristics of the overall system with delta fopid controller in dspace (kp = 4, the maximum overshoot percentage or mp (%) = 7.15 and ts (ms) = 1.14) fig. 17 step response characteristics of the overall system with delta fopid controller in dspace (kp = 8, the maximum overshoot percentage or mp (%) = 2.309 and ts (ms) = 0.96) 5.1. robustness analysis for the proposed controller to study the robustness analysis of the developed delta domain foc, the dc gain (kp) is varied and the responses of the closed loop system are measured. for the variation of dc gain (kp), the peak percentage overshoot and the settling time are measured, and variation of the percentage peak overshoot and settling times does not vary considerably for the variation of dc-gain. the iso-damping property of fractional-order system is thus satisfied through the designing of discrete foc in delta domain. a comparative analysis of the time domain parameters for variation of the dc gain (kp) has been summarized in table 3. design and implementation of fractional-order controller in delta domain 205 from the plots shown in fig. 12 to fig. 17, proves that the closed loop system with delta fopid controller realized using dspace is robust against process gain (k) variations and exhibits the iso-damping properties. 5.2 sensitivity analysis of the system a perturbation (± 20 % pu) is applied to the closed loop system containing the fractional order plant and the developed delta domain fopid using dspace and the steady state response in noted. the output of the closed loop system with random variation of step input, is demonstrated in fig. 18. from the fig. 18, it is very clear that the steady state error becomes zero though a sufficient perturbation is applied at the input side. this proves the system to be a robust one and sensitive to input variation . fig. 18 steady state error of the closed loop system for a random perturbation the foc designed using continuous and discrete delta domain must have to be stable. the pole -zero plotting of the designed controller in both domains are shown in fig. 19 and fig. 20. from fig. 19 and fig. 20 the stability of the realized controllers is ensured. fig. 19 pole-zero plot of discrete delta(  ) fopid controller 206 s. k. dolai, a. mondal, p. sarkar fig. 20 pole-zero plot of continuous time fopid controller table 3 comparative analysis of the time domain parameters for variation of the dc gain ‘kp’ 6. conclusion in this paper, the design and implementation of fractional order controller in the delta domain is presented. one of the essential properties of the fractional-order system is isodamping property. the fractional-order pid controller is designed in delta domain from corresponding continuous-time fopid controller transfer function by using the direct discretization method and the delta fopid controller is then realized using delta direct form-ii structure of filter realization. the ds1202 dspace board is used in this work to implement the controller through the matlab/simulink and control desk interface of the dspace board. this approach is devoid of ill-conditioning which is inherentin the case with shift operator parameterization. in this work, the sampling rate (δ=0.001 sec) is considered very close to zero to obtain a discrete time system with very high sampling realization methods s-domain realization analog realization [33] delta domain realization kp = 0.25 %mp 11.2 4.11 1.4 ts (ms) 0.86 0.54 1.1 kp = 0.5 %mp 12.9 10.9 9.28 ts (ms) 0.52 0.32 .95 kp = 1 %mp 14.23 14 14.53 ts (ms) 0.29 0.2 .74 kp = 2 %mp 11.29 12.3 12.59 ts (ms) 0.17 0.11 1.1 kp = 4 %mp 7.3 7.9 7.1 ts (ms) 0.07 0.052 1.14 kp = 8 %mp 8.1 5.8 2.3 ts (ms) 0.021 0.017 0.96 design and implementation of fractional-order controller in delta domain 207 rate. the fopid controller designed in the delta domain gives the response characteristics very close to the responses obtained from the analog realization of the fopid controller, which is designed in the s-domain. when the dc gain "kp" is varied over a specified range, the response characteristics of the overall system remains almost unaltered meaning the property of iso-damping is satisfied. from the table 3, it is evident that the results are very close to each other in regard to the time response parameters among the three methods of designing fopid controller. the stability of the realized system is also verified through the pole and zero locations of developed delta domain controller. the system response remains stable with a perturbation in the step input as demonstrated in fig.18.the results obtained using delta parameterized discrete-time system resembles to that of the results as obtained by continuoustime system at a fast-sampling rate makes the design a unified one and a viable alternative for the discrete fractional order controller design and implementation. references [1] i. podlubny, fractional differential equations, elsevier, 1998. [2] m. nakagawa and k. sorimachi, "basic characteristics of a fractance device", ieice trans. fundamentals electron., commun. comput. sci., vol. 75, pp. 1814-1819, dec. 1992. [3] a. oustaloup, la dérivation non entière, hermes science publication, 1995. [4] r. caponetto, g. dongola, l. fortuna and i. petrá, fractional order systems: modeling and control applications, world scientific, 2010. [5] k. b. oldham and j. spanier, the fractional calculus: theory and applications of differentiation and integration to arbitrary order, elsevier science, 1974. [6] i. podlubny, "fractional-order systems and piλdμ-controllers", ieee trans. automatic contr., vol. 44, no. 1, pp. 208-214, jan. 1999. [7] k. s. miller and b. ross, an introduction to the fractional calculus and fractional differential equations, john wiley & sons, july 1993. [8] y. q. chen, i. petrá and d. xue, "fractional order control a tutorial", in proceedings of the 2009 american control conference, pp. 1397-1411, june 2009. [9] h. h. sun, b. onaral and y. y. tso, "application of the positive reality principle to metal electrode linear polarization phenomena", ieee trans biomed eng, vol. bme-31, pp. 664-674, oct. 1984. [10] h. h. sun, a. a. abdelwahab and b. onaral, "linear approximation of transfer function with a pole of fractional power", ieee trans automat contr, vol. 29, pp. 441-444, may 1984. [11] s. b. skaar, a. n. michel and r. a. miller, "stability of viscoelastic control systems", in proceedings of the 26th ieee conference on decision and control, vol. 26, pp. 1582-1587, july 1987. [12] n. engheta, "fractional calculus and fractional paradigm in electromagnetic theory", in proceedings of the international conference on mathematical methods in electromagnetic theory (mmet 98) (cat. no.98ex114), vol. 1, pp. 43-49, june 1998. [13] j. swarnakar, p. sarkar and l. j. singh, "a unified direct approach for discretizing fractional-order differentiator in delta-domain", int. j. model. simul. sci. comput., vol. 9, pp. 1850055:1-1850055:20, aug. 2018. [14] j. a. t. machado, "analysis and design of fractional-order digital control systems", syst. anal. modelling simulation, vol. 27, pp. 107-122, 1997. [15] r. h. middleton and g. c. goodwin, digital control and estimation: a unified approach, englewood cliffs, nj, prentice hall, 1990. [16] a. khodabakhshian, v. j. gosbell and f. coowar, "discretization of power system transfer functions", ieee trans. power syst., vol. 9, no. 1, pp. 255-261, feb. 1994. [17] g. c. goodwin, r. h. middleton and h. v. poor, "high-speed digital signal processing and control" in proceedings of the ieee, vol. 80, no. 2, pp. 240-259, feb. 1992. [18] j. cortés-romero, a. luviano‐juárez and h. j. sira-ramírez, "a delta operator approach for the discretetime active disturbance rejection control on induction motors", math. probl eng, vol. 2013, pp.1-9, nov. 2013. [19] s. ganguli, g. kaur and p. sarkar, "identification in the delta domain: a unified approach via gwocfa", soft. comput., vol. 24, no. 3, pp. 4791-4808, april 2020. 208 s. k. dolai, a. mondal, p. sarkar [20] y. zhao and d. zhang, "h∞ fault detection for uncertain delta operator systems with packet dropout and limited communication", in proceedings of the american control conference, 2017, pp. 4772-4777. [21] j. gao, s. chai, m. shuai, b. zhang and l. cui, "detecting false data injection attack on cyberphysical system based on delta operator", in proceedings of the chinese control conference (ccc), 2018, pp. 5961-5966. [22] j. swarnakar, p. sarkar and l. j. singh, "direct discretization method for realizing a class of fractional order system in delta domain – a unified approach", automatic control comput. sci., vol. 53, no. 2, pp. 127-139, june 2019. [23] s. dolai, a. mondal and p. sarkar, "a new approach for direct discretization of fractional order operator in delta domain" fu: elec. energ., vol. 35, no. 3, pp. 313-331, sept. 2022. [24] g. maione, "high-speed digital realizations of fractional operators in the delta domain", ieee trans automat contr., vol. 56, no. 3, pp. 697-702, march 2011. [25] r. herrmann, fractional calculus: an introduction for physicists, singapore world scientific publishing, 2011. [26] j. zhong and l. li, "tuning fractional-order piλdμ controllers for a solid-core magnetic bearing system", ieee trans. control syst. technol., vol. 23, pp. 1648-1656, july 2015. [27] c. a. monje, y. q. chen, b. m. vinagre, d. xue and v. feliu, fractional-order systems and control : fundamentals and applications, springer-verlag, 2010, london. [28] b. saidi, m. amairi, s. najar and m. aoun, "bode shaping-based design methods of a fractional order pid controller for uncertain systems", nonlinear dyn., vol. 80, pp. 1817-1838, sept. 2015. [29] r. duma, p. dobra, and m. trusca, "embedded application of fractional order control",” electron lett, vol. 48, pp. 1526-1528, nov. 2012. [30] t. n. l. vu and m. lee, "analytical design of fractional-order proportional-integral controllers for timedelay processes", isa trans., vol. 52, no. 5, pp. 583-591, sept. 2013. [31] i. podlubny, i. petráš, b. m. vinagre, et al., "analogue realizations of fractional-order controllers". nonlinear dyn., vol. 29, pp. 281-296, july 2002. [32] j. petrzela, r. sotner and m. guzan, "implementation of constant phase elements using low-q band-pass and band-reject filtering sections," in proceedings of the international conference on applied electronics (ae), pilsen, czech republic, 2016, pp. 205-210. [33] c. muñiz-montero, l. v. garcía-jiménez, l. a. sánchez-gaspariano, c. sánchez-lópez, v. r. gonzález-díaz and e. tlelo-cuautle, "new alternatives for analog implementation of fractional-order integrators, differentiators and pid controllers based on integer-order integrators", nonlinear dyn, vol. 90, pp. 241256, oct. 2017. [34] b. m. vinagre, i. podlubny, a. hernandez and v. feliu, "some approximations of fractional order operators used in control theory and applications", j. fract. calc. appl. anal., pp. 231-248, jan. 2000. [35] s. khubalkar, a. junghare, m. aware and s. das, "unique fractional calculus engineering laboratory for learning and research", int. j. electr. eng. education, vol. 57, no. 1, pp. 3-23, jan. 2020. [36] m. s. monir, w. s. sayed, a. h. madian, a. g. radwan and l. a. said, "a unified fpga realization for fractional-order integrator and differentiator", electronics, vol. 11, no. 13, p. 2052, june 2022. [37] k. s. khattri, "new close form approximations of ln (1 + x)", teaching of math., vol. 12, no. 1, pp. 714, dec. 2009. [38] w. rui, s. qiuye, z. pinjia, g. yonghao, q. dehao and w. peng, "reduced-order transfer function model of the droop-controlled inverter via jordan continued-fraction expansion", ieee trans. energy conver., vol. 35, pp. 1585-1595, march 2020. [39] y. chen, b. m. vinagre and i. podlubny, "continued fraction expansion approaches to discretizing fractional order derivatives—an expository review", nonlinear dyn., vol. 38, no. 1, pp. 155-170, dec. 2004. facta universitatis series: electronics and energetics vol. 30, no 3, september 2017, pp. 351 362 doi: 10.2298/fuee1703351s nikola stojanović1, negovan stamenković2 received june 14, 2016; received in revised form november 18, 2016 corresponding author: nikola stojanović university of niš, faculty of electronic engineering, aleksandra medvedeva 14, 18000 niš, serbia (e-mail: nikola.stojanovic@elfak.ni.ac.rs) facta universitatis series: electronics and energetics vol. 28, no 4, december 2015, pp. 507 525 doi: 10.2298/fuee1504507s horizontal current bipolar transistor (hcbt) – a low-cost, high-performance flexible bicmos technology for rf communication applications tomislav suligoj1, marko koričić1, josip žilak1, hidenori mochizuki2, so-ichi morita2, katsumi shinomura2, hisaya imai2 1university of zagreb, faculty of electrical engineering and computing, department of electronics, microand nano-electronics laboratory, croatia 2asahi kasei microdevices co. 5-4960, nobeoka, miyazaki, 882-0031, japan abstract. in an overview of horizontal current bipolar transistor (hcbt) technology, the state-of-the-art integrated silicon bipolar transistors are described which exhibit ft and fmax of 51 ghz and 61 ghz and ftbvceo product of 173 ghzv that are among the highest-performance implanted-base, silicon bipolar transistors. hbct is integrated with cmos in a considerably lower-cost fabrication sequence as compared to standard vertical-current bipolar transistors with only 2 or 3 additional masks and fewer process steps. due to its specific structure, the charge sharing effect can be employed to increase bvceo without sacrificing ft and fmax. moreover, the electric field can be engineered just by manipulating the lithography masks achieving the high-voltage hcbts with breakdowns up to 36 v integrated in the same process flow with high-speed devices, i.e. at zero additional costs. double-balanced active mixer circuit is designed and fabricated in hcbt technology. the maximum iip3 of 17.7 dbm at mixer current of 9.2 ma and conversion gain of -5 db are achieved. key words: bicmos technology, bipolar transistors, horizontal current bipolar transistor, radio frequency integrated circuits, mixer, high-voltage bipolar transistors. 1. introduction in the highly competitive wireless communication markets, the rf circuits and systems are fabricated in the technologies that are very cost-sensitive. in order to minimize the fabrication costs, the sub-10 ghz applications can be processed by using the high-volume silicon technologies. it has been identified that the optimum solution might received march 9, 2015 corresponding author: tomislav suligoj university of zagreb, faculty of electrical engineering and computing, department of electronics, microand nano-electronics laboratory, croatia (e-mail: tom@zemris.fer.hr) lowpass filters approximation based on the jacobi polynomials 1university of niš, faculty of electronic engineering, serbia 2university of priština, faculty of natural science and mathematics, serbia abstract. a case study related to the design the the analog lowpass filter using a set of orthogonal jacobi polynomials, having four parameters to vary, is considered. the jacobi polynomial has been modified in order to be used as a filter approximating function. the obtained magnitude response is more general than the response of the classical ultraspherical filter, due to one additional parameter available in orthogonal jacobi polynomials. this additional parameter may be used to obtain a magnitude response having either smaller passband ripple, smaller group delay variation or sharper cutoff slope. two methods for transfer function approximation are investigated: the first method is based on the known shifted jacobi polynomial, and the second method is based on the proposed modification of jacobi polynomials. the shifted jacobi polynomials are suitable only for odd degree transfer function. however, the proposed modified jacobi polynomial filter function is more general but not orthogonal. it is transformed into orthogonal polynomial when orders are equal and then includes the chebyshev filter of the first kind, the chebyshev filter of the second kind, the legendre filter, gegenbauer (ultraspherical) filter and many other filters, as its special cases. key words: filters, analog circuits, approximation, filter characteristic function, jacobi polynomial, orthogonal polynomials. facta universitatis (niš) ser.: elec. energ. vol. 30, no. 1, february 2017, xx-xx lowpass filters approximation based on the jacobi polynomials nikola stojanović1 and negovan stamenković2 1university of niš, faculty of electronic engineering, serbia 2university of priština, faculty of natural science and mathematics, serbia abstract: a case study related to the design the the analog lowpass filter using a set of orthogonal jacobi polynomials, having four parameters to vary, is considered. the jacobi polynomial has been modified in order to be used as a filter approximating function. the obtained magnitude response is more general than the response of the classical ultraspherical filter, due to one additional parameter available in orthogonal jacobi polynomials. this additional parameter may be used to obtain a magnitude response having either smaller passband ripple, smaller group delay variation or sharper cutoff slope. two methods for transfer function approximation are investigated: the first method is based on the known shifted jacobi polynomial, and the second method is based on the proposed modification of jacobi polynomials. the shifted jacobi polynomials are suitable only for odd degree transfer function. however, the proposed modified jacobi polynomial filter function is more general but not orthogonal. it is transformed into orthogonal polynomial when orders are equal and then includes the chebyshev filter of the first kind, the chebyshev filter of the second kind, the legendre filter, gegenbauer (ultraspherical) filter and many other filters, as its special cases. keywords: filters; analog circuits; approximation; filter characteristic function; jacobi polynomial; orthogonal polynomials. 1 introduction the very classical orthogonal polynomials jacobi, laguerre and hermite [1] and their special cases i.e gegenbauer, chebyshev and legendre are widely used in communication theory and particularly in the synthesis transfer function of electric filters. the coefficients of the bessel-thomson filters, which provide maximally flatness of the group delay response in the passband without any ripple, are related to the bessel polynomials [2]. however, the bessel type polynomials are not orthogonal on an interval of the x-axis, but in certain cases are orthogonal on a unit circle. manuscript received on june 9, 2016. corresponding author: nikola stojanović, university of niš, faculty of electronic engineering, a. medvedeva 14, 1800 niš, serbia (e-mail: nikola.stojanovic@elfak.ni.ac.rs). 1 352 n. stojanović, n. stamenković lowpass filters approximation based on the jacobi polynomials 353 2 n. stojanović and n. stamenković: apart from chebyshev polynomials, which are of utmost importance in the synthesis of filters exhibiting a sharp increase in attenuation as the frequency increases above corner frequency, other classes of above mentioned orthogonal polynomials have found many useful applications in the synthesis of electrical filters. in particular, the approximation problem in the synthesis of electrical filters consists of finding a physical realizable function of frequency that shall meet a prescribed set of specifications with regard to its magnitude and/or group delay characteristics. it is known that, for a given filter degree, there is always a trade-off between the magnitude and group delay characteristics. by considering the whole frequency band, the better group delay characteristic is generally associated with the better time domain characteristic [3]. the better time domain characteristic leads to smaller time delay or smaller values of the overshoot in the step response. there are approximations that have a very good magnitude characteristic in detriment of their group delay characteristic, as for example, butterworth [4], chebyshev [5], [6], bernstein [7], legendre [8] [9] [10] and their derivatives by ku and drubin [11]. converse case occurs with other approximations, as for example, bessel [12], gauss [13], hermite [11] and least-squares monotonic [14] [15], all those filters present optimized characteristics in specific points. transitional filters are alternative filter solutions that perform a trade-off between the magnitude and group delay characteristics. transitional butterworth-chebyshev [16] filters are considered with magnitude characteristics that vary gradually from those of the butterworth filter to those of the chebyshev filter as a number of pass-band ripples (or the degree of flatness at the origin) is varied. three degrees of freedom are available for transitional butterworth-chebyshev filters: the degree n, the ripple factor ε and the degree of flatness at the origin. the smooth transition is accomplished using the method proposed of peless and murakami [17] by finding each pole of the transitional butterworth-thompson filter as an interpolation between a pole of the butterworth filter and a corresponding pole of the thompson filter. a special class of filter functions of odd order providing monotonic magnitude characteristic of the resulting filter has first been investigated by papoulis [18] by means of legendre polynomials. subsequently these results have been extended so as to include filters of even degree [19], [20], and also some other functions leading to the same class of filtering networks whose magnitude response is bounded to be monotonic have been derived using a different approach based on the applications of jacobi polynomials [21]. in this paper, the concept of magnitude response synthesis techniques is extended for orthogonal jacoby lowpass filters. simple modification of orthogonal jacobi polynomial, suitable for the continuous-time lowpass filter design, is proposed in this paper. if the degree of the filter is given, both indexes (order) of the jacobi polynomial can be used for smoothly adjusting the filter performance. the magnitude response obtained is more general than the continuous-time response of the chebyshev filter because of two additional parameters available with the modified jacobi polynomials. it should be noted, the proposed jacobi approximation covers many of the above-mentioned all-pole filter functions. 352 n. stojanović, n. stamenković lowpass filters approximation based on the jacobi polynomials 353 lowpass filters approximatin based on the jacobi polynomials 3 2 filter magnitude function in lowpass filter design, assuming all the zeros of the system function are at infinity, the squared magnitude function (insertion loss) can be written as |hn( jω)|2 = 1 1 + ε 2φ 2n (ω) (1) where ω is frequency variable, ε is a parameter that controls the passband attenuation tolerance, n denotes the degree of the filter and the polynomial φn(ω) is the characteristic (or approximating) function of the filter which is to be selected to give desired magnitude characteristic. the characteristic function is normalized to unity at the pass-band edge frequency ωp, which is also normalized to ωp = 1, then can be written as φn(1) = 1. this conventional procedure for filter design using the insertion loss method includes the design of a lumped element lc ladder lowpass filter known as the lowpass prototype. a more modern procedure uses this network synthesis technique to design filters with a completely specified frequency response. the design is simplified by beginning with low-pass filter prototypes that are normalized in terms of impedance and frequency. transformations are then applied to convert the prototype designs to the desired frequency range and impedance level. in filter design, the characteristic frequency use for frequency normalization is the cutoff frequency known as the filter passband corner frequency, and therefore normalized cutoff frequency is equal to 1. for this application, the function φ 2n (x) is required to be an even polynomial ψn(ω 2) = φ 2n (x). if φn(x) is even or odd, then φ 2 n (x) is always even, as is required. polynomials φn(x), which are neither even nor odd, may be also be used in magnitude functions if φn(x) is replaced by φn(x2). therefore it is necessary that no terms of the form x2k+1 appear in the characteristic function. the jacobi polynomials p(α,β )n (x) have n distinct zeros for α �= β but they are neither even nor odd. such type of polynomials are not suitable to be a filter characteristic function. however, jacobi orthogonal polynomials can be adapted for use in the low-pass filter magnitude functions, as will be shown in the next section. 3 jacobi polynomial the jacobi polynomials [22], denoted by p(α,β )n (x) of the degree n, are orthogonal on the interval [−1,1] with respect to the jacobi weight function w(α,β ) = (1 − x)α (1 + x)β when α,β ≥ −1. we shall refer to α and β as the orders of the jacobi polynomial. namely, ∫ 1 −1 p(α,β )m (x)p (α,β ) n (x)w(α,β )(x)dx = h (α,β ) n δn.m, (2)4 n. stojanović and n. stamenković: where h(α,β )n = 2α+β +1 2n + α + β + 1 γ(n + α + 1)γ(n + β + 1) γ(n + 1)γ(n + α + β + 1) , (3) δn.m is kronecker delta symbol and γ(·) is well known gamma function. the jacobi polynomials are generated by the three-term recurrence relation: p(α,β )0 (x) = 1, p(α,β )1 (x) = 1 2 (α + β + 2)x + 1 2 (α − β ), p(α,β )n+1 (x) = (a (α,β ) n x − b (α,β ) n )p (α,β ) n (x)− c (α,β ) n p (α,β ) n−1 (x), n ≥ 1 (4) where a(α,β )n = (2n + α + β + 1)(2n + α + β + 2) 2(n + 1)(n + α + β + 1) b(α,β )n = (β 2 − α 2)(2n + α + β + 1) 2(n + 1)(n + α + β + 1)1)(2n + α + β ) c(α,β )n = (n + α)(n + β )(2n + α + β + 2) (n + 1)(n + α + β + 1)1)(2n + α + β ) matlab is an inexpensive an easi-to-use software package and widely available comercial product that is in widespread in both academia and industry [23]. a matlab script for evaluating jacobi polynomials using the above procedure is given in jacobipoly.m. in addition to jacobi polynomial, proposed matlab program also evaluates gegenbauer and legendre polynomials. jacobypoly.m function p=jacobipoly(n,a,b) % coefficients p of the jacobi polynomial % they are stored in decending order of powers if nargin == 1, a=0; b=0; elseif nargin == 2, b=a; end p0 = 1; p1 = [(a+b)/2+1,(a-b)/2]; if n == 0, p=p0; elseif n == 1, p=p1; else for k=2:n, d=2*k*(k+a+b)*(2*k-2+a+b); a=(2*k+a+b-1)*(2*k+a+b-2)*(2*k+a+b)/d; b=(2*k+a+b-1)*(aˆ2-bˆ2)/d; 354 n. stojanović, n. stamenković lowpass filters approximation based on the jacobi polynomials 355 4 n. stojanović and n. stamenković: where h(α,β )n = 2α+β +1 2n + α + β + 1 γ(n + α + 1)γ(n + β + 1) γ(n + 1)γ(n + α + β + 1) , (3) δn.m is kronecker delta symbol and γ(·) is well known gamma function. the jacobi polynomials are generated by the three-term recurrence relation: p(α,β )0 (x) = 1, p(α,β )1 (x) = 1 2 (α + β + 2)x + 1 2 (α − β ), p(α,β )n+1 (x) = (a (α,β ) n x − b (α,β ) n )p (α,β ) n (x)− c (α,β ) n p (α,β ) n−1 (x), n ≥ 1 (4) where a(α,β )n = (2n + α + β + 1)(2n + α + β + 2) 2(n + 1)(n + α + β + 1) b(α,β )n = (β 2 − α 2)(2n + α + β + 1) 2(n + 1)(n + α + β + 1)1)(2n + α + β ) c(α,β )n = (n + α)(n + β )(2n + α + β + 2) (n + 1)(n + α + β + 1)1)(2n + α + β ) matlab is an inexpensive an easi-to-use software package and widely available comercial product that is in widespread in both academia and industry [23]. a matlab script for evaluating jacobi polynomials using the above procedure is given in jacobipoly.m. in addition to jacobi polynomial, proposed matlab program also evaluates gegenbauer and legendre polynomials. jacobypoly.m function p=jacobipoly(n,a,b) % coefficients p of the jacobi polynomial % they are stored in decending order of powers if nargin == 1, a=0; b=0; elseif nargin == 2, b=a; end p0 = 1; p1 = [(a+b)/2+1,(a-b)/2]; if n == 0, p=p0; elseif n == 1, p=p1; else for k=2:n, d=2*k*(k+a+b)*(2*k-2+a+b); a=(2*k+a+b-1)*(2*k+a+b-2)*(2*k+a+b)/d; b=(2*k+a+b-1)*(aˆ2-bˆ2)/d; lowpass filters approximatin based on the jacobi polynomials 5 c=2*(k-1+a)*(k-1+b)*(2*k+a+b)/d; p=conv([a b],p1)-c*[0,0,p0]; p0 = p1; p1 = p; end end end some properties of the jacobi polynomials, which are needed here, are as follows p(α,β )n (1) = γ(n + α + 1) γ(n + 1)γ(α + 1) (5) and p(α,β )n (−1) = (−1)nγ(n + β + 1) γ(n + 1)γ(β + 1) (6) jacobi polynomials have symmetry p(α,β )n (x) = (−1)np (β ,α) n (x) (7) the following important derivative relation is d dx p(α,β )n (x) = 1 2 (n + α + β + 1)p(α+1,β +1)n−1 (x) (8) 3.1 shifted jacobi polynomials in order to use jacobi polynomials on the interval x ∈ [0,1] we define the so-called shifted jacobi polynomials by introducing the change of variable x �→ 2x − 1. let the shifted jacobi polynomials p(α,β )n (2x−1) be denoted by j (α,β ) n (x). the shifted jacobi polynomials are orthogonal with respect to the weight function w(α,β )s = (1 − x)α xβ in the interval [0,1] with the orthogonality property: ∫ 1 0 w(α,β )s j (α,β ) m (x)j (α,β ) n (x)dx = 1 2n + α + β + 1 γ(n + α + 1)γ(n + β + 1) γ(n + 1)γ(n + α + β + 1) δn,m (9) the shifted jacobi polynomials are generated from the three-term recurrence relations [24]: j (α,β ) 0 (x) = 1, j (α,β ) 1 (x) = (α + β + 2)y −(β + 1), j (α,β ) n+1 (x) = (a (α,β ) n x − b (α,β ) n )j (α,β ) n (x)− c (α,β ) n j (α,β ) n−1 (x), n ≥ 1 (10) 354 n. stojanović, n. stamenković lowpass filters approximation based on the jacobi polynomials 355 lowpass filters approximatin based on the jacobi polynomials 5 c=2*(k-1+a)*(k-1+b)*(2*k+a+b)/d; p=conv([a b],p1)-c*[0,0,p0]; p0 = p1; p1 = p; end end end some properties of the jacobi polynomials, which are needed here, are as follows p(α,β )n (1) = γ(n + α + 1) γ(n + 1)γ(α + 1) (5) and p(α,β )n (−1) = (−1)nγ(n + β + 1) γ(n + 1)γ(β + 1) (6) jacobi polynomials have symmetry p(α,β )n (x) = (−1)np (β ,α) n (x) (7) the following important derivative relation is d dx p(α,β )n (x) = 1 2 (n + α + β + 1)p(α+1,β +1)n−1 (x) (8) 3.1 shifted jacobi polynomials in order to use jacobi polynomials on the interval x ∈ [0,1] we define the so-called shifted jacobi polynomials by introducing the change of variable x �→ 2x − 1. let the shifted jacobi polynomials p(α,β )n (2x−1) be denoted by j (α,β ) n (x). the shifted jacobi polynomials are orthogonal with respect to the weight function w(α,β )s = (1 − x)α xβ in the interval [0,1] with the orthogonality property: ∫ 1 0 w(α,β )s j (α,β ) m (x)j (α,β ) n (x)dx = 1 2n + α + β + 1 γ(n + α + 1)γ(n + β + 1) γ(n + 1)γ(n + α + β + 1) δn,m (9) the shifted jacobi polynomials are generated from the three-term recurrence relations [24]: j (α,β ) 0 (x) = 1, j (α,β ) 1 (x) = (α + β + 2)y −(β + 1), j (α,β ) n+1 (x) = (a (α,β ) n x − b (α,β ) n )j (α,β ) n (x)− c (α,β ) n j (α,β ) n−1 (x), n ≥ 1 (10) 6 n. stojanović and n. stamenković: where the recursion coefficients are a(α,β )n = (2n + α + β + 1)(2n + α + β + 2) (n + 1)(n + α + β + 1) b(α,β )n = (2n + α + β + 1)(2n2 +(1 + β )(α + β )+ 2n(α + β + 1)) (n + 1)(n + α + β + 1)(2n + α + β ) c(α,β )n = (2n + α + β + 2)(n + α)(n + β ) (n + 1)(n + α + β + 1)(2n + α + β ) (11) the shifted jacobi polynomial j (α,β )n (x) can be obtained in the polynomial standard form as j (α,β ) n (x) = n ∑ i=0 (−1)n−i γ(n + α + β + i + 1) γ(i + 1)γ(n + α + β + 1) γ(n + β + 1) γ(n − i + 1)γ(β + i + 1) xi (12) suppose the jacobi polynomials should be normalized soo’ that φn(1) = 1. according to the polynomial (12), the normalization constant is k(α,β )n = ∑ni=0 a (n) i , where a (n) i are corresponding polynomial coefficients. as an example, fig. 1 shows the characteristic functions based on the shifted jacobi polynomials for n = 1,2,...,5 in the form φn(x) = xν j (α,β ) m (x2)/k (α,β ) n , where n = ⌊m/2⌋+ ν , the floor function ⌊m/2⌋ rounds the value of m/2 to the nearest integers towards zero, ν = 0 and ν = 1 for n even and odd, respectively. −1.5 −1 −0.5 0 0.5 1 −5 −4 −3 −2 −1 0 1 2 3 4 5 j4 (α,β) j 3 (α,β) j 2 (α,β) j 5 (α,β) j 1 (α,β) characteristic function shifted jacobi α=−0.5, β=0.5 x φ n (x )= xν j m(α ,β ) (x 2 ) /k n(α ,β ) fig. 1. the normalized shifted jacobi polynomials φn(x) = xν j (α,β ) m (x2) for ν = 0 and ν = 1 for n even and odd, respectively, used in place characteristic function, α = −0.5 and β = 0.5, n = 2m + ν , m = 0,1 and 2. 356 n. stojanović, n. stamenković lowpass filters approximation based on the jacobi polynomials 357 6 n. stojanović and n. stamenković: where the recursion coefficients are a(α,β )n = (2n + α + β + 1)(2n + α + β + 2) (n + 1)(n + α + β + 1) b(α,β )n = (2n + α + β + 1)(2n2 +(1 + β )(α + β )+ 2n(α + β + 1)) (n + 1)(n + α + β + 1)(2n + α + β ) c(α,β )n = (2n + α + β + 2)(n + α)(n + β ) (n + 1)(n + α + β + 1)(2n + α + β ) (11) the shifted jacobi polynomial j (α,β )n (x) can be obtained in the polynomial standard form as j (α,β ) n (x) = n ∑ i=0 (−1)n−i γ(n + α + β + i + 1) γ(i + 1)γ(n + α + β + 1) γ(n + β + 1) γ(n − i + 1)γ(β + i + 1) xi (12) suppose the jacobi polynomials should be normalized soo’ that φn(1) = 1. according to the polynomial (12), the normalization constant is k(α,β )n = ∑ni=0 a (n) i , where a (n) i are corresponding polynomial coefficients. as an example, fig. 1 shows the characteristic functions based on the shifted jacobi polynomials for n = 1,2,...,5 in the form φn(x) = xν j (α,β ) m (x2)/k (α,β ) n , where n = ⌊m/2⌋+ ν , the floor function ⌊m/2⌋ rounds the value of m/2 to the nearest integers towards zero, ν = 0 and ν = 1 for n even and odd, respectively. −1.5 −1 −0.5 0 0.5 1 −5 −4 −3 −2 −1 0 1 2 3 4 5 j4 (α,β) j 3 (α,β) j 2 (α,β) j 5 (α,β) j 1 (α,β) characteristic function shifted jacobi α=−0.5, β=0.5 x φ n (x )= xν j m(α ,β ) (x 2 ) /k n(α ,β ) fig. 1. the normalized shifted jacobi polynomials φn(x) = xν j (α,β ) m (x2) for ν = 0 and ν = 1 for n even and odd, respectively, used in place characteristic function, α = −0.5 and β = 0.5, n = 2m + ν , m = 0,1 and 2.lowpass filters approximatin based on the jacobi polynomials 7 as shown in fig. 1, the hump at x = 0 occurs when the filter degree is even. using (6) size of the hump can be obtained as φ (α,β )m (0) = 1 k(α,β )n p(α,β )m (−1) = 1 k(α,β )n (−1)mγ(m + β + 1) γ(m + 1)γ(β + 1) (13) because j (α,β )n (0) = p (α,β ) n (−1). one can easily show that the size of the hump increases when the degree of the filter increases. for example, for n = 4, (m = 2 and ν = 0) from (13) follow p(−0.5,0.5)2 (−1) = 1.875 and from (12) is k (−0.5,0.5) 2 = 0.3750 then value for hump is φ2(0) = 5. for n = 6 (m = 3 and ν = 0) follow p(−0.5,0.5)3 (−1) = −2.1875, k (−0.5,0.5) 3 = 0.3125 then φ3(0) = −7. thus, the even degree of the shifted jacobi polynomial is not suitable as the filter characteristic function. other definitions of the monic shifted jacobi polynomials are given in [22, chapter 22], gn(p,q,x), which are also orthogonal in the interval [0,1] with respect to weight function w(x) = (1−x)p−qxq−1 (with q > 0 and p > q − 1), are used for the construction magnitude of the filter’s transfer function [25] [26] [27]. shifted jacobi polynomials [22] are related to the jacobi polynomials p(α,β )n (x) as [28] gn(p,q,x) = γ(n + 1)γ(n + p) γ(2n + p) p(p−q,q−1)n (2x − 1) (14) it can be concluded, the shifted jacobi polynomials j (α,β )n (x) have n distinct positive real zeros in the interval (0,1) but they are neither even nor odd then it can not be used as a characteristic function in the equation (1). however, [xj (α,β )n (x2)]2 or [xg(p,q,x2)]2 could be used in (1) in place of squared characteristic function φ 2n (ω). 3.2 modified jacobi polynomials we propose the following modified jacobi polynomials, based on the summation of two jacobi orthogonal polynomials which have the same degree n, as j (α,β ) n (x) = p (α,β ) n (x)+ p (β ,α) n (x) (15) where p(α,β )n (x) is above mentioned classical jacoby orthogonal polynomial in x. one can easily show that modified jacobi polynomial (15) is not orthogonal polynomial except in the case when α = β is. since jacobi polynomials p(β ,α)n (x) = (−1)np (α,β ) n (−x) are not orthogonal polynomials with the respect to the weight function w(α,β )(x) over the interval [−1,1], then the modified orthogonal jacobi polynomials (15) are not orthogonal polynomials as the shifted jacobi polynomials are. however, the resulting degree of modified jacobi polynomial is n, which is pure odd or pure even polynomial in x, and hence the realization of the lowpass filter is possible for all specifications if they are used as characteristic function. 356 n. stojanović, n. stamenković lowpass filters approximation based on the jacobi polynomials 357 lowpass filters approximatin based on the jacobi polynomials 7 as shown in fig. 1, the hump at x = 0 occurs when the filter degree is even. using (6) size of the hump can be obtained as φ (α,β )m (0) = 1 k(α,β )n p(α,β )m (−1) = 1 k(α,β )n (−1)mγ(m + β + 1) γ(m + 1)γ(β + 1) (13) because j (α,β )n (0) = p (α,β ) n (−1). one can easily show that the size of the hump increases when the degree of the filter increases. for example, for n = 4, (m = 2 and ν = 0) from (13) follow p(−0.5,0.5)2 (−1) = 1.875 and from (12) is k (−0.5,0.5) 2 = 0.3750 then value for hump is φ2(0) = 5. for n = 6 (m = 3 and ν = 0) follow p(−0.5,0.5)3 (−1) = −2.1875, k (−0.5,0.5) 3 = 0.3125 then φ3(0) = −7. thus, the even degree of the shifted jacobi polynomial is not suitable as the filter characteristic function. other definitions of the monic shifted jacobi polynomials are given in [22, chapter 22], gn(p,q,x), which are also orthogonal in the interval [0,1] with respect to weight function w(x) = (1−x)p−qxq−1 (with q > 0 and p > q − 1), are used for the construction magnitude of the filter’s transfer function [25] [26] [27]. shifted jacobi polynomials [22] are related to the jacobi polynomials p(α,β )n (x) as [28] gn(p,q,x) = γ(n + 1)γ(n + p) γ(2n + p) p(p−q,q−1)n (2x − 1) (14) it can be concluded, the shifted jacobi polynomials j (α,β )n (x) have n distinct positive real zeros in the interval (0,1) but they are neither even nor odd then it can not be used as a characteristic function in the equation (1). however, [xj (α,β )n (x2)]2 or [xg(p,q,x2)]2 could be used in (1) in place of squared characteristic function φ 2n (ω). 3.2 modified jacobi polynomials we propose the following modified jacobi polynomials, based on the summation of two jacobi orthogonal polynomials which have the same degree n, as j (α,β ) n (x) = p (α,β ) n (x)+ p (β ,α) n (x) (15) where p(α,β )n (x) is above mentioned classical jacoby orthogonal polynomial in x. one can easily show that modified jacobi polynomial (15) is not orthogonal polynomial except in the case when α = β is. since jacobi polynomials p(β ,α)n (x) = (−1)np (α,β ) n (−x) are not orthogonal polynomials with the respect to the weight function w(α,β )(x) over the interval [−1,1], then the modified orthogonal jacobi polynomials (15) are not orthogonal polynomials as the shifted jacobi polynomials are. however, the resulting degree of modified jacobi polynomial is n, which is pure odd or pure even polynomial in x, and hence the realization of the lowpass filter is possible for all specifications if they are used as characteristic function.8 n. stojanović and n. stamenković: many of the aforementioned polynomials are special cases of modified jacobi polynomials. for α = β , one can obtain the ultraspherical polynomials (symmetric jacobi polynomials) [29]. for α = β = ∓1/2, the chebyshev polynomials of first and second kinds. for α = β = 0, one can obtain the legendre polynomials. for the two important special cases α = −β ± 1/2, the chebyshev polynomials of third and fourth kinds are also obtained. finally, the constants c(α,β )n = j (α,β ) n (1) have to be chosen in such a way that normalization criterion φn(1) = 1 is satisfied, i.e. φn(ω) = j (α,β ) n (ω) c(α,β )n , (16) where c(α,β )n = 1 γ(n + 1) [γ(n + α + 1) γ(α + 1) + γ(n + β + 1) γ(β + 1) ] . (17) modified jacobi polynomials are symmetrical in relation to the orders α and β , i.e. j(α,β )n (ω) = j (β ,α) n (ω). table 1 contains the modified jacobi polynomials for α = −0.5 and β = 0.5 up to the ninth degree. table 1. the modified orthogonal jacobi polynomials j(α,β )n (x), α = −0.5, β = 0.5, and n = 0,1,...,10. n j(−0.5,0.5)n (x) 1 2 x 2 3 x2 − 3 4 3 5 x3 − 5 2 x 4 35 4 x4 − 105 16 x2 + 35 64 5 63 4 x5 − 63 4 x3 + 189 64 x 6 231 8 x6 − 1155 32 x4 + 693 64 x2 − 231 512 7 429 8 x7 − 1287 16 x5 + 2145 64 x3 − 429 128 x 8 6435 64 x8 − 45045 256 x6 + 96525 1024 x4 − 32175 2048 x2 + 6435 16384 9 12155 64 x9 − 12155 32 x7 + 255255 1024 x5 − 60775 1024 x3 + 60775 16384 x 10 46189 128 x10 − 415701 512 x8 + 323323 512 x6 − 1616615 8192 x4 + 692835 32768 x2 − 46189 131072 it is important to know where the roots of the modified jacobi polynomials are located. the fastest way to calculate the zeros of the modified jacobi polynomials is by using mathematical programs such as matlab, mathematica and maple. it can be concluded that the modified jacobi 358 n. stojanović, n. stamenković lowpass filters approximation based on the jacobi polynomials 359 8 n. stojanović and n. stamenković: many of the aforementioned polynomials are special cases of modified jacobi polynomials. for α = β , one can obtain the ultraspherical polynomials (symmetric jacobi polynomials) [29]. for α = β = ∓1/2, the chebyshev polynomials of first and second kinds. for α = β = 0, one can obtain the legendre polynomials. for the two important special cases α = −β ± 1/2, the chebyshev polynomials of third and fourth kinds are also obtained. finally, the constants c(α,β )n = j (α,β ) n (1) have to be chosen in such a way that normalization criterion φn(1) = 1 is satisfied, i.e. φn(ω) = j (α,β ) n (ω) c(α,β )n , (16) where c(α,β )n = 1 γ(n + 1) [γ(n + α + 1) γ(α + 1) + γ(n + β + 1) γ(β + 1) ] . (17) modified jacobi polynomials are symmetrical in relation to the orders α and β , i.e. j(α,β )n (ω) = j (β ,α) n (ω). table 1 contains the modified jacobi polynomials for α = −0.5 and β = 0.5 up to the ninth degree. table 1. the modified orthogonal jacobi polynomials j(α,β )n (x), α = −0.5, β = 0.5, and n = 0,1,...,10. n j(−0.5,0.5)n (x) 1 2 x 2 3 x2 − 3 4 3 5 x3 − 5 2 x 4 35 4 x4 − 105 16 x2 + 35 64 5 63 4 x5 − 63 4 x3 + 189 64 x 6 231 8 x6 − 1155 32 x4 + 693 64 x2 − 231 512 7 429 8 x7 − 1287 16 x5 + 2145 64 x3 − 429 128 x 8 6435 64 x8 − 45045 256 x6 + 96525 1024 x4 − 32175 2048 x2 + 6435 16384 9 12155 64 x9 − 12155 32 x7 + 255255 1024 x5 − 60775 1024 x3 + 60775 16384 x 10 46189 128 x10 − 415701 512 x8 + 323323 512 x6 − 1616615 8192 x4 + 692835 32768 x2 − 46189 131072 it is important to know where the roots of the modified jacobi polynomials are located. the fastest way to calculate the zeros of the modified jacobi polynomials is by using mathematical programs such as matlab, mathematica and maple. it can be concluded that the modified jacobi lowpass filters approximatin based on the jacobi polynomials 9 polynomials, j(α,β )n (x), have n simple real zeros in the closed interval [−1,1]. for example, the zeros of the modified jacobi polynomial of degree 8 with α = −0.5 and β = 0.5 are: {−0.9396926,−0.7660444,−0.5000000,−0.1736482, 0.1736482, 0.5000000, 0.7660444, 0.9396926}. the zeros of j(α,β )n (x) are located symmetrically about x = 0 in the interval −1 < x < 1. note that modified jacobi polynomials are the only non orthogonal polynomials which are suitable for the synthesis of the filter function given in a closed form. the characteristic functions φn(x) based on the modified jacobi polynomials j α,β ) n (x) are illustrated in figure 1 for x in [−1,1] and n = 1,2,...,5. they satisfy the following relationships: for |x| < 1, the characteristic polynomial oscillates around zero and they ripples are bounded by ±1 for α,β ≥ −0.5, φn(0) �= 0 for n even and φn(0) = 0 for n odd. for |x| > 1, the polynomials magnitude increase (decrease) monotonically. x -1 -0.5 0 0.5 1 φ n( x) =[ p n(α ,β ) ( x) +p n(β ,α ) ( x) ]/c n(α ,β ) -3 -2 -1 0 1 2 3 4 5 4 3 2 n=1 α=-0.5, β=0 modified jacobi polynomials fig. 2. the normalized modified orthogonal jacobi polynomials j(α,β )n (ω)/c (α,β ) n used in place characteristic function φn(x), α = −0.5 and β = 0.5, n = 1,...,5. an example is given in figure 3, which shows the ninth-order modified jacobi lowpass filter and its three partial filters with their individual orders α and β . as mentioned earlier, jacobi orthogonal polynomial corresponds to the chebyshev polynomial if α = β = −0.5 which have 3db ripples in the pass-band. in general, passband ripples are being undesirable, but a value less than 0.5 db is acceptable in many applications. if α = −0.5 and order β increases, the ripples in the passband decrease smoothly to be unequal and smaller in magnitude. for β > 1.5 the passband response is nearly flat, but the cutoff slope is much steeper than a butterworth filter cutoff slope. on the other hand, for −1 < β < −0.5 the passband ripples are unequal, but in magnitude are lowpass filters approximatin based on the jacobi polynomials 9 polynomials, j(α,β )n (x), have n simple real zeros in the closed interval [−1,1]. for example, the zeros of the modified jacobi polynomial of degree 8 with α = −0.5 and β = 0.5 are: {−0.9396926,−0.7660444,−0.5000000,−0.1736482, 0.1736482, 0.5000000, 0.7660444, 0.9396926}. the zeros of j(α,β )n (x) are located symmetrically about x = 0 in the interval −1 < x < 1. note that modified jacobi polynomials are the only non orthogonal polynomials which are suitable for the synthesis of the filter function given in a closed form. the characteristic functions φn(x) based on the modified jacobi polynomials j α,β ) n (x) are illustrated in figure 1 for x in [−1,1] and n = 1,2,...,5. they satisfy the following relationships: for |x| < 1, the characteristic polynomial oscillates around zero and they ripples are bounded by ±1 for α,β ≥ −0.5, φn(0) �= 0 for n even and φn(0) = 0 for n odd. for |x| > 1, the polynomials magnitude increase (decrease) monotonically. x -1 -0.5 0 0.5 1 φ n( x) =[ p n(α ,β ) ( x) +p n(β ,α ) ( x) ]/c n(α ,β ) -3 -2 -1 0 1 2 3 4 5 4 3 2 n=1 α=-0.5, β=0 modified jacobi polynomials fig. 2. the normalized modified orthogonal jacobi polynomials j(α,β )n (ω)/c (α,β ) n used in place characteristic function φn(x), α = −0.5 and β = 0.5, n = 1,...,5. an example is given in figure 3, which shows the ninth-order modified jacobi lowpass filter and its three partial filters with their individual orders α and β . as mentioned earlier, jacobi orthogonal polynomial corresponds to the chebyshev polynomial if α = β = −0.5 which have 3db ripples in the pass-band. in general, passband ripples are being undesirable, but a value less than 0.5 db is acceptable in many applications. if α = −0.5 and order β increases, the ripples in the passband decrease smoothly to be unequal and smaller in magnitude. for β > 1.5 the passband response is nearly flat, but the cutoff slope is much steeper than a butterworth filter cutoff slope. on the other hand, for −1 < β < −0.5 the passband ripples are unequal, but in magnitude are lowpass filters approximatin based on the jacobi polynomials 9 polynomials, j(α,β )n (x), have n simple real zeros in the closed interval [−1,1]. for example, the zeros of the modified jacobi polynomial of degree 8 with α = −0.5 and β = 0.5 are: {−0.9396926,−0.7660444,−0.5000000,−0.1736482, 0.1736482, 0.5000000, 0.7660444, 0.9396926}. the zeros of j(α,β )n (x) are located symmetrically about x = 0 in the interval −1 < x < 1. note that modified jacobi polynomials are the only non orthogonal polynomials which are suitable for the synthesis of the filter function given in a closed form. the characteristic functions φn(x) based on the modified jacobi polynomials j α,β ) n (x) are illustrated in figure 1 for x in [−1,1] and n = 1,2,...,5. they satisfy the following relationships: for |x| < 1, the characteristic polynomial oscillates around zero and they ripples are bounded by ±1 for α,β ≥ −0.5, φn(0) �= 0 for n even and φn(0) = 0 for n odd. for |x| > 1, the polynomials magnitude increase (decrease) monotonically. x -1 -0.5 0 0.5 1 φ n( x) =[ p n(α ,β ) ( x) +p n(β ,α ) ( x) ]/c n(α ,β ) -3 -2 -1 0 1 2 3 4 5 4 3 2 n=1 α=-0.5, β=0 modified jacobi polynomials fig. 2. the normalized modified orthogonal jacobi polynomials j(α,β )n (ω)/c (α,β ) n used in place characteristic function φn(x), α = −0.5 and β = 0.5, n = 1,...,5. an example is given in figure 3, which shows the ninth-order modified jacobi lowpass filter and its three partial filters with their individual orders α and β . as mentioned earlier, jacobi orthogonal polynomial corresponds to the chebyshev polynomial if α = β = −0.5 which have 3db ripples in the pass-band. in general, passband ripples are being undesirable, but a value less than 0.5 db is acceptable in many applications. if α = −0.5 and order β increases, the ripples in the passband decrease smoothly to be unequal and smaller in magnitude. for β > 1.5 the passband response is nearly flat, but the cutoff slope is much steeper than a butterworth filter cutoff slope. on the other hand, for −1 < β < −0.5 the passband ripples are unequal, but in magnitude are 8 n. stojanović and n. stamenković: many of the aforementioned polynomials are special cases of modified jacobi polynomials. for α = β , one can obtain the ultraspherical polynomials (symmetric jacobi polynomials) [29]. for α = β = ∓1/2, the chebyshev polynomials of first and second kinds. for α = β = 0, one can obtain the legendre polynomials. for the two important special cases α = −β ± 1/2, the chebyshev polynomials of third and fourth kinds are also obtained. finally, the constants c(α,β )n = j (α,β ) n (1) have to be chosen in such a way that normalization criterion φn(1) = 1 is satisfied, i.e. φn(ω) = j (α,β ) n (ω) c(α,β )n , (16) where c(α,β )n = 1 γ(n + 1) [γ(n + α + 1) γ(α + 1) + γ(n + β + 1) γ(β + 1) ] . (17) modified jacobi polynomials are symmetrical in relation to the orders α and β , i.e. j(α,β )n (ω) = j (β ,α) n (ω). table 1 contains the modified jacobi polynomials for α = −0.5 and β = 0.5 up to the ninth degree. table 1. the modified orthogonal jacobi polynomials j(α,β )n (x), α = −0.5, β = 0.5, and n = 0,1,...,10. n j(−0.5,0.5)n (x) 1 2 x 2 3 x2 − 3 4 3 5 x3 − 5 2 x 4 35 4 x4 − 105 16 x2 + 35 64 5 63 4 x5 − 63 4 x3 + 189 64 x 6 231 8 x6 − 1155 32 x4 + 693 64 x2 − 231 512 7 429 8 x7 − 1287 16 x5 + 2145 64 x3 − 429 128 x 8 6435 64 x8 − 45045 256 x6 + 96525 1024 x4 − 32175 2048 x2 + 6435 16384 9 12155 64 x9 − 12155 32 x7 + 255255 1024 x5 − 60775 1024 x3 + 60775 16384 x 10 46189 128 x10 − 415701 512 x8 + 323323 512 x6 − 1616615 8192 x4 + 692835 32768 x2 − 46189 131072 it is important to know where the roots of the modified jacobi polynomials are located. the fastest way to calculate the zeros of the modified jacobi polynomials is by using mathematical programs such as matlab, mathematica and maple. it can be concluded that the modified jacobi 358 n. stojanović, n. stamenković lowpass filters approximation based on the jacobi polynomials 359 lowpass filters approximatin based on the jacobi polynomials 9 polynomials, j(α,β )n (x), have n simple real zeros in the closed interval [−1,1]. for example, the zeros of the modified jacobi polynomial of degree 8 with α = −0.5 and β = 0.5 are: {−0.9396926,−0.7660444,−0.5000000,−0.1736482, 0.1736482, 0.5000000, 0.7660444, 0.9396926}. the zeros of j(α,β )n (x) are located symmetrically about x = 0 in the interval −1 < x < 1. note that modified jacobi polynomials are the only non orthogonal polynomials which are suitable for the synthesis of the filter function given in a closed form. the characteristic functions φn(x) based on the modified jacobi polynomials j α,β ) n (x) are illustrated in figure 1 for x in [−1,1] and n = 1,2,...,5. they satisfy the following relationships: for |x| < 1, the characteristic polynomial oscillates around zero and they ripples are bounded by ±1 for α,β ≥ −0.5, φn(0) �= 0 for n even and φn(0) = 0 for n odd. for |x| > 1, the polynomials magnitude increase (decrease) monotonically. x -1 -0.5 0 0.5 1 φ n( x) =[ p n(α ,β ) ( x) +p n(β ,α ) ( x) ]/c n(α ,β ) -3 -2 -1 0 1 2 3 4 5 4 3 2 n=1 α=-0.5, β=0 modified jacobi polynomials fig. 2. the normalized modified orthogonal jacobi polynomials j(α,β )n (ω)/c (α,β ) n used in place characteristic function φn(x), α = −0.5 and β = 0.5, n = 1,...,5. an example is given in figure 3, which shows the ninth-order modified jacobi lowpass filter and its three partial filters with their individual orders α and β . as mentioned earlier, jacobi orthogonal polynomial corresponds to the chebyshev polynomial if α = β = −0.5 which have 3db ripples in the pass-band. in general, passband ripples are being undesirable, but a value less than 0.5 db is acceptable in many applications. if α = −0.5 and order β increases, the ripples in the passband decrease smoothly to be unequal and smaller in magnitude. for β > 1.5 the passband response is nearly flat, but the cutoff slope is much steeper than a butterworth filter cutoff slope. on the other hand, for −1 < β < −0.5 the passband ripples are unequal, but in magnitude are 10 n. stojanović and n. stamenković: larger than 1, but these values of β (also for α ) have no practical significance. it is shown that the passband ripple can be adjusted to improve the linearity of the group delay response near the ω = 0. normalized frequency, ω 10-1 100 s to pb an d at te nu at io n, d b 0 10 20 30 40 50 60 70 p as sb an d at te nu at io n, d b 0 1 2 3 5 10 15 20 g ro up d el ay , s modified jacobi, n=9 α=-0.5, β=0 α=-0.5, β=0.5 α=-0.5, β=1.5 butterworth fig. 3. the frequency responses of the 9th degree modified jacobi filters. generally, for microwave applications modified orthogonal jacobi as filter function may be also used. the most widely used filters in microwave applications are a band-pass filters [30]. using lowpass to bandpass frequency transformation of lumped element lowpass filter, the series inductor converts to the series resonator and parallel capacitor converts to the parallel resonator. richards transformation can be used to emulate the inductive and capacitive behaviour of the lumped circuit elements into distributive element consist the transmission line sections, and kuroda’s identities can be used to facilitate the conversion between the various transmission line realizations. in the application where approximation of the filter magnitude function based on the christofeldarboux formula for classical orthonormal jacobi polynomials gives excellent results [31] [32], this method cannot be applied to the modified jacobi filters, because it is non orthogonal. in this case, it should either generate the sum of the product modified jacobi polynomial, or christoffel-darboux formula be applied separately to the both orthonormal jacobi polynomials as: a2n(ω 2) =[p (α,β ) 0 (ω)] 2 +[p(α,β )1 (ω)] 2 + ···+[p(α,β )n (ω)]2 +[p(β ,α)0 (ω)] 2 +[p(β ,α)1 (ω)] 2 + ···+[p(β ,α)n (ω)]2 (18) where p(α,β )i (ω), i = 1,2,...,n are orthonormal jacobi polynomials with respect to the weight function w(α,β )(ω) = (1 − ω)α (1 + ω)β and p(β ,α)i (ω), i = 1,2,...,n are also orthonormal jacobi polynomials but with respect to the other weight function w(β ,α)(ω) = (1 − ω)β (1 + ω)α . the 360 n. stojanović, n. stamenković lowpass filters approximation based on the jacobi polynomials 361 10 n. stojanović and n. stamenković: larger than 1, but these values of β (also for α ) have no practical significance. it is shown that the passband ripple can be adjusted to improve the linearity of the group delay response near the ω = 0. normalized frequency, ω 10-1 100 s to pb an d at te nu at io n, d b 0 10 20 30 40 50 60 70 p as sb an d at te nu at io n, d b 0 1 2 3 5 10 15 20 g ro up d el ay , s modified jacobi, n=9 α=-0.5, β=0 α=-0.5, β=0.5 α=-0.5, β=1.5 butterworth fig. 3. the frequency responses of the 9th degree modified jacobi filters. generally, for microwave applications modified orthogonal jacobi as filter function may be also used. the most widely used filters in microwave applications are a band-pass filters [30]. using lowpass to bandpass frequency transformation of lumped element lowpass filter, the series inductor converts to the series resonator and parallel capacitor converts to the parallel resonator. richards transformation can be used to emulate the inductive and capacitive behaviour of the lumped circuit elements into distributive element consist the transmission line sections, and kuroda’s identities can be used to facilitate the conversion between the various transmission line realizations. in the application where approximation of the filter magnitude function based on the christofeldarboux formula for classical orthonormal jacobi polynomials gives excellent results [31] [32], this method cannot be applied to the modified jacobi filters, because it is non orthogonal. in this case, it should either generate the sum of the product modified jacobi polynomial, or christoffel-darboux formula be applied separately to the both orthonormal jacobi polynomials as: a2n(ω 2) =[p (α,β ) 0 (ω)] 2 +[p(α,β )1 (ω)] 2 + ···+[p(α,β )n (ω)]2 +[p(β ,α)0 (ω)] 2 +[p(β ,α)1 (ω)] 2 + ···+[p(β ,α)n (ω)]2 (18) where p(α,β )i (ω), i = 1,2,...,n are orthonormal jacobi polynomials with respect to the weight function w(α,β )(ω) = (1 − ω)α (1 + ω)β and p(β ,α)i (ω), i = 1,2,...,n are also orthonormal jacobi polynomials but with respect to the other weight function w(β ,α)(ω) = (1 − ω)β (1 + ω)α . the lowpass filters approximatin based on the jacobi polynomials 11 orthonormal jacobi plynomials are: p(α,β )n (ω) = √ 2n + α + β + 1 2α+β +1 γ(n + 1)γ(n + α + β + 1) γ(n + α + 1)γ(n + β + 1) p(α,β )n (ω) (19) and p(β ,α)n (ω) = √ 2n + α + β + 1 2α+β +1 γ(n + 1)γ(n + α + β + 1) γ(n + α + 1)γ(n + β + 1) p(β ,α)n (ω) (20) where p(β ,α)n (ω) and p (β ,α) n (ω) are the orthogonal jacobi polynomials which can be evaluated by the proposed matlab program. by using christoffel-darboux formula equation (18) is reduced to: a2n(ω 2) = k(α,β )n k(α,β )n+1 [dp(α,β )n+1 dω p(α,β )n − dp(α,β )n dω p(α,β )n+1 ] + k(β ,α)n k(β ,α)n+1 [dp(β ,α)n+1 dω p(β ,α)n − dp(β ,α)n dω p(β ,α)n+1 ] (21) where k(α,β )n and k (β ,α) n are leading coefficients of the orthonormal jacobi polynomials p (α,β ) n (ω) and p(β ,α)n (ω), respectively. the following filter approximating function for n = 5, α = −0.5 and β = 0.5 is given as an example: a10(ω 2) = 325.9493ω 10 − 488.9240ω 8 + 244.4620ω 6 − 40.7437ω 4 + 3.8197ω 2 + 2.5 according to the definition, the characteristic function should be normalized so that is unit, a10(1) = 1, at the cutoff frequency, ωp = 1. 4 conclusion in this paper, we intended to illuminate the usage of jacobi orthogonal polynomials in the design of time-continuous low-pass filter transfer function. since jacobi polynomial cannot be directly used as filter characteristic function, we suggested shifted jacobi polynomials and proposed a simple modification of jacobi polynomials to use as a filter characteristic function. the modified jacobi polynomials are not orthogonal, but they are suitable for the filter transfer function approximation. filter degree, maximum passband attenuation and two indexes of jacobi polynomials are four parameters that adjust the performance of the filter. the new modified jacobi polynomials are implemented to approximate the lowpass filter transfer function in such a way that they are used directly as filter characteristic function (as standard orthogonal polynomials: 360 n. stojanović, n. stamenković lowpass filters approximation based on the jacobi polynomials 361 lowpass filters approximatin based on the jacobi polynomials 11 orthonormal jacobi plynomials are: p(α,β )n (ω) = √ 2n + α + β + 1 2α+β +1 γ(n + 1)γ(n + α + β + 1) γ(n + α + 1)γ(n + β + 1) p(α,β )n (ω) (19) and p(β ,α)n (ω) = √ 2n + α + β + 1 2α+β +1 γ(n + 1)γ(n + α + β + 1) γ(n + α + 1)γ(n + β + 1) p(β ,α)n (ω) (20) where p(β ,α)n (ω) and p (β ,α) n (ω) are the orthogonal jacobi polynomials which can be evaluated by the proposed matlab program. by using christoffel-darboux formula equation (18) is reduced to: a2n(ω 2) = k(α,β )n k(α,β )n+1 [dp(α,β )n+1 dω p(α,β )n − dp(α,β )n dω p(α,β )n+1 ] + k(β ,α)n k(β ,α)n+1 [dp(β ,α)n+1 dω p(β ,α)n − dp(β ,α)n dω p(β ,α)n+1 ] (21) where k(α,β )n and k (β ,α) n are leading coefficients of the orthonormal jacobi polynomials p (α,β ) n (ω) and p(β ,α)n (ω), respectively. the following filter approximating function for n = 5, α = −0.5 and β = 0.5 is given as an example: a10(ω 2) = 325.9493ω 10 − 488.9240ω 8 + 244.4620ω 6 − 40.7437ω 4 + 3.8197ω 2 + 2.5 according to the definition, the characteristic function should be normalized so that is unit, a10(1) = 1, at the cutoff frequency, ωp = 1. 4 conclusion in this paper, we intended to illuminate the usage of jacobi orthogonal polynomials in the design of time-continuous low-pass filter transfer function. since jacobi polynomial cannot be directly used as filter characteristic function, we suggested shifted jacobi polynomials and proposed a simple modification of jacobi polynomials to use as a filter characteristic function. the modified jacobi polynomials are not orthogonal, but they are suitable for the filter transfer function approximation. filter degree, maximum passband attenuation and two indexes of jacobi polynomials are four parameters that adjust the performance of the filter. the new modified jacobi polynomials are implemented to approximate the lowpass filter transfer function in such a way that they are used directly as filter characteristic function (as standard orthogonal polynomials: 12 n. stojanović and n. stamenković: chebyshev or legendre). these methods of approximation can be used to provide filters with adjustment of the passband ripple, group delay deviation or cutoff slope. acknowledgment this work is supported by serbian ministry of education, science and technological development, project no. 32009tr. references [1] w. v. assche and e. coussement, “some classical multiple orthogonal polynomials,” journal of computational and applied mathematics, vol. 127, no. 12, pp. 317 – 347, jan. 2001, numerical analysis 2000. vol. v: quadrature and orthogonal polynomials. [online]. available: http://www.sciencedirect.com/science/article/pii/s0377042700005033 [2] l. storch, “synthesis of constant-time-delay ladder networks using bessel polynomials,” proceedings of the ire, vol. 42, no. 11, pp. 1666–1675, nov. 1954. [3] b. d. rakovich and v. s. stojanovich, “on the design of equal ripple delay filters with chebyshev stopband attenuation,” radio and electronic engineer, vol. 43, no. 4, pp. 257–265, apr. 1973. [4] s. butterworth, “on the theory filter amplifier,” experimental wireless and the radio engineer, vol. 7, pp. 536–541, oct. 1930. [5] h. g. dimopoulos, “optimal use of some classical approximations in filter design,” ieee transactions on circuits and systems ii: express briefs, vol. 54, no. 9, pp. 780–784, sep. 2007. [6] s. c. d. roy, “modified chebyshev lowpass filters,” international journal of circuit theory and applications, vol. 38, no. 5, pp. 543–549, 2010.[online]. available: http://dx.doi.org/10.1002/cta.585 [7] r. ramiz and h. sedef, “general method for designing and simulating of resistively terminated lc ladder filters,” facta universitatis, series: electronics and energetics, vol. 12, no. 3, pp. 79–94, 1999. [8] s. prasad, l. g. stolarczyk, j. r. jackson, and e. w. kang, “filter synthesis using legendre polynomials,” proceedings of the iee, vol. 114, no. 8, pp. 1063–1064, aug. 1967. [9] m. t. chryssomallis and j. n. sahalos, “filter synthesis using products of legendre polynomials,” electrical engineering, vol. 81, no. 6, pp. 419–424, 1999. [10] d. živaljević, n. stamenković, and v. stojanović, “nearly monotonic passband low-pass filter design by using sum-of-squared legendre polynomials,” international journal of circuit theory and applications, vol. 44, no. 1, pp. 147–161, jan. 2016. [online]. available: http://dx.doi.org/10.1002/cta.2068 [11] y. h. ku and m. drubin, “network synthesis using legendre and hermite polynomials,” j. franklin inst., vol. 273, no. 2, pp. 138–157, feb. 1962. [12] i. m. filanovsky, “bessel-butterworth transitional filters,” in 2014 ieee international symposium on circuits and systems (iscas), jun. 2014, pp. 2105–2108. [13] a. dey, s. sadhu, and t. k. ghoshal, “adaptive gauss hermite filter for parameter varying nonlinear systems,” in 2014 international conference on signal processing and communications (spcom), jul. 2014, pp. 1–5. 12 n. stojanović and n. stamenković: chebyshev or legendre). these methods of approximation can be used to provide filters with adjustment of the passband ripple, group delay deviation or cutoff slope. acknowledgment this work is supported by serbian ministry of education, science and technological development, project no. 32009tr. references [1] w. v. assche and e. coussement, “some classical multiple orthogonal polynomials,” journal of computational and applied mathematics, vol. 127, no. 12, pp. 317 – 347, jan. 2001, numerical analysis 2000. vol. v: quadrature and orthogonal polynomials. [online]. available: http://www.sciencedirect.com/science/article/pii/s0377042700005033 [2] l. storch, “synthesis of constant-time-delay ladder networks using bessel polynomials,” proceedings of the ire, vol. 42, no. 11, pp. 1666–1675, nov. 1954. [3] b. d. rakovich and v. s. stojanovich, “on the design of equal ripple delay filters with chebyshev stopband attenuation,” radio and electronic engineer, vol. 43, no. 4, pp. 257–265, apr. 1973. [4] s. butterworth, “on the theory filter amplifier,” experimental wireless and the radio engineer, vol. 7, pp. 536–541, oct. 1930. [5] h. g. dimopoulos, “optimal use of some classical approximations in filter design,” ieee transactions on circuits and systems ii: express briefs, vol. 54, no. 9, pp. 780–784, sep. 2007. [6] s. c. d. roy, “modified chebyshev lowpass filters,” international journal of circuit theory and applications, vol. 38, no. 5, pp. 543–549, 2010.[online]. available: http://dx.doi.org/10.1002/cta.585 [7] r. ramiz and h. sedef, “general method for designing and simulating of resistively terminated lc ladder filters,” facta universitatis, series: electronics and energetics, vol. 12, no. 3, pp. 79–94, 1999. [8] s. prasad, l. g. stolarczyk, j. r. jackson, and e. w. kang, “filter synthesis using legendre polynomials,” proceedings of the iee, vol. 114, no. 8, pp. 1063–1064, aug. 1967. [9] m. t. chryssomallis and j. n. sahalos, “filter synthesis using products of legendre polynomials,” electrical engineering, vol. 81, no. 6, pp. 419–424, 1999. [10] d. živaljević, n. stamenković, and v. stojanović, “nearly monotonic passband low-pass filter design by using sum-of-squared legendre polynomials,” international journal of circuit theory and applications, vol. 44, no. 1, pp. 147–161, jan. 2016. [online]. available: http://dx.doi.org/10.1002/cta.2068 [11] y. h. ku and m. drubin, “network synthesis using legendre and hermite polynomials,” j. franklin inst., vol. 273, no. 2, pp. 138–157, feb. 1962. [12] i. m. filanovsky, “bessel-butterworth transitional filters,” in 2014 ieee international symposium on circuits and systems (iscas), jun. 2014, pp. 2105–2108. [13] a. dey, s. sadhu, and t. k. ghoshal, “adaptive gauss hermite filter for parameter varying nonlinear systems,” in 2014 international conference on signal processing and communications (spcom), jul. 2014, pp. 1–5. 12 n. stojanović and n. stamenković: chebyshev or legendre). these methods of approximation can be used to provide filters with adjustment of the passband ripple, group delay deviation or cutoff slope. acknowledgment this work is supported by serbian ministry of education, science and technological development, project no. 32009tr. references [1] w. v. assche and e. coussement, “some classical multiple orthogonal polynomials,” journal of computational and applied mathematics, vol. 127, no. 12, pp. 317 – 347, jan. 2001, numerical analysis 2000. vol. v: quadrature and orthogonal polynomials. [online]. available: http://www.sciencedirect.com/science/article/pii/s0377042700005033 [2] l. storch, “synthesis of constant-time-delay ladder networks using bessel polynomials,” proceedings of the ire, vol. 42, no. 11, pp. 1666–1675, nov. 1954. [3] b. d. rakovich and v. s. stojanovich, “on the design of equal ripple delay filters with chebyshev stopband attenuation,” radio and electronic engineer, vol. 43, no. 4, pp. 257–265, apr. 1973. [4] s. butterworth, “on the theory filter amplifier,” experimental wireless and the radio engineer, vol. 7, pp. 536–541, oct. 1930. [5] h. g. dimopoulos, “optimal use of some classical approximations in filter design,” ieee transactions on circuits and systems ii: express briefs, vol. 54, no. 9, pp. 780–784, sep. 2007. [6] s. c. d. roy, “modified chebyshev lowpass filters,” international journal of circuit theory and applications, vol. 38, no. 5, pp. 543–549, 2010.[online]. available: http://dx.doi.org/10.1002/cta.585 [7] r. ramiz and h. sedef, “general method for designing and simulating of resistively terminated lc ladder filters,” facta universitatis, series: electronics and energetics, vol. 12, no. 3, pp. 79–94, 1999. [8] s. prasad, l. g. stolarczyk, j. r. jackson, and e. w. kang, “filter synthesis using legendre polynomials,” proceedings of the iee, vol. 114, no. 8, pp. 1063–1064, aug. 1967. [9] m. t. chryssomallis and j. n. sahalos, “filter synthesis using products of legendre polynomials,” electrical engineering, vol. 81, no. 6, pp. 419–424, 1999. [10] d. živaljević, n. stamenković, and v. stojanović, “nearly monotonic passband low-pass filter design by using sum-of-squared legendre polynomials,” international journal of circuit theory and applications, vol. 44, no. 1, pp. 147–161, jan. 2016. [online]. available: http://dx.doi.org/10.1002/cta.2068 [11] y. h. ku and m. drubin, “network synthesis using legendre and hermite polynomials,” j. franklin inst., vol. 273, no. 2, pp. 138–157, feb. 1962. [12] i. m. filanovsky, “bessel-butterworth transitional filters,” in 2014 ieee international symposium on circuits and systems (iscas), jun. 2014, pp. 2105–2108. [13] a. dey, s. sadhu, and t. k. ghoshal, “adaptive gauss hermite filter for parameter varying nonlinear systems,” in 2014 international conference on signal processing and communications (spcom), jul. 2014, pp. 1–5. 12 n. stojanović and n. stamenković: chebyshev or legendre). these methods of approximation can be used to provide filters with adjustment of the passband ripple, group delay deviation or cutoff slope. acknowledgment this work is supported by serbian ministry of education, science and technological development, project no. 32009tr. references [1] w. v. assche and e. coussement, “some classical multiple orthogonal polynomials,” journal of computational and applied mathematics, vol. 127, no. 12, pp. 317 – 347, jan. 2001, numerical analysis 2000. vol. v: quadrature and orthogonal polynomials. [online]. available: http://www.sciencedirect.com/science/article/pii/s0377042700005033 [2] l. storch, “synthesis of constant-time-delay ladder networks using bessel polynomials,” proceedings of the ire, vol. 42, no. 11, pp. 1666–1675, nov. 1954. [3] b. d. rakovich and v. s. stojanovich, “on the design of equal ripple delay filters with chebyshev stopband attenuation,” radio and electronic engineer, vol. 43, no. 4, pp. 257–265, apr. 1973. [4] s. butterworth, “on the theory filter amplifier,” experimental wireless and the radio engineer, vol. 7, pp. 536–541, oct. 1930. [5] h. g. dimopoulos, “optimal use of some classical approximations in filter design,” ieee transactions on circuits and systems ii: express briefs, vol. 54, no. 9, pp. 780–784, sep. 2007. [6] s. c. d. roy, “modified chebyshev lowpass filters,” international journal of circuit theory and applications, vol. 38, no. 5, pp. 543–549, 2010.[online]. available: http://dx.doi.org/10.1002/cta.585 [7] r. ramiz and h. sedef, “general method for designing and simulating of resistively terminated lc ladder filters,” facta universitatis, series: electronics and energetics, vol. 12, no. 3, pp. 79–94, 1999. [8] s. prasad, l. g. stolarczyk, j. r. jackson, and e. w. kang, “filter synthesis using legendre polynomials,” proceedings of the iee, vol. 114, no. 8, pp. 1063–1064, aug. 1967. [9] m. t. chryssomallis and j. n. sahalos, “filter synthesis using products of legendre polynomials,” electrical engineering, vol. 81, no. 6, pp. 419–424, 1999. [10] d. živaljević, n. stamenković, and v. stojanović, “nearly monotonic passband low-pass filter design by using sum-of-squared legendre polynomials,” international journal of circuit theory and applications, vol. 44, no. 1, pp. 147–161, jan. 2016. [online]. available: http://dx.doi.org/10.1002/cta.2068 [11] y. h. ku and m. drubin, “network synthesis using legendre and hermite polynomials,” j. franklin inst., vol. 273, no. 2, pp. 138–157, feb. 1962. [12] i. m. filanovsky, “bessel-butterworth transitional filters,” in 2014 ieee international symposium on circuits and systems (iscas), jun. 2014, pp. 2105–2108. [13] a. dey, s. sadhu, and t. k. ghoshal, “adaptive gauss hermite filter for parameter varying nonlinear systems,” in 2014 international conference on signal processing and communications (spcom), jul. 2014, pp. 1–5. 12 n. stojanović and n. stamenković: chebyshev or legendre). these methods of approximation can be used to provide filters with adjustment of the passband ripple, group delay deviation or cutoff slope. acknowledgment this work is supported by serbian ministry of education, science and technological development, project no. 32009tr. references [1] w. v. assche and e. coussement, “some classical multiple orthogonal polynomials,” journal of computational and applied mathematics, vol. 127, no. 12, pp. 317 – 347, jan. 2001, numerical analysis 2000. vol. v: quadrature and orthogonal polynomials. [online]. available: http://www.sciencedirect.com/science/article/pii/s0377042700005033 [2] l. storch, “synthesis of constant-time-delay ladder networks using bessel polynomials,” proceedings of the ire, vol. 42, no. 11, pp. 1666–1675, nov. 1954. [3] b. d. rakovich and v. s. stojanovich, “on the design of equal ripple delay filters with chebyshev stopband attenuation,” radio and electronic engineer, vol. 43, no. 4, pp. 257–265, apr. 1973. [4] s. butterworth, “on the theory filter amplifier,” experimental wireless and the radio engineer, vol. 7, pp. 536–541, oct. 1930. [5] h. g. dimopoulos, “optimal use of some classical approximations in filter design,” ieee transactions on circuits and systems ii: express briefs, vol. 54, no. 9, pp. 780–784, sep. 2007. [6] s. c. d. roy, “modified chebyshev lowpass filters,” international journal of circuit theory and applications, vol. 38, no. 5, pp. 543–549, 2010.[online]. available: http://dx.doi.org/10.1002/cta.585 [7] r. ramiz and h. sedef, “general method for designing and simulating of resistively terminated lc ladder filters,” facta universitatis, series: electronics and energetics, vol. 12, no. 3, pp. 79–94, 1999. [8] s. prasad, l. g. stolarczyk, j. r. jackson, and e. w. kang, “filter synthesis using legendre polynomials,” proceedings of the iee, vol. 114, no. 8, pp. 1063–1064, aug. 1967. [9] m. t. chryssomallis and j. n. sahalos, “filter synthesis using products of legendre polynomials,” electrical engineering, vol. 81, no. 6, pp. 419–424, 1999. [10] d. živaljević, n. stamenković, and v. stojanović, “nearly monotonic passband low-pass filter design by using sum-of-squared legendre polynomials,” international journal of circuit theory and applications, vol. 44, no. 1, pp. 147–161, jan. 2016. [online]. available: http://dx.doi.org/10.1002/cta.2068 [11] y. h. ku and m. drubin, “network synthesis using legendre and hermite polynomials,” j. franklin inst., vol. 273, no. 2, pp. 138–157, feb. 1962. [12] i. m. filanovsky, “bessel-butterworth transitional filters,” in 2014 ieee international symposium on circuits and systems (iscas), jun. 2014, pp. 2105–2108. [13] a. dey, s. sadhu, and t. k. ghoshal, “adaptive gauss hermite filter for parameter varying nonlinear systems,” in 2014 international conference on signal processing and communications (spcom), jul. 2014, pp. 1–5. 12 n. stojanović and n. stamenković: chebyshev or legendre). these methods of approximation can be used to provide filters with adjustment of the passband ripple, group delay deviation or cutoff slope. acknowledgment this work is supported by serbian ministry of education, science and technological development, project no. 32009tr. references [1] w. v. assche and e. coussement, “some classical multiple orthogonal polynomials,” journal of computational and applied mathematics, vol. 127, no. 12, pp. 317 – 347, jan. 2001, numerical analysis 2000. vol. v: quadrature and orthogonal polynomials. [online]. available: http://www.sciencedirect.com/science/article/pii/s0377042700005033 [2] l. storch, “synthesis of constant-time-delay ladder networks using bessel polynomials,” proceedings of the ire, vol. 42, no. 11, pp. 1666–1675, nov. 1954. [3] b. d. rakovich and v. s. stojanovich, “on the design of equal ripple delay filters with chebyshev stopband attenuation,” radio and electronic engineer, vol. 43, no. 4, pp. 257–265, apr. 1973. [4] s. butterworth, “on the theory filter amplifier,” experimental wireless and the radio engineer, vol. 7, pp. 536–541, oct. 1930. [5] h. g. dimopoulos, “optimal use of some classical approximations in filter design,” ieee transactions on circuits and systems ii: express briefs, vol. 54, no. 9, pp. 780–784, sep. 2007. [6] s. c. d. roy, “modified chebyshev lowpass filters,” international journal of circuit theory and applications, vol. 38, no. 5, pp. 543–549, 2010.[online]. available: http://dx.doi.org/10.1002/cta.585 [7] r. ramiz and h. sedef, “general method for designing and simulating of resistively terminated lc ladder filters,” facta universitatis, series: electronics and energetics, vol. 12, no. 3, pp. 79–94, 1999. [8] s. prasad, l. g. stolarczyk, j. r. jackson, and e. w. kang, “filter synthesis using legendre polynomials,” proceedings of the iee, vol. 114, no. 8, pp. 1063–1064, aug. 1967. [9] m. t. chryssomallis and j. n. sahalos, “filter synthesis using products of legendre polynomials,” electrical engineering, vol. 81, no. 6, pp. 419–424, 1999. [10] d. živaljević, n. stamenković, and v. stojanović, “nearly monotonic passband low-pass filter design by using sum-of-squared legendre polynomials,” international journal of circuit theory and applications, vol. 44, no. 1, pp. 147–161, jan. 2016. [online]. available: http://dx.doi.org/10.1002/cta.2068 [11] y. h. ku and m. drubin, “network synthesis using legendre and hermite polynomials,” j. franklin inst., vol. 273, no. 2, pp. 138–157, feb. 1962. [12] i. m. filanovsky, “bessel-butterworth transitional filters,” in 2014 ieee international symposium on circuits and systems (iscas), jun. 2014, pp. 2105–2108. [13] a. dey, s. sadhu, and t. k. ghoshal, “adaptive gauss hermite filter for parameter varying nonlinear systems,” in 2014 international conference on signal processing and communications (spcom), jul. 2014, pp. 1–5. 362 n. stojanović, n. stamenković lowpass filters approximation based on the jacobi polynomials pb 12 n. stojanović and n. stamenković: chebyshev or legendre). these methods of approximation can be used to provide filters with adjustment of the passband ripple, group delay deviation or cutoff slope. acknowledgment this work is supported by serbian ministry of education, science and technological development, project no. 32009tr. references [1] w. v. assche and e. coussement, “some classical multiple orthogonal polynomials,” journal of computational and applied mathematics, vol. 127, no. 12, pp. 317 – 347, jan. 2001, numerical analysis 2000. vol. v: quadrature and orthogonal polynomials. [online]. available: http://www.sciencedirect.com/science/article/pii/s0377042700005033 [2] l. storch, “synthesis of constant-time-delay ladder networks using bessel polynomials,” proceedings of the ire, vol. 42, no. 11, pp. 1666–1675, nov. 1954. [3] b. d. rakovich and v. s. stojanovich, “on the design of equal ripple delay filters with chebyshev stopband attenuation,” radio and electronic engineer, vol. 43, no. 4, pp. 257–265, apr. 1973. [4] s. butterworth, “on the theory filter amplifier,” experimental wireless and the radio engineer, vol. 7, pp. 536–541, oct. 1930. [5] h. g. dimopoulos, “optimal use of some classical approximations in filter design,” ieee transactions on circuits and systems ii: express briefs, vol. 54, no. 9, pp. 780–784, sep. 2007. [6] s. c. d. roy, “modified chebyshev lowpass filters,” international journal of circuit theory and applications, vol. 38, no. 5, pp. 543–549, 2010.[online]. available: http://dx.doi.org/10.1002/cta.585 [7] r. ramiz and h. sedef, “general method for designing and simulating of resistively terminated lc ladder filters,” facta universitatis, series: electronics and energetics, vol. 12, no. 3, pp. 79–94, 1999. [8] s. prasad, l. g. stolarczyk, j. r. jackson, and e. w. kang, “filter synthesis using legendre polynomials,” proceedings of the iee, vol. 114, no. 8, pp. 1063–1064, aug. 1967. [9] m. t. chryssomallis and j. n. sahalos, “filter synthesis using products of legendre polynomials,” electrical engineering, vol. 81, no. 6, pp. 419–424, 1999. [10] d. živaljević, n. stamenković, and v. stojanović, “nearly monotonic passband low-pass filter design by using sum-of-squared legendre polynomials,” international journal of circuit theory and applications, vol. 44, no. 1, pp. 147–161, jan. 2016. [online]. available: http://dx.doi.org/10.1002/cta.2068 [11] y. h. ku and m. drubin, “network synthesis using legendre and hermite polynomials,” j. franklin inst., vol. 273, no. 2, pp. 138–157, feb. 1962. [12] i. m. filanovsky, “bessel-butterworth transitional filters,” in 2014 ieee international symposium on circuits and systems (iscas), jun. 2014, pp. 2105–2108. [13] a. dey, s. sadhu, and t. k. ghoshal, “adaptive gauss hermite filter for parameter varying nonlinear systems,” in 2014 international conference on signal processing and communications (spcom), jul. 2014, pp. 1–5. lowpass filters approximatin based on the jacobi polynomials 13 [14] b. d. rakovich and v. b. litovski, “least-squares monotonic lowpass filters with sharp cutoff,” electronics letters, vol. 9, no. 4, pp. 75–76, feb. 1973. [15] d. mirković, m. a. stošović, p. petković, and v. litovski, “design of iir digital filters with critical monotonic passband amplitude characteristic a case study,” facta universitatis, series: electronics and energetics, vol. 29, no. 2, pp. 269–283, 2016. [16] a. budak and p. aronhime, “transitional butterworth-chebyshev filters,” circuit theory, ieee transactions on, vol. 18, no. 3, pp. 413–415, may 1971. [17] y. peless and murakami, “analysis and synthesis of tranzitional butterworth-thomson filters and bandpass amplifier,” rca rev., vol. 18, no. 3, pp. 60–94, mar. 1957. [18] a. papoulis, “optimum filters with monotonic response,” proceedings of the ire, vol. 46, no. 3, pp. 906–609, mar. 1958. [19] ——, “on monotonic response filters,” proceedings of the ire, vol. 47, no. 2, pp. 332–333, feb. 1959. [20] m. fukada, “optimum filters of even orders with monotonic response,” ire transactions on circuit theory, vol. 6, no. 3, pp. 277–281, sep. 1959. [21] p. halpern, “optimum monotonic low-pass filters,” circuit theory, ieee transactions on, vol. 16, no. 2, pp. 240–242, may 1969. [22] m. abramowitz and i. stegun, handbook of mathematical functions with formulas, graphs, and mathematical tables, 9th ed. new york, dover: national bureau of standards applied mathematics series 55, 1972. [23] m. lutovac and d. t. sić, “symbolic signal processing and system analysis,” facta universitatis, series: electronics and energetics, vol. 16, no. 3, pp. 423–431, 2003. [24] a. h. bhrawy, e. h. doha, s. s. ezz-eldien, and r. a. van gorder, “a new jacobi spectral collocation method for solving 1+1 fractional schrödinger equations and fractional coupled schrödinger systems,” the european physical journal plus, vol. 129, no. 12, pp. 1–21, 2014. [online]. available:http://dx.doi.org/10.1140/epjp/i2014-14260-6 [25] c. beccari, “the use of the shifted jacob1 polynomials in the synthesis of lowpass filters,” international journal of circuit theory and applications, vol. 7, no. 2, pp. 289–295, 1979. [26] b. d. rakovich, “designing monotonic low-pass filterscomparison of some methods and criteria,” international journal of circuit theory and applications, vol. 2, no. 3, pp. 215–221, sep. 1974. [online]. available: http://dx.doi.org/10.1002/cta.4490020302 [27] d. topisirović, v. litovski, and m. andrejević stošović, “unified theory and state-variable implementation of critical-monotonic all-pole filters,” international journal of circuit theory and applications, vol. 43, no. 4, pp. 502–515, apr. 2015. [online]. available: http://dx.doi.org/10.1002/cta.1956 [28] t. v. hoang and s. tabbone, “errata and comments on ”generic orthogonal moments: jacobi-fourier moments for invariant image description”,” pattern recognition, vol. 46, no. 11, pp. 3148 – 3155, nov. 2013. [online]. available: http://www.sciencedirect.com/science/article/pii/s0031320313001817 [29] d. johnson and j. johnson, “low-pass filters using ultraspherical polynomials,” ieee transactions on circuit theory, vol. 13, no. 4, pp. 364–369, dec. 1966. [30] z. d. milosavljević and m. v. gmitrović, “realizable band-pass filter structures with optimal redundancy parameters,” facta universitatis, series: electronics and energetics, vol. 13, no. 1, pp. 131–141, 2000. 14 n. stojanović and n. stamenković: [31] v. d. pavlović and ć. b. dolićanin, “mathematical foundation for the christoffel-darboux formula for classical orthonormal jacobi polynomials applied in filters,” scientific publications of the state univ. of novi pazar, series a: appl. math. inform. and mech.,, vol. 3, no. 2, pp. 139–151, 2011. [32] v. d. pavlović et al., “new class of filter functions generated most directly by the christoffel-darboux formula for classical orthonormal jacobi polynomials,” scientific publications of the state univ. of novi pazar, series a: appl. math. inform. and mech.,, vol. 5, no. 1, pp. 23–33, 2013. instruction facta universitatis series: electronics and energetics vol. 29, n o 2, june 2016, pp. 159 175 doi: 10.2298/fuee1602159a characterization of nonlinear loads in power distribution grid  miona andrejević stošović 1 , marko dimitrijević 1 , slobodan bojanić 2 , octavio nieto-taladriz 2 , vančo litovski 1 1 university of niš, faculty of electronic engineering, niš, serbia 2 escuela técnica superior de ingenieros de telecomunicación, universidad politecnica de madrid, madrid, spain abstract. electronic devices are complex circuits, consisting of analog, switching, and digital subsystems that require direct current (dc) for polarization. since they are connected to the mains delivering alternating current (ac), however, ac-to-dc converters are to be introduced between the mains and the electronics to be fed. a converter is an electric circuit containing several subsystems, the most important being the switch-mode power supply, drawing power from the mains in pulses hence it is highly nonlinear. that happens, in reduced amplitude, even when the electronics to be fed is switched off. the process of ac-to-dc conversion is not restricted to feeding electronic equipment only. it is more and more frequently encountered in modern smart-grid facilities giving rise to the importance of the studies referred hereafter. the converter can be studied (theoretically or by measurements) as two-port network with reactive and nonlinear port-impedances. characterization is performed after determining the port electrical quantities which are voltages and currents. based on these data power and power quality parameters – power factor and total harmonic distortionmay be extracted. when nonlinear loads are present, one should introduce new ways of thinking into the considerations due to the existence of harmonics and related power components. in that way the power factor can be generalized to total or true power factor where the apparent power, involved in its calculations, includes all harmonic components. after introducing a wide range of definitions used in contemporary literature, here we describe our measurement set-up both as hardware and a software solution. the results reported unequivocally confirm the importance of the subject of characterization of small nonlinear loads to the grid having in mind their number which is rising without saturation seen in the near and even far future. key words: smart grid, nonlinear loads, load characterization, power factor, harmonic distortions received september 29, 2015 corresponding author: miona andrejević stošović university of niš, faculty of electronic engineering, aleksandra medvedeva 14, 18000 niš, serbia (e-mail: miona.andrejevic@elfak.ni.ac.rs) 160 m.andrejević-stošović, m. dimitrijević, s. bojanić, o. nieto-taladriz, v. litovski 1. introduction with the advent of modern diversified sources of electrical energy, the issue of power quality becomes both more ambiguous and more complicated. we will address here first the new aspects that are coming in fore thanks to the new ways of producing electrical energy, which are becoming more and more popular, and thanks to the emergence of a new paradigm known as smart-grid which involves mutual interaction of power electrical systems and electronic systems for its proper functionality [1]. nowadays we are witnessing changes in the demand and energy use which in fact means “new” load characteristics, and trends changing the nature of the aggregate utility consumption. all of that is mostly due to the electronic devices that became ubiquitous. it is presumed that the overall household consumption for electronic appliances will rise with a rate of 6% per year so reaching 29% of the total household consumption in the year 2030. in the same time the household consumption is expected to reach 40% of the overall electricity demand. the immense rise of the office consumption due to the enormous number of computers in use is also to be added. that stands for educational, administrative, health, transport, and other public services, too. one may get the picture if one multiplies the average consumption of a desk-top (about 120 w) with the average number of hours per day when the computers is on (about 7), and the number of computers (billion(s)?). electronic loads are strongly related to the power quality thanks to the implementation of ac/dc converters that in general draw current from the grid in bursts. the current voltage relationship of these loads, looking from the grid side, is nonlinear, hence nonlinear loads. in fact, while keeping the voltage waveform almost sinusoidal, they impregnate pulses into the current so chopping it into seemingly arbitrary waveform and, consequently, producing harmonic distortions. having all this in mind the means for characterization of the load from the nonlinearity point of view becomes one of the inevitable tools of quality evaluation of smart grid. the problem is further complicated when different power generation technologies and resources are combined leading. new subsystem in the power production, transport, and consumption emerge named micro-grids and the overall system is supposed to become a smart-grid. for example, due to the rise of the number of different kind of electricity sources even the frequency of the grid voltage may be considered as “unknown” asking for algorithms and software to be implemented in real time to extract the frequency value [2] and, based on that, to compute the amplitudes of the harmonics [3, 4, 5]. due to the nonlinearities, measurement of power factor and distortion, however, usually requires dedicated equipment. for example, use of a classical ammeter will return incorrect results when attempting to measure the ac current drawn by a non-linear load and then calculate the power factor. a true rms multi-meter must be used to measure the actual rms currents and voltages and apparent power. to measure the real power or reactive power, a wattmeter designed to properly work with non-sinusoidal currents must be also used. contemporary methods and algorithms for spectrum analysis are presented in this paper. the basic definitions of parameters describing nonlinear loads are introduced. alternative definitions for reactive power and their calculation methods are elaborated, also. in our previous research we were first developing a tool for efficient measurements that would allow for proper and complete characterization of the nonlinear loads [6, 7]. namely we found that the tools for characterization of modern loads available on the market, most frequently, lack at least one of the following properties: low price, ability of implementation of complex data processing algorithms (versatility), ability to store and characterization of nonlinear loads in power distribution grid 161 statistically analyze the measured data, and ability to communicate with its environment no matter how distant it is. all these were achieved by the system reported in [6, 7] and the measurement results demonstrated here were obtained by these tools. next, we implemented these tools for characterization of small loads. the results obtained, as reported in [8] and [9] for example, were, in some cases, surprisingly different from what expected. that stands for the power components which are not the active power and for the abundance of harmonics. in [10] and [11] we demonstrated that based on the main's current, by proper data processing, despite the complex signal transformation between the mains and the components of a computer via the power supply chain, one may deduce the activities within the computer. even more, one may recognize a software running within the computer. such information is distributed via the grid. here we will for the first time summarize the theoretical background of all computations necessary to be performed for complete characterization of small loads. then, we will demonstrate our new results in the implementation of the theory and the measurement tools on a set of nonlinear loads. the definitions used in modern characterization of the main's current, voltage, and power which are implemented by our system will be listed in the second section so enabling the main attention to be devoted to the set of measured results and their analysis, which will be given next. the paper will be organized as follows. first a short description of the measurement experiment will be given. to preserve conciseness, for this purpose, we will mainly refer to our previous work. 2. parameter definitions although power quality is a relatively ambiguous concept, limited mostly to conversations among utility engineers and physicists, as electronic appliances take over the home, it may become a residential issue as well. 2.1. linear loads with sinusoidal stimuli a sinusoidal voltage source rms ( ) 2 sin(ω )v t v t (1) supplying a linear load, will produce a sinusoidal current of rms ( ) 2 sin(ω φ)i t i t  (2) where vrms is the rms value of the voltage, irms is the rms value of the current, ω is the angular frequency, φ is the phase angle and t is the time. the instantaneous power is ( ) ( ) ( )p t v t i t  (3) and it can be represented as rms rms ( ) 2 sin ω sin(ω ) . p q p t v i t t p p     (4) using trigonometric transformations, we can write: rms rms cosφ (1 cos(2ω )) (1 cos(2ω )) p p v i t p t        (5) and 162 m.andrejević-stošović, m. dimitrijević, s. bojanić, o. nieto-taladriz, v. litovski rms rms sin φ sin(2ω ) sin(2ω ) q p v i t q t        (6) where rms rms rms rms cos φ, sin φ p v i q v i       (7) represent real (p) and reactive (q) power. it can be easily shown that the real power presents the average of the instantaneous power over a cycle: 0 0 t +t t 1 ( ) ( )p v t i t dt t    (8) where t0 is arbitrary time (constant) after equilibrium, and t is the period (20ms in european and 1/60s in american system, respectively). the reactive power q is the amplitude of the oscillating instantaneous power pq. the apparent power is the product of the root mean square value of current times the root mean square value of voltage: rms rms s v i  (9) or: 2 2 .s p q  (10) power factor is simply defined as the ratio of real power to apparent power [12, 3]: / .tpf p s (11) for pure sinusoidal case, using (7), (10) and (11) we can calculate: cosφ.tpf  (12) 2.2. nonlinear loads when there is a nonlinear load in the system, it operates in non-sinusoidal condition and use of well known parameters such as power factor, defined as cosine of phase difference, does not describe system properly. in that case, traditional power system quantities such as effective value, power (active, reactive, apparent), and power factor need to be numerically calculated from sampled voltage and current sequences by performing dft, fft or goertzel algorithm [3]. the rms value of some periodic physical entity x (voltage or current) is calculated according to the well-known formula [13, 14]: 0 0 t +t 2 rms t 1 ( ( )) t x x t d t  (13) where x(t) represents time evolution, t is the period and t0 is arbitrary time. for any periodic physical entity x(t), we can give fourier representation: 0 1 ( ) ( cos( ω ) sin( ω )) k k k x t a a k t b k t        (14) characterization of nonlinear loads in power distribution grid 163 or 0 1 ( ) cos( ω ) k k k x t c c k t        (15) where 0 0 c a represents dc component, 2 2 k k k c a b  magnitude of k th harmonic, k = arctan(bk/ak) phase of k th harmonic and  = 2/t, angular frequency. fourier coefficients ak, bk are: t / 2 t / 2 0 t / 2 t / 2 1 2 2 π ( ) , ( ) cos t t t k k t a x t dt a x t dt                (16) and t / 2 t / 2 2 2 π ( ) sin . t k k t b x t dt t            (17) the rms value of k th harmonic is k, rms / 2. k x c (18) we can calculate total rms value 2 2 2 rms , rms 1, rms h, rms 1 m k k x x x x     (19) where m is the highest order harmonic taken into calculation. index “1” denotes first or fundamental harmonic, and index “h” denotes contributions of higher harmonics. equations (13) – (19) need to be rewritten for voltage and current. practically, we operate with sampled values and integrals (16) and (17) are transformed into finite sums. for a single-phase system where k is the harmonic number, k phase difference between voltage and current of k th harmonic and m is the highest harmonic, the total active power is given by: ,rms ,rms 1 h 1 cos φ . m k k k k p i v p p       (20) the first addend in the sum (20), denoted with p1, is fundamental active power. the rest of the sum, denoted with ph, is harmonic active power [13]. in the literature, there exists a number of definitions of reactive power for nonsinusoidal conditions that serve to characterize nonlinear loads and measure the degree of loads’ non-linearity [14]. as more general term, non-active power n, was introduced. each definition has some advantages over others. but, although there is tendency to generalize, there is no generally accepted definition. the most common definition of reactive power is budeanu’s definition [15], given by following expression for single phase circuit: b ,rms ,rms 1 sin φ .k k k k q i v      (21) 164 m.andrejević-stošović, m. dimitrijević, s. bojanić, o. nieto-taladriz, v. litovski budeanu proposed that apparent power consists of two orthogonal components, active power (20) and non-active power, which is divided into reactive power (21) and distortion power: 2 2 2 b .d u p q   (22) it should be noted that the actual contribution of harmonic frequencies to active and reactive power is small (usually less than 3% of the total active or reactive power). the major contribution of higher harmonics to the power comes as distortion power. the apparent power, for non-sinusoidal conditions conventionally denoted as u, can be written: 2 2 1 2 2 h 2 2 2 2 2 1,rms 1,rms 1,rms h,rms 2 2 2 2 1,rms h,rms ,rms h,rms v i s d h d s u i v i v v i v i          (23) where s1 represents fundamental apparent power, dv voltage distortion power, di current distortion power and sh harmonic apparent power. s1 and sh are 2 2 2 2 2 1 1 1 h h h h , s p q s p q d     (24) where dh represents harmonic distortion power. the total apparent power, denoted with u, is 2 2 2 rms rms .u p q d i v     (25) we can also define non-active power n, defined with equation 2 2 n q d  (26) and phasor power s, defined in the same way as apparent power for sinusoidal conditions (10). it is obvious that for sinusoidal conditions, apparent power and phasor power are equal, and (25) reduces to (10). the total harmonic distortions, thd, are calculated from the following formula [12, 13]: h, rms 2 , rms 2 2 2 rms 1, rms 2 1, rm1, rms 1, rms s 1 m i j j i thd i i i i ii      (27) and h, 2 , rms 2 2 rms 1, rms 2 1, rm, s21, 1 1 mrms v k krms rms vv th v v d v v v      (28) where ij, vk j, k=1, 2, …, m stands for the harmonic of the current or voltage. it can be shown that: 1, rms h, rms 1 h, rms 1, rms 1 1 . i i v v h i v d v i s thd d v i s thd s s thd thd            (29) characterization of nonlinear loads in power distribution grid 165 fundamental power factor or displacement power factor is given by the following formula: 1 1 1 1 cos . p pf s   (30) total power factor tpf [12, 13], defined by equation (12), taking into calculation (11) and (23), is 1 h 2 2 2 2 1 hi v p pp tpf u s d d u       (31) and substituting (29) and (30):   h 1 1 22 2 1 cos φ . 1 i v i v p p tpf thd thd thd thd            (32) total power factor can be represented as product of distortion power factor dpf and displacement power factor pf1, i.e. cos1: 1 cosφtpf dpf  (33) therefore, distortion power factor is [12, 13]   h 1 22 2 1 . 1 i v i v p p dpf thd thd thd thd       (34) in real circuits, ph << p1 and voltage is almost sinusoidal (thdv < 5%), leading to simpler equation for tpf [12, 13]: 1 2 cos φ . 1 i tpf thd   (35) 2.3. other definitions of reactive power budeanu’s definition the most common definition of reactive power is budeanu’s definition [16], given by following expression for single phase circuit, as mentioned earlier in the text: b ,rms ,rms 1 sink k k k q i v       (36) budeanu proposed that apparent power consists of two orthogonal components, active power and non-active power, which is divided into reactive power (36) and distortion power: 2 2 2b b .d u p q   (37) ieee std 1459-2010 proposes reactive power to be calculated as: ,rms ,rms 2 2 2 ieee 1 sin k k k k q i v       (38) 166 m.andrejević-stošović, m. dimitrijević, s. bojanić, o. nieto-taladriz, v. litovski equation (38)eliminates the situation where the value of the total reactive power q is less than the value of the fundamental component. kimbark’s definition similar to budeanu’s definition, kimbark [17] proposed that apparent power consists of two orthogonal components, non-active and active power, defined as average power. the non-active power is separated into two components, reactive and distortion power. the first is calculated by equation k 1,rms 1,rms 1sinq i v    (39) it depends only on fundamental harmonic. the distortion power is defined as non-active power of higher harmonics: 2 2 2 k k .d u p q   (40) sharon’s definition this definition [18], introduces two quantities: reactive apparent power, sq, and complementary apparent power sc, defined as: 2 2 q rms ,rms 1 sink k k s v i       (41) and 2 2 2 c qs u p s   (42) where s is apparent power (9) and p active power(8). fryze’s definition fryze’s definition [19] assumes instantaneous current separation into two components named active and reactive currents. active current is calculated as a 2 rms ( ) ( ) p i t v t v  (43) and reactive current as: r a( ) ( ) ( ).i t i t i t  (44) active and reactive powers are rms a f rms r p v i q v i     (45) where ia and ir represent rms values of instantaneous active and reactive currents. kusters and moore’s power definitions kusters-moore definition [20] presents two different reactive power parameters, inductive reactive power: characterization of nonlinear loads in power distribution grid 167 ,rms ,rms 1 l rms 2 ,rms 2 1 1 sink k k k k k v i k q v v k             (46) and capacitive reactive power: ,rms ,rms 1 c rms 2 2 ,rms 1 sin . k k k k k k k v i q v k v              (47) there are other power decompositions, not considered in this paper: shepard-zakikhani [21], depenbrock [22] and czarnecki decomposition [23, 24]. more comprehensive comparison of reactive power definitions, obtained by means of simulation, can be found in [25]. 3. measurement system in order to establish a comprehensive picture about the properties of a given load one needs to perform complete analysis of the current and voltage waveforms at its terminals. in that way the basic and the higher harmonics of both the current and the voltage may be found. more frequently, however, indicators related to the power are sought in order to quantitatively characterize the load. namely, a linear resistive load will have voltage and current in-phase and will consume only real power. any other load will deviate from this characterization and one wants to know the extent of deviation expressed by as much indicators as necessary to get a complete picture. all these were implemented in our measuring system which will be shortly described in the next. the solution, as described in full details in [6, 7], is based on a real time system for nonlinear load analysis. the system is based on virtual instrumentation paradigm, keeping main advantage of legacy instruments – determinism in measurement. the system consists of three subsystems: acquisition subsystem, real time application for parameter calculations, and virtual instrument for additional analysis and data manipulation (fig. 1). tcp/ippcifpga rtos gpos n ni9225l1 l3 l2 ni9227 fig. 1 the system architecture the acquisition subsystem, fig. 2, is implemented using field programming gate array (pxi chassis equipped with pxi-7813r fpga card with virtex ii fpga) in control of data acquisition [26]. acquisition is performed using ni 9225[27] and ni 9227 [28] cseries acquisition modules connected to pxi-7813r fpga card [26]. a/d resolution is 168 m.andrejević-stošović, m. dimitrijević, s. bojanić, o. nieto-taladriz, v. litovski 24-bit, with 50 ksa/s sampling rate and dynamic range ±300 v for voltages and ±5 a for currents. the fpga provides timing, triggering control, and channel synchronization maintaining high-speed, hardware reliability, and strict determinism. the fpga code is implemented in a labview development environment. the function of the fpga circuit is acquisition control. a a a a v v v dut ni 9225 ni 9227 l1 l2 l3 n fig. 2 connection diagram of acquisition subsystem the software component is implemented in two stages, executing on real-time operating system (pharlap rtos, [29, 30]) and general purpose operating system (gpos). described system enables calculation of a number of parameters in real-time that characterize nonlinear loads, which is impossible using classical instruments. the measured quantities are calculated from the current and voltage waveforms according to ieee 1459-2000 and ieee 1459-2010 standards [12, 13]. real time application (fig. 3) calculates power and power quality parameters deterministically and saves calculated values on local storage. the application is executed on real time operating system. fig. 3 part of real-time application in g code, alternative reactive power calculations characterization of nonlinear loads in power distribution grid 169 virtual instrument, implemented in national instruments labview [30, 31] environment, is used for additional analysis and data manipulation represents user interface of described system. it runs on general purpose operating system, physically apart from the rest of the system. communication is achieved by tcp/ip. parameters and values obtained by means of acquisition and calculations are presented numerically and graphically (fig. 4). fig. 4 virtual instrument provides measurements of various parameters 4. measurement results we have performed measurements on various small loads. the parameters obtained may be used for decision making of various kinds, such as verification of compliance to some standards or categorization within quality frames. as small loads here we consider various devices: cfl and led lamps, power supply devices and battery chargers in case of personal communication and computing devices. these devices are ubiquitous and in everyday use, thus their cumulative effect on power distribution grid is not negligible [32], [33]. various parameters that characterize nonlinearity, efficiency and quality are measured and calculated. table 1 shows measured results obtained on small loads such as various compact fluorescent lamps (cfl, 7 w – 20w), incandescent lamps (100w and 60w), two low-power 1 w indoor led (light emitting diode) lamps, prototype of street 34 w led lamp and crt computer monitor for reference. compact fluorescent lamp is good example of nonlinear load [34]. it brings reduction in total energy consumption (about 20%, comparing to incandescent lamp of equivalent luminosity), but with harmonic currents and increased harmonic loss on distribution transformer. measurements show that cfl lamps have good correction of displacement power factor, but significant distortion leading to low total power factor (table 1). cfls are equipped by power supply units which conduct current only during a very small part of fundamental period, so the current drawn from the grid has the shape of a short impulse. 170 m.andrejević-stošović, m. dimitrijević, s. bojanić, o. nieto-taladriz, v. litovski table 1 cfl and led lamps type n o m in a l p o w e r (w ) f re q u e n c y (h z ) v r m s (v ) i r m s ( m a ) a c ti v e p o w e r (w ) i d c ( m a ) v o lt a g e t h d (% ) c u rr e n t t h d (% ) c u rr e n t c r e s t v o lt a g e c r e s t d p f ( % ) c o s( φ ) t p f ( % ) incandescent 100 50.03 230.20 421.66 97.02 0.62 3.11 3.05 1.52 1.47 99.95 1.00 99.95 cfl bulb 20 50.03 231.49 134.87 18.64 0.24 2.58 112.17 3.38 1.41 66.55 0.90 59.70 cfl tube 20 49.95 231.20 145.89 19.66 0.25 2.84 114.01 4.33 1.44 65.94 0.88 58.28 cfl bulb 15 49.99 231.47 92.16 12.60 0.13 2.82 115.52 3.52 1.41 65.45 0.90 59.08 incandescent 60 49.97 231.15 257.88 59.58 0.42 2.87 2.84 1.57 1.41 99.96 1.00 99.96 cfl spot 7 49.97 232.48 50.86 7.23 0.19 2.81 104.24 3.24 1.40 69.23 0.88 61.20 cfl bulb 7 50.06 230.95 52.46 7.21 0.28 2.83 112.26 3.42 1.40 66.51 0.90 59.54 cfl bulb 9 50.01 233.20 60.54 8.25 0.11 2.87 116.93 3.60 1.39 64.99 0.90 58.44 cfl tube 11 50.01 233.17 84.34 11.66 0.16 2.79 112.27 3.37 1.45 66.51 0.89 59.28 cfl tube 18 50.01 221.32 135.56 18.40 0.38 2.82 107.35 4.52 1.45 68.16 0.90 61.32 cfl tube 11 50.01 221.14 115.00 14.06 0.16 3.01 119.30 4.06 1.46 64.24 0.86 55.41 cfl helix 11 50.00 221.83 76.73 10.23 0.25 2.96 109.26 4.90 1.47 67.51 0.89 60.09 cfl bulb 9 49.99 232.52 70.06 9.70 0.19 2.84 110.87 3.52 1.43 66.98 0.89 59.53 cfl helix 18 50.01 221.46 138.68 19.01 0.35 2.89 105.56 3.94 1.43 68.77 0.90 61.71 cfl helix 20 50.03 231.19 156.43 21.02 0.20 2.79 111.36 3.91 1.44 66.82 0.87 58.13 cfl tube 15 50.01 221.00 105.09 13.96 0.29 3.16 112.13 4.46 1.40 66.56 0.90 60.11 led white 1 50.00 217.24 14.96 0.35 0.09 2.36 21.14 1.72 1.38 97.84 0.11 10.79 led cold w. 1 49.94 217.33 14.95 0.35 0.08 2.36 21.14 1.72 1.38 97.84 0.11 10.79 led street 34 49.99 216.63 246.12 32.87 0.05 2.53 102.98 3.28 1.38 69.66 0.89 61.66 crt  50.03 232.63 475.86 107.46 1.60 2.93 13.24 1.65 1.49 99.14 0.98 97.69 characterization of nonlinear loads can be accomplished by analyzing reactive and distortion power. table 2 shows reactive power and distortion power values, calculated using alternative definitions, for compact fluorescent lamps, two incandescent lamps and indoor led lamps. following values are displayed: active power (p), apparent power (s), non-active power (n), budeanu’s reactive power (qb), budeanu’s distortion power (db), fryze’s reactive power (qf), ieee std 1459-2010 proposed definition for reactive power (qieee), shanon’s apparent power (sq), kimbark’s reactive power (qk), kusters-moore’s capacitive (qc) and inductive (ql) reactive power. comparison of budeanu’s reactive and distortion power suggests that all examined cfl and led lamps are non-linear loads (db>qb). reactive power calculated from fryze’s definition (45) is equal to non-active power, 2 2n s p  . kimbark’s equation (39) for reactive power, which takes only fundamental harmonic into account, gives approximately ±3% deviance from budeanu’s formula (qb). it suggests that the actual contribution of harmonic frequencies to reactive power is small – less than 3% of the total reactive power. ieee proposed definition always provides value of the total reactive power greater than the value of the fundamental component. characterization of nonlinear loads in power distribution grid 171 table 2 cfl and led lamps no. type p o w e r p ( w ) u (v a ) n ( v a r ) q b (v a r ) d b (v a r ) q f (v a r ) q ie e e ( v a r ) s q ( v a r ) q k ( v a r ) q c ( v a r ) q l ( v a r ) 1 cfl rod 11.56 17.84 13.58 -6.16 12.10 13.58 6.16 10.24 -6.16 -4.43 -6.11 2 cfl bulb e27 20 17.14 27.72 21.78 -8.43 20.08 21.78 8.43 14.48 -8.43 -6.46 -8.37 3 cfl tube e27 20 16.77 28.46 23.00 -8.44 21.39 23.00 8.45 14.55 -8.45 -6.07 -8.39 4 cfl bulb e27 15 11.59 18.91 14.94 -5.31 13.97 14.94 5.32 9.22 -5.32 -4.00 -5.28 5 inc e27 100 86.77 86.78 0.80 -0.50 0.63 0.80 0.50 0.56 -0.50 -0.36 -0.49 6 cfl spot e14 7 5.87 9.32 7.25 -2.83 6.67 7.25 2.81 4.23 -2.81 -2.17 -2.80 7 cfl bulb e27 7 6.16 9.86 7.71 -2.64 7.24 7.71 2.65 4.83 -2.65 -2.03 -2.63 8 cfl bulb e14 9 6.46 10.78 8.63 -2.72 8.19 8.63 2.72 5.45 -2.72 -2.08 -2.70 9 cfl tube e14 11 9.89 16.11 12.72 -4.71 11.82 12.72 4.69 7.89 -4.69 -3.61 -4.66 10 cfl tube e27 18 17.10 28.86 23.24 -8.73 21.54 23.24 8.75 13.27 -8.75 -6.64 -8.68 11 cfl tube e27 11 10.63 17.67 14.12 -5.83 12.85 14.12 5.83 8.85 -5.83 -4.41 -5.79 12 cfl helix e27 11 9.58 16.27 13.16 -4.93 12.20 13.16 4.95 8.75 -4.95 -3.68 -4.90 13 inc e14 60 55.06 55.06 0.61 -0.37 0.49 0.61 0.37 0.37 -0.37 -0.27 -0.37 14 cfl helix e27 18 17.21 28.87 23.18 -8.82 21.43 23.18 8.83 15.55 -8.82 -6.77 -8.76 15 cfl helix e27 20 18.41 30.68 24.54 -9.95 22.43 24.54 9.93 16.14 -9.93 -7.56 -9.86 16 cfl tube e27 15 12.66 21.97 17.95 -6.32 16.80 17.95 6.33 11.63 -6.33 -4.80 -6.28 17 spot e27 15 16.92 34.24 29.77 -3.88 29.52 29.77 4.14 20.01 -4.13 -1.98 -4.06 18 spot e27 10 13.23 26.33 22.76 -2.97 22.56 22.76 3.17 15.45 -3.17 -1.51 -3.12 19 bulb w e27 8 10.00 19.53 16.77 -2.81 16.54 16.77 2.94 11.52 -2.93 -1.74 -2.89 20 bulb w e27 6 8.51 9.45 4.11 0.08 4.11 4.11 0.07 3.29 0.07 0.08 0.07 21 bulb e27 6 8.69 9.58 4.04 0.09 4.04 4.04 0.08 3.28 0.08 0.08 0.08 22 bulb e27 3 4.07 7.70 6.54 -0.84 6.48 6.54 0.90 4.35 -0.90 -0.45 -0.88 23 rgb e27 3 1.92 3.17 2.52 0.01 2.52 2.52 0.01 1.39 0.00 0.05 0.00 24 spot e14 3 4.00 8.05 6.99 -0.98 6.92 6.99 1.04 4.86 -1.04 -0.52 -1.02 further, personal devices such as tablet computer, mobile phone, laptop computer and cordless telephone containing rechargeable batteries are analyzed regarding operating conditions. measured results are presented in table 3. working conditions are standby (device turned off and battery not charging), working and charging (device turned on and battery charging) and charging only (device turned off and battery charging). a standalone battery charger is also tested. following values are measured and shown in the table: voltage rms (v), current rms (i), frequency (f), cosine of 1st harmonic phase difference (cosφ1), tpf – total power factor (%), dpf – distortion power factor (%), thdv – voltage total harmonic distortion (%),thdi – current total harmonic distortion (%), active power (p), budeanu’s reactive power (qb), apparent power (u), distortion power (d), non-active power (n), phasor power (s), first harmonic active power (p1) and higher harmonics active power (ph). 172 m.andrejević-stošović, m. dimitrijević, s. bojanić, o. nieto-taladriz, v. litovski in the next we will pay some attention to the very results depicted in table 3. let's first have a glimpse at the distortions of the current (thdi). as can be seen even in the best cases the thdi is larger than 20%. there is a case, a mobile phone battery charger while charging, where the thdi is 154.51% which means the harmonics exceed by a large margin the fundamental. note that this is not an isolated case. one may observe several thdis of similar value. to summarize, thdi is exposing the nonlinear character of all small loads, some of which are extremely nonlinear producing harmonics larger than the fundamental one. table 3 personal devices in different working conditions n o . device description v ( v ) i (m a ) f (h z ) 1 charger 230v 1.7a 2xaaa nicd battery charging. 850mah 236.06 9.89 50.02 2 tablet computer turned on. li-polimer 8220 mah battery charging 235.70 80.92 49.98 3 tablet computer turned off. li-polimer 8220 mah battery charging 236.59 61.65 49.99 4 tablet computer turned off. charger 230v/2a connected. not charging 236.51 1.70 50.00 5 mobile phone charger connected. not charging 230v/0.2a 236.62 1.33 9.99 6 mobile phone turned on. li-ion 1230 mah battery charging 235.65 53.72 49.98 7 mobile phone turned off. li-ion 1230 mah battery charging 236.09 48.05 50.01 8 laptop comp. (type 1) turned on. charger 230v. 1.7a connected, not charging 233.49 22.99 50.01 9 laptop comp. (type 1) turned on. li-ion 2200mah battery charging 232.81 231.39 50.00 10 laptop comp. (type 1) turned off. li-ion 2200mah battery charging 233.52 106.52 49.99 11 laptop comp. (type 2) turned on. charger 230v 1.5a connected, not charging 233.07 15.71 49.99 12 laptop computer (type 2) turned on. li-ion 4400mah battery charging 232.05 436.60 49.97 13 cordless telephone base charger 230v/40ma disconnected 232.77 21.05 49.97 14 cordless telephone base. 2xaaa. nicd. 550mah battery not charging 233.68 21.71 50.00 15 cordless telephone base. 2xaaa. nicd. 550mah battery charging 233.55 25.60 49.99 n o . t p f ( % ) d p f ( % ) t h d v ( % ) t h d i (% ) p ( w ) q b ( v a r ) u ( v a ) d ( v a r ) n ( v a r ) s ( v a r ) p 1 ( w ) p h (w ) 1 32.93 70.81 1.70 94.47 0.77 1.77 2.33 1.62 2.20 1.68 0.78 -0.02 2 57.36 58.15 1.73 137.76 10.94 -1.74 19.07 15.53 15.62 11.08 11.08 -0.14 3 55.12 55.54 1.70 146.23 8.04 -0.93 14.59 12.13 12.17 8.09 8.17 -0.12 4 21.43 79.20 1.67 114.80 0.09 0.18 0.40 0.35 0.39 0.20 0.05 0.00 5 12.64 101.35 1.69 59.01 0.04 0.17 0.31 0.26 0.31 0.18 0.02 0.00 6 52.73 53.66 1.71 154.51 6.67 -1.18 12.66 10.69 10.76 6.78 6.73 -0.05 7 51.18 51.98 1.77 161.72 5.81 -0.96 11.34 9.70 9.75 5.88 5.87 -0.06 8 7.00 95.18 1.78 29.07 0.38 1.38 5.37 1.61 5.36 5.12 0.38 -0.01 9 53.67 54.76 2.00 147.11 28.91 -6.10 53.87 45.04 45.45 29.55 29.65 -0.71 10 47.51 50.62 1.92 164.35 11.82 -4.64 24.87 21.39 21.89 12.70 12.18 -0.28 11 12.69 99.22 1.94 40.82 0.46 1.46 3.66 1.42 3.63 3.37 0.43 0.00 12 96.74 97.30 1.83 20.90 98.01 -10.67 101.31 23.32 25.65 98.59 97.86 0.02 13 23.50 90.76 1.80 43.70 1.15 4.33 4.90 1.97 4.76 4.48 1.16 -0.01 14 47.31 92.64 1.78 36.64 2.40 4.09 5.07 1.81 4.47 4.74 2.43 -0.01 15 70.29 92.99 1.82 37.24 4.20 3.66 5.98 2.16 4.25 5.57 4.23 -0.02 characterization of nonlinear loads in power distribution grid 173 the next very important and also interesting set of data is related to the power factor. in early days it was known as cos of the load while only linear loads were considered supposedly having reactive component introducing phase shift between the voltage and the current. the total power factor (tpf) encompasses the whole event including the distortions of both the voltage and the current and their mutual phase shift. as can be seen from table 1, there is only one case where the tpf is approaching unity which is supposed to be its ideal value. in many of the cases the value of tpf is smaller than 50% meaning that the active power is smaller than a half of the total power drawn from the main which, as we could see from the previous paragraph, is mainly due to the distortions. in general, since most of the chargers are considered of small power (look to the column p1 in table 3), no power factor correction is built in so that significant losses are allowed. that, to repeat once more, would not be a problem if the number of such devices, being attached to the mains all the time, is not in the range of billion(s). the next column, the distortion power factor (dpf), represents the percentage of power taken by the harmonics. as we can see, except for a small number of cases where the harmonics are approximately on the level of half of the total power, in most cases they are taking as large power as the fundamental. note, the harmonics are unwanted not only because of efficiency problems. in fact, in the long term, the presence of harmonics on the grid can cause:  increased electrical consumption  added wear and tear on motors and other equipment  greater maintenance costs  upstream and downstream power-quality problems,  utility penalties for causing problems on the power grid  overheating in transformers, and similar. similar conclusion may be drawn in by comparison of the distortion (d) and the power of the first (fundamental) harmonic (p1). there are only three cases where the second is larger than the former. to summarize the data from table 3 one may say that an electronic load to the grid which in fact represents a power supply of a telecommunication or it device, represents a small but highly nonlinear load. in many cases the tpf of such a load is in favor of everything but not the active power to be delivered to the device. 5. conclusion due to the changes in the nature of the electrical loads to the grid new aspects of the characterization of the loads to the electrical grid are emerging. these are related mainly to the nonlinearities of modern electronic loads and to the subsystems used for conversion from dc to ac and vice versa that is becoming unavoidable in modern production and distribution systems. to qualify and quantify the properties of the modern power electrical systems new tools are to be developed being able to cope with the new properties of the signals arising at the grid-to-load and grid to power-producing-facility interface. that stands for both theoretical algorithms for computation and for the very measurement equipment. in these proceedings we represent our results in development and implementation of a measurement system for small loads that are becoming ubiquitous and consequently of big concern for the quality of the delivered electrical energy. we also present the measurement 174 m.andrejević-stošović, m. dimitrijević, s. bojanić, o. nieto-taladriz, v. litovski results for a broad set of electronic loads revealing many secrets hidden behind the prejudice that these loads are small and unimportant. our hardware and software solutions may be characterized as advanced, accurate and versatile while at the same time of low price making them very attractive for practical use being it in laboratory or in field conditions. acknowledgement: this research was partly funded by the ministry of education and science of republic of serbia under contract no tr32004. references [1] l. freeman, “the changing nature of loads and the impact on electric utilities”, tech advantage expo electronics exhibition and conference 2009, new orleans, usa, feb. 2009, www.techadvantage.org/ 2009conferencehandouts/2e_freeman.pdf. [2] v. terzija, v. stanojević, “stls algorithm for power-quality indices estimation”, ieee transactions on power delivery, vol. 24, no. 2, pp. 544-552, april 2008. [3] g. goertzel, “an algorithm for the evaluation of finite trigonometric series”, the american mathematical monthly, no. 1, vol. 65, pp. 34-35, january 1958. [4] s. vukosavić, “detection and suppression of parasitic dc voltages in 400 v ac grids”, facta universitatis, series: electronics and energetics, vol. 28, no 4, pp. 527-540, december 2015. [5] l. korunović, m. rašić, n. floranović, v. aleksić, “load modelling at low voltage using continuous measurements”, facta universitatis, series: electronics and energetics, vol. 27, no. 3, pp. 455-465, september 2014. [6] m. dimitrijević, v. litovski, “power factor and distortion measuring for small loads using usb acquisition module”, journal of circuits, systems, and computers, vol. 20, no. 5, pp. 867-880, august 2011. [7] m. dimitrijević, “electronic system for polyphase nonlinear load analysis based on fpga“, phd thesis, niš, 2012 (in serbian). [8] m. dimitrijević, and v. litovski, “quantitative analysis of reactive power definitions for small nonlinear loads”, in proc. of the 4th small systems simulation symposium, niš, serbia, 2012, pp. 150-154. [9] m. dimitrijević, and v. litovski, “real-time virtual instrument for polyphase nonlinear loads analysis“, in proc. of the ix int. symp. on industrial electronics, indel 2012, banja luka, b&h, november 2012, pp. 136-141. [10] m. andrejević stošović, m. dimitrijević, and v. litovski, “computer security vulnerability as concerns the electricity distribution grid”, applied artificial intelligence, vol. 28, pp. 323–336, 2014. [11] m. dimitrijević, m. andrejević stošović, j. milojković, v. litovski, “implementation of artificial neural networks based ai concepts to thesmart grid”, facta universitatis, series: electronics and energetics, vol. 27, no. 3, pp. 411-424, september 2014. [12] -,”ieee trial-use standard definitions for the measurement of electric power quantities under sinusoidal, non-sinusoidal, balanced, or unbalanced conditions”, ieee power engineering society, ieee std. 1459-2000, 30. january 2000. [13] ieee power engineering society: ieee trial-use standard definitions for the measurement of electric power quantities under sinusoidal, nonsinusoidal, balanced, or unbalanced conditions. ieee std. 1459-2010, 2. february 2010. [14] l. s. czarnecki, “harmonics and power phenomena”, encyclopedia of electrical and electronics engineering, j. wiley and sons, 1999. [15] a. e. emanuel, “power definitions and the physical mechanism of power flow”, j. wiley and sons, 2010. [16] c. i. budeanu, “reactive and fictitious powers.” rumanian national institute, no. 2.,1927. [17] e. w. kimbark, “direct current transmission” j. wiley and sons, 1971. [18] d. sharon, “reactive power definition and power-factor improvement in nonlinear systems.” 1973. in proc. of ins vol. electric engineers, vol. 120, pp. 704-706. [19] s. fryze, et al., “elektrischen stromkreisen mit nichtsinusoidalformingem verfauf von strom und spannung.” elektrotechnische zeitschriji, no. 53, vol. 25, pp. 596-599, 1932. [20] n. l. kusters, w. j. m. moore, “on the definition of reactive power under nonsinusoidal conditions.” ieee trans. power apparatus systems, no. 99, vol. 5, pp. 1845-1854, 1980. [21] w. shepard, p. zakikhani, “power factor correction in nonsinusoidal systems by the use of capacitance”, journal of physics d: applied physics, no. 6, pp. 1850–1861, 1973. [22] m. w. depenbrock, e. t. g. blindleistung, fachtagung blindleistung. aachen, 1979. characterization of nonlinear loads in power distribution grid 175 [23] l. s. czarnecki, “powers in nonsinusoidal networks: their interpretation, analysis and measurement”, ieee trans. instrumental measurements, no. 39, vol. 2, pp. 340-345, 1990. [24] l. s. czarnecki, “physical reasons of currents rms value increase in power systems with nonsinusoidal voltage”, ieee trans. in power delivery, no. 8, vol. 1, pp. 437-447, 1993. [25] m. e. balci, m. h. hocaoglu, “quantitative comparison of power decompositions”, electric power systems research, no. 78, pp. 318-329, 2008. [26] -,“ni pxi-7813r r series digital rio with virtex-ii 3m gate fpga.” national instruments. [27] -, “ni 9225 operating instructions and specifications.” national instruments. [28] -, “ni 9227 operating instructions and specifications”, national instruments. [29] c. jarvis, c., k. kinsella, p. timpanaro, “phar lap ets™ – an industrial-strength rtos white paper.” [30] national instruments: “labview real-time.” national instruments web page. [url] http://sine.ni.com/ nips/cds/view/p/lang/en/nid/2381. [31] national instruments, “labview system design software.” [32] d. stevanović, p. petković, “smarter power meters reduce economic losses at utility grid”, facta universitatis, series: electronics and energetics, vol. 28, no 3, pp. 407-421, september 2015. [33] s. puzović, b. m. koprivica, a. milovanović, m. đekić, “analysis of measurement error in direct and transformer-operated measurement systems for electric energy and maximum power measurement”, facta universitatis, series: electronics and energetics, vol. 27, no. 3, pp. 389-398, september 2014. [34] m. etezadi-amoli, t. sr. florence, “power factor and harmonic distortion characteristics of energy efficient lamps”, ieee transactions on power delivery, no. 4, pp. 1965–1969, 1989. http://sine.ni.com/nips/cds/view/p/lang/en/nid/2381 http://sine.ni.com/nips/cds/view/p/lang/en/nid/2381 facta universitatis series: electronics and energetics vol. 31, n o 2, june 2018, pp. 303 311 https://doi.org/10.2298/fuee1802303m on a property of the reed-muller-fourier transform  claudio moraga faculty of computer science, technical university of dortmund, germany abstract. the reed-muller-fourier is reviewed and a new property is presented: the reed-muller-fourier transform of an n-place p-valued function preserves any permutation of the arguments. this leads to the additional result that the reed-mullerfourier spectrum of an n-place p-valued symmetric function is also symmetric. furthermore, the reed-muller and the vilenkin-chrestenson spectra of an n-place pvalued symmetric function are also symmetric. key words: multiple-valued switching theory, symmetric functions, reed-muller-fourier transform. dedicated to prof. radomir stanković on the occasion of his 65th birthday 1. introduction the fundamentals of the reed-muller transform may be found in the early work of i. zhegalkin [1], [2]. however since his publications were in russian, they remained practically unknown for scientists not proficient in that language. the transform was rediscovered with the works of i.s. reed [3] and d.e. muller [4] and since then, it carries their names. in the literature frequently this transform is mentioned as the rm transform. the transform was developed to be applied to boolean functions. the later extension of the reed-muller transform to multiple-valued domains is due to d.h. green and i.s. taylor [5]. the reed-muller-fourier transform (rmf) was introduced by radomir. s. stanković [6], [7] aiming to combine relevant properties of the reed-muller transform and the discrete fourier transform. in a way, this transform is another extension of the reedmuller transform to the multiple-valued domain. in the binary case, the rmf transform converges to the reed-muller transform. received august 3, 2017; received in revised form september 8, 2017 corresponding author: claudio moraga faculty of computer science, technical university of dortmund, germany (e-mail: claudio.moraga@tu-dortmund.de) 304 c. moraga an important common property of both the rm and rmf transforms is the fact that they represent bijections in the set of p–valued functions. this means that the rm spectrum or the rmf spectrum of an n–place p–valued functions is again an n–place p– valued function, not necessarily different from the original one. (it has been shown that both transforms have fixed points [8], [9]). moreover, both the rm and the rmf transforms have a kronecker product structure. (kronecker product: see e.g. [10], [11]). the rmf transform matrix is lower triangular [12] and exhibits special similarities with the pascal matrix on finite fields [13]. 2. formalisms notation: vectors and matrices will be written with upper case in bold. if m is a p m p n matrix, it will be denoted simply as mm,n. square matrices will be assigned just one index. if not clear from the context, the length of vectors will be explicitly given. an exception to this notation is “xprmf”, which, for historical reasons [7] will be used to denote the basis of the rmf transform. spectral techniques in a nut shell: let v = {0, 1, …, p–1} be the domain of p–valued functions and let f : v n  v, be an nplace p–valued function. to every function f, a value column vector f of length p n is associated. the elements of f are the values of f for all the different value assignments to the arguments. the elements of f follow the lexicographic order of the value assignments to the arguments of f. let f  f denote the association. it is obvious that f  inf, where in denotes the identity matrix, represents a valid association. if mn is a non-singular matrix, its inverse is also non-singular and well defined. moreover since (mn) -1  mn = in, then f  (mn) -1 mnf is also a valid association and represents the basic concept of spectral transformations. since (mn) -1 is non-singular, its columns form a linearly independent set. if the columns of (mn) -1 are considered to represent value vectors of auxiliary functions, then (mn) -1 constitutes a basis. mn, the inverse of (mn) -1 , is called a transform matrix and the product mnf is normally called the spectrum of f. the inner product of the basis and the spectrum leads to a polynomial expression of f. depending on the choice of (mn) -1 , different polynomial expressions on elements of the basis will be obtained. definition 1: let f, g : zp  zp. the gibbs convolution product () of p-valued functions is calculated as follows [6]: if x = 0, then (f  g)(0) = 0. if x  0, then (f  g)(x) = ∑ – – mod p definition 2: the fundamental basis for the rmf transform, called xprmf is the following [6], [7]: xprmf = [x* 0 x* 1 … x* (p–1) ], where x* 0 is defined to be the constant p – 1 for all x, and for 1 ≤ j ≤ p – 1, the powers x* j are calculated as the j–fold gibbs product of x* 0 with itself. on a property of the reed-muller-fourier transform 305 it is simple to show that xprmf is its own inverse. therefore the basic rmf transform matrix, called r1 equals xprmf, and for all n > 1 holds: rn = (xprmf) n , where the exponent “n” denotes the n-fold kronecker product of xprmf with itself. since xprmf is its own inverse, it is easy to see that rn will also be its own inverse. example 1: let n = 2 and p = 3. calculating mod 3, notice that the borders of r2 look different than those of r1. this will happen whenever n is even, since for all p, (p–1) n  1 mod p. if this is inconvenient for some application, then a normalized transform may be used. definition 3: the normalized rmf transform is given by rn = (–1) n+1 xprmf(1)⨂ n mod p. the factor (–1) n+1 is introduced to preserve the value (p–1), in the leftmost column of the matrix when n is even, since (–1) n+1 (p–1) n ≡ (p–1) n+1 (p–1) n ≡ (p–1) 2n+1 mod p. 2n + 1 will be an odd number and an odd power of (p–1) equals (p–1) mod p. it is simple to see that in this case rn is also self-inverse. if for particular applications a “homogeneous and dft-like look” is desirable, then a special rmf transform may be used. definition 4: the special rmf transform equals (p-1)(xprmf) n mod p. see figure 1. [ ] [ ] [ ] fig. 1 special rmf transform matrices for p = 3, 4, and 5 when n = 1 if for any p r1 is expressed as [ri,j], i, j  ℤp, then ( ) mod p [12]. it may be observed that in the case when p is a prime, the matrices are skewsymmetric, i.e., symmetric with respect to the diagonal with positive slope. furthermore besides being skew-symmetric and self inverse, starting at the lower left corner and moving along the diagonal with positive slope, a pascal triangle mod p is found. r2 = 306 c. moraga an important property of the rmf transform is the following: the rmf transform of a non-zero constant vector is an “impulse” vector, i.e. a vector with only one non-zero entry, at the 0-th position [12]. this is a well known property of the dft, which is preserved by the rmf transform. 3. theorems theorem 1. preliminaries: let v = {0, 1, …, p–1} be the domain of p–valued functions and let f : v 2  v, with value vector f of length p 2 . moreover let g : v 2  v, such that g(x1, x2) = f (x2, x1). let the value vector of g be g. furthermore, let p2 be a permutation matrix such that when applied upon f induces a permutation of its components according to the reordering of the arguments of the function. hence g = p2f. claim: the rmf transform of a p-valued function of two variables preserves the order of the arguments. r2g = r2p2f = p2r2f mod p. proof: let i, j  (ℤ ) , with i = i1i0 and j = j1j0. since r has a kronecker product structure, then r2 = r1  r1 mod p. if r2 is expressed as [ri,j] then ri,j = ( ( )) ( ( )) mod p. if i1 and i0 are exchanged, then modified ri,j mod p. and if j1 and j0 are exchanged, then modified ri,j mod p. it is simple to see that in both cases the modified ri,j takes the same value. moreover, exchanging i1 and i0 has the effect of exchanging (the corresponding) two rows of r2 and, similarly, exchanging j1 and j0 has the effect of exchanging (the corresponding) two columns of r2. exchanging i1 and i0 corresponds to p2r2, while exchanging j1 and j0 corresponds to r2p2. the assertion follows. although not explicitly needed for theorem 1, it is not difficult to construct the p2 matrices for different values of p, because of the strong regularity of their structure. they are symmetric, skew-symmetric and self inverse. see figure 2. on a property of the reed-muller-fourier transform 307 [ ] [ ] [ ] fig. 2 p2 matrices for p = 2, p = 3, and p = 4 corollary 1.1: from p2r2 = r2p2 and recalling that r2 is self inverse follows that p2 = r2p2r2. since p2 is also self inverse, then p2p2 = r2p2r2p2 = i2, meaning that r2p2 is also its own inverse. theorem 2. let n  2 and k < n. define f and g to be p-valued functions of n variables (i.e. nplace functions) with value vectors f and g, respectively, such that for all value assignments to the arguments, g equals f, but with transposed arguments xk and xk+1. let pn be a permutation which when applied to f has the effect of transposing only the two selected arguments, i.e., pn = (ik-1  p2  in-k-1). then rnpnf = pnrnf mod p. proof: decompose rn to match the structure of pn. i.e. rn = rk-1  r2  rn-k-1, and apply it to both sides of the claim, taking advantage of the compatibility between kronecker and matrix products [11]: rnpnf = (rk-1  r2  rn-k-1)(ik-1  p2  in-k-1)f = (rk-1  r2p2  rn-k-1)f mod p. pnrnf = (ik-1  p2  in-k-1)(rk-1  r2  rn-k-1)f = (rk-1  p2r2  rn-k-1)f mod p. it is easy to see that the claim will be satisfied if and only if p2r2 = r2p2. this was proven in theorem 1. the assertion follows. 308 c. moraga example 2. let p = 4 and n = 2. calculate p2r2 operating mod 4. from corollary 1.1, (p2r2) -1 = p2r2 = r2p2 therefore commuting the factor matrices will give the same result. theorem 3. let f and g be n-place p-valued functions with value vectors f and g, respectively, such that for all value assignments to the arguments, g equals f, but with transposed arguments xk and xk+1 and transposed arguments xh and xh+1. (n > k > h > 0). if applied independently, let the corresponding transposition matrices be and , respectively, leading to g =  f. the following holds: rng =  rnf mod p. proof: consider first one of the transpositions. let g’ = f mod p. then from theorem 1 follows that rng’ = rn f = rnf mod p. now let the second transposition be executed. g = g’. then from theorem 1 follows that rng = rn g’ = rng’ = =  rnf mod p. p2r2 = = = on a property of the reed-muller-fourier transform 309 theorem 4. let f and g be n-place p-valued functions with value vectors f and g, respectively, such that for all value assignments to the arguments, g equals f, but with permuted arguments. let pn be a permutation matrix, which when applied to f has the same effect as permuting the corresponding arguments. then rng = rnpnf = pnrnf mod p. proof: recall that any permutation of an ordered set of arguments may be obtained with an appropriate sequence of transpositions, and any transposition may be obtained with a cascade of transpositions of neighbor arguments. apply accordingly theorems 2 and 3 as many times as needed. theorem 5. the rmf spectrum of an n-place p-valued symmetric function is symmetric. proof: recall that a p-valued function is symmetric iff it is invariant with respect to any permutation of its arguments. (see e.g. [14], [15], [16], [17]) let f be the value vector of a symmetric function and let pn be equivalent to a random permutation of its arguments. then f = pnf. from theorem 4, rnf = rnpnf = pnrnf mod p. therefore rnf mod p is symmetric. example 3: let p = 4 and f : v 2  v be symmetric, such that f = [1 1 0 3 1 2 3 1 0 3 3 2 3 1 2 0 ] t let s = r2f s = 310 c. moraga symmetry proof: x2 x1 f t s t 0 1 2 3 0 1 2 3 0 1 2 3 0 1 2 3 0 0 0 0 1 1 1 1 2 2 2 2 3 3 3 3 1 1 0 3 1 2 3 1 0 3 3 2 3 1 2 0 1 0 3 3 0 1 3 0 3 3 0 3 3 0 3 2 it is easy to see that s, the spectrum of f, is also symmetric. remark: it was shown in [18] that an analog to theorem 3 holds for spectra obtained with the reed-muller or the vilenkin-chrestenson transforms. this also includes the circular vilenkin-chrestenson spectrum. corollary 5.1. the reed-muller and the vilenkin-chrestenson spectra of p–valued symmetric functions are symmetric. corollary 5.2. if f is a p–valued bent function [20], [19], then the function obtained after permuting the value assignment to the arguments is also bent, since the circular vilenkinchrestenson spectrum will remain flat., i.e. all its components will have a constant absolute value equal to p n/2 . 4. conclusions it has been shown that the rmf transform shares with the reed-muller and the vilenkin-chrestenson transforms the property of preserving any permutation of the arguments, in spite of their different structural attributes. recall that the vilenkinchrestenson transform is complex-valued, symmetric, and unitary up to a normalizing coefficient; the reed-muller transform is integer-valued and neither symmetric nor orthogonal; and the reed-muller-fourier transform is integer-valued, lower triangular, and self inverse. references [1] i.i. zhegalkin, “o tekhnyke vychyslenyi predlozhenyi v symbolytscheskoi logykye,” math. sb., vol. 34, pp. 9-28, in russian, 1927. [2] i.i. zhegalkin, “aritmetizatiya symbolytscheskoi logyky,” math. sb., vol. 35, pp. 311-377, in russian, 1928. [3] i.s. reed, “a class of multiple-error-correcting codes and the decoding scheme.” ire trans. on information theory pgit-4, pp. 38-49, 1954. [4] d.e. muller, “application of boolean algebra to switching circuit design and to error correction.” ire trans. on elec. computers ec-3, vol. 3, pp. 6-12, 1954. [5] d.h. green and i.s. taylor, “multiple-valued switching circuit design by means of generalized reedmuller expansions.” digital processes 2, pp. 63-81, 1976. [6] r.s. stanković, “some remarks on fourier transforms and differential operators for digital functions,” in proceedings of the 22 nd international symposium on multiple-valued logic, sendai, japan, ieee press n.y., 1992, pp. 365-370. on a property of the reed-muller-fourier transform 311 [7] r.s. stanković, “the reed-muller-fourier transform – computing methods and factorizations”, claudio moraga: a passion for multi-valued logic and soft computing. (r. seising, h. allende-cid, eds.), springer 2017, pp. 121-151. [8] c. moraga, s. stojković and r.s. stanković, “on fixed points and cycles in the reed muller domain.” in proceedings of the 38 th international symposium on multiple-valued logic, ieee press, 2008, pp. 82-88. [9] c. moraga, r.s. stanković, m. stanković and s. stojković, “on fixed points of the reed-muller-fourier transform.” in proceedings of the 47 th international symposium on multiple-valued logic, ieee press, 2017, pp. 55-60. [10] a. graham, kronecker products and matrix calculus with applications. ellis horwood ltd., chichester uk, 1981. [11] r.a. horn and ch.r. johnson, topics in matrix analysis. cambridge university press, new york, 1991. [12] c. moraga, r.s. stanković and m. stanković, “a comparative study of the reed-muller-fourier transform, the pascal matrix, and the discrete pascal transform.” research report fsc-2015-02, european centre for soft computing, mieres, asturias, spain, 2015. [13] r.s. stanković, j.t. astola and c. moraga, “pascal matrices, reed-muller expressions, and reed-muller error correcting codes.” in logic in computer science ii, (s. ghilezan, ed.), press mathematical institute of the serbian academy of science, belgrade, serbia, 2015., zbornik radova 18 (26), pp. 145-172. [14] e. pogossova and k. egiazarian, “reed-muller representation of symmetric functions.” j. multiplevalued logic and soft computing, vol. 10, no. 1, pp. 51-72, 2004. [15] r.s. stanković, j.t. astola and k. egiazarian, “remarks on symmetric binary and multiple-valued functions.” in proceedings of the 6th international workshop boolean problems, b. steinbach (ed.), 2004, pp. 83-87. [16] j.t. butler and k. a. schueller, “worst case number of terms in symmetric multiple-valued functions.” in proceedings of the 21 st international symposium on multiple-valued logic. ieee press, 1991. [17] j.c. muzio, “concerning the maximum size of the terms in the realization of symmetric functions.” in proceedings of the 20 th international symposium on multiple-valued logic, 1990, pp. 292-299. [18] c. moraga, “permutations under spectral transforms.” in proceedings of the 38 th international symposium on multiple-valued logic, ieee press, 2008, pp. 76-81. [19] p.v. kumar, r.a. scholz and l.r. welch, “generalized bent functions and their properties.” jr. combinatorial theory series a, vol. 40, no. 1, 90-107, 1985. [20] c. moraga, m. stanković, r.s. stanković and s. stojković, “contribution to the study of multiplevalued bent functions.” in proceedings of the 33 rd international symposium on multiple-valued logic, ieee press, 2013, pp. 340-345. instruction facta universitatis series: electronics and energetics vol. 29, n o 4, december 2016, pp. 489 507 doi: 10.2298/fuee1604489b coreless open-loop current transducers based on hall effect sensor csa-1v  marjan blagojević 1 , uglješa jovanović 2 , igor jovanović 2 , dragan mančić 2 , radivoje s. popović 3 1 irc sentronis ad, niš, serbia 2 university of niš, faculty of electronic engineering, niš, serbia 3 epfl swiss federal institute of technology, lausanne, and senis ag, zug, switzerland abstract. the paper provides an overview of coreless open-loop current transducers based on hall effect sensor csa-1v. depending on the implementation method and current range, the presented transducers are divided in the four groups. the transducers are capable to measure ac and dc currents ranging from several tens of miliamperes up to several hundreds of amperes. methods for resolving issues with the skin effect and stray magnetic fields are also presented including the experimental test results. some of these methods are novelty and have never been presented in literature. key words: current measurement, current transducer, hall effect sensor, csa-1v 1. introduction hall effect refers to the voltage that appears on a conducting material when an electric current flowing through the conductor is influenced by a magnetic field [1], [2]. hall effect is illustrated in fig. 1, where the current i (flowing through the hall sensor in the direction shown in fig. 1) is deflected due to the magnetic flux density b, and thereby generates the voltage vh. fig. 1 operation principle of a hall effect sensor. received october 6, 2015 corresponding author: marjan blagojević irc sentronis ad, niš, serbia (email: marjan@sentronis.rs) 490 m. blagojević, u. jovanović, i. jovanović, d. manĉić, r. s. popović the equation which describes the output voltage of the hall effect sensor is: bikv hh  (1) whereas kh is coefficient which defines the sensitivity of the sensor. thanks to the advantages they provide, current transducers based on hall effect sensors are used in various applications [2]. hall effect sensors are suitable for current measurement due to their small sizes, low prices, good linearity, galvanic isolation, high bandwidth, good accuracy and the ability to measure dc current rather than only ac current [3], [4]. they can be employed to measure currents ranging from several microamperes up to several thousands of amperes. the distribution of current density in a conductor with a rectangular cross section and equivalent schematic of this conductor are shown in fig. 2, where each color of resistor in the schematic matches the corresponding area of a rectangular conductor. due to the skin effect, the higher the frequency the less current flows through the resistor r4 and more through the resistor r1, i.e. the less current flows through the middle of the conductor and more near the outer edges [5], [6]. current redistribution in a rectangular conductor is important factor in current measurement applications. fig. 2 distribution of current density and equivalent schematic of a flat conductor. the skin effect within massive rectangular conductors may become noticeable at very low frequencies in the order of several tens of hz. the redistribution of current density results in the redistribution of the measured magnetic flux density which deteriorates frequency and phase responses of current transducers. phase response of a current transducer is very important in applications for electric energy measurements (for instance, good current transformers have a phase shift less than 1°). immunity to stray magnetic fields is an important feature of current transducers based on hall effect sensors because they can induce a false reading and measurement errors. this paper provides an overview of current transducers based on a hall effect sensor csa-1v divided in the four groups. the first group works in the similar way as pickup coils and measures current in a pcb traced conductor or in a wire. the second group is based on miniature bus bars and can measure currents up to several tens of amperes. thanks to the magnetic field increase using multi-turn coils, the third group can measure very low currents in the range of miliamperes. the fourth group is based on bus bars and it is designed for high current applications. the major features of the transducers, the issues that arise with current measurement and the methods to overcome them are also presented in this paper. special attention was paid to methods for resolving issues with the skin effect and stray magnetic fields. coreless open-loop current transducers based on hall effect sensor csa-1v 491 some of the presented solutions for frequency response improvement are novelty and, according to the knowledge of this paper authors, have never been presented in the literature. 2. hall device csa-1v csa-1v is an integrated hall effect single-axis magnetic field sensor designed for non-contact measurement of electric current. the device is manufactured using a standard cmos technology with an additional sentron’s patented ferromagnetic layer called integrated magnetic concentrator (imc) [7], [8] and it incorporates the spinning current technique. thanks to that, compared to the conventional hall effect sensors, the csa-1v provides a magnetic gain contributing to a greater magnetic sensitivity, a lower magnetic offset and a lower magnetic noise [9], [10]. the device is packed in a standard soic-8 case (see fig. 3) which provides a good isolation (up to 600 v) for applications with current conductor traced on a printed circuit board (pcb) [9]. fig. 3 direction of the sensitivity vector and location of the sensing element [10]. the sensing element of the csa-1v is located approximately 0.3 mm below the top surface of soic-8 case as illustrated in fig. 3. consequence of uncontrollability of the imc process is that csa-1vs will usually not have the specifications rated in the datasheet [9]. for this reason, during the manufacturing process, the calibration procedure is introduced using the certain number of specifically designated memory cells [11]. the calibration memory cells are manufactured in “zener zapping” technology and can be programed only once [12], [13]. the calibration procedure of csa-1vs is well presented in papers [11], [13]. 3. pickup hall effect current transducers a current transducer operating on the similar principle as a pickup coil can be realized using hall effect sensors. in these transducers, instead of a pickup coil, the csa-1v is employed to sense a magnetic field generated by a current carrying conductor and convert it to a voltage proportional to that field. this can be performed either by employing the csa-1v to measure current in an adjacent wire or in a pcb traced conductor below the csa-1v as shown in fig. 4 [10]. 492 m. blagojević, u. jovanović, i. jovanović, d. manĉić, r. s. popović fig. 4 shape and direction of magnetic field from two different conductor types [10]. the csa-1v differential output voltage for a current carrying circular conductor (wire) located on top of the sensor can be approximated with the following equation [10]: 3.0 060.0    d i v outdiff (2) whereas d is a distance between the csa-1v top surface and a center of a wire given in milimeters (see fig. 4) and i is current applied in a wire. the application of the csa-1v measuring current in a current carrying wire is shown in fig. 5. if placed too close to the csa-1v, high current carrying wire can saturate the csa-1v. therefore, the limits for electrical and magnetic saturation must be taken into the account. fig. 5 application of the csa-1v measuring current in a current carrying wire. the csa-1v differential output voltage for a flat pcb conductor traced directly below the csa-1v can be approximated with the following equation [10]: ivoutdiff  40 (3) whereas i is current applied in a pcb traced conductor assumed to be roughly 3.2 mm wide. the sizing of the pcb trace needs to take in account the current handling capability and the total power dissipation. for this reason, the pcb trace needs to be thick enough and wide enough to handle designated nominal current continuously. using a single pcb traced conductor, currents up to 10 a can be measured. the applications of the csa-1v measuring current in the pcb traced conductor (see fig. 6) are presented in paper [14] while the thermal analysis performed using the thermal imaging camera is presented in paper [15]. coreless open-loop current transducers based on hall effect sensor csa-1v 493 fig. 6 applications of the csa-1v measuring current in the pcb trace [14], [15]. applications shown in fig. 6 are implemented in photovoltaic power plant for dc current measurement of photovoltaic modules [14]. 3.1. transducer with the magnetic shield the csa-1v can detect any surrounding stray magnetic field which is in the direction of sensitivity (across the chip) which may cause interference and disturb the measurement accuracy. the solution to this issue is to shield the csa-1v by mounting a small (roughly 1 cm 2 x 0.5 mm) ferromagnetic plate on the opposite side of a pcb from the one to which the csa-1v is soldered as shown in fig. 7 [10]. the plate can be made out of mu-metal since it has high permeability at low field strengths and low remanence field. fig. 7 shielding the csa-1v from stray fields [10]. the ferromagnetic shield has double effect: 1. it concentrates the flux around the trace thus shortening the field lines that go through the air almost by double. in this way the magnetic resistance is reduced by double which ultimately contributes to higher induction and greater output signal (by 30%–50%). 2. it serves as the concentrator for stray fields at the same time deflecting them from the csa-1v as shown in fig. 7. 3.2. anti-differential configuration of hall effect sensors measurement error produced by stray fields can be also minimized by implementing two hall effect sensors in the anti-differential configuration shown in fig. 8. 494 m. blagojević, u. jovanović, i. jovanović, d. manĉić, r. s. popović fig. 8 anti-differential configuration of hall effect sensors [10]. implementation of this method cancels common mode magnetic fields while the output signal is doubled [10] as per following equations: 1 ( ) s u s b b   (4) 2 ( ) s u s b b   (5) bsuu  2 21 (6) whereas b is the measured magnetic field, bs is the common mode magnetic field and s is the sensor sensitivity. this method works perfectly with homogenous stray fields. since the field gradient decreases as a function of distance, if the surrounding nonhomogeneous stray fields are relatively distant from the transducer they can be considered as homogeneous. in this way, the useful signal is doubled while the noise is 2 greater, i.e. the signal to noise ratio is 2 times better. application of anti-differential configuration of the csa-1vs on the massive oval conductor is shown in fig. 9. fig. 9 application of anti-differential configuration on the massive oval conductor. 4. miniature bus bar current transducers currents greater than 10 a can be measured using the csa-1v by conducting current throughout a properly shaped copper miniature bus bar (mbb) placed above the csa-1v as illustrated in fig. 10. fig. 10 copper mbb placed above the csa-1v. coreless open-loop current transducers based on hall effect sensor csa-1v 495 sizing of the mbb and the distance from the csa-1v are dependent on the desired current handling capability. the closer the mbb to the csa-1v, the more accurate readings will be obtained but the limits of electrical and magnetic saturation need to be taken into the account. the approximate csa-1v differential output voltage can be obtained by the following equation: 40 ( 0.3) outdiff i v d    (7) whereas d is the distance between the mbb center and the csa-1v top surface given in millimeters and i is current applied in the mbb. the method illustrated in fig. 10 can easily be implemented by soldering a mbb on to a pcb above the csa-1v as shown in fig. 11 [10]. fig. 11 application of a mbb and a pcb trace [10]. when a noncircular mbbs are employed in the application illustrated in fig. 11, it is necessary to take into the account frequency dependence of transducer’s sensitivity because the skin effect forces high frequency current to flow along the outer edges of the mbb thus changing the magnetic flux density at the site of the csa-1v. consequently, the frequency response deteriorates. the solution to this issue is to split a rectangular mbb into two parallel branches by drilling a hole in the middle of a mbb as illustrated in fig. 12. in this way, since the current flows through the branches the skin effect is minimized. fig. 12 rectangular mbb without and with the hole in the middle. it should be noted that a hollow mbb (see fig. 12) must be thicker than a same mbb without a hole in order to handle the same current intensity. to demonstrate the difference between transducers with a circular mbb, a rectangular mbb and a rectangular hollow mbb properly, it is necessary to analyze their frequency responses. in order to do so, the three transducers are realized using all three types of mbbs. the transducer with the circular mbb is shown in fig. 13. 496 m. blagojević, u. jovanović, i. jovanović, d. manĉić, r. s. popović pcb csa-1v circular mbb 0.7 fig. 13 transducer with the circular mbb. the transducer with the rectangular mbb, capable of handling currents up to 50 a, is shown in fig. 14. 0.7 pcb csa-1v rectangular mbb fig. 14 transducer with the rectangular mbb. the transducer with the rectangular hollow mbb is shown in fig. 15. the mbb is identical to one employed in the transducer shown in fig. 14 with the only difference being the hole. 0.7 pcb csa-1v hole rectangular hollow mbb fig. 15 transducer with the rectangular hollow mbb. the frequency responses of all three transducers are shown in fig. 16. the sensitivity of the transducer with the circular mbb for dc current is s=34 mv/a, the sensitivity of the transducer with the rectangular mbb for dc current is s=35 mv/a while the sensitivity of the transducer with the rectangular hollow mbb for dc current is s=28.36 mv/a. as can be seen from fig. 16 the frequency response of the transducer with the circular mbb (see fig. 13) has the 3 db sensitivity attenuation (sensitivity is equal to 0.7) at 100 khz which corresponds to the frequency response of the csa-1v sensor itself [9]. for the transducer with the rectangular mbb (see fig. 14), the 3 db sensitivity attenuation is around 80 khz. however, for the transducer with the hollow rectangular mbb (see fig. 15), the 3 db sensitivity attenuation is around 100 khz just like for the transducer with the circular mbb. based on the measurements shown in fig. 16, the benefit of the hollow rectangular mbb is evident. coreless open-loop current transducers based on hall effect sensor csa-1v 497 fig. 16 frequency responses for all three transducers. it is possible for a high frequency ac current carrying mbb to be on much higher potential relative to the ground of the csa-1v. this can lead to the capacitive coupling between the mbb and the csa-1v. to avoid this, it is necessary to place the electrostatic shield between the mbb and the csa-1v. figure 17 shows the electrostatic shield implemented in the transducer with the rectangular hollow mbb. the electrostatic shield is mounted over the top surface of the csa-1v and soldered on the ground pad on the pcb. the electrostatic shield drops sensitivity and to minimize this it is necessary to employ the electrostatic shield with the shape shown in fig. 17. fig. 17 transducer with the rectangular hollow mbb and the electrostatic shield. the frequency response of the transducer with the rectangular hollow mbb and the electrostatic shield is shown in fig. 18. fig. 18 frequency response of the transducer with the rectangular hollow mbb and the electrostatic shield. 498 m. blagojević, u. jovanović, i. jovanović, d. manĉić, r. s. popović by comparing the frequency responses for the transducer with and without the electrostatic shield (fig. 16 and fig. 18) slight sensitivity decrease is evident. 4.1. mbb transducer with the magnetic shield to protect the csa-1v from stray magnetic fields it is possible to employ the magnetic shield shown in fig. 19. selection of the shield material must be taken into the account in order not to affect transducer frequency response and linearity [16]. compared to the magnetic resistance of air, the magnetic resistance of the ferromagnetic shield is practically equal to zero. this means that the magnetic resistance of the magnetic circuit is reduced by factor of two, i.e. the sensitivity is increased by factor of two. fig. 19 magnetic shield structure and transducer with the magnetic shield. side effect of the magnetic shield is that it may have hysteresis and a remanence magnetization which can cause offset. the solution to this issue is to insert a layer of vitrovac beneath the magnetic shield. vitrovac absorbs the field inflicted by the remanence magnetization. the frequency response of the transducer with the magnetic shield (see fig. 19) is shown in fig. 20. sensitivity for dc current is s=57.8 mv/a. fig. 20 frequency response of the transducer with the magnetic shield. implementation the magnetic shield does not affect the frequency response which can easily be seen by comparing frequency responses shown in fig. 16 and fig. 20. coreless open-loop current transducers based on hall effect sensor csa-1v 499 5. current transducers based on a bobbin coil another method to develop low current transducers based on the csa-1v is by increasing the magnetic field around the csa-1v using a multi-turn coil (see fig. 21). in this way even currents in the order of several tens of miliampers can be accurately measured. during the assembly, the csa-1v is mounted in a center of a bobbin with the sensing element, inside the csa-1v, in the middle of a bobbin at equal distance from top and bottom bobbin edge as shown in fig. 21. fig. 21 multi-turn coil and placment of the csa-1v inside the bobbin. transducer sensitivity is dependent on the coil size and the number of turns. increased sensitivity and immunity to stray fields can be gained by shielding the coil. the bobbin provides very high dielectric isolation making this a suitable solution for high voltage power supplies with relatively low currents. the output should be scaled to obtain the maximum voltage for the highest current to be measured in order to obtain the best accuracy and resolution. based on this method the transducers, capable of measuring currents ranging from 250 ma to 10 a, are produced. structure of these transducers is the same (see fig. 22) with the only difference being the type of implemented coil. depending on the current range there are three types of coil implemented in the transducer: 1. for 250 ma current with 10 v/a sensitivity using 250 turns with awg34 wire; 2. for 2.5 a current with 1 v/a sensitivity using 24 turns with awg24 wire; 3. for 10 a current with 0.25 v/a sensitivity using 6 turns of awg18 wire; fig. 22 transducer structure: 1. shields; 2. duct tape; 3. foil; 4. bobbin; 5. csa-1v. components shown in fig. 22 are fitted in a cubic box and properly sealed. photo of the realized transducer is shown in fig 23. 500 m. blagojević, u. jovanović, i. jovanović, d. manĉić, r. s. popović fig. 23 photo of the realized transducer. the transducer can be adjusted to output either a bipolar or unipolar voltage. the transfer functions for both output types are shown in fig. 24. fig. 24 transfer function for bipolar and unipolar output. when a transducer is inserted in a primary circuit its resistance plays an important role because it acts an insertion resistance and can create an undesired voltage drop. for this reason, it is important to keep a transducer resistance as low as possible. the resistances of the realized transducers are 6 ω for 0.25a, 0.06 ω for 2.5 a and 0.006 ω for 10 a. 6. bus bar current transducers currents ranging up to few thousands of amperes can be measured in the similar way as presented in previous two methods. in this way, instead of employing a pcb trace or a mbb, the idea is to conduct current trough an electrolytic copper bus bar and to fit the csa-1v in the middle of a bus bar to measure current. rather than employing only one csa-1v effective cancellation of stray fields without magnetic cores or shielding can be achieved by employing two csa-1vs. for this reason, the bus bar transducer is realized using the anti-differential configuration of two csa-1vs shown in figs. 8 and 9. photo of the realized bus bar transducer is shown in fig 25. fig. 25 copper bus bar. coreless open-loop current transducers based on hall effect sensor csa-1v 501 as stated above, the skin effect within massive rectangular conductors such as the bus bar shown in fig. 25 can be manifested at very low frequencies in the order of several tens of hz. the skin effect has a major impact in rectangular bus bars [17, 18] with one of the major issues being a redistribution of the magnetic flux density [19]. for this reason, it is necessary to evaluate transducer under dc and ac current. frequency and phase measurements are conducted using the dc current source with modulation from 1 hz to 250 hz and using the power ac current source. measurement results are shown in fig. 26. the blue curve is obtained using the dc source while the red curve is obtained using the ac source. frequency ranges for both current sources partly overlap. fig. 26 frequency and phase responses of the bus bar transducer. based on these measurements, it is evident that skin effect becomes significant for frequencies higher than 20 hz. therefore, it is unnecessary to use dc current source hence every subsequent measurement is performed using the ac current source. on the phase response graph (see fig. 26), the blue curve is obtained using the dc current source, the red curve is obtained using the ac current source while the green curve represents the phase response of the csa-1v which has a dominant role on high frequencies. the transducer phase response on low frequencies is influenced by the bus bar and surroundings. issue with the frequency dependence of the transducer sensitivity can be resolved by implementing at least one of the following methods or by their combination: 1. unsymmetrical placement of the csa-1vs with regard to the bus bar; 2. application of a magnetic filter; 3. application of an electronic filter. 4. cutting out notches in a bus bar in order to produce a restrictive region. 6.1. unsymmetrical placement of the csa-1vs to evaluate the effect of the csa-1vs position on the bus bar, series of measurements are performed in which the both csa-1vs are placed at the same distance from the middle of the bus bar as illustrated in fig. 27. dbus bar csa-1v csa-1v fig. 27 csa-1v positions on the bus bar. 502 m. blagojević, u. jovanović, i. jovanović, d. manĉić, r. s. popović since the skin effect forces current to flow along the outer edges of the bus bar, the idea is to find a suitable position, for the csa-1vs to be mounted, at which the field changes originating from current redistribution are the least. the measurement results of this experimentation are shown in fig. 28. fig. 28 frequency and phase responses of the bus bar transducer with unsymmetrical placement of the csa-1vs. based on these measurements, the ideal position to mount the csa-1vs is where the frequency response is the flattest. 6.2. magnetic filter the frequency response can be improved using the passive method based on the assembly of a massive flat conductor above the bus bar and the csa-1v. this conductor will induce eddy currents which will cancel the primary magnetic field. consequently, the magnetic field lines will bypass the conductor. instead, they will concentrate between the bus bar and the conductor mounted above the csa-1v. moreover, the current distribution in the bus bar with the conductor mounted above will not be the same as in the case without the conductor, i.e. the current density in the bus bar will be higher on the side closer to the conductor. to obtain a flat frequency response, the copper magnetic filter is employed in the way shown in fig. 29. fig. 29 application of the magnetic filter on the transducer. coreless open-loop current transducers based on hall effect sensor csa-1v 503 frequency and phase responses of the bus bar transducer with the magnetic filter (see fig. 29) are shown in fig. 30. fig. 30 frequency and phase responses of the bus bar transducer with the magnetic filter. based on the measurements shown in fig. 30, it is evident that the magnetic filter reduces sensitivity drop caused by the skin effect, i.e. it increases sensitivity. as the sensitivity drop caused by the skin effect is roughly 40%, it is obvious that the magnetic filter reduces the initial impact of the skin effect for 10%. magnetic filter also improves the phase response. 6.3. electronic filter fig. 31 shows electrical schematic of the bus bar transducer. summation of outputs from two csa-1vs is performed using a differential amplifier ad623 with unity gain. fig. 31 schematic of the bus bar transducer. the idea how to employ an electronic filter to obtain a flatter transducer frequency response is to connect a resistor and capacitor in series instead of a gain defining resistor rg. the resistor is selected so the amplifiers gain compensates the output signal decrease caused by the skin effect. the capacitor is selected so that its impedance begins to decrease when the skin effect begins to impact, meaning that its impedance is zero when 504 m. blagojević, u. jovanović, i. jovanović, d. manĉić, r. s. popović practically entire current flows along the bus bar outer edges. on this basis, a 220 kω resistor and a 2.2 nf capacitor are selected. frequency and phase responses of the bus bar transducer with the electronic filter are shown in fig. 32. fig. 32 frequency and phase responses of the bus bar transducer with the electronic filter. based on these measurements it is evident that the electronic filter reduces sensitivity drop at the same time improving the phase response. 6.4. bus bar with the restrictive region by having the notches cut out in a bus bar (see fig. 33) nearly a circular cross section of the restrictive region is obtained. for conductors with a circular cross section, redistribution of a current density does not impact on distribution of a magnetic field around a conductor. in this way the lateral skin effect is minimized. fig. 33 bus bar with the restrictive region [20]. since ac current flows through the restrictive region of the bus bar the magnetic flux density around the restrictive region is greater than around the rest of the bus bar. in addition to this, combination of the anti-differential configuration of hall effect sensors and a notched bus bar provides the better immunity to stray magnetic fields mainly because hall effect sensors are close to each other. however, it should be noted that having the notches cut out may cause an overheating at the restrictive region [20]. 6.5. optimized bus bar current transducer in order to improve the frequency response, i.e. to obtain flat frequency response, the optimized bus bar current transducer comprising top three previously presented methods is realized. the electronic filter is composed of a 330 kω resistor and a 2.2 nf capacitor, coreless open-loop current transducers based on hall effect sensor csa-1v 505 the csa-1vs are mounted 5 mm away from the middle of the bus bar and the magnetic filter is applied. overall the obtained amplitude error is less than 1% as shown on fig. 34. fig. 34 frequency and phase responses of the optimized bus bar transducer. effect of the applied methods can be easily spotted on the frequency response in fig. 34 because they result in 55% better frequency response compared to the transducer without compensation. 6.6. braid bus bar another way to minimize the skin effect is rather than to employ plain bus bar to employ a braid bus bar, such as one shown in fig 35. the application of a braid bus bar, consisted of a thin insulated wires, results in a spatial averaging of a current density so that a distribution of a magnetic field around the conductor is not frequency depended. fig. 35 braid bus bar. the disadvantage of this solution is that it is not easy to achieve a rigid attachment between a flexible braid and a hall effect sensor. movement of a hall effect sensor relative to a braid bus bar results in a sensitivity change. therefore, if necessary, this issue must be properly addressed. 6.7. current transducer with magnetic shielded conductor the skin effect in rectangular bus bars can be minimized or even eliminated with partial shielding of the bus bar. the idea is to fit ferromagnetic plates, shaped like letter “c”, on the side edges of a bus bar as shown in fig. 36. 506 m. blagojević, u. jovanović, i. jovanović, d. manĉić, r. s. popović fig. 36 partial shielding of the bus bar [21]. fig. 36 illustrates current density distribution in a bus bar without (left bus bar) and with partial magnetic shield (right bus bar). this method is presented in patent [21] and discussed in paper [22]. optimization of size and shape of ferromagnetic shields can result in a significantly better transducer frequency response keeping dimensions of bus bar the same. minimization of skin effect reduces heating of a bus bar. magnetic structures presented in [1], [2] also minimize ac resistance, which can be useful for some applications. 7. conclusion this paper reviews several types of coreless open-loop current transducers based on the hall effect sensor csa-1v capable of measuring ac and dc currents ranging from several tens of miliamperes up to several hundreds of amperes. during the development of each transducer special attention was paid on solving problems related to the frequency response. in addition, attention was paid not to disrupt the linearity and to achieve satisfactory immunity to stray magnetic fields. another goal of this paper is to expand the scope of use of the realized transducers by providing a lot of useful guidelines for designers faced with the challenges of current measurement using hall effect sensors. the first experiments were related to the mbb transducers suitable for current measurements up to several tens of amperes. with mbbs the skin effect becomes noticeable at frequencies greater than 10 khz. the issue with the skin effect has been overcome by drilling a hole in the bus bar. the issue with stray magnetic fields has been overcome by implementing a ferromagnetic shield and anti-differential configuration of two csa-1vs. the second experiments were related to the transducers based on massive copper bus bars with cross sections which can handle currents up to several hundred of amperes. these solutions employ different ways of position csa-1vs relative to the bus bar, the application of magnetic filter and application of electronic filter. some of the presented solutions for frequency response improvement are novelty and have never been described in the literature. acknowledgement: the research presented in this paper is financed by the ministry of education, science and technological development of the republic of serbia under the projects tr32057 and tr33035. coreless open-loop current transducers based on hall effect sensor csa-1v 507 references [1] r. s. popović, hall effect devices. institute of physics publishing, bristol and philadelphia, 2004. [2] honeywell inc., "hall effect sensing and application," micro switch sensing and control, 2002. [3] d. r. popović, s. dimitrijević, m. blagojević, p. kejik, e. schurig, r. s. popović, "three-axis teslameter with integrated hall probe free from the planar hall effect," in proc. of the of instrumentation and measurement, technology conference, sorrento, italy, no. 6384, pp. 24-27, 2006. [4] d. r. popović, s. dimitrijević, m. blagojević, p. kejik, e. schurig, r. s.popović, "three-axis teslameter with integrated hall probe," ieee transactions on instrumentation and measurement, vol. 56, issue 4, pp. 1396-1402, 2007. [5] d. m. veliĉković, s. r. aleksić, "a numerical procedure for solving skin effect integral equation in thin strip conductors," facta universiatis, series: electronics and energetics, vol. 14, no. 2, pp. 253-270, 2001. [6] m. greconici, g. madescu, m. mot, "skin effect analysis in a free space conductor," facta universiatis, series: electronics and energetics, vol. 23, no. 2, pp. 207-215, 2010. [7] r. s. popović, z. randjelović, d. manić, "integrated hall-effect magnetic sensors," sensors and actuators a: physical, vol. 91, pp. 46 -50, 2001. [8] r. s. popović, p. m. drljaĉa, p. kejik, "cmos magnetic sensors with integrated ferromagnetic parts," sensors and actuators a: physical, vol. 129, pp. 94-99, 2006. [9] sentron, csa-1v datasheet, 2008. [10] sentron, "current sensing with the csa-1v," application note, 2008. [11] m. blagojević, d. manĉić, "programator strujnih i 2d magnetnih senzora," in proc. of the infotehjahorina 2007, jahorina, bosnia and herzegovina, no. e-vi-10, 2007, in serbian. [12] m. blagojević, s. dimitrijević, "programiranje strujnih senzora csa-1v i statisticka analiza," in proc. of the of xiii conference yu info 2007, kopaonik, serbia, pp. 11-14, 2007, in serbian. [13] m. blagojević, m. radmanović, "ureċaj za kalibraciju strujnih senzora," in proc. of the infotehjahorina 2007, jahorina, bosnia and herzegovina, no. e-vi-11, 2007, in serbian. [14] z. petrušić, i. jovanović, lj. vraĉar, d. manĉić, m. blagojević, "a wirelles solution of measurement-control system for photovoltaic application," in proc. of the unitech’10 international scientific conference, gabrovo, bulgaria, vol. 1, pp. 114-122, 2010. [15] m. blagojević, z. petrušić, d. manĉiĉ, m. radmanović, "termiĉka analiza strujne sonde bazirane na senzoru csa-1v," in proc. of the xiii meċunarodni simpozijum energetska elektronika ee 2005, novi sad, serbia, no. t4-4.8, pp. 1-5, 2005, in serbian. [16] p. ripka, "current sensors using magnetic materials," journal of optoelectronics and advanced materials, vol. 6, no. 2, pp. 587-592, 2004. [17] v. belevitch, "the lateral skin effect in a flat conductor," philips technical rev. 32, pp. 221-231, 1971. [18] j. zhou, a. m. lewis, "thin-skin electromagnetic fields around a rectangular conductor bar," journal of physics d: applied physics, vol. 27, pp. 419–425, 1994. [19] i. popa, a.-i. dolan, "numerical modeling of dc busbar contacts," facta universiatis, series: electronics and energetics, vol. 24, no. 2, pp. 209-219, 2011. [20] m. blagojević, d. manĉić, i. jovanović, z. petrušić, "current ampacity of bus-bar with neck for application in current transducers," in proc. of the unitech’10 international scientific conference, gabrovo, bulgaria, vol. 1, pp. 123-127, 2010. [21] j. s. gallina, m. brand, "magnetic structure for minimizing ac resistance in planar rectangular conductors", patent us6105236a [22] t. mizuno, s. enoki, t. suzuki, t. asahina, m. noda, h. shinagawa, "reduction in eddy current loss in conductor using magnetoplated wire," ieej transactions on fundamentals and materials, vol. 127, no. 10, pp. 611-620, 2007. instruction facta universitatis series: electronics and energetics vol. 27, n o 3, september 2014, pp. 329 338 doi: 10.2298/fuee1403329s method for integrated circuits total ionizing dose hardness testing based on combined gammaand x-ray irradiation facilities  armen v. sogoyan 1,2 , alexey s. artamonov 1,2 , alexander y. nikiforov 1,2 , dmitry v. boychenko 1,2 1 national research nuclear university mephi, moscow, russia 2 jsc "specialized electronic systems" (spels), moscow, russia abstract. a method is proposed to test microelectronic parts total ionizing dose hardness based on a rationally balanced combination of gammaand x-ray irradiation facilities. the scope of this method is identified, and a step-by-step algorithm of combined testing is provided, along with a test example of the method application. key words: microelectronics, tid effects, gammaray, x-ray. 1. introduction testing procedure of microelectronic parts, i.e., integrated circuits (ics), semiconductor devices, solid-state microwave electronics and electronic modules for compliance with nuclear and space radiation hardness regulations can be based on various radiation facilities that initiate total ionizing dose (tid) effects [1], [2] in devices under test (dut). since the problem of radiation testing of microelectronic parts had arisen for the first time and till now, tid effects are induced in laboratory mainly by gamma irradiation test facilities based on co 60 sources. every isotope-based gamma irradiation facility is unique and complex installation with a full-scale biological personnel protection, commonly designed under dedicated projects. there are also some other types of tid radiation test facilities which are widely used such as electron accelerators, other isotopic sources (cs 137 ), nuclear reactors. in all cases radiation test installation is focused at reproducing characteristics equivalent to real-world radiation factors and their effects. as gamma quanta have high energy (about 1 mev), this results in high penetrating power and weak dependence of the total ionizing dose in active areas of dut. at the same time in order to provide radiation safety gamma irradiation facilities require a  received february 28, 2014 corresponding author: aleksandr y. nikiforov national research nuclear university mephi, moscow, russia (e-mail: aynik@spels.ru) 330 a.v. sogoyan, a.s. artamonov, a.y. nikiforov, d.v. boychenko significant (usually from 6 up to 25 meters) signal lines distance from dut to the measuring hardware. this remote measurements usually fail to test all necessary modes and conditions of dut operation under irradiation. moreover a substantial part of dut informative parameters (including those related to precision and high frequency performance) have become totally immeasurable at such a distance. gamma irradiation facilities have low general availability due to strict radiation safety regulations and it is impossible to use such a facility directly within ic design and manufacture process. as a result, such method of testing has not a very compelling business case in its favor. to overcome this downside of gamma irradiation facilities, in late 80s to early 90s new tid simulation test method have been developed using relatively compact x-ray irradiators with low-energy (10...100kev). in tests with x-ray facilities, intensity is tuned so as to result in a tantamount change in parameters, faults and failures of electronic components compared to the real-world ionization sources having the same dominant effect. x-ray testers (e.g., produced by aracor, usa or spels, russia) have been installed in many companies specialized in microelectronics research and development. the main goal of x-ray testers is their radiation safety (2 mm iron shield is enough for 10 kev source) together with very sho rt signal lines (less than 1 meter) and good compatibility with automotive control and measurement tools ( including wafer probes). implementation of x-ray testers for microelectronics tid hardness was accompanied by theoretical and experimental verification and research to substantiate equivalence of tid effects of various types of radiation [3]-[11]. as a result x-ray testers were incorporated into microelectronic processes and test standards [12], [13]. 2. use of x-ray irradiation facilities the main issue restraining application of x-ray testers is their low energy and, consequently, low penetration of x-ray radiation, as well as substantial dependence of tid absorbed in active areas, on design and process specifics of dut. all these necessitate advanced expert skills to ensure quantitative tid assessment (i.e. dosimetric evaluation) in the context of process diversity of microelectronic parts, a multitude of packages used, etc. a substantial number of microelectronic parts tested today are sophisticated chips used in modern apparatus. test customers tend to minimize the number of tested samples of each type to 3...5. many types of microelectronic parts have plastic packages. dosimetric evaluation of such samples is rather complex, because in most cases the manufacturer fails to provide data on the component design, layout, process used, chemistry of the package, etc. therefore, in this work we tried to overcome the disadvantages of gamma and x-ray radiation test sources specifically for microelectronics tid research using the inherent benefits of both of them in favor of compact and safe x-ray source and rationally minimizing usage of gamma-sources for necessary cases only. method for integrated circuits total ionizing dose hardness testing based on combined... 331 3. scope of joint testing the joint method of tid hardness testing based on gammaand x-ray irradiation facilities has been designed to enhance precision and quality of x-ray based simulation testing defined in [13]. it covers packaged and caseless silicon-based cmos circuits (i.e., with monosilicon, epitaxial, silicon-on-sapphire and silicon-on-insulator structures), as well as bipolar and bicmos (including sige) ics. to be admitted to tests, microelectronic parts have to meet the following conditions:  number of samples: 3 or more  samples taken from the same production lot, with clearly identified samples. 4. calibration method in x-ray dosimetry the method of calibration is commonly used. the most tid sensitive parameter of the device under test is chosen as a calibration parameter and denoted as qk. it is assumed that the x-ray dose is equivalent to the  radiation dose (d), if they both produce an identical radiation-induced change in the calibration parameter under identical testing conditions (mode, temperature, time from start of irradiation till measurement): dэ(qk) = d (qk). d(qk) is called the calibration curve; it is determined based on the test results on a gamma irradiation facility. based on this curve, the tested sample tid sensitivity is "calibrated". as calibration parameter qk, we propose to choose such electrical parameter of the product, the radiation-induces change of which is determined by tid effects. additional requirements to be met by the calibration parameter are: ease of measurement, a higher sensitivity to d and a long linear or, at least, "smooth" monotonous interval with qк=qк(d), lower susceptibility to electromagnetic interference and crosstalk. 5. algorithm of combined testing microelectronics tid hardness testing procedure on gamma and x-ray facilities is based on the following algorithm. 1. predicting the level of tid hardness and selection of the most sensitive operating mode. the following prediction methods can be used (descending priority):  based on the lab's own previous experience in testing of a given part type, or other products of a given manufacturer;  based on formally published results of previous testing of a given part type or other products of a given manufacturer provided by another test labs;  based on formally published results of previous tests of similar parts provided by a given manufacturer, including technical specifications;  based on results of previous tests of functionally similar parts provided by other various manufacturers;  based on, data-bases, articles, advertizing and other informal sources. 332 a.v. sogoyan, a.s. artamonov, a.y. nikiforov, d.v. boychenko such a prediction results in a preliminary selection of a particular calibration parameter from various device under test (dut) parameters listed in the test procedure as well as selection of the mostly tid-sensitive electric and operating modes. if there is no technical evidence in favor of a particular electric mode, we recommend opting for the mode with a maximum supply voltage according to specifications. 2. analysis of dut design and estimation of the x-ray package (coating) attenuation ratio. the attenuation ratio is estimated based on the type, thickness and chemical composition of the package (protective coating) of a dut. 3. x-ray irradiation of dut sample, measuring all the criterial parameters specified in the test procedure, in the selected operating mode under the normal climatic conditions. to make a preliminary selection of the calibration parameter and the criterial parameters, the q = q(dx) dependency should be identified. the power of x-ray radiation absorbed on the crystal surface, based on the estimated attenuation ratio, should fall in the range of x-ray irradiation facility power used for calibration. irradiation proceeds until the sample fails in most of criterial parameters, or until the level of exposure at which radiation-induced change of a pre-selected calibration parameter and criterial parameters 100 times exceeds the measurement error. when choosing an irradiation mode, the following condition should be met: trad > 10tmeas, where trad is the full exposure time, tmeas  total time of parameter measurement during irradiation. in case of low radiation sensitivity of the calibration parameter and other criterial parameters (initial value changes less than 100 times the measurement error) hardness is assessed on a smaller number of samples (but still 2 samples at least) on a gamma irradiation facility. 4. gamma irradiation of a dut sample, measuring all the criterial parameters in the selected operating mode under the normal climatic conditions. to make a preliminary selection of the calibration parameter and the criterial parameters, the q = q(d) dependency should be identified. the power of gamma radiation absorbed should fall in the range 0.5...2.0 of gamma radiation absorbed on the crystal surface, in view of the estimated attenuation ratio. irradiation continues until d0 is reached, or the sample fails in most of criterial parameters, or until the level of exposure at which radiation-induced change of a preselected calibration parameter and criterial parameters 100 times exceeds the measurement error. the tid is measured by the gamma irradiation facility standard dosimetric methods. when choosing an irradiation mode, the following condition should be met: trad>10tmeas, where trad is the full exposure time, tmeas is the total time of parameter measurement during irradiation. 5. comparative analysis of x-ray and gamma irradiation test results. a decision is made on feasibility and validity of x-ray tests and the calibration factor is estimated. 6. applicability of combined testing the method of joint testing is applicable in case it is possible to build the calibration transformation: x d kd   , (1) method for integrated circuits total ionizing dose hardness testing based on combined... 333 where k is a factor for which dependencies qk(dx) and qк(d) are approximately similar: 2 ( / ) ( ) max 0.04 ( ) kx k d k q d k q d q d          , (2) where  is a relative instrumental error for q (according to the measurement tool data sheet), qk (d) is the dependence of criterial parameter versus d obtained on the gamma irradiation facility (item 4), qkx (dx) is the dependence of the criterial parameter increment versus the exposure level dx on the x-ray irradiation facility (3). the k-factor in the relationship (1) can be estimated by the least squares method. condition (2) should be verified at least at two points of d. when condition (2) is met, a decision on applicability of calibration-based dosimetry method is taken. lot #1 of n samples is tested on a gamma irradiation facility, and lot #2 of nx samples is tested on an x-ray irradiation facility, where nx > n. both lots are tested in an identical electric mode and under the same climatic conditions. the method to estimate the k-factor depends on the nature of functions qi(d), where i is the number of a sample in lot 1: i = 1 ... n. as a calibration parameter, we recommend to select a one with the higher relative radiation-induced increment. if there are multiple criterial parameters having close relative increment values (within 20%), the conditions outlined below apply to each parameter. if, in the tid range 0...d 0, the qi(d) dependency has a maximum in the neighborhood of dimax, it is normalized to the value of qi l, measured at di l closest to dimax. if, within a dosage range of 0...d 0, the qi(d) dependence has several maximums, the main maximum should be selected. if no maximum is available, the dependency is not normalized. the calibration level of q0 is selected. the calibration level should be selected close to the value corresponding to the parameter tolerance boundary specified for the tested sample. for the j-th sample of lot 1, j = 1...n, based on the experimental dependency qi(d) the value of tid d j is determined from condition 0 0 ( ) j j q d q q      (3) if necessary, to determine d j from (3), linear interpolation of dependency qj (d) can be used. similarly, the values of dxi , i = 1...nx, for lot 2, are defined. then, the point estimate of calibration factor k is made: x d k d   , (4a) 1 1 x n x xi ix d d n    , (4b) 334 a.v. sogoyan, a.s. artamonov, a.y. nikiforov, d.v. boychenko 1 1 n i i d d n        (4с) when there are multiple criterial parameters with close relative values of increments, the calibration parameter is that for which 2 2 2 2 1 1 2 2 ( ) ( ) 1 1 x nn xi x i i i x x x d d d d n nd d                    (4d) has the smallest value. the lower boundary kl of the calibration factor confidence interval is calculated: 2 1 2 1 2 1 (1 ) l k q k q q q k q      , (5) 2 1 , 2 12 1 2 ( ) 1 1 ( 2) x x n xi x n n i x x x t d d q n n n n d                     , 2 1 , 2 12 2 2 ( ) 1 1 ( 2) x n i n n i x x x t d d q n n n n d                        where t 1/2,n is the quantile of the student distribution with n degrees of freedom, where confidence level is /2. confidence level p=1-is defined in the regulatory and technical documentation. if its value is not set, it is assumed to be 0.95 according to radiation test standards. as the calibration factor k=kl is taken.. the k/kl> 1 ratio plays the role of testing norm which depends on the number of samples tested. a relative dosimetry error in such a case  is affected by relative errors of gamma () and x-ray (x) dosimetry: (1 )(1 ) 1 x x          (6) dosimetric conformity of products is regulated by radiation test standards. 7. combined testing example for a test example, we have chosen a typical integrated circuit, hef4013bt which is a dual cmos d-trigger manufactured by nxp semiconductors. let's estimate the calibration factor for hef4013bt. as the sample was irradiated, we controlled its operation and measured acceptability criteria (uoh, ioh, iol, icch, iccl) versus the level (time) of exposure (see fig. 1). method for integrated circuits total ionizing dose hardness testing based on combined... 335 a) b) c) d) fig 1 experimental dependences of selected hef4013bt parameters versus exposure time: а) uoh, b) ioh, iol, c) icch, d) iccl next, we have to assess applicability of the method. for this purpose, we expose the circuit in a gamma irradiation facility (sample 13) and in an x-ray irradiation facility (sample 6). fig. 2 shows matching of dependencies of increment of supply current in the set mode for these samples. the calibration transformation factor (1) was estimated by the least squares method. at k=0.0328, relationship (2) is valid even at δ = 0 at least at three different exposure levels. therefore, we can conclude that the combined test method is applicable to the particular sample. further, the two lots of integrated circuites are irradiated. the first lot (2 samples, including sample #13) is exposed in a gamma irradiation facility, while the second lot (5 samples, including #6) is exposed in an x-ray facility. as a calibration parameter, the supply current in the set state (icch) is selected. since the dependence of the parameter increment versus exposure level is monotonous, such dependence is not normalized. 336 a.v. sogoyan, a.s. artamonov, a.y. nikiforov, d.v. boychenko fig. 2 matching of dependencies of supply current in the set mode at exposure of hef4013bt in a gamma irradiation facility (sample 13) and an x-ray irradiation facility (sample 6) at k = 0.0328. the calibration level of parameter q0 = 3 ma is selected. for the j-th sample of lot 1, j = 1...n , based on the experimental dependency q j (d), the tid value d j is determined from the following condition (fig. 1) 0 0 ( ) j j q d q q      fig. 3 then, the levels of exposure di matching the q0 criteria, are determined. method for integrated circuits total ionizing dose hardness testing based on combined... 337 resulting d = {51.6, 44.6}. similarly, the values of dxi, i = 1...nx are determined for lot 2: dx = {1734, 1733, 1521, 1488, 1569}. then, the point estimate of calibration factor k is made: 1609, 48.1, 0.0299. x x d d d k d       the lower boundary kl of the calibration factor confidence interval is calculated at p=0.95: k=kl=0.025. a relative error of measuring x-ray exposure duration x for automatic source control is under 1%, therefore the dosimetry testing error is determined by the relative error of gamma irradiation dosimetry   which is 15% according to the dosimetric system data sheet. if the case for x-ray testing is proven, electronic components informative parameters immeasurable under the gamma irradiation conditions are measured on the x-ray source, otherwise the entire test is run the on the gamma irradiation facility. 8. conclusion the method of microelectronics tid hardness assurance testing based on a combination of gamma and x-ray irradiation facilities clarifies and develops the method of x-ray tests dosimetry specified in regulatory documents. this method can improve reliability of dosimetry of x-ray testing, fully combining, within a single test cycle, the capabilities and benefits, both of gamma irradiation facilities ensuring adequacy of test effects and of xray irradiation facilities, allowing to determine all informative parameters of electronic components (including precision and performance), and check all the operating modes and conditions directly under irradiation. the newly proposed method of combined electronic component testing offers the benefit of working with small sample lots and presents clear applicability criteria. acknowledgement: the authors wish to thank dr. yuriy bogdanov with the physics and technology institute, ras, moscow for the great contribution in statistical approach. references [1] artamonov a.s., chumakov a.i., eremin n.v., figurov v.s., kalashnikov o.a., nikiforov a.y., sogojan a.v. / 'reis-ie' x-ray tester: description, qualification technique and results, dosimetry procedure // ieee radiation effects data workshop, 1998, pp. 164-169. [2] methods for the prediction of total-dose effects on modern integrated semiconductor devices in space: a review / belykov v.v., pershenkov v.s., zebrev g.i., sogoyan a.v., chumakov a.i., nikiforov a.y., skorobogatov p.k. // russian microelectronics, 2003, 32 (1) , pp. 31-47. [3] fleetwood, d.m.; winokur, p.s.; schwank, j.r. using laboratory x-ray and cobalt-60 irradiations to predict cmos device response in strategic and space environments. ieee transaction on nuclear science. vol 35, 1988, pp. 1497-1505. [4] palkuti, leslie j. lepage, james j. x-ray wafer probe for total dose testing. ieee transaction on nuclear science. vol 29, 1982, pp. 1832-1837. [5] fleetwood, d. m.; beegle, r. w.; sexton, f. w.; winokur, p. s.; miller, s. l.; treece, r. k.; schwank, j. r.; jones, r. v.; mcwhorter, p. j. using a 10-kev x-ray source for hardness assurance. ieee transaction on nuclear science. vol 33, 1986, pp. 1330-1336. 338 a.v. sogoyan, a.s. artamonov, a.y. nikiforov, d.v. boychenko [6] dozier, c. m.; brown, d. b.; throckmorton, j. l.; ma, d. i. defect production in sio2 by x-ray and co-60 radiations. ieee transaction on nuclear science. vol 35, 1985, pp. 4363-4368. [7] oldham t.r., mcgarrity j.m. comparison of co-60 response and 10 kev x-ray response in mos capacitors // ieee trans. -1983. -vol. ns-30. -n 6. -p. 4377. [8] ic space radiation effects experimental simulation and estimation methods / chumakov a.i., nikiforov a.y., telets v.a., sogoyan a.v. //radiation measurements, 1999. v.30. [9] ic’s radiation effects modeling and estimation /chumakov a.i., nikiforov a.y., pershenkov v.s., skorobogatov p.k. //microelectronics reliability, 2000, v.40, #12. [10] nikiforov a.y., chumakov a.i. simulation of space radiation effects in microelectronic parts// effects of space weather on technology infrastructure. 2004 kluwer academic publishers/ netherlands. [11] a.v. sogoyan. assessment of tid hardness of cmos vlsi exposed to pulsed radiation // russian microelectronics, 2011, v. 40, #3, pp. 200-208. [12] astm f1467 11 standard guide for use of an x-ray tester (≈10 kev photons) in ionizing radiation effects testing of semiconductor devices and microcircuits [13] astm e666 09 standard practice for calculating absorbed dose from gamma or x radiation. 11268 facta universitatis series: electronics and energetics vol. 36, no 2, june 2023, pp. 299-314 https://doi.org/10.2298/fuee2302299m © 2023 by university of niš, serbia | creative commons license: cc by-nc-nd original scientific paper analysis of portable system for sound acquisition of vehicles powered by internal combustion engines marko milivojčević1, emilija kisić2, dejan ćirić3 1academy of technical and art applied studies, school of electrical and computer engineering, belgrade, serbia 2metropolitan university, faculty of information technology, belgrade, serbia 3university of niš, faculty of electronic engineering in niš, niš, serbia abstract. in this paper a portable system for acquisition of sound generated by passenger vehicles powered by internal combustion engines is described and analyzed. the acquisition system is developed from scratch and tested in order to satisfy the requirements such as high-quality of audio recordings, high mobility, robustness and privacy respect. with this acquisition system and adequate signal processing, the main goal was to collect a large amount of clear audio recordings that will form a quality dataset. in further research, this dataset will be used for machine learning model training and testing, i.e. for developing a system for automatic recognition of the type of car engine based on fuel. key words: acoustic based acquisition system, dataset, audio signals, internal combustion engines 1. introduction applications of artificial intelligence algorithms to audio signals are becoming more numerous over time [1-4]. sound classification, audio event detection and audio scene recognition are examples of tasks that are successfully realized in practice by applying machine or deep learning [5, 6]. in this context, machine and deep learning could be used to identify the type of internal combustion engine with regard to the fuel based on the sound generated by the engine. namely, the sound of these engines differs depending on the used fuel petrol (gasoline) or diesel. human ear can recognize this sound difference, that is, whether it is a petrol or diesel engine’s sound. those facts and the need to classify passenger vehicles by fuel as a result of improved environmental standards [7] have served as major pillars of the present research. its main aim is to develop a system for automatic recognition of engine type based on sound generated by the engine, that is, to received november 09, 2022; revised january 18, 2023; accepted february 06, 2023 corresponding author: marko milivojčević academy of technical and art applied studies, school of electrical and computer engineering, belgrade, serbia e-mail: markom@viser.edu.rs 300 m. milivojčević, e. kisić, d. ćirić build a machine/deep learning model that will be able to recognize the type of engine with high accuracy, where an input to the model will be the engine sound. since the successful implementation of machine/deep learning requires an adequate dataset (containing, in this case, audio samples), a specialized acquisition system has been developed for this purpose. details of the development of such an acquisition system for the collection of audio samples of the passenger vehicles powered by internal combustion engines are presented here. the first requirement that the acquisition system should satisfy is the automation, because manual collection of a large number of samples would require a lot of time and might introduce certain differences in conditions during the acquisition. then, the collected data should meet the requirements for quality, duration and invariability of environmental conditions in order to provide the reliable information regarding the acoustic characteristics. the paper is divided into several sections. the technical characteristics of the system, hardware configuration and selection of components as well as acquisition procedure and processing of the collected audio signals are presented in the section related to methodology. the section describing the results provides a tabular presentation of the system efficiency for three cases of time interval between the start of detection of two consecutive vehicles, as well as the presentation of audio signals in the time and spectral domain as a measure of validity of the obtained images for further analysis with a machine or deep learning system. the paper ends with concluding remarks. 2. acquisition system and procedure description in the earlier phases of the research, the influence of microphone position in the area below the engine compartment on the characteristics of audio recordings was analyzed in detail [8]. as a result, it was determined that the basic characteristics of the audio signal varied minimally independently where the measuring microphone was placed as long as the microphone was directly below the engine compartment [8]. in that regard, depending on a vehicle, the target area where the microphone can be placed below the engine is approximately 1.2 m by 1.2 m. because of that, it is possible to collect relevant audio samples regardless of the exact position of the vehicle when it is stopped above the microphone. based on the previous findings, the acquisition system uses a microphone positioned in the area below the engine compartment chosen as the most suitable area in terms of "purity" of sound [8, 9], and audio recording begins only after the presence of the vehicle is detected. in this way, audio samples of engine operation in the idle mode are collected, without the microphone itself being positioned on the vehicle. the system has been developed to be mobile, so that it can be set up independently of availability of power sources, and it is fully designed to run on battery power. additionally, the system is designed to be autonomous, i.e., not to require human presence during operation. as the system has limited memory space, it was necessary to develop several verification steps before the current audio sample was written in the memory. specifically, this system has four levels of verification before storing the audio recording, which resulted in a dataset of recordings that contains only sounds of interest, i.e., engine operation. when the system is applied in real conditions involving presence of interfering sources of noise and different engine load modes, despite a large number of successfully collected audio recordings, some recordings containing not only the desired engine mode but also other engine modes appeared analysis of portable system for sound acquisition of vehicles powered by internal combustion engines 301 in the formed dataset. so, it was necessary to develop a procedure that detects and then extracts the idling mode of the internal combustion engine. in order to have as much autonomy of the system as possible, the requirement for minimum energy consumption conditioned the application of the simplest possible procedure for separating the desired engine mode. thus, the procedure of extracting the engine idle mode applied here is based on the audio signal processing in the time domain, i.e., usage of signal envelope. it is worth mentioning that the number of recordings containing only the engine idling mode is also affected by the minimum time period between the start of detection of two consecutive vehicles. 2.1. acquisition system the main goal of collecting audio samples of engine operation is to make a dataset containing sounds of passenger vehicles recorded in real conditions. in this way, the future classification system will be able to properly work in such conditions, as those at entrances to underground garages, toll plazas, gas stations, etc. the generated dataset of audio samples should preferably have such characteristics that will enable its usage in different machine and deep learning approaches [10]. they include support vector machine (svm) [11], k-nearest neighbors (k-nn) [12], deep forest [13] or various deep neural network architectures as multilayer perceptron [14] or convolutional neural network [15]. for this purpose, the audio samples may be transformed either into selected set of features or images, such as spectrogram-like images, or they may be used in the existing format (raw audio signals). the entrance to the underground garage with a ramp was chosen as the most suitable space for collecting audio samples, where it is necessary to stop the vehicle until the driver takes the card / token. during this period, the car is static and idling. even if it has a start / stop system, it will run in idle mode for a certain period of time. in addition, in such a situation, the movement of the vehicle is so directed that there is no possibility of mechanical damage to the microphone and sensor that are placed on the ground in the space between the wheels. the block diagram of the system is presented in fig. 1, and the realized system in a laboratory environment is shown in fig. 2. fig. 1 block diagram of the sound acquisition system 302 m. milivojčević, e. kisić, d. ćirić fig. 2 realized acquisition system in laboratory environment the system has been developed so that the presence of vehicles is detected with the ultrasonic sensors before the process of recording the engine operation sound begins (the first level of verification). ultrasonic sensors are primarily selected as sensors that, unlike widely used cameras, do not affect user privacy. also, these sensors that are among the cheapest sensors on the market have very low power consumption, and they are accurate enough to detect vehicles. this type of vehicle detection enables the installation of the system almost everywhere because there is no possibility of interference with any existing induction sensors at the entrance ramp and violations of the law related to user privacy. in order to avoid detection of objects that are not vehicles of interest, two sensors are used. the sensors are positioned so that one measures the distance along the horizontal (x) axis and the other one along the vertical (y) axis. the plane formed by the ultrasound sensors is not perpendicular to the direction of vehicle movement, as shown in fig. 3. by using the sensors placed in the described way, the possibility of detecting twowheelers and pedestrians that might also show at the ramp is eliminated. namely, due to the position and orientation of the sensor placed on the ground, two-wheelers can only be detected if they pass directly above, i.e., over the sensor. however, even in such a case, they will not meet the distance requirement from the side sensor, if they move in the intended direction of entering the garage. if the sensors were located in the same plane, then it would theoretically be possible for a motorcycle to be oriented perpendicularly in reference to the intended direction of movement of the vehicle, i.e., above the ground sensor and facing the side sensor with the front or rear wheel. by positioning the sensors in two planes, a motorcycle would have to be in an almost impossible position to enter the garage, i.e., it would need to hit the ramp in order to satisfy the condition of the vehicle presence on both sensors. in a similar manner, a pedestrian who is above the ground sensor could move in tandem with another pedestrian who would satisfy the condition of the side sensor if both sensors were in the same plane. however, if the distances are measured in different planes, it would be more difficult and less likely to meet the condition of the vehicle presence on both sensors. analysis of portable system for sound acquisition of vehicles powered by internal combustion engines 303 fig. 3 the acquisition system positioned at the entrance to the underground garage, where horizontal (x) and vertical (y) axis as well as horizontal and vertical plane, which is also the plane formed by the axes, are presented readily available waterproof ultrasonic distance measurement modules containing an ultrasonic sensor jsn-sr04t, whose specification is given in [16], are used in the acquisition system. these modules, that is, sensors are controlled by a microcontroller within the arduino nano platform [17], where distances are set for the specific measurement case. distance measurement is realized by the short-term emission of an ultrasonic signal triggered by arduino, after which arduino measures the time until the reflected signal appears. the distance to an obstacle is calculated based on the measured time required for the signal to reach the obstacle and then return, and based on the speed of sound in the air. since measurement of the distance to the vehicle does not require precision greater than 1 cm, the best results were obtained by a trigger signal lasting 10 microseconds. thus, if both sensors detect an object (the horizontal sensor at distance less than 80 cm and the vertical sensor at distance less than 40 cm), the microcontroller registers a vehicle presence and sends this information using serial communication to the raspberry pi computer [18]. this computer represents the central part and heart of the acquisition system. the reason for using an additional microcontroller in addition to the raspberry pi, which can also control and read ultrasonic sensors, is the need to detect vehicle presence continuously, i.e., in parallel with recording the audio. by having both the raspberry pi and the additional microcontroller (arduino nano), two activities − vehicle detection and audio recording, supposed to be done in parallel, can be realized in an easier and more reliable way. each audio sample is recorded with an omnidirectional microphone that is placed on the ground in the area below the vehicle. in this way, in almost all cases, the microphone is positioned directly below the vehicle’s engine after the vehicle is stopped in front of the ramp. in order to obtain the highest quality audio recordings, the akg c562cm omnidirectional microphone is used, with the specifications that are listed in [19] and presented in fig. 4. 304 m. milivojčević, e. kisić, d. ćirić (a) (b) fig. 4 characteristics of akg c562cm microphone: (a) frequency and (b) polar response [19] the microphone and ultrasonic sensor that measures the vertical distance are placed in a purpose-made cable protector (fig. 5a) made of industrial rubber with a hardness of 90 shora. for the purpose of collecting audio recordings, the edges of this cable protector had to be processed at an appropriate angle so that the sound of wheel crossing over it should be negligible in the recordings. the processing angle was determined empirically and was approximately 150°. in addition to protecting the cables that connect the microphone and ultrasonic sensor to the rest of the system, the cable protector is designed to protect both the microphone and sensor in the case that the vehicle wheel passes directly over them (fig. 5b). when the cable protector is placed at the measuring position, it is not necessary to fasten this guide to the base, because it is not subject to slipping and moving due to the structure of the rubber and its width of 20 cm. the guide was at almost the same position during the acquisition independently on how large and heavy were the vehicles passing over. (a) (b) fig. 5 cable and ultrasonic sensor protection: (a) purpose-made cable protector and (b) microphone/sensor protection the hardware limitations of the raspberry pi computer in terms of maximum sampling frequency and number of bits for audio signal quantization as well as the need for microphone phantom power resulted in the insertion of an a/d converter between the microphone and the raspberry pi computer. for the purpose of a/d conversion and microphone power supply, a dedicated high-quality audio interface irig pre hd is employed, which is also a batterypowered device whose specifications are given in [20]. additionally, the use of an external analysis of portable system for sound acquisition of vehicles powered by internal combustion engines 305 a/d converter enables the raspberry pi to run at lower processing power and lower power consumption. on the raspberry pi computer, the developed python code is run after the power is turned on. within this code, the serial communication via usb port is listened to in order to receive the information from the arduino about the vehicle presence. when the vehicle is detected, a series of processes are realized that are described in the next subsection. 2.2. acquisition procedure after the vehicle presence is detected, and in order to save the battery, the raspberry pi starts the microphone listening mode via the a/d interface. only when the detected sound level is above the set threshold, the storage of the stream in the buffer will begin (the second level of verification). the threshold level is determined empirically at 74 db. in this way, an accidental excitation of the sensors that can be caused by the passing of a pedestrian, dog or cat is avoided. the audio recording duration is initially set to 5 s and after the time has elapsed, the stream stops. in order to avoid an accidental excitation potentially caused by the passing of a motorcycle, the stream is additionally checked after stopping it. namely, at the location where the samples were taken, and in most of the underground garages, motorcycles are allowed to enter without any obstruction next to the ramp, so they are not stopped at the entrance. the mentioned check is performed simply after two seconds from the beginning of the stream the signal level is checked whether it is above the threshold set in the previous step or not (the third level of verification). if the threshold condition is met in that segment of the stream, it is stored as a wav file on the sd card. the entire procedure is shown as the flowchart given in fig. 6. fig. 6 acquisition procedure flowchart 306 m. milivojčević, e. kisić, d. ćirić the fourth level of verification is a specially developed algorithm where only the engine idle mode (stationary signal) is extracted from the existing wav file. the description of this procedure is given in the next subsection. the initial installation of the system at the entrance ramp of the underground garage showed that the system detected only vehicles and that the audio recordings contained only signals originating from internal combustion engines. however, the waiting time of vehicles above the microphone varied considerably from case to case. due to this phenomenon, three different approaches for audio signal recording (a, b and c) were applied based on the activities of ultrasound sensors. in the first one (a), it was defined that after detecting the object (vehicle), the ultrasound sensors remained inactive for 5 s until the rest of the system finished the audio sample recording. in the approach b, a fixed time of 5 s of sensor inactivity after detection was replaced by the time of 8 s. the third approach (c) is related to the situation where the sensors were constantly active in order to detect when the vehicle left the space above the microphone, thus not sending a command to the rest of the system to start the next recording. if the sensors in two successive iterations separated in time for 50 microseconds detected the absence of a vehicle, the system interpreted this situation as the vehicle had left the position. this is important because occasionally one of the sensors measures greater distance to a vehicle caused by the higher-order reflections of ultrasonic waves, due to the long waiting of the vehicle. this is interpreted as non-compliance with the presence condition. such a phenomenon is attributed to the dispersion of ultrasonic waves that can occur due to the shape of the vehicle’s body. during the system testing, it was shown that this phenomenon was rare. in terms of the negative effects of constant exposure to the ultrasonic waves, the used ultrasonic sensors are of very low power, designed to measure distances of up to 4.5 m, which means that the signal level can be negligible at longer distances due dispersion. if we look at the configuration of the entrances to the underground garages, the width of the passage for vehicles must be at least 3 m. in this way, if a pedestrian passage exists, it can only be found at a distance greater than 3 m from the sensor. besides, within the few hours of the acquisition, fewer than 10 passengers were seen in the pedestrian passage, but being further than 5 m from the sensors. 2.3. extraction of idling mode of operation in order to extract the stationary part of each recorded audio signal that corresponds to the engine idle by the signal processing in the time domain applied here, it is necessary to determine the threshold (time moment) after which the non-stationary part of the signal should be rejected. due to the nature of the problem, the stationary part of the signal always appears at the signal beginning, see figs. 7, 8 and 9 given in the next section. there are no cases where the idling occurs later (in the middle or at the end of the signal). so, it is clear that the threshold needs to be found at a certain time point after the signal starts, i.e., at the first moment when the signal becomes non-stationary. based on the analysis of the waveforms of the recorded signals in the time domain, it is noticed that at the moment when the signal ceases to be stationary, its amplitude abruptly increases. thus, at that moment, there is a noticeable increase (jump) in the signal envelope. the idea for extracting a stationary part of a recorded signal is based on generating the signal envelope and calculating the difference between the current and previous envelope values along the envelope. while the signal is stationary, the difference between the current and analysis of portable system for sound acquisition of vehicles powered by internal combustion engines 307 previous envelope values is expected to be small. on the other hand, at the moment when the signal ceases to be stationary, the difference between the current and previous value of the envelope must be significantly greater than the difference at time instants before that moment. the first time instant from the beginning to the end of the signal where there is a significant increase in the difference between the current and the previous envelope value is a candidate for setting the threshold. this significant increase needs to be quantified. if the signal envelope is denoted as env(t) and the threshold representing the upper time limit of the stationary signal part as tl, the threshold itself can be determined as:         −−= f s l n t atenvtenvt })1()(min{ (1) where ts denotes the duration of the signal, and nf is the number of frames in which the signal maxima are calculated in the procedure of generating the signal envelope. a is a constant having the value of 0.1 determined empirically. since it is necessary to set the threshold at the first time instant after the envelope jump looking from the signal beginning to its end, the smallest value that satisfies the condition in (1) is taken as the threshold tl. more precisely, since the time variable t is given in frames used for generating the signal envelope, the condition min{env(t)-env(t-1)>a} returns an envelope frame in which there is an envelope jump indicating a transition from stationary to non-stationary part of the signal. in order to obtain the exact time instant for setting the threshold, it is necessary to normalize the obtained envelope frame value by ts/nf. in our case, the frame size for generating the envelope is 4000 samples with the frame overlap of 1000 samples. this means that the resolution for setting the threshold tl is determined by the frame size, which can be chosen in accordance with the nature of the analyzed signal. 3. analysis of recorded audio signals positioning the acquisition system at the entrance of the underground garage with a ramp where it is necessary to stop a vehicle in order to take a token gave the results above the expectations in terms of the quality and number of audio recordings. these recordings have the following parameters: sampling rate of 44.1 khz, the bit depth of 16 bits, fixed duration of 5 s resulting in a file size of approximately 431 kb, which provides the possibility of storing approximately 67800 audio samples assuming the effective storage space of 28 gb on the 32 gb sd card. the power supply used a power bank with a capacity of 10 ah, consumed about 20% of the capacity for 2 hours of recording, showing that the system is able to function with this power supply for about 10 hours in a completely autonomous way. in parallel with the autonomous operation of the system, manual records of the engine type by fuel were made, meaning that the samples were labeled manually. this was done to identify the possible error, e.g. audio recording that would be unusable due to excessive noise of the environment that might be present indoors typically coming from the garage ventilation. this case did not happen in practice as a result of a correctly set threshold that determines the beginning of the recording. considering the three approaches mentioned above (a, b and c), after analyzing the recordings, the most important fact is that no vehicle passed by the acquisition system without triggering the system to record the sound of its engine operation. also, events other than passenger vehicles passing by did not falsely trigger the system, and a completely blank recording was not obtained. table 1 provides a comparative overview of 308 m. milivojčević, e. kisić, d. ćirić these three approaches in terms of the number of samples collected as well as the usability of the samples. it is worth mentioning, that during the collection of audio samples, a very small percentage of vehicles belonged to the older generation of vehicles. the majority of diesel vehicles belonged to the generation of common rail type injection, while the majority of gasoline vehicles had multipoint indirect injection. table 1 gives the total number of audio recordings and number of useful audio recordings. here, the latter contain the engine idling mode sounds, while the rest of recordings still contain the vehicle engine sounds, but not the idling mode of operation − instead they contain the sound of a vehicle leaving the ramp. large majority of recordings are the useful recordings, and its percentage in reference to the total number of recordings is above 90%, where this percentage is the highest for the ultrasonic detection approach c, and it is close to 97%. by comparing three ultrasonic detection approaches from table 1, it can be noticed that the approach a with a fixed time interval of detection (sensor inactivity) of 5 s gave the most audio samples, as many as 202% of useful recordings in relation to the number of vehicles. this approach is primarily suitable for generating the largest possible dataset, but it is not suitable from the point of view of efficient usage of the storage resources. if the system is used employing this approach for detection and recognition of the engine type in real conditions, there will be cases where the same vehicle is detected more than once. strictly speaking, this increased number of recorded audio signals for some vehicles could have certain detrimental effects on the machine/deep learning due to overrepresentation of these vehicles in comparison to others. although the number of recordings for majority of vehicles is up to two, these effects will be analyzed in the next phases of the research. besides, if necessary, the redundant recordings for the same vehicle could easily be removed from the dataset according to the time of recording. table 1 comparative overview of three different detection approaches (a, b and c) in terms of the number of samples collected as well as the usability of the samples a (sensors inactive for 5 s after vehicle detection) b (sensors inactive for 8 s after vehicle detection) c (continuous detection of vehicles by sensors) number of vehicles that passed through the acquisition system 50 100 100 number of detected vehicles 50 100 100 total number of audio recordings 111 143 122 number of useful audio recordings 101 133 118 number of idle mode records only (without any additional processing) 69 97 109 percentage of vehicles detected 100% 100% 100% percentage of useful recordings in relation to the total number of recordings 90.99% 93% 96.72% percentage of useful recordings in relation to the number of sampled vehicles 202% 133% 118% percentage of recordings of idle mode only without additional processing in relation to the number of sampled vehicles 138% 97% 109% percentage of recordings not requiring the fourth level of verification in relation to the total number of recordings 62.16% 67.83% 89.34% analysis of portable system for sound acquisition of vehicles powered by internal combustion engines 309 the approach b (time interval of sensor inactivity of 8 s) also gave good results in terms of the number of vehicles detected and the amount of audio recordings. however, it has the lowest percentage of recordings of idling mode only without additional processing compared to the number of sampled vehicles. this approach has more efficient usage of memory resources compared to the first approach. the most complex approach (c), continuous detection with the recognition of the next vehicle, gave the least audio recordings in relation to the number of detected vehicles. on the other hand, this approach is the most efficient in terms of memory utilization, achieving a high percentage of clean recordings. in this way, the lowest redundancy among samples and the highest percentage of useful recordings in relation to the total number of recordings were obtained. the latter led to the least need for additional processing (saving cpu resources) and additional power from the power supply. within all three approaches from a to c, one or two audio recordings per vehicle were obtained for the majority of vehicles. here, the first recording represented the engine idling stationary mode without exceptions, fig. 7. in most of the samples, the second recording (in some cases the last one) partially contained the engine idling mode followed by an increase in the crankshaft speed and partial engine load mode in order to accelerate the vehicle, as shown in figs. 8 and 9. there were no cases where the partial load mode of the engine appeared before the idling mode in the recordings. in these three figures (figs. 7, 8 and 9), the audio signals of approximately the same generation of vehicles are presented. here, the signals’ amplitudes are normalized; hence the focus is on differences in the signal level caused by the change in operating mode. fig. 7 audio signal of (a) petrol and (b) diesel engine at idle, without changing the mode fig. 8 audio signal of (a) petrol and (b) diesel engine having early operation mode change from idle to load mode (during the recording interval) 310 m. milivojčević, e. kisić, d. ćirić fig. 9 audio signal of (a) petrol and (b) diesel engine having late operation mode change from idle to load mode (during the recording interval) calculation of the threshold (i.e., the time instant of the audio signal until which the engine is in the idling mode) used for extraction of idling mode of operation is illustrated in figs. 10 and 11, where the threshold is marked with a purple vertical line. the terms “early” and “late” are related to the cases where the operation mode change happens earlier (up to 1 s) and later (after 1 s) in the recorded signal, respectively. in the recorded signals where there is no change in the engine operation mode, the threshold (cutoff time) could not be determined in the described way. in such a case, the entire audio track is selected as an engine idle, and is used for further analysis and processing. fig. 10 waveform and envelope of the audio signal of (a) petrol and (b) diesel engine with an early change of operation mode (the threshold is marked by a vertical line) fig. 11 waveform and envelope of the audio signal of (a) petrol and (b) diesel engine with a late change of operation mode (the threshold is marked by a vertical line) the waveforms of the characteristic audio signals extracted in the described way are presented in figs. 12 and 13. for the presented case of the vehicle using diesel fuel where an early operation mode change (almost at the very beginning of the recording) occurred, the calculated threshold (cutoff) time was also very close to the beginning of the signal analysis of portable system for sound acquisition of vehicles powered by internal combustion engines 311 (fig. 10b), which means that this recording is rejected using the function for checking the duration of the stationary mode. this duration can be set according to the requirement related to the minimal length of the signals. depending on a particular need, the signal length may be either shorter or longer. in the present case, the duration of the stationary mode is set to 0.5 s meaning that the minimal signal length is 0.5 s. fig. 12 audio signal of (a) petrol and (b) diesel engine at idle extracted from the recordings with a late change of operation mode fig. 13 audio signal of (a) petrol and (b) diesel engine at idle extracted from the recordings with an early change of operation mode as the mapping of audio signals into an adequate image format [21, 22], such as spectrogram-like images, is increasingly used in modern signal processing and deep learning, the obtained audio signals are also presented in the form of spectrograms, see figs. 14, 15 and 16. there are some properties present in the spectrograms of both engine types (petrol and diesel), such as stronger components at low and mid frequencies than at high frequencies as well as rather steady-state behavior along the time axis. however, these images contain also certain differences between the sounds of petrol and diesel engines, such as more uniform energy distribution along frequency axis for the petrol engine and more prominent particular frequency components for the diesel engine. more detailed analysis of the recorded audio signals and their representations in different domains, as well as correlation between the signals and vehicle types by fuel will be done in the very next phase of the research. 312 m. milivojčević, e. kisić, d. ćirić fig. 14 spectrogram of (a) petrol and (b) diesel engine audio signal at idle, without changing the mode and without applying the idling mode extraction fig. 15 spectrogram of (a) petrol and (b) diesel engine audio signal at idle with a late change of operation mode fig. 16 spectrogram of (a) petrol and (b) diesel engine audio signal at idle with an early change of operation mode 4. conclusions considering the number of recordings containing exclusively the idling mode of the vehicles in reference to the number of sampled vehicles, it can be seen that the developed acquisition system has collected at least one such recording for each vehicle. also, the system has not recorded a single blank audio file, and it is rather robust to false triggering. in addition, the selected amount of memory proved to be sufficient, and the most critical part of the system, the battery power, gave very satisfactory results in terms of system autonomy. since 250 vehicles in total passed behind the microphone and sensors placed on the ground without any consequences for functionality, the condition of robustness has been satisfied, and also the ability of unattended use has been proven. analysis of portable system for sound acquisition of vehicles powered by internal combustion engines 313 the developed additional processing of recorded signals for extracting exclusively the engine idle mode along the entire audio recording has enabled to create a dataset of audio samples containing only this target mode of operation. the acquisition system has proven to be efficient for recording the sound of a passenger vehicle at idle regardless of the type of fuel. the number of audio recordings can also be affected by the approach applied for detecting the presence of a vehicle using ultrasound sensors. this results in a larger or smaller number of recordings having higher or lower redundancy between the recordings, respectively. by using the developed acquisition system, a dataset has been created consisting of 352 audio recordings for 250 vehicles containing the sound of engines in the idling mode of operation. this acquisition system can found its application in different use-cases including control of car entrance in restricted areas of smart-cities, prevention of misfueling at gas stations, optimization of road usage or noise prevention based on engine fuel type. in such cases, this proof-of-concept system could be implemented as an embedded system on a dedicated single platform. depending on a particular application and its requirements, the acquisition system might be modified to become even less demanding. thus, taking into account relatively high sound pressure levels at the microphone (above 74 db) and proximity of the source, the condenser akg c562cm microphone might by replaced by an electro-dynamic microphone not requiring phantom power. since it is expected that dynamic range of the acquired signals will not be that large, the bit depth might be smaller than 16 bits used here. in addition, after developing an adequate classifier and considering the useful frequency range, it would be worthwhile to explore an option of reducing the sampling frequency. the generated dataset of audio samples will play an important role in future work for developing a system for automatic recognition of the type of engine based on the used fuel. this system will be designed by applying an adequate approach of deep or machine learning for classification and employing the created dataset for model training and testing. based on the samples from the generated dataset, it can be concluded that spectrograms of engines that use petrol and diesel at idle seem to be different, forming a strong ground-base for achieving high accuracy in engine type classification. acknowledgment: this work has been supported by the ministry of education, science and technological development of the republic of serbia, contract no. 451-03-68/2022-14/200102. references [1] s. das, a. dey, a. pal and n. roy, "applications of artificial intelligence in machine learning: review and prospect", int. j. comput. appl., vol. 115, no. 9, pp. 31-41, april 2015. [2] p. dhanalakshmi, s. palanivel and v. ramalingam, "classification of audio signals using svm and rbfnn", expert syst. appl., vol. 36, no. 3, part 2, pp. 6069-6075, 2009. [3] p. dhanalakshmi, s. palanivel and v. ramalingam, "classification of audio signals using aann and gmm", appl. soft. comput., vol. 11, no. 1, pp. 716-723, 2011. [4] h. ponce, p. ponce and a. molina, "adaptive noise filtering based on artificial hydrocarbon networks: an application to audio signals", expert syst. appl., vol. 41, no. 14, pp. 6512-6523, 2014. [5] z. liu, j. huang, y. wang and t. chen, "audio feature extraction and analysis for scene classification", in proceedings of first signal processing society workshop on multimedia signal processing, princeton, nj, usa, 23-25 june 1997, pp. 343-348. 314 m. milivojčević, e. kisić, d. ćirić [6] t. birtchnell, "listening without ears: artificial intelligence in audio mastering", big data & society, vol. 5, no. 2, july 2018. [7] g. p. chossière, r. malina, f. allroggen, s. d. eastham, r. l. speth and s. r. h. barrett, "countryand manufacturer-level attribution of air quality impacts due to excess nox emissions from diesel passenger vehicles in europe", atmospheric environ., vol. 189, pp. 89-97, sept. 2018. [8] m. milivojčević, f. pantelić, d. ćirić, "pozicioniranje mikrofona prilikom snimanja audio karakteristika motora putničkih vozila" (microphone positioning when recording audio characteristics of passenger car engines) in proceedings of 63rd national conference on electrical, electronic and computing engineering etran, srebrno jezero, serbia: 3-6 june 2019, pp. 58-62 (in serbian). [9] m. milivojčević, f. pantelić and d. ćirić, "comparison of frequency characteristics of sound generated by internal combustion engines depending on fuel", in proceedings of 26th noise and vibration, niš, serbia: 6-7 december 2018, pp. 115-120. [10] n. evans, automated vehicle detection and classification using acoustic and seismic signals. ph.d. thesis, university of york, 2010. [11] h. frederick, a. winda and m. iwan solihin, "automatic petrol and diesel engine sound identification based on machine learning approaches", in proceedings of the international conference on automotive, manufacturing, and mechanical engineering. bali, indonesia: 26-28 september 2018, published at e3s web of conferences, vol. 130, article no. 01011. [12] a. d. mayvana, s. a. beheshtib and m. h. masoom, "classification of vehicles based on audio signals using quadratic discriminant analysis and high energy feature vectors", int. j. soft comput., vol. 6, no. 1, pp. 5364, feb. 2015. [13] a. wieczorkowska, e. kubera, t. słowik and k. skrzypiec, "spectral features for audio based vehicle and engine classification", j. intell. inf. sys., vol. 50, pp. 265-290, 2018. [14] e. alexandre, l. cuadra, s. salcedo-sanz, a. pastor-sánchez and c. casanova-mateo, "hybridizing extreme learning machines and genetic algorithms to select acoustic features in vehicle classification applications", neurocomput., vol. 152, pp. 58-68, march 2015. [15] s. d. badiger and m. uttarakumari, "vehicle classification using machine learning algorithms based on the vehicular acoustic signature", sci. tech. dev., vol. 8, no. 11, pp. 369-374, nov. 2019. [16] ultrasonic waterproof range finder datasheet. available at: https://www.jahankitshop.com/getattach.aspx?id= 4635&type=product. [17] a. pajankar, kickstart to arduino nano. susteren, the netherlands: elektor international media, 2022. [18] b. r. kent, science and computing with raspberry pi. san rafael, usa: morgan & claypool publishers, 2018. [19] c562 cm specifications. available at: https://www.akg.com/microphones/boundary%20layer% 20microphones/c562cm.html. [20] digital high definition microphone interface specifications. available at: https://www.ikmultimedia. com/products/irigprehd/. [21] s. amiriparian, m. gerczuk, s. ottl, n. cummins, m. freitag, s. pugachevskiy, a. baird and b. schuller, "snore sound classification using image-based deep spectrum features", in proceedings of interspeech 2017, stockholm, sweden, august 20–24, 2017, pp. 3512-3516. [22] d. ćirić, z. perić, j. nikolić, n. vučić, "audio signal mapping into spectrogram-based images for deep learning applications", in proceedings of 20th international symposium infoteh-jahorina (infoteh), east sarajevo, bosnia and herzegovina: march 17-19, 2021, pp. 1-6. plane thermoelastic waves in infinite half-space caused facta universitatis series: electronics and energetics vol. 31, n o 3, september 2018, pp. 411-423 https://doi.org/10.2298/fuee1803411s fem cfd analysis of air flow in kiosk substation with the oil immersed distribution transformer  stevan stanišić 1 , milica jevtić 1 , bhaba das 2 , zoran radaković 1 1 faculty of electrical engineering, university of belgrade, belgrade, serbia 2 engineering department, etel ltd, auckland 0640, new zealand abstract. in practice of loading of oil-immersed distribution transformers, there is a need to have lumped thermal model, requiring no big computational resources and computational time. one such model is presented in international transformer loading guide (iec 60076-7), where heat transfer inside the transformer is modeled. in case of indoor transformer operation, this model does not consider transient thermal phenomena in the room. we developed a lumped model that includes heat transfer in the transformer room. in scope of the research, we also built fem cfd (finite element method, computational fluid dynamics) model of air flow and heat transfer. the purpose of fem cfd was to make a better insight into air flow, i.e. to study the simplifications introduced in lumped model and suggest potential improvements. this paper presents results achieved with fem cfd. the considered case was the transformer with natural oil and natural air flow (onan). key words: indoor transformer station, thermal model, finite element method, computational fluid dynamics 1. introduction it is well-known that the temperatures at the hottest position (hot-spot) in solid insulation and the hottest oil are the main factors which define possible transformer load (current) at specific ambient conditions (ambient temperature). the majority of the published research relate to the modeling of the heat transfer inside the transformer tank and from the tank and coolers (radiators) to the outer cooling medium. at this point, the losses depend on the load and the average winding temperature. so, the coupled calculation of the temperatures and the losses has to be performed. nowadays, there is a strong demand for saving the space above the ground surface when placing the distribution substations in urban areas. on the other hand, establishing of air flow cooling the transformer can be more difficult and very restricted for underground placement. there is a need to have a calculation method for quantifying this effect. a common solution is prefabricated concrete transformer substation. so far, there received september 29, 2017; received in revised form december 25, 2017 corresponding author: zoran radaković faculty of electrical engineering, university of belgrade, 73 kralj aleksandar blvd, 11000belgrade, serbia (e-mail: radakovic@etf.rs) 412 s. stanišić, m. jevtić, b. das, z. radaković are compact substation designs. technical issues for such substations include sizing of the ventilation openings and choosing the opening types, taking protection into consideration as well (safety of human beings and animals against touching the metallic parts under voltage, entry of the animals, rain etc.). this research was initiated with the task to optimize size and placement of ventilation openings for compact kiosk substation produced by etel ltd, new zealand. ventilation opening sizing is an old engineering problem some 40 years ago there were investigations about it [1, 2, 3], but in modern engineering practice (especially for smart grid concept) there is a need not only to perform the calculations in steady states, but also to develop the dynamic thermal models for the estimation of possible overloading. actual loading guide iec 60076-7 considers additional heating of the transformer due to its positioning in enclosure in simplified manner. the rated top oil temperature rise is corrected (increased) for the difference of air temperature in the enclosure minus ambient air temperature. the standard contains the typical values of this temperature increase for different types of the enclosure, the transformer rated power and the number of transformers in the enclosure. in our previous publication [4], we presented the dynamic thermal model for prefabricated concrete enclosure, typically used in power distribution company, belgrade, serbia. that model was based on the empirical data. in our recent publication [5] we published more physical factor based model. as far as we know, no other research efforts similar to those presented in [5] have been made by other authors. meanwhile, we improved the lumped model announced in [5] and will publish such an upgraded model, with additional on-site tests, in a future paper. the lumped model is suitable for on-line applications, but is simplified and of limited accuracy. in recent years, fem cfd tools are becoming more and more present in the process of research and development of optimal cooling systems for power transformers. so far, main targets of fem cfd analyses in this area are oil cooled core windings as well as both natural and forced transformer substation cooling. it is reported in [6] how finite element approach is applied on oil filled disc type windings for the purpose of locating hot spot. cases regarding optimal design of indoor substation and ventilation are treated in [7, 8, 9, 10, 11]. there is even an example where fem cfd tools were used to gain further insight in conjugate heat transfer for oil inside and air outside the radiators for both onan and onaf transformers [12, 13]. this paper presents the results of the application of fem cfd simulations, which gives detailed space distribution of air velocity and temperature. thus, the simplifications in lumped model can be checked and lumped model potentially improved. the paper presents the experience about the application of fem cfd simulations and their results. 2. geometry of simulated substation the kiosk substation consists of transformer, hv and lv compartments. figure 1 presents the geometry of the transformer compartment used in the model (base sh = a x b = 1.56 x 1.32 = 2.06 m, height h = 1.33 m). the data about real kiosk transformer compartment ventilation openings is specified in table 1. table 2 contains the data about the simplified openings modeled as shown in figure 1. boundary condition with pressure head loss coefficient is assigned to the openings. the rated power of the transformer is 500 kva and it is cooled by natural ventilation over the fins. fem cfd versus lumped thermal model of kiosk substation with the oil immersed distribution transformer 413 kiosk floor is 125 mm thick, kiosk walls containing the openings and kiosk ceiling are all 25 mm thick. in order to simplify the model, only kiosk transformer compartment interior surfaces were modeled, i.e. the wall thickness and the resistance to the heat conduction are neglected. only the convection heat transfer is considered in the model. convection heat transfer coefficient on the inner surfaces of the kiosk is determined from cfd calculation. constant values of convection heat transfer coefficients on outer surfaces are determined using the equation from the theory of natural air flow near horizontal and vertical walls. the drawing of the modeled transformer compartment is shown in figure 1, with openings protruding 25 mm to the outside, thus accounting all hydraulic resistances to air flow. fig. 1 the geometry modeled with cfd fem table 1 data about the real openings openings outlet inlet louvre access door (a) louvre side panel (b) cutout (d) cutout (c) hole rows 10 10 7 18 hole columns 5 8 9 9 intake holes (per panel) 50 80 63 162 height of holes [m] 0.03 0.03 0.01 0.01 width of holes [m] 0.07 0.07 0.07 0.07 hole area [m 2 ] 0.105 0.168 0.044 0.113 hole area reduction due to jalousies (%) 55 55 0 0 effective hole area [m 2 ] 0.047 0.076 0.044 0.113 number of panels 1 3 4 4 effective total volume of holes [m 2 ] 0.047 0.227 0.176 0.454 414 s. stanišić, m. jevtić, b. das, z. radaković table 2 size and position of the modeled openings inlet cutout (c) louvre side panel (b, a) top cutout (d) area (width*height) [m 2 ] 0.282 x 0.632 0.36 x 0.602 0.101 x 0.632 x coordinate of center [m] 0.788 y coordinate of center (front side) [m] 0.012 y coordinate of center (rear side) [m] 1.307 z coordinate of center [m] 0.288 0.956 1.294 3. assumptions in the model we experienced problems with the convergence and had to make simplifications to achieve model convergence. a final score is that 2 of 20 simulations converged with results in expected range, while 5 of 20 simulations were stopped due to very long computational time. the 2 successful simulations took 93 hours and 15 days, respectively. that is why we did not come to the point to include all relevant physical issues in the model. more precisely, fem cfd model is focused on analyzing air flow and heat transfer due to the air mass transfer and heat transfer through the walls. only the transformer compartment is modeled while it is assumed that the temperature in hv and lv compartments are equal to the ambient temperature. no thermal resistances to the heat conduction through the walls, through the ceiling and through the floor were considered. kiosk floor is modeled as adiabatic. these approximations have smaller quantitative effect than two following approximations. the radiation heat transfer is not considered. also, there is an approximation in the model for heat transfer along the radiators, which is important for the air buoyancy in the zone of the radiators. the basics about the buoyancy can be found in our previous publications [5] and [14]. several improvements of the lumped model from [5] have been made in the meantime and will be published as an upgraded model for calculation of air buoyancy, based on the radiator modeled as a heat exchanger. in fem cfd, fins are initially modeled as the rectangular aluminum bars. transformer tank is modeled as a homogeneous body with low thermal conductivity (0.11 w / (mk)). this approximation causes the discrepancy of fin surface temperature from the real one, causing calculation error for the heat transfer to the air and calculation error for the air buoyancy. two attempts were made at modeling air flow. the first was to consider laminar flow and the second with algebraic yplus turbulent flow regime. no convergence with stationary solver has been achieved with either of the flow regimes, so transient solving was applied. the algebraic yplus turbulence has been selected for the simulation of the flows inside closed areas [15]. it solves the flow everywhere and it is the most robust and least computationally intensive with good approximations for internal flow. multiphysics comprising of heat transfer and turbulent flow, algebraic yplus, was employed to the model. effective size of kiosk openings has been modeled through boundary conditions in turbulent fluid flow physics: grille. this boundary condition incorporates effect of having a square mesh on the openings or louvres via head loss coefficient. values for this coefficient are calculated using equations from [16]. fem cfd versus lumped thermal model of kiosk substation with the oil immersed distribution transformer 415 after initial modeling of the fins as bars, they were reduced to surfaces, with thin layer boundary condition for heat transfer and interior walls for laminar flow. comsol built-in default meshing sequence was used with the coarser setting. stationary solver exhibited problems for both laminar and turbulent algebraic yplus model. online research and comsol blog exploration pointed out to the experience that convective cooling sometimes poses a small transient that the stationary solver is not capable to solve. transient (time domain) solver showed much better performance. after gathering such experience, we came to the idea to set initial value of the entire transformer block temperature to 74°c (equal to the steady-state top oil temperature, as presented in [5]), and thus to shorten the computational time needed to reach the steady-state (in respect to needed time if the initial condition would be the cold state). the initial temperature of the fins is set to 20°c. algebraic yplus turbulent model specific solver was used: transient with initialization. this solver is comprised of two study steps. the first step is to initialize the values of wall distances. this study step utilizes a fully coupled physics stationary solver in which initial values of all dependent variables (temperature, pressure, velocity field, wall distance in viscous units and reciprocal wall distance) are solved for using iterative gmres method (generalized minimum residual). the second step uses segregated time dependent solver where first segregated node calculates for velocity field, pressure and temperature using iterative gmres and the second node for wall distance in viscous units using direct pardiso solver (parallel direct sparse solver). time range solved for is (0, 1, 60) [min], meaning that simulation is solved in minutes, starting from zero, with step 1 until one hour has been reached. our assumption was that this period would be sufficient to reach a quasi-stationary state concerning air flow and cooling process since the entire domain volume of transformer already had the temperature as its steady-state condition. nevertheless, because of setting the initial temperature of the fins to 20°c and adopting low thermal conductivity for the solid material of the tank no steady-state has been reached. 4. computing resources the best available computation resource we had was a single desktop with 64-bit windows 7 enterprise os, intel core i5-6400 cpu and 32 gb (2x16 gb) of ddr4-2400/ pc4-19200 ram. 5. calculation results the following 6 tables present the results of the post-processing of fem cfd simulation results. tables 3 and 4 show the differential pressures (differences of pressures of air exiting fins from above and air entering fins from bellow) averaged on the surface between each two fins, i.e. over/under the openings. the pressures on the transformer compartment openings in table 4 are given in pa as gauge pressures, i.e. the difference of absolute pressure and referent atmospheric pressure; referent pressure is 1.0133e5 pa, at the level of the kiosk ceiling. 416 s. stanišić, m. jevtić, b. das, z. radaković table 5 shows the averaged temperatures on 10 surfaces leaning vertically on the fins and 2 surfaces leaning horizontally on the fins from below (z=0.508 m) and from above (z=1.308 m). example of the bottom-most vertical surface on the front radiator is marked on figure 2. table 6 shows averaged temperatures on the openings. the ambient air temperature (outside the kiosk) is 20°c. tables 7 and 8 present the results of the post-processing of fem cfd simulation results for the total air flows in g/s through the surfaces as for averaged temperatures in table 5 (horizontal and vertical surfaces), i.e. cross-sections of the openings. flow values are negative on surfaces where air is dominantly entering the radiator and positive on the outflow surfaces. example of the bottom-most vertical surface on the front radiator is again the same as marked red on figure 2. table 3 pressures difference (pa) on the surfaces between bottoms and tops δp=pexit-pentry between fins 1-2 2-3 3-4 4-5 5-6 6-7 7-8 front side (8 fins) -9.3626 -9.3623 -9.361 -9.3599 -9.3601 -9.3612 -9.3646 rear side (22 fins) -9.3603 -9.3619 -9.3623 -9.3624 -9.3617 -9.361 -9.3601 between fins 8-9 9-10 10-11 11-12 12-13 13-14 14-15 rear side (22 fins) -9.3584 -9.3543 -9.3537 -9.3575 -9.3617 -9.3637 -9.3642 between fins 15-16 16-17 17-18 18-19 19-20 20-21 21-22 rear side (22 fins) -9.365 -9.3656 -9.366 -9.366 -9.3644 -9.3622 -9.3579 table 4 pressures on the openings (in pa as gauge pressures) pressure [pa] c b d front (8 fins) 12.283 4.4086 0.4299 rear (22 fins) 12.282 4.4089 0.4294 table 5 an example of air temperature values [°c] over the fin height z-coordinate [m] top surface 1.23 1.31 1.15 1.23 1.07 1.15 0.99 1.07 0.91 0.99 front side (8 fins) 27.4 26.87 26.8 26.75 26.52 26.04 rear side (22 fins) 26.82 26.11 25.87 25.77 25.67 25.39 z-coordinate [m] 0.83 0.91 0.75 0.83 0.67 0.75 0.59 0.67 0.51 0.59 bottom surface front side (8 fins) 25.63 25.19 24.75 24.51 24 23.66 rear side (22 fins) 24.87 24.7 24.66 24.26 23.74 23.11 table 6 the temperatures [°c] averaged on the openings c b d front (8 fins) 19.84 21.77 28.67 rear (22 fins) 19.91 21.69 28.39 fem cfd versus lumped thermal model of kiosk substation with the oil immersed distribution transformer 417 table 7 an example of air flow values [g/s] over the fin height z-coordinate [m] top surface 1.23 1.31 1.15 1.23 1.07 1.15 0.99 1.07 0.91 0.99 front side (8 fins) 2.6394 1.5214 1.2132 0.9277 0.5038 0.089 rear side (22 fins) 5.8881 5.4824 4.2955 3.9995 3.1065 1.2586 z-coordinate [m] 0.83 0.91 0.75 0.83 0.67 0.75 0.59 0.67 0.51 0.59 bottom surface front side (8 fins) -0.2381 -0.4147 -0.4344 -0.4892 -0.428 -4.585 rear side (22 fins) -0.3735 -1.1181 -0.6613 -0.1161 0.3325 -20.254 table 8 the flows [g/s] on the openings oppening c b d front (8 fins) 42.84 14.413 40.188 rear (22 fins) 69.67 18.666 39.236 fig. 2 example of the vertical surface (first out of 10 for front radiator) 418 s. stanišić, m. jevtić, b. das, z. radaković table 9 presents the values of the characteristic air flows. table 9 air flows flows qc qhp qhtp qhnp qp qnp values [g/s] 112.51 91.1 24.84 66.26 29.11 83.4 qc – inlet opening c qhp – upward flow around transformer through horizontal surface with coordinate z=0.51 m (just below the bottom edges of fins, qhtp – upward flow through the horizontal surfaces below the fins (only the flow component entering both radiators from bellow) qhnp – upward flow that does not enter the radiators: qhnp= qhp – qhtp qp – total air flow into the radiators (qhtp is increased by the air entering from the side (see table 7)) qnp – part of flow through inlet openings (qc) minus flow entering into the radiators (qp): qnp= qc – qp note: we suppose that the reason for the deviation of qhp from qc is the different mesh, i.e. the error caused by the interpolation for different meshes on the opening and on the horizontal plane below the fins. large ratios qhp / qhtp and qnp / qp are the consequence of the air heating up on the tank surfaces which are not covered by the fins where air buoyancy also exists and friction is small (in fact, it appears only in velocity boundary layer. total cooling surface in the lumped model is considered when the heat transfer coefficient (kp) is calculated. it is approximately supposed that the entire air mass, which is used for the calculation of the buoyancy and for the calculation of the frictional pressure drop in space between the radiator plates, flows vertically exclusively between the fins. figure 3 presents the distribution of air velocity on the kiosk walls with the openings, being used to get the values in table 8. figure 4 visualizes space distribution of air velocity and temperature on the side with 8 fins; it is of relevance for values in table 9. fem cfd versus lumped thermal model of kiosk substation with the oil immersed distribution transformer 419 a) on the kiosk wall with the openings, side with 8 fins on the tank b) on the kiosk wall with the openings, side with 22 fins on the tank fig. 3 distribution of the air velocity (m/s) 420 s. stanišić, m. jevtić, b. das, z. radaković fig. 4 space distribution of air flow pattern and temperature on the side with 8 fins fem cfd versus lumped thermal model of kiosk substation with the oil immersed distribution transformer 421 6. the type of the results obtained from fem cfd and lumped model the characteristic temperatures, flows and the pressures can be obtained from the lumped model. the difference in respect to fem cfd calculation is that lumped model delivers only one value for the pressure and the temperature at the bottom of the fins and also only one value at the top of the fins. the values from fem cfd, which are to be compared with the ones from lumped model, are averaged on the surfaces. since ideal upward air flow is supposed in the lumped model, there are no output values for air flow in and out through vertical surfaces in the zone of the fins (tables 5 and 7). the lumped model results are presented in tables 10 (temperatures), 11 (flows) and 12 (pressures). table 10 the values of temperatures obtained by lumped model temperatures       values [°c] 24.6 20.03 51.1 50.7 50.5 50 c  temperature on top of inner side of opening c br  temperature on entry to the radiator tr  temperature on exit from the radiator ceil  temperature on kiosk ceiling near the kiosk wall d  temperature on top of opening d b  temperature on top of opening b table 11 the values of flows obtained by lumped model flows qc qbr qd qb qkdb qkbc values [m 3 /h] 0.0322 0.129 0.0226 0.01 0.0205 0.0004 there are 4 openings c, 4 openings d and 4 openings b (flow through opening a is practically the same as through b, so it is considered as there is 4 openings b instead of 3 b and 1 a) qc  flow through opening c qbr  flow through the radiator qd  flow through opening d qb  flow through opening b qkdb  flow downstream the kiosk wall (between the openings d and b) qkbc  flow downstream the kiosk wall (between the openings b and c) table 12 the values of pressure differences obtained by lumped model press. diff.      values [pa] -2.662 -8.5479 -1.7405 1.7434 0.4671 press. diff.     values [pa] 0.2726 2.2545* (2.486)** 0.0675 9.1455* (9.3594)** * on the inner side of the kiosk, ** on the outer side of the kiosk 422 s. stanišić, m. jevtić, b. das, z. radaković pc-br  pressure difference between middle of opening c  entry to the radiator pbr-tr  pressure difference between entry to the radiator  exit from the radiator ptr-ceil  pressure difference between exit from the radiator  kiosk ceiling pceil-d  pressure difference between kiosk ceiling  middle of opening d pdin-dout  pressure difference between inner side of middle of opening d  outer side of middle of opening d pbin-bout  pressure difference between inner side of middle of opening b  outer side of middle of opening b pd-b  pressure difference between middle of opening d  middle of opening b pcout-cin  pressure difference between outer side of middle of opening c inner side of middle of opening c pb-c  pressure difference between middle of opening d  middle of opening c 7. conclusions fem cfd method is relatively new and presents a powerful tool for analysis of wide variety of heat transfer problems including fluid flow. nevertheless, as presented in the paper, severe convergence problems can appear when using software based on this method. from that point of view, publishing the practical experience of its application is valuable. convergence problems we encountered were clearly stated in the paper. at the end, fem cfd results that were obtained only gave us qualitative representation of air flow. it was not possible to compare the results with the results of lumped model since the output quantities were not the same. in the experiment there was no record which corresponds to the fem cfd simulation (initial state is different and no steady-state has been reached in fem cfd). stronger computational resources could probably make it feasible to use smaller mesh and to increase convergence. another option, combined with the previous one, is to perform custom meshing in critical zones, i.e. not to use automatic mesh generation as we did. such work is presented in [17], where similar problem, but for transformer placed under the ground surface, is considered using fem cfd. the approach in [17] is stricter with fewer simplifications, but with much stronger hardware recourses solving a model with much higher mesh cells number (as well as performing grid independence verification). at the end, the following conclusions about air flow distribution were drawn from the simulations that converged, which could not have been seen in lumped model developed and applied in our previous work: 1. a part of the air exits the radiator before it reaches the top of the radiator, streaming toward the outlet cutouts. similar situation happens for the air entry: part of air flows from the inlet openings and enters the radiator on the vertical boundary surface of the air ducts between the fins. 2. there is significant upward air flow outside the zone of the fins, caused by the buoyancy in the areas of the tank surfaces which are not covered by the fins (see section 5). these finds should be kept in mind while building the lumped models, i.e. a way of considering their influence (via elements of lumped model) should be explored. fem cfd versus lumped thermal model of kiosk substation with the oil immersed distribution transformer 423 references [1] k. baral and i. primus, "lebensdauer eines 630 kvatransformators in einer beton-netzstation," elektrizitaetswirtschaft, vol. 8, pp. 268–276, 1979. [2] i. primus, "temperaturen in netzstationen–wirtschafliche bedeutung und einfluss factoren," elektrizitaetswirtschaft, vol. 16, pp. 451–460, 1976. [3] i. primus, "temperaturen in netzstationen–messergebnisse und deutung," еlektrizitaetswirtschaft, vol. 22, pp. 833–842, 1976. [4] z. radakovic and s. maksimovic, "non–stationary thermal model of indoor transformer stations," electrical engineering (archiv fur elektrotechnik), vol. 84, no. 2, pp. 109-117, 2002. [5] z. radakovic, m. jevtic and b. das, "dynamic thermal model of kiosk oil immersed transformers based on the thermal buoyancy driven air flow," international journal of electrical power & energy systems, vol. 92, pp. 14-24, nov. 2017. [6] a. k. das and s. chatterjee, "finite element method-based modelling of flow rate and temperature distribution in an oil-filled disc-type winding transformer using comsol multiphysics," iet electric power applications, vol. 11, no. 4, pp. 664 673, 2017. [7] y. huijuan, y. tingfang, x. rui and p. chunhua, "numerical simulation of ventilation for main transformer room of indoor substations," the open automation and control systems journal, vol. 7, pp. 630-639, 2015. [8] h. liu, y. hao, m. fu, d. wang and l. yang, "study on ventilation of indoor substation main transformer room based on comsol software," in proceedings of the 1st international conference on electrical materials and power equipment (icempe), xi'an, china, july 2017. [9] m. r. nalamwar, d. k. parbat and d. singh, "study of effect of windows location on ventilation by cfd simulation," international journal of civil engineering and technology, vol. 8, pp. 521-531, 2017. [10] t. yu, h. yang, r. xu and c. peng, "simulation study on ventilation & cooling for main transformer room of an indoor substation," journal of multimedia, vol. 9, no. 8, pp. 1040-1047, 2014. [11] m. banjac, "application of computational fluid dynamics in cooling systems design for special purpose objects," fme transaction, vol. 42, pp. 26-33, 2014. [12] s. b. paramane, w. v. d. veken and a. sharmac, "a coupled internal–external flow and conjugate heat transfer simulations and experiments on radiators of a transformer," applied thermal engineering, vol. 103, pp. 961-970, june 2016. [13] s. b. paramane, k. joshi, w. v. d. veken and a. sharma, "cfd study on thermal performance of radiators in a power transformer: effect of blowing direction and offset of fans," ieee transactions on power delivery, vol. 29, no. 6, pp. 2596-2604, dec. 2014. [14] z. radakovic and m. sorgic, "basics of detailed thermal-hydraulic model for thermal design of oil power transformers," ieee trans. on power delivery, vol. 25, no. 2, pp. 790-802, 2010. [15] https://www.comsol.com/blogs/which-turbulence-model-should-choose-cfd-application/. [16] i.e. idelchick, handbook of hydraulic resistance, 3rd ed., florida: crc press inc., 1994. [17] j.c. ramos, m. beiza, j. gastelurrutia, a. rivas, r. anton, g. s. larraona and i. de miguel, "numerical modelling of the natural ventilation of underground transformer substations," applied thermal engineering, vol. 51, no. 1-2, pp. 852-863, march 2013. plane thermoelastic waves in infinite half-space caused facta universitatis series: electronics and energetics vol. 31, no 1, march 2018, pp. 141 153 https://doi.org/10.2298/fuee1801141j trade-off between multiple criteria in smart home control system design aleksandar janjić 1 , lazar velimirović 2 , miomir stanković 3 , vladimir djordjević 4 1 faculty of electronic engineering niš, niš, serbia 2 mathematical institute of the serbian academy of sciences and arts, belgrade, serbia 3 faculty of occupational safety niš, niš, serbia 4 electric power industry of serbia, belgrade, serbia abstract. the successful automation of a smart home relies on the ability of the smart home control system to organize, process, and analyze different sources of information, according to several criteria. because of variety of key design criteria that every smart home of the future should meet, the main challenge is the trade-off between them in uncertain environment. in this paper, a problem of smart home design has been solved using the methodology based on multiplicative form of multi-attribute utility theory. aggregated functions describing different smart home alternatives are compared using stochastic dominance principle. the aggregation of different criteria has been performed through their numerical convolution, unlike usual approach of pairwise comparison, allowing only the additive form of aggregation of individual criteria. the methodology is illustrated on the smart home controller parameter setting. key words: maut, decision making, multi criteria analysis, smart home, stochastic dominance 1. introduction making a home smart means that residents move around safely and easily, economizing and using resources more efficiently. in order to accomplish these multiple tasks, a smart home must be equipped with technology that observes the residents and provides proactive services. with the increase of inexpensive sensors, communication equipment and embedded processors, smart homes are equipped with a large amount of sensors that use the acquired data on the activities and behaviors of its residents and consequently perform appropriate control actions [1]. the successful automation of a smart home relies on the ability of the smart home control system to organize, process, and analyze different sources of information according to different received june 20, 2017; received in revised form september 20, 2017 corresponding author: lazar velimirović mathematical institute of the serbian academy of sciences and arts, kneza mihaila 36, 11001 belgrade, serbia (e-mail: lazar.velimirovic@mi.sanu.ac.rs) 142 a. janjić, l. velimirović, m. stanković, v. djordjević criteria defined by the user. to this end, a strong and formal support to the multi-criteria decision is central to the smart home controller design and setting. as far as smart home functionality is concerned, there are at least four major key design requirements that every smart home of the future should meet [2]:  user-friendliness: a functionality must be comfortable and helpful to (often nontechnical) home occupants.  intelligence for the most basic and sensible functions (such as turning on lights when coming, and turning them off when leaving home), requiring complex information processing of diverse information sources.  non-intrusiveness: the ability of the system to operate in the background, not bothering occupants by the proliferation of queries.  security and its accompanying factor, privacy, are extremely important for the adoption of any smart home system. the trade-off between these criteria is necessary on all hierarchical levels of smart home design, selection and operation. we do not know what mix of sensors is optimal for a particular group or individual, and how to appropriately control, summarize and present information collected to different stakeholders. a series of technical and social challenges need to be addressed before sensor technologies can be successfully integrated according to the occupant’s attitude to different criteria. besides the presence of multiple criteria, another challenge in front of intelligent builiding and smart home automation is the great uncertainty due to the stochastic naure of renewable energy sources. in this paper, the methodology for discrete stochastic multiple criteria decision making problem in smart home system design, with different types of tradeoffs among criteria has been applied for the smart home design selection problem. the advantage of this approach is the usage of compensatory aggregation, which is more suitable for conflicting criteria or the human aggregation behavior. the proposed methodology is based on numerical convolution of criteria probability distribution functions, according to different types of criteria aggregation. alternatives are ranked according to the stochastic dominance (sd) rules. the contribution of this paper is the introduction of new decision support tool which is more adapted to the smart home design faced with uncertainties and necessary trade-off between different criteria and different stakeholders. the methodology can be used for various problems in the smart home design, including the sensor disposition, parameter setting, functionality selection etc. unlike previous multi-criteria approach, compensatory aggregation adapted to the human behavior has been applied. the paper is organized in the following way. after the literature review of the current state of the problem, the methodology for stochastic multi criteria decision making (smcdm) is presented, describing each step of the methodology: definition of the type of the criteria aggregation, numerical convolution of aggregated utility probability distributions and the application of sd rules for the ranking of alternatives. the methodology is illustrated on the choice of the smart home control parameter settings and finally, conclusions and further research directions are presented. trade-off between multiple criteria in smart home control system design 143 2. litterature review generally, a home that is designed according to smart and sustainable home principle has to meet occupant’s needs through all stages of their life. previous work on smart home system design has been generally focused on a specific problem area such as information correlation or hardware [3], [4]. in [5], authors review sensor technology used in smart homes focusing on environment and infrastructure mediated sensing. in [6]-[9] smart home technology is a support for people with reduced capabilities due to aging or disability. requirements generated from considerations of social, environmental, and economic issues for high efficient energy-saving building systems in compliance with building codes and regulations were analyzed in [10], [11]. focusing on specific design problem, authors did not take on a holistic system and multi-criteria engineering view. in [12], the general controller system design procedure based on evolutionary multiobjective optimisation (emo) is presented, with the comprehensive review of other multiobjective design procedures. an extensive list of requirements for composition of smart home application has been provided in [13] and [14], where requirements are clustered in seven categories, each of which consisting of three to five requirements, including:  simplicity: describing the complexity of application development, involving the interaction between the system and the application developer.  modeling: requirements that affect the way the smart home applications can be modeled.  time: the ability to impose timing constraints  mobility: including both mobile devices and changes in the system  technical requirement for a composition solution  security, safety and privacy  miscellaneous, containing all requirements that do not match the other categories. with the diversification of criteria and the increased number of stakeholders engaged in smart home realization, the need for multiobjective and multicriteria approach emerged. starting from the redesign of building automation systems [15], various applications of multiobjective optimization of control systems were introduced, like the controller adjustment and controller parameter selection [16]. in [17] fuzzy ahp multicriteria analysis of key performance indicators related to the smart grid efficiency, as the key factor of any energy management system implementation have been analyzed. however in all of mentioned approaches the multiobjective problem is normalized and converted to a single-objective optimization with deterministic state of nature concerning the consequences of different alternatives. although the authors present a multi-criteria decision-making model using the analytic network process to evaluate the lifespan energy efficiency of intelligent buildings, the tradeoff between different criteria has not been taken into account in all mentioned approach. as stated before, stochastic nature of renewable sources integrated in intelligent buildings requires stochastic predictors [15], [18]. however, authors conclude that the current technology is still not mature enough for cost-effective usage in most of the real-world scenarios. one of the prominent stochastic and multicriteria methodology smcdm is used for selecting alternatives associated with multiple criteria, where consequences of alternatives with respect to criteria are in the form of random variables. there are three general methods to solve smcdm problem: 1) outranking methods using confidence indices on alternative 144 a. janjić, l. velimirović, m. stanković, v. djordjević pairwise comparisons with respect to each criterion [19], 2) data envelopment analysis [20] and 3) stochastic multi-objective acceptability analysis (smaa) [21]. methods using stochastic processes and sd rules generally include two processes [22], [23]: comparison and selection. the comparison serves to identify whether there exists a sd relation for comparison of any pair of alternatives using sd rules, while the selection is to rank alternatives based on the determined sd relations using rough set theory or interactive procedures [24], [25]. in stochastic multi attribute analysis (smaa) or group decision-making analysis, both criterion values and criterion weights are uncertain but the usage of more complex utility functions together with the correlation between attributes remained neglected. so far, smcdm problems were exclusively related to the additive form of utility functions, with evaluations eij taken as utility values. in [26] a range of simulated problem settings is used to show that using an additive aggregation when preferences actually follow a multiplicative model may often only have minor impacts on results. however, for many decision problems, including the various smart home design phases, estimated parameters are inconsistent with the linear additive case and are strongly favoring the multiplicative functional form. furthermore, decision makers tend to partially compensate between criteria, instead of trying to satisfy them simultaneously, emphasizing the need for the multiplicative functional form. in [27], a new methodology for the multidimensional risk assessment, based on stochastic multiattribute theory has been presented. this methodology encompasses simultaneously: the multi criteria decision problem, stochastic nature of criteria outcomes and trade-off between them depending on decision maker preferences, making it the candidate for the smart home controller design problems. 3. methodology the main challenge in the smart home control system design is the presence of great number of different stakeholders, with different and often opposite preferences. for the sake of illustration, suppose that seven persons evaluate different alternatives for indoor temperature setting (e.g. 20º c) over the set of three criteria: comfort (c1), ecology (c2) and energy costs (c3), on a scale of ten (1 the worst, 10 the best). the evaluations of i-th alternative are expressed in the form of the discrete probability distribution as shown in table 1. table 1 evaluation distribution of three criteria for an indoor temperature setting value scores criteria c1 c2 c3 1 0 2/7 0 2 0 0 1/7 3 3/7 0 0 4 0 1/7 1/7 5 2/7 2/7 0 6 0 1/7 3/7 7 0 0 0 8 1/7 1/7 1/7 9 0 0 0 10 1/7 0 1/7 trade-off between multiple criteria in smart home control system design 145 the graphical representation of appropriate cumulative distribution functions is given on figure 1. 0 1 2 3 4 5 6 7 8 9 10 0,0 0,1 0,2 0,3 0,4 0,5 0,6 0,7 0,8 0,9 1,0 c1 c2 c3 p ro b a b il it y grades fig. 1 the cumulative distribution functions of three criteria evaluations the problem is how to make a trade-off between these criteria and how to choose the required temperature to satisfy all occupants’ preferences. furthermore, on other levels of smart home design or operation, the same problem of multi-criteria decision analysis in presence of group of decision makers, or uncertain environment still exists. the methodology proposed in this paper for solving this problem is based on multi-attribute utility theory (maut) and numerical convolution of probability distribution. the reader is referred to the article [27] for the detailed explanation of the methodology, but the key points will be explained in the sequel. a decision problem is consisting of n alternatives denoted by ai, i  {1,...,n} each evaluated on m criteria denoted by cj, j  {1,...,m}. let eij be the evaluation of ai in terms of criterion cj, according to some suitable performance measure. we focus on decision making situations in which the values of eij for each i are not known with certainty for all j, but follow some distribution function f (eij). this formulation is known as alternatives, attributes (criteria), evaluators (aae or ace) model. the process of selecting the optimal smart home design is performed in following steps:  identification of different alternatives and criteria.  formation of individual criteria probability distribution functions.  the aggregated probability distribution formation by the numerical convolution of marginal probability distributions.  sd evaluation on aggregated probability functions 3.1. criteria aggregation the following three types of aggregation of criteria are used most commonly in decision making: conjunctive, disjunctive and compensatory. conjunctive aggregation implies simultaneous satisfaction of all decision criteria, while the disjunctive aggregation implies full 146 a. janjić, l. velimirović, m. stanković, v. djordjević compensation amongst them. the compensatory aggregation is more suitable for human aggregation behavior. among the great number of different compensatory aggregation operators, multiplicative multi-attribute utility function proved to be the most suitable for practical engineering applications. it is shown that if the additive independence condition is verified, a multi-attribute comparison of two actions can be decomposed to one-attribute comparisons. if mutual utility independence exists, the multi-attribute utility function is of the following form [28]: 1 2 (1 ( )) 1 ( , , , ) i i i i n kk u x u x x x k     (1) here, ui(xi)  the single-attribute utility value for attribute i with value xi (ranges from 0 to 1), ki = a  parameter from the trade-off for component i, for all i, and k = a  normalization constant, ensuring that the utility values are scaled over the component range space between 0 and 1. one method to determine the multiplicative function (1) is to measure each u(x), determine the kj values, and find the k value by iteratively solving (2). 1 1 (1 ) n i i k k k      (2) parameter k is related to parameters ki as follows: if 1 1, n i i k   then 1 0,k  (3) if 1 1, n i i k   then 0,k  and the additive model holds, (4) if 1 1, n i i k   then 0k  . (5) the overall utility function actually reflects three different types of interactions between individual criteria. in the compensatory case, performance of one criterion makes up for the lack of performance by other criteria, while in the additive case, it does not interact with the value of the other criteria. in the complementary case, a good performance by one criterion is less important than balanced performance across the criteria. 3.2. smcdm with compensatory aggregation the main idea of the proposed methodology is to compare different alternatives using a pragmatic aggregation function for combining the single-utility functions from each of the system components. this comparison is possible because of equivalence of rules for multivariate utility function u = u(x1,x2,...,xn) and univariate utility function defined on multivariate outcome space u = u s (p(x1,x2,...,xn)). in order to make the ranking of alternatives more practical, the convolution of these probability distributions to enable the comparison of only one distribution function per alternative is proposed. after the new, aggregated probability distribution has been built for every alternative, the ranking of alternative is performed by sd rules explained in the appendix. different uncertainty types, like outcomes and weighting factors can be simultaneously handled by the convolution principle. trade-off between multiple criteria in smart home control system design 147 the four step methodology of alternative ranking is based on the multiplicative utility function as a combination of suggested criteria and decision maker attitude towards risk, numerical convolution of individual distribution functions and sd principle. 3.3. aggregation of utility distribution functions let x and y be two independent integer-valued random variables, with distribution functions fx and fy respectively. then the convolution of fx and fy is the distribution function fz given by: ( ) ( ) ( ) z x y k f j f k f j k   , (6) for j = ,...,+. the function fz (j) is the distribution function of the random variable z = x + y. in [29], an efficient algorithm for computing the distributions of sums of discrete random variables is presented. however, multiplicative form of utility function requires other convolution type. in the proposed methodology, the computational procedure is extended to different forms of aggregating function and speeded up by the reduction of dimensions of arrays p and z to the number of evaluation grades, according to the following algorithm. for n criteria, and m number of evaluation grades, dimension of output array is reduced to m instead of m x n. the algorithm for the discrete convolution algorithm is given below: input: f (x1,...,xn) – multi-attribute utility function; m – number of evaluation grades; p(xi = j) – probability that variable i takes the value j, j = (1,m).  for i = 1 to m for j = 1 to m … for n = 1 to m calculate 1 2 ( , , , ) n f x i x j x n   z = integer(f) [discretization of f] 1 2 ( ) ( ) [ ( ) ( ) ( )] n p z p z p x p x p x     output: z [dimension m] the cumulative distribution function of aggregated random variable u is given by (7). ( ) ( ) ( ) ( ) x x u x u x f x p x x p x x f u         , (7) the comparison of different cdfs corresponding to aggregated utility function is now possible with the sd principle. the first step is the formation of aggregation function based on suggested criteria and dm attitude towards risk. in the second step, using the numerical convolution of individual criterion probability distribution functions, an aggregated probability distribution is derived. in the third step, using sd rules and sd degree values, a dominance matrix is formed. the final step in this methodology is the alternative ranking based on the results of the dominance matrix. two types of dominance matrices will be used in this methodology: the first one obtained by the three types of stochastic dominance. using the first, second or third degree stochastic dominance rule, the appropriate type of the dominance matrix is obtained, where the elements of the dominance matrix are defined in the following way:   1, , , 0, 1, 2, 3ij ai h aj ijsd if f sd f otherwise sd h   . http://en.wikipedia.org/wiki/random_variable 148 a. janjić, l. velimirović, m. stanković, v. djordjević the methodology will be illustrated on the example of smart home controller parameter selection concerning four criteria explained in the introductory section. 4. case study we consider one of many possible smart home functions: the blackout prevention for the smart house, where the smart meter measures the real-time power levels of appliances and send this information to smart home control system. the control system calculates the remaining available power, and send this information to the appliances, but with a time delay. table 2. expert’s evaluation of alternatives criteria scores alternatives a1 a2 a3 a4 a5 a6 a7 a8 a9 a10 c1 1 0 0 0 1/7 0 1/7 1/7 1/7 0 0 2 3/7 1/7 0 0 0 0 0 2/7 0 1/7 3 1/7 0 0 0 1/7 0 0 2/7 0 2/7 4 0 2/7 0 0 0 0 0 1/7 0 2/7 5 2/7 1/7 3/7 1/7 0 0 3/7 1/7 2/7 1/7 6 0 2/7 1/7 0 2/7 0 1/7 0 1/7 0 7 1/7 0 1/7 0 2/7 1/7 0 0 3/7 1/7 8 0 1/7 2/7 1/7 0 4/7 1/7 0 1/7 0 9 0 0 0 4/7 2/7 0 0 0 0 0 10 0 0 0 0 0 2/7 1/7 0 0 0 c2 1 0 1/7 1/7 0 0 0 1/7 3/7 0 0 2 2/7 0 0 0 0 0 3/7 3/7 0 1/7 3 1/7 0 0 1/7 0 4/7 1/7 0 1/7 0 4 0 0 0 1/7 0 0 0 1/7 1/7 0 5 2/7 0 0 0 1/7 0 1/7 0 0 0 6 0 1/7 1/7 1/7 2/7 0 1/7 0 1/7 0 7 0 1/7 0 0 1/7 1/7 0 0 4/7 2/7 8 1/7 1/7 2/7 3/7 2/7 2/7 0 0 0 3/7 9 1/7 3/7 1/7 1/7 1/7 0 0 0 0 0 10 0 0 2/7 0 0 0 0 0 0 1/7 c3 1 0 0 1/7 0 1/7 0 0 2/7 0 1/7 2 0 0 0 0 0 0 3/7 1/7 0 2/7 3 1/7 0 0 1/7 0 0 1/7 4/7 1/7 0 4 3/7 0 0 0 0 1/7 1/7 0 2/7 0 5 0 1/7 0 0 0 1/7 2/7 0 2/7 0 6 1/7 0 0 0 0 0 0 0 0 2/7 7 0 1/7 0 1/7 0 0 0 0 2/7 2/7 8 1/7 2/7 0 2/7 3/7 2/7 0 0 0 0 9 1/7 3/7 2/7 1/7 1/7 1/7 0 0 0 0 10 0 0 4/7 2/7 2/7 2/7 0 0 0 0 c4 1 0 1/7 0 1/7 0 0 0 2/7 0 0 2 0 0 0 0 0 0 0 0 1/7 0 3 3/7 0 0 0 0 0 1/7 0 0 0 4 0 0 0 0 0 0 0 1/7 1/7 0 5 2/7 0 0 0 0 1/7 1/7 2/7 0 0 6 0 0 0 0 1/7 1/7 0 1/7 3/7 3/7 7 0 0 1/7 0 1/7 1/7 0 0 0 1/7 8 1/7 2/7 4/7 0 3/7 2/7 3/7 1/7 1/7 1/7 9 0 2/7 0 1/7 1/7 1/7 1/7 0 0 1/7 10 1/7 2/7 2/7 5/7 1/7 1/7 1/7 0 1/7 1/7 trade-off between multiple criteria in smart home control system design 149 let suppose that we can build 10 alternatives with different combination of appliances and times for their disconnection, directly affecting all of four criteria concerning the smart home functionality requirements. in the problem, the set of ten alternatives is (a1, a2, ...; a10) and the criteria considered include: user friendliness c1, intelligence complexity c2, non-intrusiveness c3 and security c4. suppose that seven persons provide evaluations on the alternatives with respect to the criteria on a scale of ten (1 the worst, 10 the best). the complete table of probability distributions of expert’s evaluation is presented in table 2. the similar problem, which served as as basis for our analysis is given in [23],[25],[31]. the proposed method is illustrated with the multiplicative utility function of four existing criteria. using the expression (1), the aggregated utility function is obtained with the supposed weighting factors: k1 = 0.5, k2 = 0.2, k3 = 0.57, k4 = 0.09, k = -0.686. applying the numerical convolution of four criteria probability functions, ten aggregated probability distributions are obtained, represented on figure 2. 0 1 2 3 4 5 6 7 8 9 10 0,0 0,1 0,2 0,3 0,4 0,5 0,6 0,7 0,8 0,9 1,0 a1 a2 a3 a4 a5 a6 a7 a8 a9 a10 p ro b a b il it y grades fig. 2. aggregated probability distributions for ten different alternatives using the stochastic dominance degree, the dominance matrix is obtained (8). as explained in the appendix the premise of calculating the sdd on a pair of alternatives is that there must be the sd relation on the pair of alternatives. the matrix element sdd (i,j) represents the degree of the dominance of the alternative i over the alternative j. 150 a. janjić, l. velimirović, m. stanković, v. djordjević 0 0 0 0 0 0 0.05 0.30 0 0 0.33 0 0 0 0 0 0.37 0.53 0.24 0.32 0.46 0.19 0 0.03 0.05 0.06 0.49 0.62 0.35 0.45 0.44 0.16 0 0 0.02 0.03 0.47 0.61 0.34 0.43 0.43 0.14 0 0 0 0.01 0.46 0.60 0.31 0.42 0.42 0.13 0 0 0 0 0.45 0.60 0.24 0.41 0 0 0 0 0 0 0 0.26 0 0 0 0 0 0 0 0 0 0 0 0 0. sdd  17 0 0 0 0 0 0.22 0.42 0 0.16 0.01 0 0 0 0 0 0.07 0.31 0 0                                , (8) as the final step, the ranking of alternatives is performed based on the values from the dominance matrix. 3 4 5 6 2 9 10 1 7 8 a a a a a a a a a a , (9) the power and flexibility of the proposed method is illustrated on the same example, with additive utility function of four existing criteria and the criterion weight vector w = [0.09; 0.55; 0.27; 0.09], as proposed in the original example in [23]. the comparison of alternative ranking obtained from the previous matrix with three already mentioned methods is given in table 3. table 3. different alternative ranking methods comparison method ranking proposed method 3 5 4 2 6 10 9 1 7 8a a a a a a a a a a zhang et al. 3 2 5 4 6 10 9 1 7 8a a a a a a a a a a zaras and martel’s 3 4 2 5 6 10 9 1 7 8a , a a , a a , a , a a , a a nowak 3 2 4 5 6 9 10 1 7 8a a a , a a a , a a a a the proposed method gives the same results as the method of zhang et al. [31]. however, instead of pairwise comparison of alternatives for individual criterion the result is obtained in only three steps explained above. the simulation is performed on intel(r)xeon(r) cpu e526670 @ 2.90 ghz processor with 32 gb ram. the total time for the simulation was 1.3 sec that proves the suitability of the method in real time smart home applications. 5. concluding remarks proper smart home design depends on human judgment in great extent. in many practical applications, criteria in different stages of smart home design can be presented as random variables with appropriate discrete probability density function. these applications include, but are not limited to the scheduling of appliances in the presence of stochastic renewable production, control parameter selection and the choice of control strategy in uncertain trade-off between multiple criteria in smart home control system design 151 environment. in this paper, a problem of optimal design alternative selection has been solved with enhanced smcdm methodology, based on numerical convolution of criteria probability distribution functions, according to multiplicative aggregation form. the methodology is based on multiplicative form of multi-attribute utility theory, which proved to be suitable for the modeling of human behavior in front of opposite criteria the ranking of alternative is performed by the stochastic dominance degree. because of variety of key design criteria that every smart home should meet, and the trade-off between them in uncertain environment, this method proved to be efficient, unlike usual approach of pairwise comparison, allowing only the additive form of aggregation of individual criteria. in previous methodologies, the decision maker risk attitude is taken into account only at individual level of criterion comparison, while this attitude can be directly incorporated in the model with the different compensatory aggregators. together with the multiple uncertainties of evaluations and weighting factors, the problem of group decision making in smart home applications will be the focus of further researches of the possible application of this methodology. acknowledgement: this work was supported by the ministry of education, science and technological development of the republic of serbia through mathematical institute sasa under grant iii 44006 and grant iii 42006. references [1] i. cardei, b. furth, and l. bradely, "design and technologies for implementing a smart educational building: case study", facta universitatis series: electronics and energetics, vol. 29, no. 3, pp. 325 – 338, 2016. [2] j. xiao and r. boutaba, "the design and implementation of an energy-smart home in korea", journal of computing science and engineering, vol. 7, no. 3, 204-210, 2013. [3] j. y. son, j. h. park, k. d. moon, and y. h. lee, "resource aware smart home management system by constructing resource relation graph", ieee transactions on consumer electronics, vol. 57, no. 3, pp. 1112-1119, 2011. [4] d. m. han and j. h. lim, "smart home energy management system using ieee 802.15.4 and zigbee", ieee transactions on consumer electronics, vol. 56, no. 3, pp. 1403-1410, 2010. [5] d. ding, r. a. cooper, p. f. pasquina, and l. fici-pasquina, "sensor technology for smart homes", maturitas, vol. 69, no. 2, pp. 131-136, 2011. [6] d. h. stefanov, z. bien, and w. c. bang, "the smart house for older persons and persons with physical disabilities: structure, technology arrangements, and perspectives", ieee transactions on neural systems and rehabilitation engineering, vol. 12, no. 2, pp. 228-250, 2004. [7] m. chan, e. campo, d. esteve, and j. fourniols, "smart homes—current features and future perspectives", maturitas, vol. 64, no. 2, pp. 90–96, 2009. [8] t. gentry, "smart homes for people with neurological disability: state of the art", neuro rehabilitation, vol. 25, no. 3, pp. 209–225, 2009. [9] g. demiris, and b. k. hensel, "technologies for an aging society: a systematic review of smart home applications", imia yearbook of medical informatics, vol. 3, no. 1, pp. 33–40, 2008. [10] h. alwaera and d. j. clements-croomeb, "key performance indicators (kpis) and priority setting in using the multi-attribute approach for assessing sustainable intelligent buildings", building and environment, vol. 45, no. 4, pp. 799–807, 2010. [11] z. chen, d. clements-croome, j. hong, h. li, and q. xu, "a multicriteria lifespan energy efficiency approach to intelligent building assessment", energy and buildings, vol. 38, no. 5, pp. 393–409, 2010. [12] g. reynoso-mesa, x. blasco, j. sanchis, and m. martinez, "controller tuning using evolutionary multiobjective optimisation: current trends and applications", control engineering practice, vol. 28, pp. 58– 73, 2014. javascript:%20goarcpage('',%20'480578',%20''); javascript:%20goarcpage('',%20'483196',%20''); http://www.sciencedirect.com/science/article/pii/s036013230900225x http://www.sciencedirect.com/science/article/pii/s036013230900225x http://www.sciencedirect.com/science/article/pii/s036013230900225x http://www.sciencedirect.com/science/article/pii/s036013230900225x http://www.sciencedirect.com/science/journal/03601323 http://www.sciencedirect.com/science/journal/03601323 http://www.sciencedirect.com/science/journal/03601323/45/4 http://www.sciencedirect.com/science/article/pii/s0378778805001349 http://www.sciencedirect.com/science/article/pii/s0378778805001349 http://www.sciencedirect.com/science/article/pii/s0378778805001349 http://www.sciencedirect.com/science/article/pii/s0378778805001349 http://www.sciencedirect.com/science/article/pii/s0378778805001349 http://www.sciencedirect.com/science/journal/03787788 http://www.sciencedirect.com/science/journal/03787788/38/5 152 a. janjić, l. velimirović, m. stanković, v. djordjević [13] b. davidovic, and a. labus, "a smart home system based on sensor technology", facta universitatis series: electronics and energetics, vol. 29, no. 3, pp. 451 – 460, 2016. [14] c. beckel, h. serfas, e. zeeb, g. moritz, f. golatowski, and d. timmermann, "requirements for smart home applications and realization with ws4d-pipesbox", in proceedings of the 16th conference on emerging technologies & factory automation (etfa), toulouse, france, ieee, 2011. [15] m. levin, a. andrushevich, a. klapproth ―improvement of building automation system‖, in proceedings of the international conference on industrial, engineering and other applications of applied intelligent systems iea/aie 2011: modern approaches in applied intelligence pp 459-468 [16] p. stewart, j. c. zavala, and p. fleming, "automotive drive by wire controller design by multi-objective techniques", control engineering practice, vol. 13, no. 2, pp. 257–264, 2005. [17] janjic, s. savic, g. janackovic, m. stankovic, and l. velimirovic, "multi-criteria assessment of the smart grid efficiency using the fuzzy analytic hierarchy process", facta universitatis series: electronics and energetics, vol. 29, no 4, pp. 631 – 646, 2016. [18] m. prýme, a. horák, l. prokop s. misak ―smart home modeling with real appliances‖, in proceedings of the international joint conference soco’13-cisis’13-iceute’13, pp. 369-378. [19] j. martel, and g. d’avignon, "projects ordering with multicriteria analysis", european journal of operational research, vol. 10, no. 1, pp. 56–69, 1982. [20] d. wu and d. l. olson, "a comparison of stochastic dominance and stochastic dea for vendor evaluation", international journal of production research, vol. 46, no. 8, pp. 2313-2327, 2008. [21] r. lahdelma and p. salminen, "stochastic multicriteria acceptability analysis using the data envelopment model", european journal of operational research, vol. 170, no. 1, pp. 241–252, 2006. [22] durbach, "the use of the smaa acceptability index in descriptive decision analysis", european journal of operational research, vol. 196, no. 3, pp. 1229–1237, 2009. [23] zaras and j. martel, multiattribute analysis based on stochastic dominance, models and experiments in risk and rationality, kluwer academic publishers, dordrecht, 1994, pp. 225–248. [24] zaras, "rough approximation of a preference relation by a multi-attribute dominance for deterministic, stochastic and fuzzy decision problems", european journal of operational research, vol. 159, no. 1, pp. 196–206, 2004. [25] nowak, "aspiration level approach in stochastic mcdm problems", european journal of operational research, vol. 177, no. 3, pp. 1626–1640, 2007. [26] t. stewart, "simplified approaches for multicriteria decision making under uncertainty", journal of multi-criteria decision analysis, vol. 4, no. 4, pp. 246–258, 1995. [27] janjic, a. andjelkovic, m. docic, ―multi-attribute risk assessment using stochastic dominance‖ international journal of economics and statistics, vol. 1, no. 3, pp. 105-112, 2013. [28] r. keeney and h. raiffa, decisions with multiple objectives: preferences and value tradeoffs, john wiley & sons, new york, 1976. [29] r williamson and t. downs, "probabilistic arithmetic: numerical methods for calculating convolutions and dependency bounds", international journal of approximate reasoning, vol. 4, no. 1, pp. 89-158, 1990. [30] y. zhang, z. p. fan, and y. liu, "a method based on stochastic dominance degrees for stochastic multiple criteria decision making", computers and industrial engineering, vol. 58, no. 1, pp. 544–552, 2010. [31] c. c. huang, d. kira, i. vertinsky, "stochastic dominance rules for multi-attribute utility functions", the review of economic studies, vol. 45, no. 3, pp. 611-615, 1978. appendix stochastic dominance in order to determine whether a relation of stochastic dominance holds between two distributions, the distributions are characterized by their cumulative distribution functions, or cdfs. suppose that we consider two distributions a and b, characterized respectively by cdfs fa and fb. then distribution b dominates distribution a stochastically at first order if, for any argument y, fa(y)  fb(y). http://users.cecs.anu.edu.au/~williams/papers/p5.pdf http://users.cecs.anu.edu.au/~williams/papers/p5.pdf trade-off between multiple criteria in smart home control system design 153 the sd rules can be fundamentally classified into two groups for two classes of utility functions. the first group is for increasing concave utility function and includes first degree stochastic dominance, second degree stochastic dominance and third degree stochastic dominance. these rules can be applied for modeling risk averse preferences. definition 1. let a and b (a < b) be two real numbers, x and y be two random variables, f(x) and g(x) be cumulative distribution functions of x and y, respectively. let u1 include all the utility functions u for which ’ 0u  , u2 include all the functions u for which u'  0 and u"  0, u3 include all the functions u for which u'  0 and u"  0 and u'''  0. let ef and eg be the two expectations or the means, respectively. let sd1, sd2 and sd3 denote first, second and third degree stochastic dominance, respectively. the sd rules are: 1 ( ) ( )f x sd g x if and only if ( ) ( )( ) ( ) f g e u x e u y for all 1 u u with strict inequality for some u, or ( ) ( )f x g x for all [ , ]x a b with strict inequality for some x; 2 ( ) ( )f x sd g x if and only if ( ) ( )( ) ( ) f g e u x e u y for all 2 u u with strict inequality for some u, or x x a a f t dt g t dt      for all , ][x a b with strict inequality for some x; 3 ( ) ( )f x sd g x if and only if f ge x e y     ( ) ( )( ( )) f g e u x e u y for all 3 u u with strict inequality for some u, or x t x t a a a a f z dzdt g z dzdt      for all [ , ]x a b with strict inequality for some x; the second group of sd rules is for increasing convex utility function and includes first degree stochastic dominance, second inverse stochastic dominance, third inverse stochastic dominance of the first type and third inverse stochastic dominance of second type. these rules are equivalent to expected utility maximization rule for risk-seeking preferences. definition 2. in [30], a sd degree is defined, in the following way: if ( ) ( ) h f x sd g x  , {1, 2, 3}h then the stochastic dominance degree sdd of ( ) ( ) h f x sd g x  is given by: [ ] ( ) {1 2 3} { [ ]} h f x g x dx f x sd g x ,h , , , x x a,b g x dx                      , both sd rules and sd degrees are used in the proposed methodology. according to [29], classes ui (i = 1, 2, 3) are identical to the following classes:   * 1 2 1 2 ( , , , ) ( ( , , , )), s s i n n i i u u x x x u p x x x u u and p u      , for each i = 1, 2,3, u s is a single attribute utility function and 1 2 ( , , ..., )p p x x x a multivariate function, and u = u for i = 1,2,3. plane thermoelastic waves in infinite half-space caused facta universitatis series: electronics and energetics vol. 31, n o 3, september 2018, pp. 389-400 https://doi.org/10.2298/fuee1803389k the concept for the “smart home” controlled by a smartwatch  miloš kosanović, slavimir stošović college of applied technical sciences in niš, serbia abstract. in this paper a “smart home“ solution is proposed in which power plugs in a remote room can be controlled by a smartwatch, an android mobile device or a php web app. communication between these devices takes place in real time via server using node.js technology. an electrical circuit for determining current and voltage on the plugs via arduino wi-fi module sends the measured values to the server, based on which the electrical energy consumption in each time interval can be determined. all the measured values are stored in mysql database and used for creation of appropriate reports. smartwatch app enables remote plugging and unplugging. in addition, the setting of limits for electrical energy consumption on each plug is enabled, as well as the power of the consumption device that can be plugged. exceeding of the allowed values leads to the automatic unplugging. key words: iot, smart home, smart watch, power consumption 1. introduction in recent times, the world has seen an exponential rise in the number of devices connected to the internet. in order to automatize business process, but also to improve life comfort, computers connected to the internet became a part of our daily routine. a concept of connecting embedded computer devices within the existing internet infrastructure is popularly called internet of things (iot) [1]. that concept should enable connecting different devices, systems and services that exceed present communication of two machines. a task is put before the engineers to implement different protocols and applications for wide spectra of devices as air condition devices, washing machines, biochips or wireless sensors. this leads to development of different systems with wide application possibilities like context aware systems, ambient assisted living systems, smart homes, smart cities etc. considering the diversity of the devices, there are many challenges that an engineer needs to overcome when designing and implementing such a system. main research received september 7, 2017; received in revised form february 12, 2018 corresponding author: miloš kosanović college of applied technical sciences in niš (e-mail: milos.kosanovic@vtsnis.edu.rs) 390 m. kosanovic, s. stosovic directions and challenges are thoroughly described in [2]. developing the architecture, designing or choosing hardware and sensors, deciding on the operating system or systems, choosing communication protocol, integrating “things” into the web by using web services are only some of the problems that we had in mind when we implemented iot solution described in this paper. one of the challenges is certainly the development of the architectural design that will enable scalability, interoperability and security. as trillions of things (objects) are connected to the internet, it is necessary to have an adequate architecture that permits easy connectivity, control, communications, and useful applications. several solutions are proposed like diat, marm, mosden, cloudthings and other [3]. in depth analysis of different context aware systems and smart home architectures and technologies can be found in [4] and [5]. another challenge is a design of universal operating systems that would work on different hardware with a similar success. one of the most popular and most common operating systems is android, as an open source operating system. in recent years, tizen os also gained popularity, due to its applicability on various devices, and support by samsung company. several communication protocols are used for communication between iot devices. most relevant is probably the snmp protocol that forms the part of the ip stack and is universally supported. on the application level coap (constrained application protocol) and mqtt (message queuing telemetry transport) are often used [6]. in the paper [7] jabeur goes further by explaining that integration of real world things or rwts into the web leads to more advanced perspective, where these things are abstracted into reusable web services, and not only viewed as simple web pages. these leads to the subset of iot called web of things or wot. restful web services are based on representational state transfer (rest) [8] which is lightweight, simple, loosely coupled, flexible as well as easy to integrate into the web using the http application protocol. from a design perspective and compared to the traditional client-server architecture, the wot has a flat architecture that should integrate the rwt into the web and make them mutually interoperate and fuse into complex web services. jabeur further introduces artificial intelligence technics as well as the ideas of social networking into the iot. smart home represents one context aware iot system. it consists of home appliances, sensors, actuators and data processors and analyzers [9]. home automation of appliances can be either wired or wireless. the idea of a smart home integrates many different aspects of iot, energy efficiency and software engineering. for example, [10] describes connecting wireless sensors to internet, [11] describes platform for smart learning environment, which enables acquiring data from sensors distributed within the university building, and [12] smart home system based on sensor technology. home devices control is usually performed from other smart devices, as tablets, mobile phones, smartwatches, smart wristbands, etc. there are several smart home systems that are proposed based on different architecture and different communication technologies like zigbee [13], bluetooth [14], gsm [15] and wi-fi[16]. as far as we know the first paper that mentions the possibility of using a smartwatch as a remote-control device in a smart home is [17]. however, this paper was written at the time when smartwatches were not yet widely available and were quite limited in comparison to nowadays smart devices. the paper [15] describes a system where android smartphone is used for monitoring a home security system. the concept for the „smart home“ controlled by a smartwatch 391 the power monitoring systems in home environments are described in [18] [19]. the first paper does not use smartwatch nor smartphone for monitoring, and does not implement real time communication as gui needs to be refreshed so the data can be shown. the second paper does not use web and cannot be considered as scalable nor easy for integration. in this paper, a smart home solution is described in which an android mobile device, php server, and tizen smartwatch app are used to monitor and control house power consumption. all three apps communicate over wi-fi, regardless the development technology, have identical functionalities, and communicate among themselves via common nodejs server, via websocket and http protocol. as a connection between hardware on one side and server app on another, an arduino open source computer platform, based on a simple board with i/o pins and an atmel microcontroller, is used. this solution represents the continuation of development of the remote control in samsung apps lab and vtš apps team in the college of applied technical sciences in niš, and it is the expansion of the apps functionality for the lab access control system [20]. it also represents an addition to work already described in [12] by introducing the smart watch control and system for energy consumption monitoring and control. considering the growing popularity of smartwatches and their apps development specificity, the special focus is put on the control of the proposed smart home solution. in the section 2, the system architecture is presented, whereas in the subsequent chapters every part of the system is described. after the third and fourth section, in which the hardware for electrical energy and voltage measurement on the plugs are described, as well as the functioning of the arduino module, in the fifth section a detailed description of server part of the system, the mobile device and the web app is given. 2. system architecture design for setting up a system for energy consumption control in the home environment the following hardware components are needed:  microcomputer – an arduino board with the wi-fi module, the sparkfun ftdi basic breakout board and the lcd display  main server for sharing the data between clients. it can be a desktop computer or a service provided by the hosting provider  smartphone, smartwatch or pc device  wi-fi network and wireless router  customized power plug the working principle of the whole system is shown graphically in figure 1. all the smart home module users, regardless whether they want to have control via smartwatch or android device, ought to be registered in the database of lab members. the application on the smartwatch enables us to remotely turn on and off the power plug, to set timer, and to set power limits for energy consumption, as well as the maximal consumption power that can be connected to any power plug. in this way, the watch on our wrist becomes a sort of personal home remote control device. beside the smart watch app, the android mobile application and php web application were also developed with similar functionalities. communication between all these applications happens in real time over 392 m. kosanovic, s. stosovic the node.js server. the connection between the power plug and electrical circuits for measuring current and voltage is implemented by using an arduino microcontroller board. for testing the proposed solution two power plugs on one power strip were used. all measured values are saved in mysql database and are used for energy consumption and cost calculations. all previously developed modules: laboratory access control system, php library of the vtš apps team and vtš explorer app [2], use the same database which is stored on server. so, the first step for the administrator is to create a user account by filling out the appropriate web form in the php app. in that way, a universal username and password are given to all the members of the lab so that they can access all the above-mentioned apps (modules). fig. 1 the block diagram of the system software and hardware architecture when creating an account, the php app checks whether entered username for newly created user is available. at the same time, a form validation is performed regarding the mandatory field checks and whether the data are entered in valid format. if created account meets all the requirements and security checks, the app stores it in database. two plugs are placed in the laboratory, and are connected with the arduino microcontroller via the electrical circuit described in the following chapter. arduino is used for voltage and electrical energy measuring, and it sends the measured values to the server app via appropriate wi-fi module. the systems based on wi-fi have the advantage of using technology which is present or will be present in almost any modern electronic device. its main feature is the existing wide support, alongside the fact that it is an upper layer protocol which allows communication over the internet without needing a protocol translator [5]. furthermore, wi-fi network today exists in almost every home with internet access. the app accepts the measured values on the server, and along with accepting date and time, stores them in the database. based on measured and stored values, a current power chart on the plugs in real time is drawn. the server app sends these data to the smartwatch or android device apps via appropriate service. communication between server and above-mentioned devices is two-way, because the devices can send instructions for momentary unplugging, as well as the time when the plug should be switched on or off. the concept for the „smart home“ controlled by a smartwatch 393 3. plug design there are a lot of wireless power plugs available on the market with price ranging from 15 to 50 eur per plug. to use the plug the costumer needs to install a proprietary mobile app. integration with third-party apps is usually not possible, as well as getting the real-time power consumption data. the price, non-accessible api and unavailability of real time power consumption readings are the reasons why we decided to design and construct custom power plugs the interior of the existing power strip is modified so the plugs could be physically divided. in figure 2, a circuit implementation scheme for measuring electrical energy is shown. calculated value represents the active power of the consumption device and is acquired as a product of the effective value of voltage and the power of the receiver. the active power is provided in technical specifications of home appliances, as well as their characteristics. the information on electrical energy consumption is acquired using dl-ct1005a current sensor, through which a conductor is pulled, and the alternating current of the consumption device is transmitted through it. that alternating current is induced proportionally in current sensor’s coils. a 39ω resistor, through which the alternating current flows causing alternating voltage drop (around 1v), is connected in parallel with the current sensor. relay 1 relay 2 load 2 load 1 fig. 2 a circuit implementation scheme for measuring electrical energy and circuit implementation scheme for relay control 394 m. kosanovic, s. stosovic the value of the gained voltage is amplified using lm358n operational amplifier to the point where it is applicable to transform it into direct voltage (figure 2). the schottky diodes are used for rectifying voltage because of their quality of having a lower voltage sag in comparison with silicon diodes. for comparing voltages at the input of a/d converter, the arduino uses the internal reference voltage of 1.1v, so the maximum value of the voltage (for consumption device of 10a), received from the amplifying and rectifying circuit, is 1.1v. considering that the display and wi-fi module are connected to 5v, the voltage varies in mv and is not even nearly precise as intern voltage generated as internal arduino cpu. considering that the system monitors the power consumption of two independent devices, two current sensors are required (sk1 and sk2 in figure 2), as well as the amplifiers and rectifiers for both current sources. at the output of the rectifying degree, the rc connection (between a 10kω resistor and a 47µf capacitor) serves as a low-pass filter, so more stable voltage is achieved at the input of the arduino. the arduino keeps filtering this signal further by repeating measurements for certain time interval and taking the mean. both consumption devices can be switched on and off remotely in a way that the instruction is sent from the mobile app, via server, to the arduino device, ordering it to switch on/off the appropriate relay. a consumption device and a relay are connected in series to the mains voltage, and relay’s coil is propelled by transistor’s switch derived with pn2222a npn transistor (figure 2). from the arduino’s digital pin to the database, via 510ω resistor, the transistor is saturated and the consumption device is switched on. diodes neutralize counter electromagnetic force that occurs after the relay is switched off. a circuit scheme for measuring voltage is shown in figure 3. mains voltage is the same on both consumption devices, so only one readout structure is necessary. the logic is galvanically separated from mains voltage with a transformation that shows 12v on its secondary. the graetz bridge, which is made with schottky diodes because of the less attrition, generates the dc voltage. this voltage is too high to be led to an analogue arduino input directly, so a voltage divider is created using 100kω and 1kω resistors. using a 2.2kω potentiometer, which is connected in series to a 1kω resistor, output voltage is set in a fine way (around 800mv for mains voltage of 225v). at the output, the rc connection between a 10kω resistor and a 10µf capacitor serves as a low-pass filter. fig. 3 a circuit implementation scheme for measuring voltage the concept for the „smart home“ controlled by a smartwatch 395 4. arduino and server communication module arduino microcontroller board is an open source development platform based on 8-bit atmel avr or 32-bit atmel arm microcontroller. there are several models of arduino boards that have different features, but all boards contain standard microcontroller components like oscillator, the crystal that regulates time periodic impulses of the processor clock, and 5v and 3.3v voltage regulators. for this project arduino pro mini board was used, with atmega 328 microprocessor and esp8266 esp-01 wi-fi module which are shown in figure 4. the final look of the arduino module and power plug is shown on figure 5. a) b) c) d) e) fig. 4 a) arduino pro mini, b) wi-fi module esp8266 esp-01 c) ftdi cable, d) sparkfun ftdi basic breakout board development environment needed for the board programming is completely free, open source, and can be downloaded for the mac, windows, or linux operating systems. the processing power of the arduino is not great, as it is built to be cheap and accessible, so the optimization of the code is important, so that the system would work without any latency. this is extremely important in a real-time system. in this project, the arduino has several different responsibilities: 1. it measures the output of electrical current and voltage for each power plug that is described in section 3; 2. it writes measured values on the lcd display; 3. it communicates with the server by sending the measured values via wi-fi; 4. it receives from the server the power consumption limits for each power plug. communication between the arduino and the php server is implemented by using the esp8266 esp-01 wireless module (figure 4b). the module uses 802.11 b/g/n standard and the tcp/ip protocol. the possibilities for the work mode are station, access point and both. it is also possible to set a static ip address or port number, and to specify the explicit web address, as the module has the dns capability to translate it to the ip address. arduino pro mini contains 14 digital input/output pins, from which six can be used as a pwm (pulse width modulation) channels. also, it has six analog entries and a reset button. this platform is categorized as an entry-level model and is designed for use where small board dimensions are required. due to this, the board does not contain the usb connector. therefore, connecting with a personal computer is not possible via usb cable, which leaves two other possibilities: 396 m. kosanovic, s. stosovic 1. direct connection via ftdi (future technology devices international) cable (figure 4c) which is used with the arduino pro mini 16mhz module with 5v power source 2. connection with additional sparkfun ftdi basic breakout board (figure 4d), which is used with the arduino pro mini 8mhz module with 3.3v power source. we decided to use this solution. every pin on the arduino board, from the 14 available pins, can be used as an input or output maximal current is limited to 40ma. additionally, specialized pins are available for connecting the bluetooth or wireless module like rx (receive x) and tx (transmit x) pins, which can also be used for the uart ttl serial communication. 5. implementation of the server, web and mobile application the server application is implemented in node.js and php technologies. node.js is a server-side javascript environment based on google chrome v8 engine, and is mainly used for the implementation of the fast simple scalable network applications. it is very efficient for development of the real-time applications, distributed applications, and applications that have the need for the large amount of data transactions or http requests or full duplex communication via web socket protocol. communication between the arduino module in the power plug, and the smartwatch or the android device is done in real time by using the web sockets, which are implemented in node.js and use socket.io library. arduino monitors current and voltage on the consumption device, which is connected to the power plug, and forwards measured values to the node.js server in real time. node.js saves the measured values in the mysql database, and forwards these values to the client applications, the smartwatch, and android and web application. with any change in the state of the plug, the current or voltage values will be immediately synchronized on all the client applications. all client apps communicate with the server via wi-fi. when any change occurs, on server or client side, data is sent to the server which forwards this information to all connected applications which immediately update the ui. fig. 5 the image of arduino module and the custom power plug the concept for the „smart home“ controlled by a smartwatch 397 fig. 6 design and functionalities of the web application web application was developed in the php programming language with the mysql database, and has the same functionalities as the android and smartwatch app which is showed in figure 6. the android application was developed in android studio. 6. implementation of the smartwatch application today’s mobile devices market has many different smartwatches manufactures, and many different operating systems that power them. the application for smart home control in this paper has been developed for samsung smartwatch gear s model, which works on tizen operating system. tizen is an open and flexible operating system developed to run on different devices like mobile phones, cars, bracelets, watches, television sets, and other devices. it was made by open source software development community and is still open for anyone who wants to contribute to the project. there are different versions of the system, like tizen mobile, tizen tv and tizen wearable, and they are all compatible with the html5 standard. due to the small screen size and the small resolution of the watch (360x480), the design of the user interface (ui) turns out to be increasingly challenging. the ui of this type of app consists of several screens. since the watch has only one physical button, it is only natural that different gestures must be used for navigation between screens. swiping gesture from the top of the application to the bottom would signify closing the application. swiping in the other direction, from the bottom to the top, would return you to the previous screen. by clicking the physical button, the app continues to run in the background, so when it is activated again, the user can continue where he left off. fig. 7 the smartwatch application screenshots annotated as 7a, 7b, 7c and 7d 398 m. kosanovic, s. stosovic the smartwatch can turn on and off the power plug autonomously by using timers. on the bottom of the screen, the next scheduled task with scheduled time is shown. to set the timer the user can click on the before mentioned task and the screen in figure 7b will be shown. on this screen, the user can see all scheduled tasks for the future, sort them, delete them, and add the new ones. by clicking on the green button on the bottom, the new screen for button addition will be shown (figure 7c). to schedule a new task, the user needs to choose the power plug, type of action, time and date. by clicking in the middle of the application, on the part that shows current power of the consumption device, the screen with consumption date for that plug will be shown in figure 7d. the data are sorted by the current tariff system in serbia (blue, green, and red tariff). the user can also see the price in serbian dinars. in the figure 8 we can see the results of the measured power consumption on the monthly scale for two power plugs used in the laboratory. fig. 8 power consumption report 7. conclusion in this paper, we proposed and described the architecture model based on wi-fi protocol that controls and monitors power consumption by using different devices. there are multiple advantages of such a system comparing to standard home like better control of home appliances, convenience, better power consumption awareness, energy efficiency etc. considering other proposed solutions in scientific literature, our solution differs by implementing smartwatch application that can monitor and control power plugs in real time. all the controlling devices exchange data and synchronize in real time. the custom power plug was designed with arduino board and wi-fi connectivity. the plugs can be controlled through the server as web services that can be accessed from any web capable device and are integrated into the “web of things”. finally, the rich and interactive tizen, android and php applications were developed which access these web services and are used to control devices and show the monitored data in a user-friendly way. the system was constructed from widely available hardware components (arduino, wi-fi module, smart watch) and from some that were custom-made (power plug). the total price for all the components is not more than 40 euros. the real-world application in the concept for the „smart home“ controlled by a smartwatch 399 home environment is possible for selected devices as it does not require changes in electric installations. however, it is not recommended for all wall power plugs, as better solutions exist. the electric installations can be changed, so all of them connect to one microcomputer, or power consumption measurement can be done on the main circuit breaker. the attention when building the system was on accessibility, usability, openness, scalability, easy integration, and real-time data exchange, while other considerations like security, robustness and system optimization were also considered but were not in the prime focus of the research. verbose data due to the protocol overheads increase the power consumed by the device and may make it less efficient. emerging protocols such as coap can be considered to provide better performance. the practical test of the smart house system features easy control, good stability and scalability, and easy integration into larger systems so it can provide reference to the future design of the smart home hardware and architecture. special attention in this paper has been given to the software development, especially to the smartwatch application due to the popularity and innovative concept that this technology represents. the integration of the other controlling devices as well as other sensors can be easily implemented and integrated in the proposed system, and is supported by the software architecture. additional work can be done to integrating smart tv as a sort of smart house hub that can monitor and control other house devices. furthermore, the development of the intelligent systems that detect and learn about our behavior and help enrich and simplify our daily routines is more common. artificial intelligence and machine learning techniques can be used to optimize and personalize energy consumption and personal preferences for optimal device usage. proposed system can help and simplify the collection of the data needed for the training of such systems. acknowledgement: the authors would like to mention that the software and hardware has been implemented by the members of vtš apps team in the college of applied sciences in niš. we are grateful to miloš milić, miloš segić, dimitrije dimitrijević and all the members of the vtš team that were involved and have significantly contributed to this paper. references [1] o. vermesen, p. fries, internet of things converging technologies for smart environments and integrated ecosystems, river publishers, 2013. [2] john a. stankovic, "research directions for the internet of things," ieee internet of things journal, vol. 1, pp. 3-9, 2014. [3] abhirup khanna, "an architectural design for cloud of things," facta universitatis, series: electronics and energetics, vol. 29, no. 3, pp. 357 – 365, september 2016. [4] charith perera, arkady zaslavsky, peter christen, dimitrios georgakopoulos, "context aware computing for the internet of things: a survey, " ieee communications surveys & tutorials, vol. 16, no. 1, pp. 414 – 454, 2014. [5] gabriele lobaccaro, salvatore carlucci, erica löfström, a review of systems and technologies for smart homes and smart grids, energies, 2016 [6] m. savic, "bridging the snmp gap: simple network monitoring the internet of things," facta universitatis, series: electronics and energetics, vol. 29, no. 3, pp. 475 – 487, september 2016. [7] nafaâ jabeur, hedi haddad, "from intelligent web of things to social web of thing," facta universitatis series: electronics and energetics, vol. 29, no. 3, pp. 367 – 381, september 2016. 400 m. kosanovic, s. stosovic [8] r. t. fielding, architectural styles and the design of network-based software architectures, ph.d.dissertation, university of california, irvine, 2000. [9] h. ghayvat, s. mukhopadhyay, x. gui, n. suryadevara, "wsnand iot-based smart homes and their extension to smart buildings, " sensors 2015, vol. 15, no. 10350-10379, 2015. [10] mirko r. kosanović, mile k. stojčev, "connecting wireless sensor networks to internet," facta universitatis, series: mechanical engineering, vol. 9, no. 2, pp. 169 – 182, 2011. [11] konstantin simić, marijana despotović-zrakić, ţivko bojović, branislav jovanić, đorđe kneţević, "a platform for a smart learning environment," facta universitatis series: electronics and energetics, vol. 29, no. 3, pp. 407 – 417, september 2016. [12] boban davidović, aleksandra labus, "a smart home system based on sensor technology," facta universitatis, series: electronics and energetics, vol. 29, no. 3, pp. 451 – 460, september 2016. [13] xian-jun yi, min zhou, jian liu, "design of smart home control system by internet of things based on zigbee, " in proceedings of the 2016 ieee 11th conference on industrial electronics and applications (iciea), hefei, 2016, pp. 128-133. [14] z. yufeng and j. ruqiao, "design and realization of the smart home control system based on the bluetooth," in proceedings of the 2015 international conference on intelligent transportation, big data and smart city, halong bay, 2015, pp. 286-289. [15] s. morsalin, a. m. j. islam, g. r. rahat, s. r. h. pidim, a. rahman and m. a. b. siddiqe, "machine-tomachine communication based smart home security system by nfc, fingerprint, and pir sensor with mobile android application," in proceedings of the 3rd international conference on electrical engineering and information communication technology (iceeict), dhaka, 2016 [16] r. k. kodali, s. soratkal and l. boppana, "iot based control of appliances," in proceedings of the 2016 international conference on computing, communication and automation (iccca), noida, 2016, pp. 12931297. [17] l. de russis, d. bonino, f. corno, “the smart home controller on your wrist”, homesys workshop (2013). [18] edwin chobot, daniel newby, renee chandler, nusaybah abu-mulaweh, chao chen, carlos pomalazaráez, "design and implementation of a wireless sensor and actuator network for energy measurement and control at home, " international journal of embedded systems and applications (ijesa), vol.3, no.1, 2013. [19] chia-hung lien, hsien-chung chen, ying-wen bai, and ming-bo lin, "power monitoring and control for electric home appliances based on power line communication, " in proceedings of the i²mtc 2008 – ieee international instrumentation and measurement technology conference. [20] n. ţivković, m. milojević, n. nikolić, b. majkić, s. stošović, "system for access and working time control implemented with raspberry pi platform, " in proceedings of the ieeestec conference, niš, serbia, 2014. pp. 201-206. instruction facta universitatis series: electronics and energetics vol. 28, n o 1, march 2015, pp. 103 111 doi: 10.2298/fuee1501103g thomas-fermi method for computing the electron spectrum and wave functions of highly doped quantum wires in n-si  volodymyr grimalsky, outmane oubram, svetlana koshevaya, christian castrejon-martinez ciicap and the faculty of sciences on chemistry and engineering, autonomous university of state morelos (uaem), av. universidad 1001, 62209 cuernavaca, mor., mexico abstract. the application of the thomas-fermi method to calculate the electron spectrum in quantum wells formed by highly doped n-si quantum wires is presented under finite temperatures where the many-body effects, like exchange, are taken into account. the electron potential energy is calculated initially from a single equation. then the electron energy sub-levels and the wave functions within the potential well are simulated from the schrödinger equation. for axially symmetric wave functions the shooting method has been used. two methods have been applied to solve the schrödinger equation in the case of the anisotropic effective electron mass, the variation method and the iteration procedure for the eigenvectors of the hamiltonian matrix. key words: quantum wires, thomas-fermi method, exchange, wave functions, variation method, inverse matrix iterations 1. introduction investigations of the electron spectrum of low-dimensional and highly doped structures are central to many nanotechnology applications [1,2]. quantum devices based on silicon have been the subject of a concentrated recent interest, both experimental and theoretical, including the recent proposals on quantum computing [1-4]. the infrared transitions between the electron sub-levels within -doped quantum wells are perspective for using in optoelectronics, especially for infrared modulators, detectors, and lasers [5,6]. the electron spectrum of -doped quantum wells can be calculated from solving schrödinger equation jointly with the poisson one (sp) [5,6]. there exist several difficulties for simulations of quantum structures in silicon, namely, anisotropy of the received july 8, 2014; received in revised form november 22, 2014 corresponding author: volodymyr grimalsky ciicap and the faculty of sciences on chemistry and engineering, autonomous university of state morelos (uaem), av. universidad 1001, 62209 cuernavaca, mor., mexico (e-mail: v_grim@yahoo.com) 104 v. grimalsky, o. oubram, s. koshevaya, c. castrejon-martinez effective electron mass and slow convergence of sp method in the case of an arbitrary initial approximation. the investigation of -doped quantum structures is possible with a simpler approach based on the statistical thomas-fermi (tf) method [6-8]. the preference is the separation of the complex problem into the sequential ones, where the wave functions are computed after the found solution of the potential energy. in the case of axially symmetric high doping, the electron potential energy depends of the radius only. also the combined method can be applied, where the final result of tf simulations is used as a starting one for sp [9]. moreover, a comparison between sp method and tf one shows that the simple tf method gives a good approximation for the electron energy sub-levels and the total electron concentration within the -doped quantum wells [9]. in this paper the application of the tf method to calculate the electron spectrum in the quantum wells formed by highly doped n-si quantum wires is presented under finite temperatures t, and many-body effects, like exchange, are taken into account [5,7]. the electron potential energy and the total electron concentration are calculated from a single equation solved by the newton method. then the wave functions and values of the electron energy sub-levels are computed from the schrödinger equation where two possible orientations of electron valleys are considered. the peculiarities of solving schrödinger equation in the case of the anisotropic electron mass are pointed out. 2. basic equations consider a single -doped electron quantum wire within n-si. below the atomic units are used for distances a0 * =2/(mce 2 ) 0.52 nm and for energy ry * = e 2 /(2a0 * )  0.12 ev, where mc =  2/3 (m 2 m||) 1/3  1.06 me  10 -27 g,  = 6 is the number of the lowest electron valleys in si. in the case of n-si the lowest valleys are lateral and the effective mass is anisotropic: m||, m. with non-dimensional variables the basic equation of tf method for the -doped electron quantum wire is [8]: 1 0 1 1 ( ) 8 { [ ] (2 exp( ) 1) ( )}; ( 0) 0, ( ) 0; d d d e vd dv r n v n n r r dr dr t dv r v r dr                 (1) where 3 / 2 1/ 2 1/ 3 1/ 2 1/ 2 1/ 2 0 2 17 3 1 1 0 0 1 [ ] ; 4 tanh(( / ) 1), ; 2 ( ); ( ) 0, ; 2 ( ) ; 1 exp( ) ( ) exp( ( / ) ); 10 . x c c x c d d c v vt n v t n n n n v n f n f n n n u du v u v n r n r r n cm                                  (2) here v(r) is the electrostatic electron energy, n is the total electron concentration; n1d and nd0 are 1d and 3d donor concentrations, respectively. vx is the many-body correction thomas-fermi method for computing the electron spectrum snd wave functions... 105 to the electron energy due to the exchange [7]. eq. (1) is the poisson equation for the electrostatic electron energy, where the electron concentration n[v] is calculated from the equilibrium statistical fermi distribution. note that at the 1d donor concentrations n1d0  10 21 cm -3 the exchange energy vx is comparable with the electrostatic one. the donor levels are assumed shallow and single charged. the concentration of 1d donors is high n1d0  10 20 cm -3 ; they are fully ionized. the 1d doping is localized at the distances r ~ r0  1 – 5 nm. the finite size of the highly doping region r0 is considered, because the distance unit a0 * in silicon is comparable with the size of the lattice cell. moreover, 1d doping cannot be approximated by the -function directly, because this approximation leads to the logarithmic singularity of the electron potential energy at r ~ 0. the results of simulations do not depend on the value of the critical electron concentration when nc  10 18 cm -3 . the position of the fermi level  has been obtained from the condition of the total neutrality of the semiconductor [6]: 1 0 [ 0] (2 exp( ) 1)d d e n v n t       (3) here ed is the donor energy with respect to the bottom of the conduction band. eq. (1) has been solved by the newton method [8]. .0|,0| )};()(][{8)( 1 )(8)( 1 ; 0 1 1            rrr d s d s s dss dr d rnvnvn dr dv r dr d r q v n v n dr d r dr d r vv       (4) note that in the derivative (n/v) the exchange correction vx does not vary. in the boundary conditions the parameter r is an enough big radius. in eq. (4) the parameter q  1 is chosen to provide better convergence [9]. the rapid convergence of the method has been demonstrated, even when the exchange energy has been taken into account. after calculation of the electron potential energy v(r), the energy sub-levels ej, the wave functions (wf) j(x,y) of the discrete spectrum of the well, and then the electron concentration n in each electron sub-level within the quantum well have been computed from the following schrödinger equations: ;])([)(ˆ )1()1()1( 2 )1(2 2 )1(2 )1( 1 jjjx jjc j evrv yxm m h         (5a) ;])([ˆ )2()2()2( 2 )2(2 || 2 )2(2 )2( 2 jjjx jcjc j evrv ym m xm m h         (5b) .)(;1)( 2/1222 yxrdxdy j       wf can be chosen as real, because the hamiltonians are real. there are two different orientations of electron valleys in silicon, as seen from eqs. (5). namely, eq. (5a) is for the isotropic case of the effective mass components, eq. (5b) is for the anisotropic case. 106 v. grimalsky, o. oubram, s. koshevaya, c. castrejon-martinez 3. simulations of potential energy and electron concentration the results of simulations of the electron potential energy v(r) and the electron concentration n(r) are presented in figs. 1,2. for all cases the volume doping is nd0 = 1·10 16 cm -3 , r0 = 2a0 *  1.04 nm. the previous simulations [10] demonstrated that the exchange correction is important for the doping levels n1d  10 21 cm -3 . but the total electron potential energy w = v + vx and the electron concentration n are practically the same as without this many-body correction. the potential energy depends on temperature t, as seen in fig. 2. this is due to the partial ionization of volume donors nd0 at low temperatures, as seen from eq. (3). but the electron concentration n does not depend on t. some difference is only at the periphery r >> r0. a) b) fig. 1 part a) is dependence of electron potential energy jointly with the exchange energy v+vx on the radius r. part b) is the dependence of the total electron concentration n(r). the values of the maximum doping are n1d0 = 310 21 cm -3 (curve 1), 10 21 cm -3 (curve 2), 310 20 cm -3 (curve 3), 510 19 cm -3 (curve 4), t = 300 k, r0 = 2a0 *  1.04 nm. the corresponding exchange energies vx are also presented there in the upper part of the part a). a) b) c) fig. 2 part a) is dependence of electron potential energy jointly with the exchange energy v+vx on the radius r for different temperatures t. part b) is the dependence of the electron concentration for different temperatures; part c) is the same as b) in details. curve 1 is for t = 300 k, 2 is for 200 k, 3 is for 150 k, 4 is for 100 k, 5 is for 50 k, 6 is for 20 k. the maximum doping is n1d0 = 310 21 cm -3 . thomas-fermi method for computing the electron spectrum snd wave functions... 107 4. wave functions and energy sub-levels after calculating the electron potential energy it is possible to simulate the electron energy sub-levels in the quantum well and the corresponding wf. to compute wf for the isotropic case (5a), where the effective masses are the same, the shooting method has been applied [11]. the axially symmetric wf (r) for the isotropic case, eq. (5a), are presented in fig. 3, a. the maximum doping level is n1d0 = 310 21 cm -3 , t = 300 k, as in fig. 1, a, curve 1. the dependence of the electron energy sub-levels, two lowest ones e1,2, on the maximum doping is presented in fig. 3, b, for t = 300 k. the dependence of the electron energy sub-levels on temperature is given in fig. 3, c. one can see that the difference e2 – e1 depends on the temperature t there. a) b) c) fig. 3 part a) is the axially symmetric wf for the case of isotropic effective mass; part b) is the dependence of the energy sub-levels on the maximum doping, t = 300 k; c) is the dependence of the energy sub-levels on the temperature t for the maximum doping n1d0 = 310 21 cm -3 . the anisotropic case, eq. (5b), with different effective masses has been solved by two simple methods, which are realized in the cartesian coordinate frame xoy. the first one is the variation method [12]. namely, the problem of the minimization of the functional of the electron energy is considered:                              dxdy dxdyh e 2 2 )( ˆ min (6) wf possesses different types of symmetry or antisymmetry in the plane xoy, due to the symmetry of the hamiltonian, eq. (5b). the probing functions for the symmetric case (±x, ±y)=(x,y) are chosen as: 2 2 1 01 01 2 2 2 2 2 02 02 02 02 exp( ( ) ( ) ); exp( ( ) ( ) )(1 ( ) ( ) ); x y x y x y x y a b x y x y           (7) 108 v. grimalsky, o. oubram, s. koshevaya, c. castrejon-martinez in the case of antisymmetry with respect to x (-x, ±y) = -(x,y) the probing functions are: );)()(1)()()(exp( );)()(exp( 2 02 2 02 2 02 2 02 2 2 01 2 01 1 y y b x x a y y x x x y y x x x   (8) analogously, it is possible to write down the probing functions for other types of symmetry or antisymmetry, i.e. with the multipliers y or xy. therefore, for the lowest wf there are two variation parameters x01, y01. for the second wf there are 3 independent variation parameters, because of the imposing orthogonality relation:        0 21 dxdy . (9) the second method is the search of the eigenvalues of the matrix of the hamiltonian by means of the iteration procedure [13,14]. for this purpose, wf has been expanded by the truncated fourier series. zero boundary conditions have been used at the periphery x = ±lx, y = ±ly, where the boundaries lx, ly are chosen enough large. namely, wf is represented by the vector, or the column of the coefficients of the expansion; the hamiltonian has been represented by the matrix. then the following iteration procedure has been applied [13,14]: ),( ),( ;)ˆ( 10 1 1 02 1        ss ss s ss ee eh (10) here ),( 1  ss is the scalar product of vectors, s in the number of iterations, e0 < 0 is the parameter that has been chosen from the condition of maximum convergence. usually, e0 is close to the lowest energy sub-level computed from the variation method. it is important that the matrix inversion can be realized in the simple manner, because of the diagonal domination of the shifted hamiltonian matrix )ˆ( 02 eh  . when the second wf is searched, it should be orthogonal to the first wf 1: ),/(),( 111 1 1 11   sss . after each iteration it is better to normalize the vector: .1),(  ss the initial values of the vectors  s=0 can be chosen as ones found earlier from the variation method. the iterations with the direct hamiltonian matrix )ˆ( 02 eh  diverge and cannot be applied. the profiles of wf for the two lowest sub-levels are presented in fig. 4 for the temperature t = 300 k and the maximum doping n1d0 = 310 21 cm -3 . the dependencies of the two lowest energy sub-levels in the quantum well on the maximum doping concentration for t = 300 k and on the temperature t for the maximum doping n1d0 = 310 21 cm -3 are given in fig. 5 for symmetric wf. thomas-fermi method for computing the electron spectrum snd wave functions... 109 a) b) c) d) e) f) g) h) fig. 4 wave functions computed for the case of anisotropic effective masses, t = 300 k, n1d0 = 310 21 cm -3 . part a) is the first symmetric wave function; the left panel is computed from the matrix iteration method, the right panel is from the variation method. part b) is the same for the second symmetric wave function. parts c) and d), e) and f), g) and h) are the first and the second wave functions correspondingly computed from the variation method for different types of symmetry or antisymmetry. 110 v. grimalsky, o. oubram, s. koshevaya, c. castrejon-martinez a) b) fig. 5 the dependence of two lowest energy sub-levels in the quantum well on the maximum doping concentration for t = 300 k (a) and on the temperature for n1d0 = 310 21 cm -3 (b). the case of anisotropic effective mass, eq. (5b), symmetric wf, is considered. the solid lines are the data obtained from the matrix iteration method, the dot lines are ones from the variation method. the variation method yields accurate values of the energy sub-levels. for instance, at t = 300 k and n1d0 = 310 21 cm -3 the values of the energy for the symmetric case calculated from the iteration procedure are e1 = -4.09 ry * , e2 = -1.94 ry * . the same values computed from the variation method are e1 = -4.075 ry * , e2 = -1.895 ry * . the profiles of wf are calculated from the variation method approximately; there is some difference at the periphery from those computed from the matrix iteration procedure. in the report [10] the electron spectrum has been calculated from the shooting method applied in the polar coordinate system. there is coincidence of the energy sub-levels with the data presented here, but that numerical realization of the shooting method is more complicated. the difference of the lowest energetic sub-levels e2 – e1 does not depend on the temperature for the anisotropic case, symmetric wf, see fig. 5, b. for the isotropic case, eq. (5a), this is not valid, see fig. 3, c. this can be explained by higher values of the electron sub-levels |e1,2| for the anisotropic case. it is possible to calculate wf more accurately also by means the standard simulators based on finite element methods, like comsol multiphysics [15]. 5. conclusions an application of tf method to the electron spectrum of quantum wires in n-si can be subdivided into two problems. the first one is the simulation of the electron potential energy from the simple ordinary differential equation. the iteration procedure demonstrates rapid convergence even when the many-body effects, like exchange, are taken into account. then it is possible to solve the schrödinger equations for the wave functions and the energy sub-levels. because of the anisotropy of the effective electron mass in silicon, this problem is generally two-dimensional. two simple methods have been proposed. the variation method yields accurate values of the energy sub-levels, whereas the profiles of the electron wave functions are approximate at the periphery. the method based on the inverse matrix iterations is more accurate both for the eigenvalues and the eigenfunctions. thomas-fermi method for computing the electron spectrum snd wave functions... 111 acknowledgement: the authors would like to thank to sep-conacyt (mexico) for a partial support of our work. references [1] d. w. drumm, a. budi, m. c. per, s.p. russo, and l.c. l. hollenberg. “ab initio calculation of valley splitting in monolayer δ-doped phosphorus in silicon”, nanoscale research lett., vol. 8, no 1, pp. 111121, jan. 2013. [2] b. weber, s. mahapatra, h. ryu, s. lee, a. fuhrer, t. c. g. reusch, d. l. thompson, w. c. t. lee, g. klimeck, l. c. l. hollenberg, and m. y. simmons, “ohm’s law survives to the atomic scale”, science, vol. 335, no 6064, pp. 64-67, jan. 2012. [3] f. j. ruess, w. pok, t. c. g. reusch, m. j. butcher, kuan eng j. goh, l. oberbeck, g. scappucci, a. r. hamilton, and m. y. simmons, “realization of atomically controlled dopant devices in silicon”, small, vol. 3, pp. 563 – 567, apr. 2007. [4] d.k. ferry, s.m. goodnick, and jonathan bird, transport in nanostructures, cambridge: cambridge univ. press, 2009. [5] l. ramdas ram-mohan, finite element and boundary element applications in quantum mechanics, oxford: oxford university press, 2002. [6] y. fu and m. willander, physical models of semiconductor quantum devices, dordrecht: kluwer, 1999. [7] l. m. gaggero-sager, “exchange and correlation via functional of thomas-fermi in delta-doped quantum wells”, modelling simul. mater. sci. eng., vol. 9, pp. 1-5, jan. 2001. [8] v. grimalsky, l. m.gaggero-s., s.koshevaya, and a.garcia-b., “electron spectrum of -doped quantum wells by thomas – fermi method at finite temperatures”, in proc. 27 th international conference on microelectronics (miel 2010), nis, serbia, 2010, pp. 119-122. [9] c. castrejon-martinez, v. grimalsky, l. m. gaggero-sager, and s. koshevaya, “combined method for simulating electron spectrum of delta-doped quantum wells in n-si with many-body corrections”, progress in electromagnetics research m, vol. 31, pp. 215-229, aug. 2013. [10] v. grimalsky, o. oubram, s. koshevaya, and c. castrejon-m., “thomas-fermi method for computing the electron spectrum of highly doped quantum wires in n-si”, in proc. 29 th international conference on microelectronics (miel 2014), belgrade, serbia, 2014, paper 049. [11] v. a. ilyina and p. k. silaev, numerical methods for theoretical physicists, vol. 2, moscow: institute for computing research publ., 2004 (in russ.). [12] k. t. hecht, quantum mechanics, springer, n.y., 2000. [13] w. h. press, s.a. teukolsky, w.t. vetterling, and b.p. flannery, numerical recipes in fortran, cambridge university press, cambridge, 1997. [14] j. stoer and r. burlish, introduction to numerical analysis. n.y.: springer, 2002. [15] s. m. musa, ed., computational finite element methods in nanotechnology. boca raton, ca: crc press, 2013 (www.comsol.com). https://www.sciencemag.org/search?author1=b.+weber&sortspec=date&submit=submit https://www.sciencemag.org/search?author1=s.+mahapatra&sortspec=date&submit=submit https://www.sciencemag.org/search?author1=h.+ryu&sortspec=date&submit=submit https://www.sciencemag.org/search?author1=s.+lee&sortspec=date&submit=submit https://archive.today/o/zfl0/http:/www.sciencemag.org/search?author1=a.+fuhrer&sortspec=date&submit=submit https://archive.today/o/zfl0/http:/www.sciencemag.org/search?author1=t.+c.+g.+reusch&sortspec=date&submit=submit https://archive.today/o/zfl0/http:/www.sciencemag.org/search?author1=d.+l.+thompson&sortspec=date&submit=submit https://archive.today/o/zfl0/http:/www.sciencemag.org/search?author1=w.+c.+t.+lee&sortspec=date&submit=submit https://archive.today/o/zfl0/http:/www.sciencemag.org/search?author1=g.+klimeck&sortspec=date&submit=submit https://archive.today/o/zfl0/http:/www.sciencemag.org/search?author1=g.+klimeck&sortspec=date&submit=submit https://archive.today/o/zfl0/http:/www.sciencemag.org/search?author1=l.+c.+l.+hollenberg&sortspec=date&submit=submit https://archive.today/o/zfl0/http:/www.sciencemag.org/search?author1=m.+y.+simmons&sortspec=date&submit=submit http://www.comsol.com/ instruction facta universitatis series: electronics and energetics vol. 27, n o 4, december 2014, pp. 631 638 doi: 10.2298/fuee1404631j efficiency limits in photovoltaics – case of single junction solar cells  marko jošt, marko topič university of ljubljana, faculty of electrical engineering, ljubljana, slovenia abstract. the conversion efficiency of solar energy into electrical energy is the most important parameter when discussing solar cells, photovoltaic (pv) modules or pv power plants. so far many papers have been written to address the limiting efficiency of solar cells, the theoretical maximum conversion efficiency an ideal solar cell could achieve. however, most of the researches modelled sun’s spectrum as a blackbody which does not represent a realistic case. in this paper we have calculated the limiting efficiency as a function of absorbers band gap at standard test conditions using the solar spectrum am1.5. in addition, the other key solar cells performance parameters (open-circuit voltage, short-circuit current density and fill factor) are evaluated while the intrinsic losses in the solar cells are also explained and presented in light of a cell temperature. key words: efficiency limit, single junction solar cell, loss mechanisms, am1.5 spectrum. 1. introduction the conversion efficiency is one of the most important parameters when discussing photovoltaic or any other energy conversion devices, telling us the ratio between output and input energy. besides the actual state of the art average or record efficiencies, theoretical limiting efficiency is also very important since it declares how much progress is still left to achieve. in photovoltaics, the first limiting efficiency for solar cells was calculated by shockley and queisser in 1961 [1]. in their work they assumed the detailed balance principle based on the second law of thermodynamics. they calculated limiting efficiency to be 30% for single junction solar cells modelling sun as a blackbody with t = 6000 k. later, other attempts in this field were also reported [2]–[4]. different techniques were examined in an attempt to achieve or even exceed the limiting efficiency. most popular are light management through optical light scattering [7], tandem solar cells and concentrator solar cells. other approaches, such as multiple exciton generation[8] and up-[9] and down-[10]conversion, are also researched. here we will focus on single-junction solar cells under terrestrial conditions. in the existing studies most of the researchers used sun’s blackbody radiation spectrum. this, however, does not represent realistic situation for terrestrial conditions as part of received july 28, 2014; received in revised form october 23, 2014 corresponding author: marko jošt university of ljubljana, faculty of electrical engineering, tržaškacesta 25, 1000 ljubljana, slovenia (e-mail: marko.jost@fe.uni-lj.si) 632 m. jošt, m. topič sun’s spectrum is absorbed in the atmosphere. in this paper we will use standard solar spectrum am1.5[11]. the two spectra are shown in fig. 1. the difference is clear, thus an analysis has to be done separately. we will determine the maximum efficiency under am1.5 one sun illumination for single junction solar cell at standard testing conditions together with the key performance parameters of the solar cells. in the second part of this paper intrinsic losses will be explained and calculated. also, a comparison between results underam1.5 or blackbody radiation spectrum will be shown. fig. 1 comparison between blackbody radiation spectrum at ts = 6000 k (g=1550wm -2 ) and solar spectrum am1.5 (g=1000 wm -2 ). same energy span for both spectra was considered here. 2. efficiency limit according to the detailed balance principle[1], in equilibrium everything that is absorbed has to be emitted. radiative recombinations are therefore the only recombination mechanism present in the solar cell and as such necessary and unavoidable. all other recombinations diminish the efficiency significantly. the emission from the solar cell in equilibrium follows the planck equation where the emission spectrum is described by the solar cell temperature tc: ( ) ( ) (1) e is the energy of the emitted photon, h is planck’s constant, c speed of light and k boltzmann’s constant. emissivity ε = 1 was used in the following calculations as black body was assumed. under the illumination the system is no longer in equilibrium. due to chemical potential between quasi-fermi levels the photon emission increases by ( ) following the normalised np product, where µ = q*voc is a chemical potential, q is elementary charge. absorbed incident photons create electron-hole pairs. at the open-circuit voltage voc generated electrons cannot be extracted as no load is connected to the solar cell. in the ideal efficiency limits in photovoltaics – case of single junction solar cells 633 solar cell they are all radiatively recombined and emitted. open-circuit voltage can therefore be derived by equalling recombinations rext (equation 2) and generations ppump (equation 3): ( ) ∫ ∫ ( ) (2) ∫ ∫ ( ) (3) (4) (5) eg denotes band gap, s(e) is spectrum of the solar radiation while ω and θ stand for solid and polar angle, respectively. if non-radiative recombinations rnr do occur, they can be described with external fluorescence efficiency: ηext<1. voc is then: ∬ ( ) ∬ ( ) (6) the current in the solar cell is the difference between the generated electrons from solar radiation and the recombined electrons, radiatively or non-radiatively, and is voltage v dependent. this gives us: ( ) ( ) ∬ ( ) ( ) ∫ ( ) (7) here, we consider only radiative recombinations since we assumed an ideal solar cell, where only loss in the solar cell is emission loss due to radiative recombination. therefore ηext = 1. the short circuit current jsc can easily be calculated from equation 7 by inserting v=0. maximum power can be obtained as a product of jmpp and vmpp, current and voltage in maximum power point, while efficiency, the most important factor when discussing solar cells, is the ratio between generated electrical power pel and incoming power from the sun pin. (8) where ff is the fill factor and calculated by the following equation: (9) the discussed parameters are presented in fig. 2 in band gap dependency. the first graph shows the famous sq limit for the two spectra. the peak is 33.8% at 1.34 ev for am1.5 and 31.4% at 1.29 ev for blackbody radiation. voc increases linearly with band gap while jsc decreases due to less photons absorbed at higher band gap energies. the discussed parameters of c-si and gaas record solar cells are also inserted in the graphs. 634 m. jošt, m. topič fig. 2 graphical presentation of solar cell parameters for am1.5 (blue line) and blackbody radiation (green line) at tc = 25°c with inserted points for record c-si [12] and gaas [13] solar cells since most of the papers are based on blackbody radiation we decided to show the comparison between blackbody radiation at ts = 6000 k and am1.5 spectrum for all the basic parameters of the solar cell. to calculate parameters with blackbody radiation only am1.5 spectrum data has to be replaced with blackbody formula. while there is not much difference at voc and ff, clear difference between generated currents can be observed. this is a result of a higher number of photons if we consider the blackbody radiation. the efficiency of the solar cell is higher at am1.5 despite less current due to lower incident power density. in table 1 we present the comparison between theoretical efficiency limits and the achieved record efficiencies for crystalline silicon (c-si) and crystalline gallium arsenide (gaas) solar cells. the columns 3 and 4 show that the material properties are very important at determining the limiting efficiency of solar cells while record solar cells are still some way below theoretical limits. the j-v characteristics for the record and the ideal solar cell for both c-si and gaas are presented in fig. 3. since there is no j-v data about 28.8% gaas solar cell, the one of the 27.6%solar cell [14] is used instead. the record c-si cell exhibits better utilization of incident photons (higher jsc compared to jsc_ideal), while the record gaas cell exhibits better utilization of photovoltage (higher voc compared to voc_ideal). in both cases and in particular in thin-film solar cells there is room for further improvements [11] efficiency limits in photovoltaics – case of single junction solar cells 635 table 1 comparison between efficiencies for si and gaas solar cells, solar spectrum am1.5 material eg limit at eg limit for the material record cell[13] c-si 1.12 ev 33.2% 29.8% 25.6% gaas 1.42 ev 33.4% 33.4% 28.8% fig. 3 i-u characteristics for the ideal and record c-si (a) and gaas (b) solar cell at stc (am1.5, tc=25°c) 3. loss mechanisms as shown in the previous section the efficiency limit is 33.8%. where is the rest of the power lost? in this section we will explain intrinsic losses in a solar cell. since the amount of absorbed incident photons is strongly related with band gap, the biggest losses are spectral losses. other losses, such as emission, carnot and boltzmann losses, also contribute to the lower efficiency of the solar cells. 3.1. spectral losses the band gap is the most important parameter when determining efficiency. the photons with energy below the band gap do not have enough energy to generate an electron-hole pair and are therefore transmitted and not absorbed. such losses are named below band gap losses. they can be calculated by the following equation where we integrate the am1.5 spectrum for all the energies below the band gap. ∫ ( ) ∫ ( ) (10) the photons with energy above the band gap are absorbed in the active layer, creating free electron-hole pairs. the excessive energy, difference between photon’s energy and the band gap, however, is lost in a thermalization process where the generated electron thermalizes from the conductive band to its edge. such losses are named above band gap losses or thermalization losses. ∫ ( ) ∫ ( ) (11) 636 m. jošt, m. topič 3.2. emission loss the free electrons generated by the incoming photons are not stable and eventually drop back to the valence band where they recombine. recombination results in a phonon if the recombination is non-radiative or in photon if it is radiative. here we assumed only radiative recombinations occur in the solar cell as they are unavoidable due to detailed balance principle where everything absorbed has to be emitted. the emission loss can be calculated as a radiation from a blackbody at the maximum power point. ( ) ∫ ( ) (12) 3.3. energy less than band gap ideally the open circuit voltage would be equal to the band gap. however, in application open circuit voltage is lower and corresponds to the potential difference between quasifermi levels while voltage in the maximum power point is even lower. this is a result of carnot and boltzmann factor. carnot factor appears as the conversion from thermal to electrical work needs some energy [15], while boltzmann factor is a consequence of unequal solid angles of absorption and emission. (13) (14) the symbol ωe denotes solid angle of emission and ωa is solid angle of absorption. their values are π and 6.8221e-5, respectively. (a) (b) fig. 4 losses in solar cell for am1.5 at tc=0 k (a) and tc = 298.15 k = 25°c (b) the structures of the losses are presented in fig. 4. first we assumed the temperature of the solar cell to be 0 k. by observing equations 1, 12, 13 and 14, we see that at tc = 0 k the emission, carnot and boltzmann losses all equal to 0. the only loss mechanism present are spectral losses that are not temperature dependant, therefore the efficiency increases. maximum efficiency is now 49.1% at 1.14 ev. such a state is shown in fig. 4 a. efficiency limits in photovoltaics – case of single junction solar cells 637 second, we calculated the losses for stc conditions which demand solar cell temperature to be 25°c (298.15 k). the result is shown in fig. 4 b. the spectral losses present most of the losses. thermalization losses decrease with the band gap while below band gap losses increase due to less photons absorbed. the emission loss presents only a small fraction while carnot and boltzmann losses are not insignificant. the maximum efficiency at tc = 25°c is 33.8% at 1.34 ev. conclusions we have calculated efficiency limit of single junction solar cells for standard solar spectrum am1.5 under stc conditions. the peak efficiency is 33.8% at 1.34 ev. the basic solar cell parameters – efficiency, jsc, voc and ffwere derived and shown in band gap dependency. the blackbody radiation and solar spectrum am1.5 comparison was also shown to emphasize the difference between the two spectra. in addition, intrinsic losses in solar cells were explained and discussed. spectral losses, due to unabsorbed photons with energy below band gap or thermalization process of absorbed photons, contribute to over 50% drop of efficiency. attention was also paid to losses that are present in the solar cell at tc = 0 k, where only spectrum losses would have been present and increasing the efficiency limit under am1.5 spectrum to 49.1%. acknowledgement: we would like to thank j.r.sites and b. lipovšek for fruitful discussions. the work has been funded by the slovenian research agency under the research programme p2-0197. references [1] w. shockley and h. j. queisser, “detailed balance limit of efficiency of p‐n junction solar cells,” j. appl. phys., vol. 32, no. 3, pp. 510–519, mar. 1961. [2] g. araujo and a. marti, “absolute limiting efficiencies for photovoltaic energy-conversion,” sol. energy mater. sol. cells, vol. 33, no. 2, pp. 213–240, jun. 1994. [3] t. tiedje, e. yablonovitch, g. d. cody, and b. g. brooks, “limiting efficiency of silicon solar cells,” ieee trans. electron devices, vol. 31, no. 5, pp. 711–716, may 1984. [4] l. c. hirst and n. j. ekins-daukes, “fundamental losses in solar cells,” prog. photovolt. res. appl., vol. 19, no. 3, pp. 286–293, may 2011. [5] c. h. henry, “limiting efficiencies of ideal single and multiple energy gap terrestrial solar cells,” j. appl. phys., vol. 51, no. 8, pp. 4494–4500, aug. 1980. [6] o. d. miller, e. yablonovitch, and s. r. kurtz, “intense internal and external fluorescence as solar cells approach the shockley-queisser efficiency limit,” ieee j. photovolt., vol. 2, no. 3, pp. 303–311, jul. 2012. [7] e. yablonovitch and g. d. cody, “intensity enhancement in textured optical sheets for solar cells,” ieee trans. electron devices, vol. 29, no. 2, pp. 300–305, feb. 1982. [8] a. j. nozik, m. c. beard, j. m. luther, m. law, r. j. ellingson, and j. c. johnson, “semiconductor quantum dots and quantum dot arrays and applications of multiple exciton generation to thirdgeneration photovoltaic solar cells,” chem. rev., vol. 110, no. 11, pp. 6873–6890, nov. 2010. [9] t. trupke, m. a. green, and p. würfel, “improving solar cell efficiencies by up-conversion of sub-bandgap light,” j. appl. phys., vol. 92, no. 7, pp. 4117–4122, oct. 2002. [10] t. trupke, m. a. green, and p. würfel, “improving solar cell efficiencies by down-conversion of highenergy photons,” j. appl. phys., vol. 92, no. 3, pp. 1668–1674, aug. 2002. [11] “solar spectral irradiance: air mass 1.5.” [online]. available: http://rredc.nrel.gov/solar/spectra/am1.5/. [accessed: 19-mar-2014]. [12] “panasonic hit(r) solar cell achieves world’s highest energy conversion efficiency of 25.6% at research level | headquarters news | panasonic global.” [online]. available: http://panasonic.co.jp/ corp/news/official.data/data.dir/2014/04/en140410-4/en140410-4.html. [accessed: 08-jul-2014]. 638 m. jošt, m. topič [13] m. a. green, k. emery, y. hishikawa, w. warta, and e. d. dunlop, “solar cell efficiency tables (version 44),” prog. photovolt. res. appl., vol. 22, no. 7, pp. 701–710, jul. 2014. [14] b. m. kayes, h. nie, r. twist, s. g. spruytte, f. reinhardt, i. c. kizilyalli, and g. s. higashi, “27.6% conversion efficiency, a new record for single-junction solar cells under 1 sun illumination,” in 2011 37th ieee photovoltaic specialists conference (pvsc), 2011, pp. 000004–000008. [15] e. fermi, thermodynamics. courier dover publications, 1956. development of an iot system facta universitatis series: electronics and energetics vol. 31, n o 3, september 2018, pp. 329-342 https://doi.org/10.2298/fuee1803329r development of an iot system for students' stress management branka rodić-trmčić 1 , aleksandra labus 2 , zorica bogdanović 2 , marijana despotović-zrakić 2 , božidar radenković 2 1 medical college of applied studies in belgrade, serbia 2 faculty of organizational science, university of belgrade, serbia abstract. this paper shows the development of an iot system for students' stress management. the iot system is developed in an open architecture and is an integral part of the educational ecosystem. the system is composed of two elements: the one that enables measurement of vital parameters for identifying stress in students, and the other for stress control. the system for stress control consists of a mobile health application featuring relaxation content. such system should minimize the excitement and have an impact on reducing future stress. the iot system for stress management was evaluated in a real environment, during students’ thesis defense on faculty of organizational sciences, university of belgrade. the results show that time spent using mobile health application with relaxing content can reduce students’ physiological arousal and excitement during thesis defense. key words: internet of things, wearable computing, stress management, education, students 1. introduction the internet of things (hereinafter: iot) is the paradigm of the modern world in which people and devices are linked and communicate with each other. the human dimension of the internet of things leads to its role in the healthcare sector and it can change the health behavior for the better. until present, different technological solutions have been developed with the aim of improving healthcare and conducting preventive actions [1], in various areas, including the field of stress management. stressful situations can generate arousal and anxiety. when a person is aroused, the body is under the stress. stress activates the sympathetic nervous system, and its activation causes different reactions in the human body, such as the production of sweat, a heart rate increase and muscle tension. although stress plays a positive role in performance, too much received november 23, 2017 corresponding author: božidar radenković faculty of organizational science, university of belgrade, jovana ilića 154, 11010 belgrade, serbia (e-mail: boza@elab.rs) 330 b. rodić-trmĉić, a. labus, z. bogdanović, m. despotović-zrakić, b. radenković stress or repeated stress can have negative effects. stress has been recognized as one of the leading problems in healthcare and has a high impact on people’s health. long-term repetitions of stress manifestations in any population can be used as a predictor of other health conditions and disorders. research shows that the most pronounced sources of stress are in the domain of selfcognition and school life [2]. there is a frequent occurrence of stress in students during studies. it can be caused by changes in habits, separation from family members, or excitement in taking an exam [3]. many students face a variety of stressful situations during an exam, which can negatively affect the result of the exam. at the same time, poor results frequently do not mean less intelligence or student knowledge [4]. to cope with stress moments and prevent a repetition of stress, it is important for students to manage the situation that can produce negative arousal. at the same time, providing information of arousal in students in real time to teachers is an important part to prevent stress. such biofeedback enables teachers to adapt classes or exam method and thus make the environment that is more favorable for the student. stress management in this domain refers to using different technologies with the aim to measure and control a person’s levels of stress. the presence of stress can be identified through the measurement of different vital parameters of a user’s body with various iot and wearable systems. commercial iot systems are not with open apis, they can hardly be programmed and customized, which makes them hard to integrate with other e-education services. in order to make this integration possible, we have chosen an approach based on open architecture. an open architecture ensures that the developed iot system for stress management becomes an integral part of an educational ecosystem. in addition, an important part of the developed system for stress management is mobile health application with relaxation content that should minimize the students’ excitement and have an impact on reducing future stress. the pilot project is evaluated in a real environment, at faculty of organizational sciences, university of belgrade, during students’ thesis defense. besides students' stress management, our motivation for this work is being raised by educational aspect, too. the educational aspect represents advocating for students to be provided with fundamentals of iot technologies, applications, and devices. further, an assertion is on introducing the iot application potential, therefore students should be able to use and further develop the iot system. the rest of the paper is organized as follows: section 2 summarizes previous work in the field of stress management. section 3 is dedicated to the presentation of the development and implementation of the iot system for stress management. section 4 discusses the research methods for experiment setup and data collection. experimental results and discussion are provided in section 5. section 6 concludes the paper. 2. related work and motivation in recent years, iot technologies were widely applied through many projects in various health areas. technologies based on iot, like mobile technologies and wearable computing, are frequently used in the area of self-measuring of health metrics. those measurements are used for health promotion, well-being, and stress management where they can detect changes in vital parameters and play a role in the improvement of human behavior. different kinds of wearables and sensors, used individually and in combination, development of an iot system for students' stress management 331 are designed for tracking vital parameters that are indicating stress or arousal symptoms, such as heart rate, gsr, blood pressure, and others. heart rate is one of the health status indicators, a manifestation of the presence of an arousal. heart rate sensor is often used in sport [5][6] and fitness [7] equipment. in everyday life, heart rate sensor can improve identification of stress indicators and help to provide stress management [8] [9]. smart phones take a big part in stress management. some devices have embedded sensors for detecting and monitoring behavior indicative of stress or depression [10]. others have implemented biofeedback, i.e. therapeutic tools for stress and anxiety [11] [12]. ahtinen proposed a mobile application for stress management that contains four intervention modules for mental wellness-training [13]. the study that recruited 15 participants from a university has shown significant improvements in stress and life satisfaction in respondents. one commercial solution for stress management is biosync technology. patient’s heart rate, galvanic skin response, and movements are measured. data are automatically processed to determine patient’s stress level, and in addition, the patient is provided with preventive measures to have a better life [14]. a number of papers have looked into measuring the psychological state and physical reactions in students [15] and their academic success, as well as calculating their correlation [16] [17], or the improvement of student’s mental health [8]. the research carried out by shen, wang & shen [18] used psychological signals to predict emotions. they investigated the presence of different emotions in the studying process and proposed a sensible e-learning model. the data were acquired using three sensors: skin conductivity sensor measuring electrodermal activity, photoplethysmograph sensor measuring blood pressure, and eeg sensor measuring brain activity. measurements were taken over several weeks on one subject in the natural environment, the closest possible to the everyday environment. in the study [19], a heart rate sensor was implemented together with a skin conductivity sensor, accelerometer and temperature sensor in a natural context – public appearance of phd students in front of an audience, where significant variations in the values of measured vital parameters were observed. a feedback to the speaker was implemented in a form of talk assistant that sends information about heart rate to trigger relaxation. in this research, we want to design and implement a system to identify the psychophysiological signals indicating stress during students’ thesis defending. most of the research uses different kinds of devices to recognize stress in students, but few of them are coping with the stress management in a real environment [20] [21]. our goal is to provide a solution that would enable monitoring of body manifestation of arousal and help students to manage their stress during thesis defense. a mobile application with the relaxation content should minimize the arousal in students and make an impact on reducing stress during thesis defense. the influences of the mobile application on the presence of stress and anxiety levels, or change the behavior of students in certain contexts will be analyzed. 3. an iot system for stress management sensors and smart phones in monitoring health status usually imply: collecting data from the sensors; providing support to the user through a display with the measured values; sharing the information; ensuring the low-power devices, wearability, precision, longevity and reliability of devices. 332 b. rodić-trmĉić, a. labus, z. bogdanović, m. despotović-zrakić, b. radenković iot system architecture for stress measurement consists of three parts: a wearable system for monitor vital parameters, a mobile health application featuring relaxation content and cloud platform. in addition, we have developed services for connecting the components, hosts, and users (see fig. 1). the wearable system requires components such as a heart rate sensor, a microprocessor to process obtained data, and a wireless internet connection that allows the participants to freely move during their activities. the second component, the mobile application featuring relaxation content can be installed on any android device that has a wireless connection to the internet. cloud platform collects sensor data from the wearable system, analyzes collected data, and follows browsing history from the mobile application. fig. 1 components of the iot system for stress management 3.1. intelligent devices one of the main challenges is designing an iot system that is wearable, low-power, reliable and precise. commercial solutions often satisfy the above characteristics. one such solution is xiaomi mi band (shown in fig. 2), which is completely wearable. the device possesses a bluetooth module that enables communication with appropriate mobile application. as the most of the commercial solutions, it comes with no open apis and, thus, it is not programmable and customizable. accordingly, there is no possibility to gather sensor data and store in the database for real-time monitoring, further analytics, biofeedback, nor integration with other educational services. development of an iot system for students' stress management 333 fig. 2 commercial wearable wristband xiaomi mi band 2 the wearable system developed in this pilot project is not completely wearable as commercial solutions, but it provides open apis that enable adjustment and integration with other systems in accordance with our needs. fig. 3 shows the components of the wearable system for heart rate measurement. fig. 3 schematic view of components of a wearable system for heart rate measurement in the implementation, we used pulse sensor amped, plug-and-play heart rate sensor for arduino. there is added amplification and noise cancellation circuitry to the hardware of the sensor [22]. it works with either 3v or 5v lilypad arduino. a heart rate sensor records the user’s heartbeats and is an important parameter in evaluating the arousal or the exposure to stress. the technology of measurement is usually based on two beams of light of different wavelengths that are focused on the human nail tip. the measured signal can then be obtained by a photosensitive element. heart rate sensor, on the other side, is connected to arduino lily pad (atmega32u4), and raspberry pi microcomputer. a usb cable was used to connect raspberry pi and arduino lily pad. the system is packaged in a plastic box with two bracelets that enable easily wearing the device. heartbeat sensor is attached on the fingertip that enables more reliable measurement without excess sensor moves (see fig. 4). 334 b. rodić-trmĉić, a. labus, z. bogdanović, m. despotović-zrakić, b. radenković fig. 4 position of wearable device during project evaluation the software was implemented using the python programming language. in the created prototype, communication layer consists of communication and networking connection with the wi-fi access integrated into devices with the support of required software [23]. obtained sensor data are transmitted to the cloud using the raspberry pi with wi-fi module (802.11b/g/n). wi-fi module and mobile phone with health application are connected to a local secured wi-fi network. the mobile application connects to web services on the remote cloud through mobile wi-fi access with low power consumption so that user can carry the device for a longer period of time. the power supply is provided via power bank battery attached to the plastic box of the device. the charged battery provides about 2 hours of continuous measurement. 3.2. mobile health application the android application with relaxation content was created in the android studio 1.2.1 programming environment, using the java programming language. contents that could have a relaxing effect were implemented in the app: funny sports scenes, beautiful nature photos, relax natural sounds [24] [25]. the content was taken from youtube. the content watched by users is recorded on the cloud platform. fig. 5 shows some screens of the application content. fig. 5 respondent relaxing with mobile health application development of an iot system for students' stress management 335 3.3. cloud platform the cloud platform is mainly responsible for data classification and storage. after receiving a piece of data from arduino, raspberry pi has a role to read the data from the arduino processor through the serial port. in case there are no errors, raspberry pi sends data to microsoft azure iot hub platform. raw data are analyzed using microsoft azure stream analytics processing job. the result of the analytics process is a stream of aggregated data, which is stored in mssql database on the cloud. 4. research methods the aim of this evaluation is to identify the psychological signals indicating arousal during students’ thesis defending. in addition, we examine and compare the arousal presence in respondents who used the mobile health application with relaxation content (experimental group) to the respondents who didn't use it (control group). significance should indicate if the stress during thesis defense can be managed with relaxing content delivered through the mobile application. 4.1. instruments for the purpose of evaluation, a general, non-anonymous questionnaire was created. it consisted of demographic questions and it was filled out by students before the first phase of testing. in order to evaluate anxiety level before testing, after the test, and after relaxation following the completed test, spielberger’s [26] text anxiety inventory (stai) test is used. the test is a self-report instrument for measuring anxiety and it consists of 20 questions. the answers were provided in the form of a 4-point likert scale, namely: 4 – very much so, 3 – moderately, 2 – a little, 1 – not at all and respondents rated the extent to which each statement is true for them. the instrument was referred to the student’s state at the given time, not an earlier period. the students’ stai test scores were categorized into low (20 to 40), moderate (41 to 50) and high (51 to 80) [17]. 4.2. participants the evaluation sample consists of students of faculty of organizational sciences, university of belgrade. the participants were students of the master studies of the department of e-business. all participants had previously attended theoretical and practical courses in e-business, internet technologies, mobile technologies, and iot, so they were familiar with sensors and usage of wearable devices. in total, 26 students successfully finish the evaluation of the pilot project. all of the participants were informed about the testing and they signed a consent form. the participants were from 23-30 years old. there is evidence that health-risk behaviors (smoking cigarettes, lack of physical activity, etc.) influence higher levels of stress in students [27]. in the sample, most of the respondents were physically active regularly or at least occasionally (couple times a week, at least half an hour) and they were non-smokers at most. the half of the students had a job. table 1 shows descriptive statistics of the sample. 336 b. rodić-trmĉić, a. labus, z. bogdanović, m. despotović-zrakić, b. radenković table 1 descriptive statistics of the sample characteristic gradation frequency percentage (%) sex male female 10 38 16 62 smoker yes no 2 8 24 92 physical activity regularly occasionally never 12 46 12 46 2 8 employed yes no 14 54 12 46 4.3. experimental design experiment session lasted approximately 45 minutes for each participant. each respondent was situated in a pleasant classroom and connected to the wearable system after arrival. respondents wore the wearable device on the left hand and the pulse sensor was placed on the index finger. there were no sensors on the right hand and writing was enabled. at the beginning, respondents filled out a general questionnaire, an stai test and signed a consent to participate in testing. the sample was divided into two groups: experimental and control. after completing the questionnaires, respondents from experimental group were given a tablet with a pre-installed android application with relaxation content and short instructions on how to use it. the respondent had 15 minutes to relax and use the application content. during the use of the mobile application, the duration of the use and the browsed content were tracked and saved in the cloud. also, the obtained values of vital parameters for the respondent were recorded. the respondents from the control group had the same procedure, except they were relaxing without content on the mobile application. scheme of experimental protocol is shown in fig. 6. fig. 6 protocol and testing phases development of an iot system for students' stress management 337 4.4. statistical analysis measured and obtained data from the heart rate sensor were averaged for all three phases individually. thus, we got three values of heart rate for each respondent. for further analyses, totaled averaged measures on the pre-test were then subtracted from totaled averaged measures on the post-test, determining the difference between the two measures. data analyses were performed using spss (v20.0). in all analyses, results were considered statistically significant at the p ≤ 0.05 level or p≤0.001. in the first step of the analysis, we tested differences between arousal occurrences in students’ heart rate through three phases: pre-test, test and test phase. differences between two phases of measurement, regardless of group, were tested with wilcoxon signed-rank test. in the second step, we tested statistical differences in heart rate difference between control and experimental group through phases. for that purpose, we used students ttest. when testing for normality, shapiro-wilk test was used, and levene’s test for equality of variances. in case that normality or equality of variances was not met, data were tested with mann-whitney u test [28]. 5. results and discussion 5.1. arousal through testing phases we have investigated if there was a statistical difference in students’ heart rate from pre-test phase and test phase in which they defended their thesis. the results have shown that heart rate in the test phase is significantly higher (z=-2.66, p<0.05) than in pre-test phase. this is an expected result, as a defense is a stressful event. it was expected that after increased arousal during pre-test and test phase, respondents’ heart rate became lower in post-test phase, as a stressful event was finished. there is a strong statistical significance that measured values of heart rate at post-test phase are lower than heart rate values in pre-test phase (z=-3.92, p<0.001) as well as in test phase (z=-4.01, p<0.001). post-test phase can be considered as the most calming phase from the beginning of the test. figure 7 shows the distribution of differences of averaged heart rate between pre-test and posttest phase among experimental and control groups. mean value (shown as a vertical line in a box) is higher in the control group (-9.46) than in the experimental group (-5.92). 5.2. stress management in experimental and control group through testing phases there were statistical differences (t(24) = -3.72, p<0.05) in arousal, measured by subtraction of heart rate between pre-test and test phase, between the control and the experimental groups, regarding the use of the mobile health application with relaxation content in pre-test phase. according to the results, an arousal of the experimental group was decreased, probably because they were treated with relaxation content during pre-test phase. also, there is a strong statistical difference (mann–whitney u=17.5, p<0.05) in heart rate during the test and post-test phase, between control and experimental groups. distribution of averaged heart rate values between two phases is shown in fig. 8. the experimental group’s values do not deviate from test to post-test phase as they deviate in 338 b. rodić-trmĉić, a. labus, z. bogdanović, m. despotović-zrakić, b. radenković the control group (shown as the difference on the figure). the greater deviation, shown as difference, points to higher arousal in the test phase. fig. 7 distribution of differences in measured averaged heart rate values between pre-test and post-test phase fig. 8 distribution of averaged heart rate values during test and post-test phase and difference between them in experimental and control group development of an iot system for students' stress management 339 since students used the mobile health application with relaxation content in the posttest phase, it could be an explanation for the significant difference in arousal between the two groups. students’ heart rate values were significantly lower in post-test phase than during the pre-test phase but there is no statistical significance in differences between the two phases among groups. regardless of the relaxation content on the mobile health application, students’ arousal were decreased in the post-test phase. table 2 presents the participants’ average time spent at different contents of the mobile health application with relaxation content. before and after defense, participants spent the longest time watching fun sports clips in total, and the longest average time they kept in continuously was watching fun sports clips and then listening to relax music. table 2 distribution of time spent at different contents of the mobile health application with relaxation content content total of time spent on content (hh:mm:ss) average time on content (hh:mm:ss) fun sport 05:46:18 00:06:56 relax music 04:10:21 00:04:49 relax photo 03:02:31 00:04:21 5.3. the results of the stai test table 3 shows participants distribution by stai test points in both experimental and control groups. table 3 participants distribution by stai test results anxiety gradation test phase/group low moderate high pre-test – total 20 5 1 experimental 12 1 0 control 8 4 0 test – total 22 3 1 experimental 12 1 0 control 10 2 1 post-test – total 25 1 0 experimental 12 1 0 control 13 0 0 almost 20% of students reported moderate anxiety in pre-test phase, e.g. a moment before thesis defense. after the test, only 11.5% of respondents reported moderate anxiety, and after relaxation after defense, in post-test phase, just one respondent (3.8%) had moderate anxiety. high anxiety was present in pre-test and test phase in 3.8% of the sample from the control group. in pre-test and test phase there were more respondents with moderate and high anxiety in the control group than in the experimental group. there is a statistically significant change in stai pre-test scores that are lower than stai post-test scores (z=-3.902, p<0.001). also, stai post-test scores are lower that 340 b. rodić-trmĉić, a. labus, z. bogdanović, m. despotović-zrakić, b. radenković stai test scores (z=-3.137, p<0.05), but there is no significant change in scores between stai pre-test and stai test scores. figure 9 shows the distribution of stai scores in pre-test, test and post-test phases. median of stai scores (shown as a horizontal line in boxes) and mean (x mark in the box) is the highest in pre-test phase and the lowest in post-test phase. fig. 9 distribution of scores of stai test through three phases there was no correlation between stai test results and the level of heart rate through phases. low anxiety in participants during thesis defense could be explained either with too subjective answers on stai test because the test wasn’t anonymous or by the fact that the students are well prepared and relaxed with their professors. 6. conclusion the solution proposed in this paper demonstrates one of the ways of integrating the concepts of electronic health, mobile health, internet of things and wearable computing. the implemented iot system allows monitoring respondents’ vital parameters. mobile health application with relaxation content can have an impact on reducing arousal in respondents during the defense of the thesis. the proposed solution is applicable in education environments. unlike the conventional educational system, the proposed solution enables biofeedback. information about students’ arousal is sent to professors on which they can change the exam flow. in addition, the iot solution proposed in this paper can be successfully applied for stress monitoring in different life situations. the values obtained in the research indicate that the use of a mobile health application with relaxation content had a significant effect on decreasing students’ arousal before and during thesis defense. there is no evidence that the application can help students to calm down after they defend their thesis. development of an iot system for students' stress management 341 the major contributions of this paper can be summarized as follows:  the way of the realization of stress measurement and stress control through iot concept and mobile health in an education environment.  development of the iot infrastructure for stress measurement and control.  introducing a new smart healthcare service into the education system, providing biofeedback to subjects involved in the education process. the main limitation of the study is the small and homogeneous sample. also, there is no evidence whether the students were feeling stressed before the test for both the control and the experimental group that could have an effect on their baseline heart rate. in addition, the future work should include students from different departments where there is a different attitude towards professors. acknowledgement: the authors are thankful to ministry of education, science and technological development, republic of serbia, grant 174031. references [1] b. rodic-trmcic, a. labus, s. mitrovic, v. buha and g. stanojevic, “internet of things in e-health: an application of wearables in prevention and well-being,” in emerging trends and applications of the internet of things, igi global, 2017, pp. 191-197. [2] q. li, y. xue, l. zhao, j. jia and l. feng, “analyzing and identifying teens stressful periods and stressor events from a microblog,” ieee journal of biomedical and health informatics, vol. pp, no. 99, pp. 1-1, 2016. [3] n. sohail, “stress and academic performance among medical students,” journal of the college of physicians and surgeons pakistan, vol. 23, no. 1, pp. 67-71, 2013. [4] m. s. ali and m. n. mohsin, “test anxiety inventory (tai): factor analysis and psychometric properties,” iosr journal of humanities and social science (iosr-jhss), vol. 8, no. 1, pp. 73-81, 2013. [5] y. fu and j. liu, “system design for wearable blood oxygen saturation and pulse measurement device,” in proceedings of the 6th international conference on applied human factors and ergonomics (ahfe 2015) and the affiliated conferences, ahfe 2015, 2015. [6] suunto, “suunto foot pod mini,” 01 november 2015. [online]. available: http://www.suunto.com. [7] basis science, 02 november 2015. [online]. available: http://www.mybasis.com/. [8] a. millings, j. morris, a. rowe, s. easton, j. k. martin, d. majoe and c. mohr, “can the effectiveness of an online stress management program be augmented by wearable sensor technology?,” internet interventions, vol. 2, no. 3, pp. 330-339, 2015. [9] a. parnandi and r. gutierrez-osuna, “physiological modalities for relaxation skill transfer in biofeedback games,” ieee journal of biomedical and health informatics, vol. 21, no. 2, pp. 361-371, 2017. [10] s. saeb, m. zhang, c. karr, s. schueller, m. corden, k. kording and d. mohr, “mobile phone sensor correlates of depressive symptom severity in daily-life behavior: an exploratory study,” j med internet res, vol. 17, no. 7, p. e175, 2015. [11] h. al osman, h. dong and a. el saddik, “ubiquitous biofeedback serious game for stress management,” ieee access, pp. 1274-1286, 2016. [12] m. a. zafar, b. ahmed and r. gutierrez-osuna, “playing with and without biofeedback,” in proceedings of the ieee 5th international conference on serious games and applications for health (segah), 2017, perth, 2017. [13] a. ahtinen, e. mattila, p. välkkynen, k. kaipainen, t. vanhala, m. ermes, e. sairanen, t. myllymäki and r. lappalainen, “mobile mental wellness training for stress management: feasibility and design implications based on a one-month field study,” jmir mhealth uhealth, vol. 1, no. 2, p. e11, 2013. [14] s. falan, “wearable technology: examples from sweden,” 2016. [online]. available: http://salusdigital.net/ wearable-technology-examples-sweden/. [15] f. arriba-pérez, m. caeiro-rodríguez and j. m. santos-gago, “towards the use of commercial wrist wearables in education,” in proceedings of the 4th experiment@international conference (exp.at'17), 2017, faro, 2017. 342 b. rodić-trmĉić, a. labus, z. bogdanović, m. despotović-zrakić, b. radenković [16] t. dragon, i. arroyo, b. p. woolf , w. burleson, r. kaliouby and h. eydgahi, “viewing student affect and learning through classroom observation and physical sensors,” in intelligent tutoring systems, berlin, springer berlin heidelberg, 2008, pp. 29-39. [17] l. t. ping, k. subramaniam and s. krishnaswamy, “test anxiety: state, trait and relationship with exam,” malaysian journal of medical sciences, vol. 15, no. 2, pp. 18-23, 2008. [18] l. shen, m. wang and r. shen, “affective e-learning: using “emotional” data to improve learning in pervasive learning environment,” educational technology & society, vol. 12, no. 2, pp. 176-189, 2009. [19] m. kusserow, o. amft and g. tröster, “monitoring stress arousal in the wild,” pervasive computing, ieee, vol. 12, no. 2, pp. 28-37, 2013. [20] r. kocielnik, m. pechenizkiy and n. sidorova, “stress analytics in education,” in proceedings of the 5th international conference on educational data mining, chania, greece, 2012. [21] a. joshi, r. kiran and a. n. sah, “an experimental analysis to monitor and manage stress among engineering students using galvanic skin response meter,” work, vol. 56, no. 3, pp. 409-420, 2017. [22] world famous electronics llc., “pulse sensor amped,” 2017. [online]. available: https://pulsesensor.com/. [23] m. chen, y. zhang, y. li, m. m. hassan and a. alamri, “aiwac: affective interaction through wearable computing and cloud technology,” ieee wireless communications, vol. 22, no. 1, pp. 20-27, 2015. [24] j. j. alvarsson, s. wiens and m. e. nilsson, “stress recovery during exposure to nature sound and environmental noise,” int j environ res public health, vol. 7, no. 3, p. 1036–1046, 2010. [25] w.-c. wang, “a study of the type and characteristics of relaxing music for college students,” in proceedings of meetings on acoustics, providence, rhode, 2014. [26] d. c. spielberger, “test anxiety inventory: preliminary professional manual,” consulting psychologists press, 1980. [27] m. y. kwan, k. p. arbour-nicitopoulos, e. duku and g. faulkner, “patterns of multiple health risk– behaviours in university students and their association with mental health: application of latent class analysis,” health promot chronic dis prev can, vol. 36, no. 8, p. 163–170, 2016. [28] e. saperova and d. dimitriev, “effects of smoking on heart rate variability in students (545.6),” the faseb journal, vol. 28, no. 1, pp. 545-6, 2014. instruction facta universitatis series: electronics and energetics vol. 27, n o 3, september 2014, pp. 375 387 doi: 10.2298/fuee1403375d user-awareness and adaptation in conversational agents  vlado delić1, milan gnjatović1,2, nikša jakovljević1, branislav popović1, ivan jokić1, milana bojanić1 1 faculty of technical sciences, university of novi sad, serbia 2 graduate school of computer sciences, megatrend university, belgrade, serbia abstract: this paper considers the research question of developing user-aware and adaptive conversational agents. the conversational agent is a system which is useraware to the extent that it recognizes the user identity and his/her emotional states that are relevant in a given interaction domain. the conversational agent is user-adaptive to the extent that it dynamically adapts its dialogue behavior according to the user and his/her emotional state. the paper summarizes some aspects of our previous work and presents work-in-progress in the field of speech-based human-machine interaction. it focuses particularly on the development of speech recognition modules in cooperation with both modules for emotion recognition and speaker recognition, as well as the dialogue management module. finally, it proposes an architecture of a conversational agent that integrates those modules and improves each of them based on some kind of synergies among themselves. key words: conversational agent, user-awareness, adaptation, speech recognition, emotion recognition, speaker recognition, dialogue management 1. introduction context-awareness is certainly one of the most fundamental requirements for advanced conversational agents. recognition and interpretation of the user’s dialogue acts and dialogue management are always situated in a particular context. this is primarily due to the fact that many inherently present dialogue phenomena are context-dependent. thus, nonlinguistic contexts shared between the user and the system (e.g., graphical displays) may influence the language of the user to a high extent with respect to frequency of “irregular” utterances (elliptical and minor utterances, utterances containing anaphora and exophora, etc.) [1]. in addition, the user’s dialogue acts may fall outside the system’s domain, scope and semantic grammar, or contradict his earlier dialogue acts. this is even more the case when we consider users in non-neutral emotional states. forcing users to follow a preset grammar or interaction scenario is too restrictive, if possible at all, and  received april 30, 2014 corresponding author: vlado delić faculty of technical sciences, university of novi sad, trg dositeja obradovića 6, 21000 novi sad, serbia (e-mail: vlado.delic@uns.ac.rs) 376 v. delić, m. gnjatović, n. jakovljević, b. popović, i. jokić, m. bojanić would not be well accepted [2]. in such cases, the system needs a considerable amount of stored contextual knowledge to enable it to advance the conversation in spite of miscommunication and to maintain the dialogue’s consistency. however, the requirement for habitable natural language interfaces goes beyond pragmatics. another reason relates to the technology. speech recognition technology is still not accurate enough to deal with flexible, unrestricted language. in realistic settings, average word recognition error rates are 20–30%, and they go up to 50% for non-native speakers [3]. in certain conditions, speech recognition accuracies may degrade dramatically to an extent that systems become unusable even for cooperative users [4]. researchers generally agree that conversational agents need to incorporate dialogue context models in order to maintain a consistent dialogue and overcome technical deficiencies. yet, context is a complex construct and can be considered from different aspects. in this paper, we consider a restricted research question of how user-awareness may help in improving dialogue management. this paper summarizes some aspects of our previous work and presents work-in-progress. in the reported approach, we differentiate between two research lines:  user-awareness. the system is user-aware to the extent that it recognizes the user and his/her emotional states that are relevant in a given interaction domain.  user-adaptation. the system is user-adaptive to the extent that it dynamically adapts its dialogue behavior according to the user and his/her emotional state. at the methodological level, these two lines of research are fundamentally different. the first line relates to a statistical approach to the research problems of automatic speech recognition (asr), emotional speech recognition (esr), and speaker recognition. speech signal encodes not only information about the lexical content of the speaker’s dialogue act, but also information about the speaker’s voice characteristics that may be used for recognition of the speaker and his/her emotional state [5], [6]. the basic idea is to use data derived from both speech and language corpora, and apply automated analysis methods. although speech/speaker/emotion recognition technologies have a common foundation, they are usually developed and applied separately. we build upon our previous work [7]-[13], and investigate the possibilities to combine these technologies rather than to apply them separately. sections 2 and 3 discuss this in more detail. the second research line relates to a representational approach to natural language processing and dialogue management. in previous work, we introduced a representational model of attentional information in human-machine interaction that provides a framework for more robust natural language understanding and designing adaptive dialogue strategies [2], [14]-[17]. section 4 discusses the application of this model to designing useradaptive conversational agents. 2. acoustic information-based approach to user-awareness 2.1. speech recognition the task of automatic speech recognition is to translate spoken words into text. in order to accomplish this task, the reported speech recognizer exploits information about acoustic representations of phonemes, encapsulated in an acoustic model, and information about syntactic rules, encapsulated in a language model. the relation between words and phonemes is captured in a pronunciation dictionary where each word is segmented into at user-awareness and adaptation in conversational agents 377 least one sequence of phonemes. since each phoneme has several acoustic representations, as a basic modeling unit we use a context dependent phone referred to as triphone. the acoustic model is based on hidden markov models and gaussian mixture models. in order to reduce the model computational complexity and to achieve robust parameter estimation, similar states of triphones share parameters. the tree based clustering procedure presented in [18] is performed to find those similar states. the gaussians are modeled using the full covariance matrix, since they obtain more accurate acoustic representation in comparison to models with diagonal covariance matrix [19]. however, in this variant the computational complexity of log likelihood is significantly increased. to overcome this problem, several approaches have been developed and applied [20], [21], [22]. the system uses feature vectors consisted of 15 mel-frequency cepstral coefficients (mfcc), normalized energy and their first derivatives. the feature vectors are extracted from 30 ms speech segments, every 10 ms. the training set for the acoustic model contains recordings of both scripted and spontaneous utterances produced by several dozen speakers, with a total duration of about 200 hours [23]. language modeling is a special issue for highly inflected languages, since language models have to cover a range of grammatical categories (including tense, aspect, mood, case, etc.) and morphological derivations that involve the addition of prefixes and suffixes. in the currently predominant statistically-based approach to asr, language models are trained on large text corpora. however, simple n-gram based language models do not suffice for morphologically more complex languages without significant modifications [24]. our language model is a combination of 3 n-gram models. the first model is based on tokens (surface forms), the second on lemmata, and the third on classes [23]. the size of vocabulary causes data sparsity problems, resulting in the need for significantly greater language corpora, sufficient for obtaining a robust language model. the training set for the language model consists of text content from various newspapers, scientific articles and books, with a total volume of about 16 million words (178865 lemmata). splitting words into phoneme sequence is relatively simple for the serbian language, due to the fact that it has phonemic orthography. however, there are some exceptions in word pronunciation (e.g. dvanaest is usually pronounced as dvanajst) and our phonetic inventory distinguishes stressed and unstressed variant of vowels, thus for mapping words into phones the system uses the pronunciation dictionary developed for speech synthesis [23]. the size of search space is determined by the following factors: the number of words which are expected to be recognized, the number of their pronunciation variants, and the number of hidden markov model states in the acoustic model. for the real-time recognition, it is important to reduce the search space, which can be a significant problem for highly inflected languages, where many derived forms may exist for a single lemma. the standard way to cope with this problem is pruning, i.e., discarding the less probable hypotheses. for this purpose, a system should rely not only on an acoustic model, but also on a language model and information about word pronunciation. our system uses a decoder based on the token-passing algorithm (a variant of the viterbi algorithm in which the information about the path and score is stored at the word level instead of trellis state level). a detailed description of the decoder can be found in [25]. 378 v. delić, m. gnjatović, n. jakovljević, b. popović, i. jokić, m. bojanić 2.2. emotion recognition emotional speech recognition is concerned with the task of identifying emotional states of the speaker automatically, based upon the analysis of his speech. prosodic and spectral features are the most frequently ones used for this task, while the less frequently used features include voice quality features (e.g., harmonic-to-noise ratio, jitter, shimmer). prosodic features, also referred to as paralinguistic features, include specific changes in pitch patterning, the energy of the voice signal, and changes in speech rate. the positions and bandwidth of formants, and a cepstral representation of the spectrum are usually selected as spectral features for emotional speech recognition. this is in line with the findings that the distribution of the spectral energy across the speech range of frequency is a possible measure of the emotional content of speech. in [11], we show that a feature set containing both the prosodic and the spectral features achieves high recognition accuracy (i.e., 91.5 %) of the basic emotional states (i.e., anger, joy, fear, sadness, and neutral). the feature vector was obtained by applying statistical functionals to the spectral/prosodic feature contours, where the most relevant functionals, ranked in the descending order, are: moments, extrema, and regression coefficients [12]. in many speech-based applications, it is beneficial to conceptualize the user’s emotional states in a given interaction domain as positive or negative (e.g., for the purpose of detecting a frustrated or satisfied call-centre customer). therefore, in our previous work, we also investigated the perspective of dimensional emotion models that describe emotional content in terms of valence (positive/negative emotion) and arousal (active/passive emotion). we conducted a comparative study of two acted emotion corpora to investigate possibilities for classification of discrete basic emotions in the valence-arousal space [26]. the first conclusion of this study was that the prosodic-spectral feature set proposed in [11] is almost equally effective in modeling emotions in the valence-arousal space as compared to modeling discrete emotional states. the second conclusion was that the discrimination of emotional states according to the arousal level is more successful than their discrimination according to the valence level [26]. our research on acoustic information-based emotion recognition was primarily supported by the gees corpus of emotional and attitude-expressive speech in serbian [27]. it contains recordings of acted speech-based emotional expressions. six drama students (3 female, 3 male) were engaged to produce emotionally colored utterances. they were given a set of textual entries (32 isolated words, 30 short sentences, 30 long sentences, and one passage of 79 words) and asked to express each entry in five emotional states (anger, joy, fear, sadness and neutral). the perception test demonstrated that the corpus contains acoustic variations that are indicative of emotional expression of the five target emotional states. 2.3. speaker recognition our research on speaker recognition centers on a text-independent speaker recognition based on the feature set that contains mel-frequency cepstral coefficients (mfcc) and their first and second derivatives. the research was primarily supported by a corpus containing recordings of 121 native serbian speakers (61 female, 60 male). each speaker produced 14 audio recordings: one recording of the speaker uttering his/her first name and family name, two recordings of the speaker uttering a sequence of digits, and eleven recordings of the speaker uttering a sequence of syntactically unrelated words. to reduce user-awareness and adaptation in conversational agents 379 the dimensionality of the standard mfcc, we applied the technique of principal component analysis (pca). the reported experimental results [9] suggest that this technique is appropriate to reduce the dimensionality without reducing the recognition accuracy. the applied automatic speaker recognizer shows that already for a 14-dimensional pca feature space, the recognition accuracy reaches the target value as in the 39-dimensional mfcc feature space. mfccs depend on the energy in an observed speech frame. therefore the distribution of a speaker feature vectors depends on the lexical content and expressed emotions. to decrease the text dependency on the covariance matrices used for speaker modeling, we apply an algorithm of model elements weighting introduced in [10]. the basic idea may be formulated as follows: the importance of an element of the speaker model in the decision making processes decreases as its time variability increases. in accordance with this, an element of the speaker model that has the highest time variability will be assigned the smallest value. in real applications, it can be the case that, for some speakers, the automatic speaker recognizer has only one model determined during the training phase. thus, the recognizer cannot observe the time variability of model elements. the time variability of speaker models depends primarily on the largest model elements. by applying a nonlinear function, such as the sigmoid function, on the largest model elements, the time variability of the speaker models is decreased, and consequently, the recognition accuracy is increased. also, mfccs depend on the assumed shape of auditory critical bands. when the mfccs are determined under the assumption that the auditory critical bands have exponential shape based on the lower part of the exponential function, the automatic speaker recognizer shows more accurate performance than in the case when the rectangular or triangular auditory critical bands are applied [10]. if should be noted that emotional speech may significantly affect the accuracy of speaker recognition. however, not all emotions are equally critical for speaker recognition. preliminary experiments conducted on the gees database confirmed that, e.g., the emotion of anger changes the speaker’s voice (i.e., timbre) to the greater extent than the emotion of sadness. in the next sections, we discuss how a combination of different knowledge sources may improve the recognition accuracy. 2.4. interplay between speech, emotion and speaker recognition acoustic features and language information contained within the acoustic, pronunciation and language models may be efficiently combined and used for speech, emotion and speaker recognition [5]. high-level features, e.g., phones, idiolect, semantic, accent and pronunciation, reveal speaker characteristics, such as socio-economic status, language background, personality type, and environmental influence [6]. for speech recognition systems based on hidden markov models in combination with gaussian mixture models, numerous techniques have been developed for model adaptation to specific speaker and acoustic condition [28]-[31]. they can be grouped into two classes based on maximum a posteriori likelihood (map) and maximum likelihood (ml), respectively. a map-based adaptation interpolates the original prior parameter values with parameters obtained from the adaptation data, and thus the estimated parameters converge asymptotically to the adaptation domain as the amount of adaptation data increases [28]. however, in the case of sparse adaptation data, many model parameters remain unchanged [32]. ml-based methods assume that there is a set of linear transformation 380 v. delić, m. gnjatović, n. jakovljević, b. popović, i. jokić, m. bojanić which can map the existing model parameters into new adapted model parameters. since they use linear transformation to map parameters, these methods are referred as to ml linear regression (mllr). mllr can be applied only to the gaussian mean vectors or to both mean vectors and covariance matrices. a special case of mllr where the mean vector and covariance matrix of a gaussian have the same transformation matrix is called constrained mllr. while the use of mean mllr adaptation has the greatest positive impact, the use of variance mllr adaptation may also bring a slight improvement in recognition accuracy [29]. the major advantage of mllr over map adaptations is evident in the case of sparse adaptation data, where the same transformation can be applied to all gaussians in the same acoustic class [32]. alternatively, speaker adaptation can be achieved by transformation of features instead of model parameters. the common procedure is vocal tract length normalization [33], [34]. the basic idea is to find warp scales of the frequency axis for each speaker such that the spectrum fits to the spectrum of the universal speaker with a standard vocal tract length, and to apply that transformation on the used features. in this way, withinclass scattering and the overlapping between classes are reduced. it is interesting to note that the constrained mllr can be treated as feature transformation, and that it is commonly used for speaker adaptive training. models trained in this way may achieve higher recognition accuracy [35]. additionally, the accuracy of an asr system can be improved by the adaptation of the language model in terms of reducing the search space and confusion between words [36], [37]. it is widely acknowledged that the speaker’s emotional states affect the speech production system at several levels – from the higher levels of linguistic coding (word selection and sentence structure) to the lower levels of articulator movements (phoneme/word production). this, in turn, may significantly degrade the performance of asr systems. in general, asr performance and prosodic properties of an utterance are related. variations in speaking style and speaking rate, relative to asr training conditions, may have a negative impact on the performance of an asr system [38]. prosodic features reflect those variations, and some studies show that prosody itself is capable of re-ranking asr hypotheses such as to separate the correctly recognized utterances from incorrectly recognized ones [39], [40]. it can be expected that an asr system using acoustic models trained on neutral speech will have reduced performances in settings when it operates under the conditions of emotional speech. reference [41] shows that training asr models on neutral speech, and its subsequent adaptation on emotional speech samples, does have a positive impact on the recognition performance within such conditions. in [11] and [42], we discuss how the same prosodic and spectral features can be employed for the purpose of speech recognition, emotion recognition and speaker recognition. fig. 1. illustrates how knowledge from different sources is intended to be used in the reported speech processing module. the relationship between these technologies goes beyond prosodic and spectral features. in the next section, we discuss how emotion recognition can employ lexical and discourse information provided by an asr system. user-awareness and adaptation in conversational agents 381 fig. 1 combining knowledge from different sources in the speech processing module 3. emotion recognition based on linguistic information emotion recognition can be also based on lexical and discourse information [43], e.g., a semantic analysis of an output hypothesis of an asr engine [44]. in line with this, one line of our research focuses on recognition and tracking of emotional states of the user from lexical information and other linguistic features. as part of previous work [1], a substantial refinement of the wizard-of-oz technique was proposed in order that a scenario designed to elicit affected speech in human-machine interaction could result in realistic and useful data. the nimitek corpus of affected speech in human-machine interaction was produced during a refined wizard-of-oz simulation. ten healthy native german speakers participated in the study (7 female, 3 male, ages 18 to 27, mean 21.7). the corpus contains 15 hours of audio and video recordings. the number of the subjects’ dialogue turns is 1,847, the average number of words per turn is 17.19 (with standard deviation 24.37), and the subjects’ lexicon contains about 900 lemmata. the evaluation of the corpus with respect to the perception of its emotional content demonstrated that it contains recordings of emotions that were overtly signaled, and that the subjects’ utterances are indicative of the way in which untrained, nontechnical users probably like to converse with conversational agents [1]. the transcribed version of the nimitek corpus was used to conduct a corpus-based examination of various linguistic features that may carry affect information [13]. for the purpose of this contribution, we illustrate the following linguistic features: key words and phrases, lexical cohesive agencies, dialogue act sequences, and negations. the most obvious way of recognizing an emotional state is to detect key words and phrases in users’ utterances. examples from the nimitek corpus are given in table 1. however, expressions of emotions are not necessarily limited to a single dialogue act, but 382 v. delić, m. gnjatović, n. jakovljević, b. popović, i. jokić, m. bojanić can also map over a range of mutually related dialogue acts. for example, the choice of lexical items made to create cohesion in the dialogue can signal an emotion-related state, both at the lexical level (e.g., repetitions), as well as at the semantic level (e.g., reformulations). table 2 contains examples of repetitions and reformulations that signal negative emotional states. in contrast to this, another form of anaphoric cohesion in a dialogue is achieved by ellipsis-substitutions. the typical meaning of ellipsis-substitutions is not one of co-reference – there is always some significant difference between the second instance and the first [45]. to illustrate this, let us observe a typical example from the nimitek corpus: please do it! (bitte tu das!). this utterance does not explicitly provide information what the system is expected to do, but contains an ellipticalsubstitution (verb do) which is used for signaling that the action the system performed is not the same as the action instructed by the user (indicated by the anaphoric reference it). in general, ellipsis-substitutions may signal a potential problem in communication. table 1 examples of key words and phrases that relate to emotional states (adopted and adjusted from [13]) emotional state examples of the subjects’ key words and phrases annoyed sh*t (sche*ße), stupid (blöd), do what i say (tu was ich sage), oh … something like this i hate just like the plague. (oh... so was hasse ich doch wie die pest.) retiring i don't understand it (ich versteh’ das nicht), it's not working at all (das geht doch gar nicht). indisposed i am going now (ich geh’ gleich), oh man (oh man), god (gott), i don't feel like doing any more. (ich hab’ kein’ bock mehr.) offending you think, doll. (denkst du, puppe) satisfied super (super), awesome (geil), i am good, am i not? (bin gut, was?) table 2 examples of lexical cohesive agencies that relate to negative emotional states (adopted and adjusted from [13]) lexical cohesive agencies examples of the subjects’ dialogue acts repetition it just cannot be. it just … it just cannot be. (das kann doch nicht sein. das ist doch … das kann doch nicht sein.) reformulation not true at all. that’s definitely wrong. (gar nicht wahr. das stimmt gar nicht.) ellipsis-substitution please do it. based on this study, a prototypical automatic annotator for recognition and tracking of the user’s emotional states from linguistic information was implemented [13]. it should be noted that the emotional states in the nimitek corpus [1] were conceptualized within the data-driven 6-class emotion model arisen (annoyed, retiring, indisposed, satisfied, engaged, neutral). in addition, the subjects’ expressions in the nimitek corpus often contain mixed emotions, and the human evaluators were allowed to assign more emotion labels to each subject’s utterance. thus, the automatic annotator was implemented to annotate mixed emotions, i.e., to attribute zero, one or more labels from the arisen model to each subject’s utterance. the results of the automatic annotation were compared with the results of the human evaluators. for the given 6-class emotional model arisen, the annotator showed the user-awareness and adaptation in conversational agents 383 following performance: 31.70% of subject emotional states were correctly, 34.35% of subject emotional states were not recognized, and 33.92% of subject emotional states were incorrectly recognized. furthermore, the arisen model was down-sampled to a model that differentiates between 3 emotional states, i.e., negative (including annoyed, retiring and indisposed emotional states), neutral, and positive (including satisfied and engaged emotional states). for this 3-class problem, the annotator showed the following performance: 51.20% of subject emotional states were correctly recognized, 33.67% of subject emotional states were not recognized, and 17.26% of subject emotional states were incorrectly recognized. when interpreting these results, it should be kept in mind that the automatic annotation was based only on lexical information, while the human evaluators were influenced by prosody as well. 4. user-adaptive dialogue management the main idea underlying the conversational agent’s adaptation is that its dialogue behavior is dynamically adapted according to the user and his emotional state. in this respect, the dialogue management module is the central component of the conversational agent. it consists of two components: dialogue context model and adaptive dialogue control [46]. 4.1. dialogue context model dialogue context model keeps track of information relevant to the dialogue. for the purpose of this contribution, it includes the following knowledge sources:  lexical and propositional content of the user’s dialogue act,  attentional state,  emotional state of the user,  information about the user. among these sources, attentional state deserves further discussion. at the conceptual level, attentional state contains information about the dialogue entities that are most salient at any given point. its purpose is twofold [2], [47]. first, it summarizes information from previous dialogue acts that are necessary for processing subsequent ones, and allows for processing spontaneously produced users' dialogue acts. this is an important characteristic of the system, not just because it enables a more natural dialogue, but also because forcing users to follow a preset grammar or interaction scenario is hardly acceptable for users in negative emotional states. the second purpose of attentional state is that it allows for predicting the dialogue behavior of the user, i.e., it forms the basis for expectations about the succeeding dialogue acts. this information is important both for automatic speech recognition, as a means of reducing a set of asr hypotheses, and adaptive dialogue control, for taking initiative in a dialogue. in [2], we introduced a representational model of attentional information in humanmachine interaction that provides a framework for more robust natural language processing and dialogue management. this model integrates neurocognitive understanding of the focus of attention in working memory, the notion of attention related to the theory of discourse structure in the field of computational linguistics, and investigation of the nimitek corpus. to the extent that it is computationally appropriate, it was successfully adapted and applied in several prototypical conversational agents with diverse domains of interaction [14], including the dialogue management module reported in this paper. 384 v. delić, m. gnjatović, n. jakovljević, b. popović, i. jokić, m. bojanić fig. 2 the intended architecture of a conversational agent 4.2 adaptive dialogue control the dialogue control component implements dialogue strategies of the conversational agent. in general, a dialogue strategy involves deciding what to do next once the user’s input has been received and interpreted, e.g., prompting the user for more input, clarifying the user’s previous input, outputting information to the user, etc. [46]. we recall that the reported conversational agent is adaptive to the extent that it dynamically adapts its dialogue strategies according to the current user and his emotional state. therefore, an adaptive dialogue strategy is specified by means of a set of rules that take information about the current dialogue context into account. we build upon previous work on emotion-adaptive dialogue strategies, and end-user design of adaptive dialogue strategies. it is important to note that the reported dialogue management module allows the end-user to design dialogue strategies. this makes two levels of adaptation possible. the dialogue behavior is not only dynamically adapted according to the current dialogue strategy, but also the dialogue strategy itself can be redefined by the user. for detailed discussion, the reader may consult [16], [17]. 5. concluding remarks this paper summarized some aspects of our previous work and presents work-inprogress on developing user-aware and adaptive conversational agents. the intended architecture of a conversational agent is given in fig. 2. the speech recognition module and the dialogue management module (integrated with the natural language processing modules) are fully implemented, while emotion recognition and speaker recognition modules are implemented at a prototype level. user-awareness and adaptation in conversational agents 385 current and future prospects of our research in this field include (but are not limited to): further investigation of the interplay between speech recognition, emotion recognition and speaker recognition, investigation of linguistic cues for early recognition of negative dialogue developments, further development of dialogue strategies for preventing and handling negative dialogue development, and investigation of more complex user models and alternative models of emotions. acknowledgement: the presented study was performed as part of the project “development of dialogue systems for serbian and other south slavic languages” (tr32035), funded by the ministry of education, science and technological development of the republic of serbia. references [1] m. gnjatović and d. rösner, “inducing genuine emotions in simulated speech-based human-machine interaction: the nimitek corpus”. ieee transactions on affective computing, vol. 1, no. 2, pp. 132144, july-dec. 2010, doi: 10.1109/t-affc.2010.14 [2] m. gnjatović, m. janev, v. delić, “focus tree: modeling attentional information in task-oriented humanmachine interaction”. applied intelligence, vol. 37, no. 3, pp. 305-320, 2012, doi: 10.1007/s10489-011-0329-5 [3] d. bohus and a. rudnicky, “sorry, i didn’t catch that! an investigation of non-understanding errors and recovery strategies”. in recent trends in discourse and dialogue, vol. 39 of text, speech and language technology, pp. 123–154, springer, 2008. [4] c.h. lee, “fundamentals and technical challenges in automatic speech recognition”. in proc. of the 12th international conference speech and computer, specom 2007, pp. 25–44, moscow, russia, 2007. [5] b. schuller, g. rigoll, m. lang, “speech emotion recognition combining acoustic features and linguistic information in a hybrid support vector machine-belief network architecture”. in proc. of icassp 2004, vol. 1, pp. i-577-580, 2004, doi: 10.1109/icassp.2004.1326051 [6] t. kinnunen and l. haizhou, “an overview of text-independent speaker recognition: from features to supervectors”. speech communication, vol. 52, pp. 12-40, 2010, doi: 10.1016/j.specom.2009.08.009 [7] v. delić, m. sečujski, n. jakovljević, m. gnjatović, i. stanković, “challenges of natural language communication with machines”. chap. 19 in daaam international scientific book 2013, pp. 371-388, 2013, doi: 10.2507/daaam.scibook.2013.19 [8] n. jakovljević, d. mišković, m. janev, m. sečujski, v. delić, “comparison of linear discriminant analysis approaches in automatic speech recognition”. electronics and electrical engineering, vol. 19, no. 7, pp. 76-79, 2013, doi: 10.5755/j01.eee.19.7.5167 [9] i. jokić, s. jokić, z. perić, m. gnjatović, v. delić, “influence of the number of principal components used to the automatic speaker recognition accuracy”. electronics and electrical engineering, vol. 18, no. 7, pp. 83-86, 2012, doi: 10.5755/j01.eee.123.7.2379 [10] i. jokić, s. jokić, v. delić, z. perić, “towards a small intra-speaker variability models”. electronics and electrical engineering, vol. 20, 2014 (in press). [11] v. delić, m. bojanić, m. gnjatović, m. sečujski, s.t. jovičić, “discrimination capability of prosodic and spectral features for emotional speech recognition”. electronics and electrical engineering, vol. 18, no. 9, pp. 51-54, 2012, doi: 10.5755/j01.eee.18.9.2806 [12] m. bojanić, v. delić, m. sečujski, “relevance of the types and the statistical properties of features in the recognition of basic emotions in speech”. facta universitatis, series: electronics and energetics, vol. 27, 2014 (in press). [13] m. gnjatović, m. kunze, x. zhang, j. frommer, d. rösner, “linguistic expression of emotion in human-machine interaction: the nimitek corpus as a research tool”. in proceedings of the 4th int. workshop on human-computer conversation, bellagio, italy, no pagination, 2008. [14] m. gnjatović and v. delić, “a cognitively-inspired method for meaning representation in dialogue systems”. in proc. of the 3rd ieee int. conf. coginfocom-2012, košice, slovakia, pp. 383-388, 2012. [15] m. gnjatović and v. delić, “electrophysiologically-inspired evaluation of dialogue act complexity”. in proc. of the 4th ieee int. conf. coginfocom 2013, budapest, hungary, pp. 167-172, 2013. [16] m. gnjatović and v. delić, “cognitively-inspired representational approach to meaning in machine dialogue”. knowledge-based systems, doi: 10.1016/j.knosys.2014.05.001, 2014. 386 v. delić, m. gnjatović, n. jakovljević, b. popović, i. jokić, m. bojanić [17] m. gnjatović, “therapist-centered design of a robot's dialogue behavior”. cognitive computation, special issue: the quest for modeling emotion, behavior and context in socially believable robots and ict interfaces, springer, doi: 10.1007/s12559-014-9272-1 (in press). [18] s. j. young, j. odell, p. c. woodland, “tree-based state tying for high accuracy acoustic modelling”. in proceedings of the workshop on human language technology, pp. 307-312, 1994, doi: 10.3115/ 1075812.1075885 [19] n. jakovljević, d. mišković, e. pakoci, t. grbić and v. delić, “poređenje performansi nekoliko varijanata gmm u sistemima za prepoznavanje govora”. in proc. of 21th telecommunications forum, telfor 2013, belgrade, serbia, pp. 466-469, 2013. [20] m. janev, d. pekar, n. jakovljević, v. delić, “eigenvalues driven gaussian selection in continuous speech recognition using hmms with full covariance matrices”. applied intelligence, vol. 33, no. 2, pp. 107-116, 2010, doi: 10.1007/s10489-008-0152-9 [21] b. popović, m. janev, d. pekar, n. jakovljević, m. gnjatović, m. sečujski, v. delić “a novel split-andmerge algorithm for hierarchical clustering of gaussian mixture models”. applied intelligence, vol. 37, no. 3, pp. 377-389, 2012, doi: 10.1007/s10489-011-0333-9 [22] n. jakovljević, primena retke reprezentacije na modelima gausovih mešavina koji se koriste za automatsko prepoznavanje govora, phd thesis, university of novi sad, march 2014. [23] v. delić, m. sečujski, n. jakovljević, d. pekar, d. mišković, b. popović, s. ostrogonac, m. bojanić, d. knežević, “speech and language resources within speech recognition and synthesis systems for serbian and kindred south slavic languages”. in proc. of the specom 2013, pilsen, czech republic, lncs, vol. 8113, springer, pp. 319-326, 2013, doi: 10.1007/978-3-319-01931-4_42 [24] s. ostrogonac, m. sečujski, v. delić, d. mišković, n. jakovljević, n. vujnović sedlar, a mixed-structure ngram language model, axon inteligentni sistemi, novi sad, serbia. international patent pening: pct/ rs2013/000009 [25] n. jakovljević, d. mišković, m. janev, d. pekar, “a decoder for large vocabulary speech recognition”. in proc. of 18th international conference on systems, signals and image processing, iwssip 2011, sarajevo, bosnia and herzegovina, pp. 287-290, 2011. [26] m. bojanić, m. gnjatović, m. sečujski, v. delić: “application of dimensional emotion model in automatic emotional speech recognition”. in proc. of the 11th ieee int. symp. on intelligent systems and informatics, sisy 2013, subotica, serbia, pp. 353-356, 2013, doi: 10.1109/sisy.2013.6662601 [27] s.t. jovičić., z. kašić, m. djordjević, m. rajković, “serbian emotional speech database: design, processing and evaluation”. in proc. of specom 2004, st peterburg, pp.77–81, 2004. [28] j. gauvain and c. h. lee, “maximum a posteriori estimation for multivariate gaussian mixture observations of markov chains”. ieee trans. on speech and audio process., vol. 2, no. 2, pp. 291-298, apr. 1994, doi: 10.1109/89.279278 [29] m.j.f. gales, “maximum likelihood linear transformations for hmm-based speech recognition”. computer speech & language, vol. 12, no. 2, pp. 75-98, 1998, doi: 10.1006/csla.1998.0043 [30] m.j.f. gales and p.c. woodland, “mean and variance adaptation within the mllr framework”. computer speech & language, vol. 10, no. 4, pp. 249-264, 1996, doi: 10.1006/csla.1996.0013 [31] d. povey and g. saon, “feature and model space speaker adaptation with full covariance gaussians”. in proc. interspeech 2006, paper 2050-tue2bup.14, 2006. [32] m.j.f. gales and s. young, “the application of hidden markov models in speech recognition”. foundations and trends in signal processing, vol. 1, no. 3, pp. 195-304, 2008, doi: 10.1561/2000000004 [33] n. jakovljević, d. mišković, m. sečujski, d. pekar, “vocal tract normalization based on formant positions”. in proc. inter. language technologies conference is-ltc 2006, ljubljana, pp. 40-43, 2006. [34] n. jakovljević, m. sečujski, v. delić, “vocal tract length normalization strategy based on maximum likelihood criterion”. in proc. eurocon 2009, st. petersburg, pp. 417-420, 2009, doi: 10.1109/eurcon. 2009.5167662 [35] g. saon and j.t. chien, “large-vocabulary continuous speech recognition systems”. ieee signal processing magazine, vol. 29, no. 6, pp. 12-33. nov. 2012, doi: 10.1109/msp.2012.2197156 [36] j.m. lucas-cuesta j. ferreiros, f. fernandez-martinez, j.d. echeverry, s. lutfi, “on the dynamic adaptation of language models based on dialogue information”. expert systems with applications, vol. 40, no. 4, pp. 1069-1085, 2013, doi: 10.1016/j.eswa.2012.08.029 [37] w. kim, language model adaptation for automatic speech recognition and statistical machine translation, phd thesis, johns hopkins university, 2005. [38] l. ten bosch, “emotions: what is possible in the asr framework”. itrw on speech and emotion, northern ireland, uk, pp. 189-194, 2000. [39] j. hirschberg, d. litman, m. swerts, “prosodic and other cues to speech recognition failures”. speech communication, vol. 43, pp. 155-175, 2004. user-awareness and adaptation in conversational agents 387 [40] d. litman, j. hirschberg, m. swerts, “predicting automatic speech recognition performance using prosodic cues”. in proc. of the 1 st north american chapter of the association for computational linguistics, naac, seattle, pp. 218-225, 2000. [41] b. vlasenko, d. prylipko, a. wendemuth, “towards robust spontaneous speech recognition with emotional speech adapted acoustic models”. s. wölfl (ed.), poster and demo track of the 35th german conference on artificial intelligence, ki-2012, saarbrucken, germany, pp. 103-107, 2012. [42] b. popović, i. stanković, s. ostrogonac, “temporal discrete cosine transform for speech emotion recognition”. in proc. of the 4th ieee int. conf. coginfocom 2013, budapest, hungary, pp. 87-90, 2013. [43] c.m. lee and s.s. narayanan, “toward detecting emotions in spoken dialogs”. ieee transactions on speech and audio processing, vol. 13, no. 2, pp. 293-303, 2005, doi: 10.1109/tsa.2004.838534 [44] r. müller, b. schuller, g. rigoll, “enhanced robustness in speech emotion recognition combining acoustic and semantic analyses”. in proc. of the workshop from signals to signs of emotion and vice versa, santorini, greece, 2004. [45] m. halliday, an introduction to functional grammar, edward arnold, london new york, second edition, 1994. [46] k. jokinen and m. mctear, spoken dialogue systems. synthesis lectures on human language technologies, morgan and claypool, 2009. [47] b. grosz and c. sidner, “attention, intentions, and the structure of discourse”. comput linguist, vol. 12, no 3, pp. 175-204, 1986. instruction facta universitatis series: electronics and energetics vol. 32, n o 1, march 2019, pp. 51-63 https://doi.org/10.2298/fuee1901051d reduction of susceptibility from electromagnetic interference in sensorless foc of ipmsm * lindita dhamo 1 , aida spahiu 1 , mitja nemec 2 , vanja ambrozic 2 1 polytechnic university of tirana, faculty of electrical engineering, tirana, albania 2 university of ljubljana, faculty of electrical engineering, ljubljana, slovenia abstract: this paper presents main problems of practical implementation of field oriented control (foc) developed for an interior permanent magnet synchronous motor (ipmsm). the main sources of electromagnetic interferences (emi) noises are discussed and practical aspects when a position sensor is used are presented. the control system is based on the dsp processing unit, together with inverter and encoder. the main problem addressed in this paper is reduction of vibrations in torque and speed response in a real system by re-placing a hardware device of control system very susceptible to emi noises, like encoder, with a soft block in control unit like sliding mode observer, less sensitive to emi. the experimental results with this control structure show considerable ripple reduction at steady state in torque, speed and current, as a consequence of reduction of sensitivity to emi noises. key words: emi, ipmsm, sensorless foc. 1. introduction pmsm has become really competitive to an induction motor in terms of lifetime cost. this motor has recently become quite attractive due to its many advantages be-cause magnets, instead of windings, are used for rotor magnetization [2]. phase inductance of pmsm is lower than that of the induction motor. thus, in pmsm, the effect of electromagnetic noise is greater when compared to the induction motor [3]. electro magnetic interference (emi), the appropriate term when referring to lower frequencies, or radio frequency interference (rfi), the appropriate term when referring to higher frequencies, is unwanted electrical noise that can interfere with signaling or communication equipment. drives with 8 khz or higher switching frequency have many harmonic frequencies, which produce problematic emissions affecting sensitive equipment. received february 3, 2018; received in revised form october 12, 2018 corresponding author: aida spahiu polytechnic university of tirana, faculty of electrical engineering, sheshi “nene tereza” nr.4, 1000, tirana, albania (e-mail: aida.spahiu@fie.upt.al) * an earlier version of this paper was presented at the 13th international conference on applied electromagnetics (пес 2017), august 30 september 01, 2017, in niš, serbia [1]. 52 l. dhamo, a. spahiu, m. nemec, v. ambrozic reducing the pwm carrier frequency reduces the effects and lowers the risk of common mode noise interference. higher carrier frequencies are less efficient for the drive, but lower carrier frequencies are less efficient for the motor. in general, restricting the propagation of electrical noise as close to the noise source as possible is the best way to protect sensitive devices from emi. there are many studies about the reduction of emi on ac drive systems. random pwm technique has been developed to suppress emi in power converters [4]–[8] and have shown that with this method it is possible to reduce acoustic noise and mechanical vibration. random pwm are various carried out in ways, such as by random switching frequency, random pulse position technique and random switching technique. it was shown that acoustic noise and emi were suppressed by using random pwm technique in svpwm algorithm [9]–[10]. methods having various switching frequencies like random or chaotic pwm are generally applied to induction motor. chaotic signal is obtained more easily than the random signal and it is also simpler to apply [11]. various techniques are available and discussed in literature, such as chaotic sinusoidal pwm, chaotic pulse position pwm, hybrid chaotic spwm and chaotic sv-pwm methods [12]–[14]. emi can create adverse effects with electrical components in the motor control panel, contributing to a loss of serial communication, nuisance drive trips and disturbance of control signals. emi not only degrades the performance of electrical equipment but also decreases the lifetime of components and increases the financial cost for equipment maintenance. this paper deals with electromagnetic interference (emi) and its prevention through the design of control sys-tem. it present the case when the sensitive device from emi, or noise receiver , is replaced with a soft block in control scheme, less sensitive to emi, in order to reduce the negative effect of emi propagation that are present in the system. 2. mathematical model of an ipmsm with system uncertainties 2.1 dynamic model of an ipmsm applying kirchhoff’s voltage law (kvl) to the dq-axis equivalent circuits of a threephase ipmsm yields the following voltage equations in the synchronously rotating d-q reference frame: qs s qs qs qs ds ds mv r i l i ωl i ωλ    (1) ds s ds ds ds qs qsv r i l i ωl i   (2) where vds and vqs are the dq-axis voltages, ids and iqs are the dq-axis currents, rs is the stator resistance, lds and lqs are the dq-axis inductances, ω is the electrical rotor speed, and λm is the magnetic flux. in addition, the electromagnetic torque can be obtained from the following electrical and mechanical equations: 3 [ ( ) ] 2 2 e m qs ds qs ds qs p t λ i l l i i   (3) 2 2 e lt t b ω j ω p p    (4) experimental evaluation of torque ripple reduction in a sensorless foc of ipmsm drive... 53 where te and tl are the electromagnetic and load torques, p is the number of pole pairs, b is the viscous friction coefficient, and j is the rotor inertia. substituting (3) into (4) yields the following speed dynamic equation: 3 3 2 4 2 2 4 2 2 ds qsm qs l ds qs l lλp b p p ω i ω t i i j j j j      (5) 2.2 the extraction of rotor position the extraction of rotor position is made using the magnitudes of the αβ back emf components and inverse tangent method. in this method the rotor position angle is determined from as follows: 1 ˆˆ ˆ α β e θ tan e           (6) however, the position calculated by this method depends on the quality of the estimated back emf. because of the low sampling frequency, the estimated back emf will have both phase and magnitude shifts, which will bring oscillations and phase shift to the estimated position. in order to mitigate the oscillation of the estimated position, an estimated speed feedback algorithm is used to improve the inverse tangent method for position calculation, as shown in fig. 2, and the formula is as (7). ˆ ˆ[ ] [ 1] [ 1]2 sθ k θ k ω k t     (7) block diagram that represent the algorithm for improving the inverse tangent method for rotor position calculation is shown in figure 2. there is a logic used for rotor position selection, which consist in comparison of evaluated position during the k th time step, of 1 ˆ [ ]θ k that can be obtained from the smo, and 2 ˆ [ ]θ k that has been calculated at the end of the (k-1) th time step. the error ε[k] between 1 ˆ [ ]θ k and 2 ˆ [ ]θ k will be calculated as difference of them at the beginning of the k th time step. if the generated error ε[k] is smaller than the predetermined position error margin, than ˆ[ ]θ k = 1 ˆ [ ]θ k ; otherwise, ˆ[ ]θ k = 2 ˆ [ ]θ k . this method used to extract the rotor position, in implementation has shown a good performance of speed control for ipmsm and the oscillations in the estimated rotor position are mitigated. + + delay selection of rotor position ++ atan2 compensation of phase ][ˆ ke ][ˆ ke ][ˆ k ]1[ˆ k ][ˆ 1 k ][ˆ 2 k ]1[ k s t fig. 2 block diagram to improve the inverse tangent method for position calculation. 54 l. dhamo, a. spahiu, m. nemec, v. ambrozic 3. emi noises and effects the reasons for electromagnetic compatibility (emc) having grown in importance at such a rapid pace are owed to the increasing frequency because of use of digital electronics in today’s world and the virtually worldwide imposition of governmental limits on the radiated and conducted noise emissions of digital electronic products [15]. there are three ways to prevent interference: suppress the emission at its source, make the coupling path as inefficient as possible, make the receptor less susceptible to the emission. although these three alternatives should be kept in mind, the “line of defense” in this work is to make the receptor less susceptible to the emission. the paper shows the effect of replacing a device of the control system (the absolute en-coder) with a soft block (sliding mode observer) into control scheme, in order to reduce the disturbances caused by emi. the experimental results confirm the effectiveness of sensorless field oriented control by sliding mode observer of ipmsm in decreasing the sensitivity of control system to emi noises. 3.1. emi noise transmission path each type of interference problem includes a source, a receptor, and a transmission path between the source and victim or receptor of noise that suffer from emi noises. conducted emi is defined as interference that uses conductors as a path from a source to receptor. for example, a motor encoder grounded to a noisy connection would conduct noise to the drive encoder interface. the conducted noise could cause the drive encoder interface to receive inexact voltage signals precluding the motor drive from reading the rotor position and speed correctly thus causing drive faults. at the beginning, it may be supposed that the root cause for the drive operational malfunctions are related to incorrect parameter setting or possible a faulty drive interface board. closer inspection reveals the culprit to be poor grounding of the encoder cable. radiated emi is defined as interference that uses a wireless path from a source to the receptor. this is commonly seen in motor control panels with ac motor wires are laid in parallel next to low-voltage control wiring. the result is coupling between the wires causing disturbances on the data transmission line. for example, if the motor wires were laid in close proximity to a serial link between the motor controller and the drive, the coupling of the signals may corrupt the data packets being transferred between the controller and drive. 3.2. emi noise sources the motor drive system (mds) in industrial applications has become a new noise source because its switching frequency, operation voltage and current variations have been increased, causing unwanted effects such as common-mode (cm) noise and electromagnetic interference (emi) [16]. hence, the analysis of the noise propagation paths is necessary for understanding and improving the system reliability. noise propagation paths are mainly composed of an inverter, a three phase cable, a ball-bearing, an electric motor, and multiple ground nodes. especially the electric motor is an electric active load of inverter and a mechanical power source of vehicle as well. therefore, unwanted current flows to whole vehicle body through the electric motor by capacitive coupling in both the electric components and the mechanical parts. experimental evaluation of torque ripple reduction in a sensorless foc of ipmsm drive... 55 emc of electronic circuits is to a great extent deter-mined by the way the components are laid out and inter-connected. signal lines with their corresponding return line form an antenna, which is able to radiate electromagnetic energy, where the magnitude is determined by current amplitude, frequency and the geometrical area of the current loops. there are three typical sources for emi: power sup-ply lines, signal lines carrying high frequency, oscillator circuit. an important source of electromagnetic interference noise is the crosstalk. this essentially refers to the unintended electromagnetic coupling between wires and pcb lands that are in close proximity. crosstalk is distinguished from antenna coupling in that it is a near-field coupling problem. crosstalk between wires in cables or between lands on pcbs concerns the intrasystem interference performance of the product; that is, the source of the electro-magnetic emission and the receptor of this emission are within the same system. thus this reflects the third concern in emc: the design of the product such that it does not interfere with itself. with clock speeds and data transfer rates in digital control systems steadily increasing, crosstalk between lands on pcbs is becoming a significant mechanism for interference in modern digital systems. 3.3. receptors of emi in a real digital control system, there are several devices sensitive to emi, like encoders, tachometers, analog signals and measurement devices, communication networks and devices, microprocessor devices etc. each of them demonstrates specific symptoms when affected by emi noises. encoders may include jumping around of encoder counts when still and non-repeatable positioning when moving. tachometers may include incorrect speed reporting or un-expected speed fluctuations. analog signals and measurement devices may include unexpected voltage spikes, ripple, or jitter on the analog signal causing incorrect and non-repeatable readings. communication networks and devices almost always include loss of communication or errors in reading or writing data. the microprocessor devices can include loss of communications, faults or failure in the processor, digital inputs or outputs to trigger unexpectedly, analog inputs or outputs to report the incorrect value. the upper devices are all very important and irreplaceable, except the encoder. in the sensorless control system that we have developed, the elimination of one of the most sensitive receiver noises from emi, will reduce significantly the negative effects, like ripple in analog signals: torque, speed and current. in this paper, the effect of replacing the en-coder with an observer of sliding mode type is investigated. 4. ipmsm sensorless control 4.1. control unit the control system for sensorless foc of ipmsm with sliding mode observer, developed in this study, is com-posed by three main blocks: control unit, power module and measurement unit. the control unit is based upon a piccolo f28069 controlstick dsp by ti [17]. it consists of an adc converter, pwm channels and floating point central processing unit. the stator windings of ipmsm are supplied from a conventional 3 phase power module made up of 6 mosfet-s, operated as keys for break control. 56 l. dhamo, a. spahiu, m. nemec, v. ambrozic 4.2. measurement unit the measurement unit is a determinative part in the closed loop control system and encoder has a crucial role since the performance of foc depends directly on accurate rotor position information. in this study, only the effect of encoder in emi noises is considered. the idea has been realized through a soft block added in control unit block. instead of absolute encoder a sliding mode observer to calculate the rotor position and rotor speed that are needed for foc algorithm is designed. the results for important quantities of control system are then compared. 4.3. modular philosophy of digital motor control although a standardized platform, a modular ti piccolo f28069 controlstick dsp provides a smooth way for customers to quickly port the reference software to customized hardware. ti’s modular philosophy, which clearly separates modules into cpu and peripheral-dependent (drivers) categories, greatly simplifies the porting process. the ipmsm speed controller and the speed calculator from position information is the appropriate partitioning point in this system due to its complexity and reusability. this modular philosophy of ti’s platforms has encouraged and allowed us to develop and modify the standard dmc sys-tem to a sensorless one. the figure 1 shows an overall block diagram of the proposed observer-based nonlinear sliding mode speed control system. ac/dc converter 3 phase inverter power supplies gate drivers analog conditioning gd 12 bit adc epwm module serial interface eqep svpwm phase current reconstruction bus over voltage pi foc s speed calculator ` acin dc bus proccesor ground s y n c gpio ose pwm d e fe k t b u s v b u s i o v e r c u rr e n t m o to r p w m s t ri g g e r reference speed actual speed angle ualpha ubeta torque reference iq_ref id_ref angle p ic c o lo f 2 8 0 6 9 observer smo ualpha ubeta ibeta ialpha speed estimator angle angle fig. 1 control scheme of sensorless smo ipmsm drive. the blocks “speed estimator” and “observer smo” are added in the existing control scheme in order to calculate the rotor position and rotor speed through voltages and currents of stator, digitized and transformed by clarke transformation (to uα, uβ, iα and iβ) skipping the need for encoder, that gives a very important information like rotor position. the control scheme, by a soft-key provides sensored or sensorless operation and experimental results for quantities like torque, speed and currents to be compared and analyzed. experimental evaluation of torque ripple reduction in a sensorless foc of ipmsm drive... 57 5. experimental setup in order to verify the performance and effectiveness in emi noise reduction of the proposed observer-based non-linear sliding mode controller, experiments are carried out with a prototype ipmsm drive system based on a piccolo f28069 controlstick dsp. figure 2 shows the experimental setup of sensorless smo ipmsm drive. the hardware circuit consists of an ipmsm, product of slovenian industry mahle-letrika dedicated for electric power steering systems, a three-phase inverter with 6 mosfets (irfp4410), a control board with a f28069 controlstick dsp (float-point), an absolute optical encoder (hengstler ad35, 22 bit), two hall-effect current sensors (lts15np), and a pmsm motor as load in a back-to-back configuration. table 1 show the parameters of ipmsm used in experiment. the dc-link voltage (295 vdc) is obtained from the utility (ac 230v/50hz) using a single-phase full-bridge rectifier. the two phase currents (ia, ib) are measured by lts15np hall sensors and then converted into digital form using two 12-bit a/d converters. in addition, the rotor position (θ), which is used to execute the coordinate transformation for foc, is measured by the absolute encoder and fed to tex-as instruments piccolo f28069 controlstick dsp via a 32-bit qep. note that the rotor speed (ω) required to perform the feedback control can be easily obtained by differentiating θ with respect to time. table 1 ipmsm parameters. parameters symbol unit value rated power pn w 600 rated speed ωn rpm 1250 stator resistance rs ω 0.06 d-axis inductance ld mh 0.068 q-axis inductance lq mh 0.086 total linkage flux λpm wb 0.0373 pole pairs p 3 inertia j kgm 2 0.0001682 fig. 2 experimental setup of sensorless smo of ipmsm drive. 58 l. dhamo, a. spahiu, m. nemec, v. ambrozic 6. experimental results and discussions a variety of experiments have been performed. results for sensor and sensorless mode are compared in order to evaluate the emi noises reduction by replacing the absolute encoder with sliding mode observer. since the emi noises are due to lot of complex and coupled factors, the effect of removing only the position sensor is checked in all electric and mechanic quantities that are important for the quality of control like torque, speed, and currents. that kind of nonlinear control used in our experiments, the sliding mode control, “suffer” from chattering phenomena while implementation in real time control of ipmsm drive. it is obvious that the chattering is overlapped to speed and torque ripples, resulting in a worse situation. but the encoder, is the most susceptible hardware part of the drive by emi, and replacing that hardware with a software, the smo, reduce the possibility to effect the drive operation. experimentally, result that the torque ripples are reduced up to 50%. furthermore, the existence of a no observable zone for very low speeds of motor is a weak point of operation for ipmsm drive . so the results for speed response in steady state are taken at two different regimes: for rated speed and low speed, 15 rev/s and 3.5 rev/s, respectively. the figures are presented in appropriate scale to compare the amplitudes of ripples for both sensor and sensorless operation (the reference signal is shown for speed). experimental results show that sensorless control exhibits less ripples in electromagnetic torque. 0 50 100 150 200 250 300 350 400 450 500 -0.66 -0.64 -0.62 -0.6 -0.58 -0.56 -0.54 -0.52 -0.5 -0.48 time [samples] t o rq u e c o m p a ri s o n [n m ] torque, sensored torque sensorless fig. 3 experimental results for comparison of electromagnetic torque with sensor and sensorless control. from figure 3 it is clear that electromagnetic torque during sensorless operation is more stable and has fewer ripples. the ripple’s amplitude for torque during sensorless operation is reduced up to 50% of ripple’s amplitude of torque during sensor operation. this is not an isolated fact, which occurs accidentally. the replacing of encoder with soft block smo, “confront” directly one of the receivers of emi noises, e.g. position sensor. in general, the “first line of defense” is to suppress the emission as much as possible at the source, but it is a valid strategy to make the control system “deaf” for a part of emi noises. experimental evaluation of torque ripple reduction in a sensorless foc of ipmsm drive... 59 0 50 100 150 200 250 300 350 400 450 500 14.7 14.8 14.9 15 15.1 15.2 15.3 15.4 15.5 15.6 time [samples] s p e e d c o m p a ri s o n speed calculated from encoder speed estimated from smo speed reference fig. 4 experimental results for comparison of speed response in steady state with sensor and sensorless control during rated speed regime. figure 4 shows the experimental results for speed response at steady state for both sensor and sensorless operation near rated speed. the taken results show very clear that sensorless operation provide a smooth speed control almost equal to reference speed. compared with speed response of sensor operation, the accuracy of speed estimation during sensorless operation is very high and speed error is approximately zero. so, the rotor speed reflects a great benefit in using a sensorless scheme for vector control of ipmsm from emi noises point of view. another important quantity for field oriented control algorithm is direct current id. in order to verify the validity of our strategy, we have to check the effect expressed in results for other important quantities in field oriented control like rotor speed and direct current id. currents id and iq, are variables calculated by clarke and park vector transformations of digitalized real currents flowing into stator of ipmsm, sensed with hall effect sensors and digitalized with adc converter. the direct current id is a fluxproducing component that during execution of the foc algorithm, is forced to zero in order to achieve the maximum torque production for a given stator current. so, being this very important, the results for current id during sensor and sensorless operation are put together in figure 5, were it is clearly shown that amplitude of ripples for current id is reduced by 50% during sensorless operation. 0 50 100 150 200 250 300 350 400 450 500 -0.8 -0.6 -0.4 -0.2 0 0.2 0.4 0.6 0.8 time[samples] id c o m p a ri s o n [ a ] id-sensored id-sensorless fig. 5 experimental results showing comparison between direct current id in sensored and sensorless control. 60 l. dhamo, a. spahiu, m. nemec, v. ambrozic the sensorless drive systems based on state observers, suffer from the un-observability in the area of very low speed. 0 50 100 150 200 250 300 350 400 450 500 3 3.2 3.4 3.6 3.8 4 4.2 4.4 time[samples] t ra n s ie n t re s p o n c e o f ro to r s p e e d (c o m p a ri s o n ) [r e v /s ] reference speed speed calculated from encoder speed estimated from smo fig. 6 experimental results for comparison of speed response in steady state during very low speed regime with sensor and sensorless control. since this is the lower boundary of functionality of observer, where all quantities tend to be uncontrollable, it seems to be useful to compare the speed responses for both sensor and sensorless operation. figure 6 shows a better behavior of observer at very low speed than using the encoder. the speed fluctuations are less than 50% during sensorless operation. the transient response is quite important when analyzing the behavior of a device. figure 7 show the transient response for torque during sensor and sensorless operation. it is clearly shown that sensorless operation has a smaller overshot and need less time (the half) to stabilize. 0 50 100 150 200 250 300 350 400 450 500 -3 -2.5 -2 -1.5 -1 -0.5 0 0.5 1 1.5 time [samples] t o rq u e tr a n s ie n t re s p o n s e [ n m ] with sensor sensorless fig. 7 experimental results for comparison of transient response for torque during sensor and sensorless operation. figure 8 shows the transient responses for speed during sensor and sensorless operation. it is clearly shown that sensorless operation has a smaller overshot (approximately 7%) and need almost the same time to be stabilized. experimental evaluation of torque ripple reduction in a sensorless foc of ipmsm drive... 61 t ra n s ie n t r e s p o n s e o f r o to r s p e e d time [samples, 1sample=0.00015s] 0 50 100 150 200 250 300 350 400 450 500 -2 0 2 4 6 8 10 12 14 16 calculated speed from encoder estimated speed from smo reference speed fig. 8 experimental results for comparison of transient response for rotor speed during sensor and sensorless operation. the fig. 9 show the results for angle estimation during sensorless operation by sliding mode observer. the accuracy of angle estimation is crucial for control performance because that angle estimated by smo block is used to calculate the rotor speed and to realize vector transformations of clarke and park in order to generate the right value of voltage by svpwm block. 0 50 100 150 200 250 300 350 400 450 500 -1 0 1 2 3 4 5 6 7 koha [samples, 1 sample = 0.00015s] p o z ic io n i i ro to ri t[ ra d ] pozicioni prej enkoderit pozicioni i vlerësuar me smo gabimi i vlerësimit të këndit time 0 50 100 150 200 250 300 350 400 450 500 -1 0 1 2 3 4 5 6 7 angle from encoder angle estimated with smo error estimated angle r o to r p o s it io n [ ra d ] fig. 9 experimental results for rotor position estimation by smo and encoder. as a summary of experimental results, one may conclude that elimination of one of receptors of emi noises (encoder) make the control system less susceptible to them. this strategy, although ranked third, is very useful in achieving a better overall performance of control system. an accurate view of all results confirms that using the sensorless mode of operation has a lot of benefits. this kind of nonlinear control (the sliding mode), “suffer” from chattering phenomena while implementation in real time control of ac drives and ipmsm drives too. it is obvious that the chattering is overlapped to speed and torque ripples, resulting in a worse situation. but the encoder, is the most susceptible hardware part of the drive by emi, and replacing that hardware with a software, the smo block, 62 l. dhamo, a. spahiu, m. nemec, v. ambrozic reduce the possibility to effect the drive operation. experimentally, result that the torque ripples are reduced in total up to 50%. 5. conclusions the paper presents main problems of practical implementation of sensorless foc of an ipmsm. electromagnetic distortions have pernicious influence on the calculations performed in control unit as well as for the operation of absolute encoder. eliminating one of the sufferers from emi noises, by replacing it with a sliding mode observer, provide a noticeable improvement in response of speed and torque in the control system. this improvement is reflected in decreasing effect of emi, higher efficiency, less vibrations, and better overall performance. different experiments were performed on sensor and sensorless foc and measurements of currents, torque, rotor speed and currents are compared. acknowledgement: this paper presents a part of the work supported by the research program of erasmus mundus/basileus iv (2013-2014), in laboratories of department of mechatronics, faculty of electrical engineering, university of ljubljana, slovenia.. references [1] l. dhamo, a. spahiu, m. nemec, v. ambrozic, "electromagnetic interferation reduction by using sensorless foc of ipmsm with piccolof28069 controlstick", in proceedings of the extended abstracts of the 13th international conference on applied electromagnetics (пес 2017), niš, serbia, 2017, pp. 69. [2] b. k. bose, “power electronics and motor drives: advances and trends”. usa: elsevier, 2006. [3] y. xu, q. yuan, j. zou, y. li, “analysis of triangular periodic carrier frequency modulation on reducing electromagnetic noise of permanent magnet synchronous motor”, ieee trans. magn., vol. 48, no. 11, pp. 44244427, 2012. [4] r. l. kirlin, s. kwok, s. legowski, a. m. trzynadlowski, “power spectra of a pwm inverter with randomized pulse position”, ieee trans. power electron., vol. 9, no. 5, pp. 463-472, 1994. [5] k. s. kim, y. g. jung, y. c. lim, “a new hybrid random pwm scheme”, ieee trans. power electron., vol. 24, no. 1, pp. 192-200, 2009. [6] s. kaboli, j. mahdavi, a. agah, “application of random pwm technique for reducing the conducted electromagnetic emissions in active filters”, ieee trans. ind. electron, vol. 54, no. 4, pp. 2333-2343, 2007. [7] a. m. hava, e. un, “performance analysis of reduced common-mode voltage pwm methods and comparison with standard pwm methods for three-phase voltage-source inverters”, ieee trans. power electron, vol. 24, no. 1, pp. 241-252, 2009. [8] y. c. lim, s. o. wi, j. n. kim, y. g. jung, “a pseudorandom carrier modulation scheme”, ieee trans. pow. electron, vol. 25, no. 4, pp. 797-805, 2010. [9] j.-y. chai, y.-h. ho, y.-c. chang, c.-m. liaw, “on acoustic-noise reduction control using random switching technique for switch mode rectifiers in pmsm drive”, ieee trans. ind. electron., vol. 55, no. 3, pp. 1295-1309, 2008. [10] h. khan, e. miliani, k. e. k. drissi, “discontinuous random space vector modulation for electric drives: a digital approach”, ieee trans. power electron., vol. 27, no. 12, pp. 4944-4951, 2012. [11] k. t. chau, z. wang, ”chaos in electric drive systems-analysis, control and application”. singapore: wiley, 2011. [12] h. li, y. liu, j. lu, t. zheng, x. yu, “suppressing emi in power converters via chaotic spwm control based on spectrum analysis approach”, ieee trans. ind. electron., vol. 61, no. 11, pp. 6128-6136, 2014. [13] z. wang, k. t. chau, c. h. liu, “improvement of electromagnetic compatibility of motor drives using chaotic pwm”, ieee trans. magn., vol. 43, pp. 2612-2614, 2007. experimental evaluation of torque ripple reduction in a sensorless foc of ipmsm drive... 63 [14] z. zhang, k. t. chau, z. wang, w. li, “improvement of electromagnetic compatibility of motor drives using hybrid chaotic pulse width modulation”, ieee trans. magn., vol. 47, no. 10, pp. 4018-4021, 2011. [15] c.r. paul “ introduction to electromagnetic compatibility”, second edition, john wiley & sons, inc., hoboken, new jer-sey.2006. [16] dabi, j. zare, f. ledwich, g. ghosh, a., "leakage current and common mode voltage issues in modern ac drive systems," power engineering conference, 2007. aupec 2007. australasian universities, pp.1-6, 9-12 dec. 2007. [17] e. haseloff, ”printed circuit board layout for improved electromagnetic compatibility”, texas instruments 1996. [18] plane thermoelastic waves in infinite half-space caused facta universitatis series: electronics and energetics vol. 31, n o 3, september 2018, pp. 447-460 https://doi.org/10.2298/fuee1803447s channel capacity of the macrodiversity sc system in the presence of kappa-mu fading and correlated slow gamma fading  marko m. smilić 1 , branimir s. jakšić 2 , dejan n. milić 3 , stefan r. panić 1 , petar ć. spalević 2 1 university of priština, faculty of natural sciences and mathematics, kosovska mitrovica, serbia 2 university of priština, faculty of technical sciences, kosovska mitrovica, serbia 3 university of niš, faculty of electronic engineering, niš, serbia abstract. in this paper macrodiversity system consisting of two microdiversity sc (selection combiner) receivers and one macrodiversity sc receiver are analyzed. independent κ-μ fading and correlated slow gamma fading are present at the inputs to the microdiversity sc receivers. for this system model, analytical expression for the probability density of the signal at the output of the macrodiversity receiver sc, and the output capacity of the macrodiversity sc receiver are calculated. the obtained results are graphically presented to show the impact of rician κ factor, the shading severity of the channel c, the number of clusters µ and correlation coefficient ρ on the probability density of the signal at the output of the macrodiversity system and channel capacity at the output of the macrodiversity system. based on the obtained results it is possible to analyze the real behavior of the macrodiversity system in the presence of κ-μ fading. key words: joint probability density, channel capacity, macrodiversity sc receiver, correlation coefficient, rician κ factor. 1. introduction radio signals generally propagate according to three mechanisms; reflection, diffraction, and scattering. as a result of the above three mechanisms, radio propagation can be roughly characterized by three nearly independent phenomenon; path loss variation with distance, slow log-normal shadowing, and fast multipath fading. each of these phenomenon is caused by a different underlying physical principle and each must be accounted for when designing and evaluating the performance of a cellular system [1]. received october 23, 2017; received in revised form february 16, 2018 corresponding author: marko m. smilić faculty of natural sciences and mathematics, university of pristina, lole ribara br. 29, 38220 kosovska mitrovica, serbia (e-mail: marko.smilic@pr.ac.rs) 448 m. m. smilić, b. s. jakšić, d. n. milić, s. r. panić, p. ć. spalević fast fading is caused by spreading of signal in multiple directions. the interaction of the waves with objects that are between the transmitter and receiver (reflection, diffraction and dispersion) causes that at the input of the receiver will arrives a large number of copies of the sent signal. the environment through which the wave spreads can be linear and nonlinear. when the reflected waves are correlated with factor ρ, the environment is non-linear [1], [2]. slow fading occurs due to the shadow effect. various objects between the transmitter and the receiver can form the shadow effect [3]. in most cases, the slow fading is correlated. the signal envelope is variable due to the fast fading, and the power of signal envelope is variable due to the slow fading [4], [5]. the statistical behavior of signals in such systems can be described by different distributions: by rayleigh, rician, nakagami-m, weibull or κ-μ [2], [6], [7]. κ-μ distribution can be used to describe the variation of the signal envelope in linear environments where it is a dominant component. there are several clusters in the propagation environment and the strength of components in phase and quadrature are equal. κ-μ distribution has two parameters. the parameter κ is rician factor and it is equal to the quotient of the power of dominant component and the power of linear component [8]. parameter μ is related to the number of clusters in propagation environment. κ-μ distribution is basic distribution, while other distributions can be obtained from it as special cases [9], [10]. a variety of diversity techniques are used to reduce the impact of fast fading and slow fading on system performances. diversity techniques of more replicas of the same information signal are combined. the most commonly used diversity techniques are mrc (maximum ratio combining), egc (equal gain combining) and sc (selection combining) [1], [9]. sc diversity receiver is easy for practical realization because the processing is done only on one diversity branch. sc receiver uses the branch with the highest signal-to-noise ratio for next processing of signal [11]. if the noise power is the same in all branches of sc receiver, then sc receiver separates the branch with the strongest signal [6]. performances of sc receiver are worse than performances of mrc and egc receivers. at the sc receiver, it is relatively easy to determine probability density and cumulative probability of the signal at the output from the receiver. the most commonly used are spatial diversity techniques. spatial diversity techniques are realized with multiple antennas placed on a receiver. by using spatial diversity technique it increases the reliability of the system and the channel capacity without the increase of transmitter power and the expansion of the frequency range. there are more combining spatial diversity techniques that can be used to reduce the influence of fading and co-channel interference on the performance of the system. with regard to analytical methods, for the channel capacity we used the well-known meijer g-function [12]. it is also shown how with the change of some parameters we influence on the change of the channel capacity. there are two types of channel capacity: the shannon capacity and the capacity with outage. shannon capacity is the maximum data rate that can be sent over the radio channel with asymptotically small error probability, so is also called the ergodic capacity. capacity with outage is the maximum data rate that can be transmitted over a channel with some outage probability that is the percentage of data that can not be received correctly due to the deep fading [13], [14], [15], [16], [17]. in many papers [18], [19], [20], [21], statistical characteristics of the signal for macrodiversity systems are presented. for the considered system results have not been channel capacity of the macrodiversity sc system in the presence of kappa-mu fading... 449 presented yet. based on the results obtained in this paper, it is possible to optimize the parameters of the wireless system and the emission power of the signal. using the results obtained, it is possible to predict the behavior of various system implementations for various mobile transmission scenarios and in various propagation environments, which enables mobile system designers to make rational system solutions for the desired system performance. 2. system model in this paper we discuss the macrodiversity system with macrodiversity sc (selection combining) receiver and two microdiversity sc receivers. independent κ-μ fading and slow gamma fading are at the inputs of the microdiversity sc receivers. the slow fading is correlated. the correlation coefficient decreases with increase of the distance between the antennas. microdiversity sc receiver reduces the impact of fast fading on system performance, while macrodiversity sc receiver reduces the impact of slow fading on system performance. macro system that are discussed here can be used in a single cell of a cellular mobile radio system. microdiversity receivers are installed on the base stations serving to mobile users in a single cell. macrodiversity system uses signals from multiple base stations positioned in a single cell or two or more cells. the system which is discussed is shown in figure 1. fig. 1 macrodiversity system with one macrodiversity sc receiver and two microdiversity sc receivers. the signals at the input to the first sc microdiversity receiver are marked with x11 and x12, and with x1 at the output. the signals at the input to the other sc microdiversity receiver are marked with x21 and x22, and with x2 at the output. signal at the output of the macrodiversity system is marked with x. power of signals at the input to the microdiversity receivers are marked with ω1 and ω2. the signal at the output of the macrodiversity sc receiver x is equal to the signal at the output of that microdiversity sc receiver whose power is greater than the power of signal at the input of other microdiversity sc receiver [2]. 450 m. m. smilić, b. s. jakšić, d. n. milić, s. r. panić, p. ć. spalević 3. the probability density of the signal the probability density of κ-μ signal x1 and first microdiversity sc receiver is given by [22]: 2 1 1 1 1 ( 1) 2 1 1 1 11 112 1 2 ( 1) ( 1) ( ) 2 , 1, 2; i k x x k k k k p x x e i x i k e                   (1) the probability density of κ-μ signal x2 and second microdiversity sc receiver is given by [22]: 2 2 2 2 1 ( 1) 2 2 2 1 21 212 2 2 ( 1) ( 1) ( ) 2 , 1, 2; i k x x k k k k p x x e i x i k e                   (2) the parameter μ represents the number of clusters through which the signal is extended, κ rician factor, ω1 and ω2 are average power of signals at the output of the first and second microdiversity system respectively, in(·) modified bessel function of the first kind and n type [23]. after using the series for bessel functions, the term of probability density x1i becomes 2 11 1 1 1 1 1 2 1( 1) 2 2 1 1 1 11 0 1 1 112 1 2 ( 1) ( 1) 1 ( ) ! ( )i ik x i x ik k k k p x x e x i i k e                      (3) where г(·) denotes the gamma function, and x1i represent signals envelopes at the input of first microdiversity sc receiver and x1 represents signal envelope at the output of first sc receiver [23]. in a similar way we get the probability density of the signal at the input to another microdiversity sc receiver: 2 12 2 1 2 1 1 2 1( 1) 2 2 1 2 2 21 0 2 1 112 2 2 ( 1) ( 1) 1 ( ) ! ( )i ik x i x ik k k k p x x e x i i k e                      (4) the cumulative probability of x1i, i=1,2 is 2 1 1 2 2 1 2 1 1 2 1 2 1 1 0 10 12 1 ( 1) 2 2 1 2 2 0 2 ( 1) ( 1) ( ) ( ) 1 ! ( ) i i i x x x ik k tx i k k k f x dt p t k e dt t e i i                             (5) after solving the integral by the use of [23], we have the expression for the cumulative probability of the signal at the input to the first microdiversity sc receiver: channel capacity of the macrodiversity sc system in the presence of kappa-mu fading... 451 2 1 2 2 1 2 1 2 1 1 0 1 2 212 1 21 2 1 1 ( 1) ( 1) 1 ( ) ! ( ) ( 1) , ( 1) i i x ik i k k k f x i i k e k i x k                                    (6) where γ(·) represents the lower incomplete gamma function [23]. by applying the procedure for obtaining the cumulative probability of the signal at the input to the first microdiversity sc receiver, the cumulative probability x2i of the signal at the input to the second microdiversity sc receiver can also be obtained. the probability density of the signal at the output of the first microdiversity sc receiver is 1 11 12 12 11 11 121 1 1 1 1 1 1 ( ) ( ) ( ) ( ) ( ) 2 ( ) ( ) x x x x x x x p x p x f x p x f x p x f x   (7) where px1i is given by (3), while fx1i is given by (6). after the replacement of (3) and (6) into (7) we have 2 11 1 1 1 1 2 2 2 2 1 2 1( 1) 2 2 2 1 1 11 0 1 1 112 1 2 1 21 2 1 0 1 2 2 1 ( 1) ( 1) 1 ( ) 4 ! ( ) ( 1) 1 ( 1) , ! ( ) ( 1) ik x i x ik i i i k k k p x e x i i k e k k k i x i i k                                                            (8) the probability density of the signal at the output of the second microdiversity sc receiver is 2 12 2 1 2 1 2 2 2 2 1 2 1( 1) 2 2 2 1 2 21 0 2 1 112 2 2 1 22 2 2 0 2 2 2 2 ( 1) ( 1) 1 ( ) 4 ! ( ) ( 1) 1 ( 1) , ! ( ) ( 1) ik x i x ik i i i k k k p x e y i i k e k k k i x i i k                                                            (9) probability density of the signal at the output of macrodiversity sc receiver is equal to the probability density of the signal at the output of that microdiversity sc receiver whose power at the input is greater than the power of the signal at the input to the two other microdiversity sc receivers [23]. based on this, the probability density of the signal at the output of the microdiversity sc receiver is equal to 1 1 1 2 1 2 1 2 1 2 1 1 2 0 0 1 2 1 1 2 1 2 0 0 ( ) ( / ) ( ) ( / ) ( ) x x x p x d d p x p d d p x p i i                         (10) where px(x/ω1), px(x/ω2) are given by (8) and (9), respectively. joint probability density power ω1, ω2 is given by [6]: 452 m. m. smilić, b. s. jakšić, d. n. milić, s. r. panić, p. ć. spalević 3 1 2 3 1 2 3 3 0 2 1 1 2 2 1 2 0 00 1 1 (1 ) 1 2 3 3 1 ( ) (1 )( )(1 ) 1 ! ( ) i c c c i i c i c p c e i i c                                      (11) where c is shadowing severity. integral i1 is equal to     1 1 1 1 2 1 2 1 2 2 1 2 2 1 2 1 1 2 1 1 2 1 00 0 2 2 1 2 2 1 01 1 2 2 2 2 2 1 2 1 00 ( 1) ( / ) ( ) 4( ) ( 1) 1 1 1 ( 1) ! ( ) ! ( ) ( ( 1)) ( 1) 1 , (1 )( )(1 ) i x ik i i i i c c k i d d p x p k k k e x k k i i i i k k i x c                                                         3 3 2 1 21 3 1 2 2 1 0 3 0 2 1 0 3 3 ( 1) 1 1 1 2 2 (1 ) 1 (1 ) 2 2 2 2 1 1 2 2 0 0 1 ! ( ) i c i k x i c i i i i c i i c d e d e                                            (12) by the use of the method for calculating the integral i1 (appendix a, expressions a1, a2 and a3), the integral i2 is also solved: 1 2 1 22 1 2 1 1 2 0 0 ( / ) ( ) x i d d p x p           (13) 4. channel capacity of macrodiversity after we got the expression for the joint probability density, we can calculate channel capacity at the output of the macrodiversity system shown in figure 1. the maximum data rate can be achieved after the channel has experienced all possible fading states during a sufficiently long transmitting time, which can be expressed as follows in the unit of bits per second, where b denotes channel bandwidth expressed in hz [24], [25]: 2 0 (1 ) ( ) x c b log x p x dx    (14) substituting the expression for the joint probability density in the expression for the channel capacity, we get: channel capacity of the macrodiversity sc system in the presence of kappa-mu fading... 453     1 1 1 2 2 2 3 3 1 2 2 1 2 2 12 1 0 1 10 2 2 1 2 1 2 0 2 2 0 2 1 3 0 0 3 3 3 4 ( 1) 1 ln(1 )( ) ( 1) ln 2 ! ( ) 1 1 1 ( 1) ! ( ) ( )(1 )( ( 1)) ( )!1 1 (1 ) ! ( ) ( i i ik i i c c i i c i c k dx x k k x b i i k e k k i i ck i c i i c i c i                                                     1 1 2 1 2 3 1 2 2 2 1 2 3 1 2 0 3 1 20 2 3 2 1 2 2 22 2 0 0 2 2 2 2 3 2 1 0 1 1 )! ( (1 )) ( )! ( ( 1) ) ( ( 1) ) (2 ( 1) (1 )) ( )! 4 ( 1) 2 (1 ) j j j i i i j j c i j i i i j j c c j i i k x k x k x i j k x k                                                 (15) in order to make the integral solution from expression (15) simpler, firstly, in front of the integrals we can get all the constants, ie, everything that does not solve the integral. secondly, we can show the logarithmic function through meijer g-function (appendix b, expression b1) as well as bessel function (appendix b, expressions b2 and b3), with the aim to get more convenient and simpler expression for resolution. the general form of our expression for the channel capacity would be:   0 ln(1 ) r v c r dxx x k x b    (16) where r represents a constant, or an expression in front of the integral, r represents the argument of degree variable by which the expression is solved and v represents the argument of bessel function. replacing (b5) into (b4), we get the expression for the solution of channel capacity at the output of the macrodiversity system:     1 2 1 2 3 2 3 1 1 1 2 2 1 2 1 2 1 0 01 1 2 22 2 1 2 1 2 0 0 3 3 30 3 0 3 1 0 2 ( 1) 1 1 2 ( ) ( 1) ( 1) 2 ! ( ) ! ( ) 1 1 1 1 (1 ) ! ( )( )(1 )( ( 1)) ( )! 1 ( )! ( z i i i ik i c i c c i j c k k k k k b i i i i k e i i c i cck i c i c j                                                        2 2 1 2 1 2 3 1 2 22 02 2 2 2 3 2 1 622 0 46 1 10 ( )!1 ( ( 1)) ( ( 1) ) ( )!(1 )) ( ,1 ), ( ,1 )4 ( 1) (2 ( 1) (1 )) | , , ( ,1 ), ( ,1 )(1 ) j i j j i i i j j c s v m t i k k x i i j l d l dk k g b b l c l c                                            (17) in table 1, the number of terms to be summed in order to achieve accuracy at the desired significant digit is depicted. as we can see from the table, how increases the correlation coefficient increases the number of terms to be summed in order to achieve accuracy at the 4th significant digit. for higher values of parameter c, smaller number of terms to achieve accuracy at the 4th significant digit is required [2], [26]. 454 m. m. smilić, b. s. jakšić, d. n. milić, s. r. panić, p. ć. spalević table 1 terms need to be summed in the expression for cumulative distribution function to achieve accuracy at the significant digit presented in the brackets. x=1, k=1, ω0=1 c=1 (4th) c=1.5 (4th) c=2 (4th) ρ=0.2 148 132 118 ρ=0.4 152 138 121 ρ=0.6 154 140 125 ρ=0.8 155 141 126 5. numerical results by using (12) and (13), in figure 2 we show the change of the probability density of the signal x at the output of the macrodiversity system for different number of clusters μ through which the signal extends. medium powers of signal are ω0 = 1, the correlation coefficient ρ = 0.5, shadowing severity c = 1.5 and rician factor κ=1. in the figure we can see that the highest value of the probability density for value parameter μ. with the decrease of μ number of clusters, decrease in the probability density of the signal is slower. in the figure we can see that maximum values of the probability density for higher values of parameter μ. with the decrease of μ number of clusters, decrease in the probability density of the signal is slower. also, by using (12) and (13), in figure 3 we show the probability density of the signal x at the output of macrodiversity system for different values of rician κ factor and μ number of clusters. medium power of signals are ω0= 1, the correlation coefficient ρ = 0.5, and the channel shadowing severity c = 1.5. for higher values of parameter κ and μ are obtained more extreme of the probability density and its faster decrease. fig. 2 the probability density of the signal at the output of macrodiversity sc receiver for different values of μ number of clusters. channel capacity of the macrodiversity sc system in the presence of kappa-mu fading... 455 by using (17), in figure 4 we show the channel capacity depending on the correlation coefficient at the output of macrodiversity system for various numbers of cluster μ and the channel shadowing severity c. medium powers of signal are ω0 = 1, and rician factor κ=1. we can see in the figure that the channel capacity decreases with the correlation coefficient at the output of macrodiversity system. for lower values of the correlation coefficient, the highest capacity channel is obtained for higher values of μ number of clusters and the channel shadowing severity c, but channel capacity faster decreases for the same values than for lower values of μ number of clusters and the channel shadowing severity c. for lower values of the correlation coefficient, the highest capacity channel is obtained for higher values of μ number of clusters and the channel shadowing severity c, but channel capacity faster decreases for the same values than for lower values of μ number of clusters and the channel shadowing severity c. figure 5 gives the graphic view of the channel capacity at the output of the macrodiversity system depending on rician κ factor for different values of μ number of cluster. medium powers of the signal are ω0= 1, the channel shadowing severity c = 1. channel capacity at the output of macrodiversity system decreases with the increase of rician κ factor especially in his lower values, while for higher values the mean number of axial cross sections is constant and approximately equal, regardless of the number of clusters. channel capacity faster decreases for lower values of the number of clusters. fig. 3 the probability density of the signal at the output of macrodiversity sc receiver for different values of rician κ factor and μ number of clusters. 456 m. m. smilić, b. s. jakšić, d. n. milić, s. r. panić, p. ć. spalević fig. 4 channel capacity per unit bandwidth at the output of the macrodiversity sc receiver for different values of the channel shadowing severity c and μ number of clusters. fig. 5 channel capacity per unit bandwidth at the output of macrodiversity system depending on the rician κ factor. channel capacity of the macrodiversity sc system in the presence of kappa-mu fading... 457 6. conclusion in this paper we discussed the diversity system with two microdiversity sc receivers and one macrodiversity sc receiver. at the inputs to microdiversity sc receivers there is an independent κ-μ fading and correlated slow gamma fading. microdiversity sc receiver reduces the impact of fast fading on system performances, while macrodiversity sc receiver reduces the impact of slow fading on system performances. for this system, the probability density function and channel capacity at the output from the macrodiversity system are calculated. the probability density of the signal is important statistical characteristic through which we calculate other statistical characteristics of the first and the second order. when the parameter μ decreases, acuity fading influence increases, but when the parameter μ increases, acuity fading influence decreases. when the acuity fading influence increases, system performances deteriorate. greater acuity fading influence occurs when rician κ factor is smaller. for lower values of the correlation coefficient, the highest channel capacity is obtained for higher values of μ number of clusters and the channel shadowing severity c. channel capacity at the output of macrodiversity system decreases with the increase of rician κ factor especially in his lower values, while for higher values the mean number of axial cross-sections is constant and approximately equal, regardless of the number of clusters. the analysis presented in this paper has a high level of generality and applicability, due to the fact that the modeling of propagation scenarios performed using κ-μ model, which within itself, as a special case involves a large number of known signal propagation model (nakagami-m, rayleigh, rician etc…). appendix a after using [21] for solving the second integral in (12), i1 becomes:     1 2 1 1 2 3 2 3 3 1 2 2 1 2 1 2 2 12 1 1 0 01 12 2 1 2 1 2 02 2 00 0 2 3 3 1 ( 1) 1 4( ) ( 1) ( 1) ! ( ) 1 1 1 ! ( ) (1 )( )(1 )( ( 1)) 1 ( 1) ( (1 )) , ! ( ) i i i i ik i c i c c i i c k i k k x k k i i k e i i ck k i x i i c                                                         2 1 3 1 2 2 1 0 2 1 3 0 ( 1) 1 1 1 2 2 (1 ) 2 2 2 2 1 1 0 , (1 ) k x i c i i i i c d e                                    (a1) after the development of gamma function 0 1 ! ( , ) ( )! n x i i n n x x e x n n i        (a2) and by the use of [21] we have 458 m. m. smilić, b. s. jakšić, d. n. milić, s. r. panić, p. ć. spalević     1 2 1 1 2 3 2 3 1 1 2 2 1 2 1 2 2 12 1 1 0 01 12 2 1 2 1 2 02 2 00 3 03 3 3 3 1 ( 1) 1 4( ) ( 1) ( 1) ! ( ) 1 1 1 ! ( ) (1 )( )(1 )( ( 1)) ( )!1 1 1 ! ( ) ( )! i i i i ik i c i c c i j k i k k x k k i i k e i i ck i c i i c i c i c j                                                       2 1 1 2 3 1 2 2 2 1 2 3 1 2 2 20 2 3 2 1 2 22 2 0 0 2 2 2 2 3 2 1 0 1 ( ( 1) ) ( (1 )) ( )! ( ( 1) ) (2 ( 1) (1 )) ( )! 4 ( 1) 2 (1 ) i j i i i j j c j j i i i j j c k x i i k x k x i j k x k                                            (a3) where kn(x) is modified bessel's function of the second kind, order n and argument x [21]. infinite-series from above rapidly converge with only few terms needed to achieve accuracy at 5th significant digit. appendix b by applying from [27] we have: 12 22 1,1 ln(1 ) | 1, 0 x g x         (b1) and by applying the formula (8.4.23/1) from [24] we have: 2 20 02 _____ 1 ( ) | 2 4 , 2 2 v x k x g v v         (b2) specifically for our case the application of (20) would be: 1 2 3 1 2 2 2 20 2 3 2 1 02 10 0 1 2 3 1 2 1 1 2 3 1 2 ,4 ( 1) 1 4 ( 1) 2 | ,(1 ) 2 (1 ) 2 3 2 1 ; 2 2 3 2 1 . 2 i i i j j c m m k x k x k g b b i i i j j c b i i i j j c b                                               (b3) after replacing (b1) and (b3) into (15) and after arrangement, we get the expression: channel capacity of the macrodiversity sc system in the presence of kappa-mu fading... 459     1 2 1 2 3 2 3 1 1 2 2 1 2 1 2 1 0 01 12 2 1 2 1 2 02 2 00 3 03 3 3 3 1 0 ( 1) 1 4( ) ( 1) ( 1) ! ( ) 1 1 1 ! ( ) (1 )( )(1 )( ( 1)) ( )!1 1 1 ! ( ) ( )! ( (1 )) i i i ik i c i c c i j c k k k k k b i i k e i i ck i c i i c i c i c j                                                       2 1 1 2 3 1 2 2 1 2 3 1 2 2 2 3 2 1 2 22 2 0 0 2 2 2 2 2 2 12 20 22 02 100 1 ( ( 1)) ( )! ( ( 1) ) (2 ( 1) (1 )) ( )! ,1,1 1 4 ( 1) | | ,1, 0 2 (1 ) i j i i i j j c j j i i i j j c m k i i k x k i j k x dxx g x g b b                                                (b4) integral of (b4) is solved by applying the formula (2.24.1/1) from [24]. we get that the integral is equal to: 1 2 3 1 2 2 2 2 2 12 20 22 02 100 1 62 46 1 10 1 2 3 1 ,1,1 1 4 ( 1) | | ,1, 0 2 (1 ) ( ,1 ), ( ,1 )1 2 4 ( 1) | , , ( ,1 ), ( ,1 )2 2 (1 ) 1 2 ( ,1 ) i i i j j c m z s v m t s k x i dxx g x g b b l d l dk g b b l c l c i i i j l d                                                     2 1 2 3 1 2 1 2 3 1 2 1 2 3 1 2 1 2 3 1 2 1 2 3 1 2 1 1 2 3 1 2 2 2 2 ; ( ,1 ) ; 2 2 2 2 2 3 2 2 ( ,1 ) ; ( ,1 ) ; 2 2 2 3 2 1 2 3 2 1 ; ; 2 2 1 2 ( ,1 ) s v v m j c i i i j j c l d i i i j j c i i i j j c l d l d i i i j j c i i i j j c b b i i i j l c                                                                1 2 1 2 3 1 2 1 1 2 3 1 2 1 2 3 1 2 2 2 2 2 ; ( ,1 ) ; 2 2 1 2 2 2 2 2 ( ,1 ) ; ( ,1 ) . 2 2 t t j c i i i j j c l c i i i j j c i i i j j c l c l c                                     (b5) references [1] g. l. stüber, principles of mobile communication, 2nd ed. new york: kluwer academic publishers, 2002. [2] s. panic, m. stefanovic, j. anastasov, and p. spalevic, fading and interference mitigation in wireless communications, 1st ed. boca raton, fl, usa: crc press, inc., 2013. [3] m. k. simon and m.-s. alouini, digital communication over fading channels, 2nd ed. new york: john wiley & sons, inc., 2005. [4] s. r. panić, d. m. stefanović, i. m. petrović, m. ĉ. stefanović, j. a. anastassov, and d. s. krstić, “second order statistics of selection macro-diversity system operating over gamma shadowed κ-μ fading channels,” eurasip j. wirel. commun. netw., vol. 2011, no. 151, pp. 1–7, 2011. [5] m. d. yacoub, “the κ-μ distribution and the η-μ distribution,” ieee antennas propag. mag., vol. 49, no. 1, pp. 68–81, 2007. 460 m. m. smilić, b. s. jakšić, d. n. milić, s. r. panić, p. ć. spalević [6] n. djordjević, b. s. jakšić, a. matović, m. matović, and m. smilić, “moments of microdiversity egc receivers and macrodiversity sc receiver output signal over gamma shadowed nakagami-mmultipath fading channel,” j. electr. eng., vol. 66, no. 6, pp. 348–351, 2015. [7] a. v. marković, z. h. perić, d. b. đošić, m. m. smilić, and b. s. jakšić, “level crossing rate of macrodiversity system over composite gamma shadowed alpha-kappa-mu multipath fading channel,” facta universitatis, ser. autom. control robot., vol. 14, no. 2, pp. 99–109, 2015. [8] d. krstic, v. doljak, m. stefanovic, and b. jaksic, “second order statistics of macrodiversity sc receiver output signal over gamma shadowed k-μ multipath fading channel,” in proceedings of the 2016 international conference on broadband communications for next generation networks and multimedia applications (cobcom), 2016, pp. 1–6. [9] j. proakis, digital communications, 4th ed. new york: mcgraw-hill, 2001. [10] p. m. shankar, “analysis of microdiversity and dual channel macrodiversity in shadowed fading channels using a compound fading model,” aeu int. j. electron. commun., vol. 62, no. 6, pp. 445–449, jun. 2008. [11] p. c. spalevic, b. s. jaksic, b. p. prlincevic, i. dinic, and m. m. smilic, “signal moments at the output from the macrodiversity system with three mrc micro diversity receivers in the presence of k μ f ading,” in proceedings of ieee conference telsiks 2015, 2015, pp. 271–274. [12] “wolfram functions site.” [online]. available: http://functions.wolfram.com. [accessed: 10-jun-2016]. [13] j. li, a. bose, and y. q. zhao, “rayleigh flat fading channels’ capacity,” in proceedings of the 3rd annual communication networks and services research conference, 2005, vol. 2005, pp. 214–217. [14] p. varzakas, “average channel capacity for rayleigh fading spread spectrum mimo systems,” int. j. commun. syst., vol. 19, no. 10, pp. 1081–1087, 2006. [15] w. hu, l. wang, g. cai, and g. chen, “non-coherent capacity of m -ary dcsk modulation system over multipath rayleigh fading channels,” ieee access, vol. 5, no. 1, pp. 956–966, 2017. [16] p. yang, y. wu, and h. yang, “capacity of nakagami$m$ fading channel with bpsk/qpsk modulations,” ieee commun. lett., vol. 21, no. 3, pp. 564–567, 2017. [17] j. m. romero-jerez and f. j. lopez-martinez, “fundamental capacity limits of spectrum-sharing in hoyt (nakagami-q) fading channels,” in ieee vehicular technology conference, 2017. [18] d. b. djosic, d. m. stefanovic, and c. m. stefanovic, “level crossing rate of macro-diversity system with two micro-diversity sc receivers over correlated gamma shadowed α–µ multipath fading channels,” iete j. res., vol. 62, no. 2, pp. 140–145, 2016. [19] s. r. panić, d. m. stefanović, i. m. petrović, m. ĉ. stefanović, j. a. anastasov, and d. s. krstić, “second-order statistics of selection macro-diversity system operating over gamma shadowed $κ$-$μ$ fading channels,” eurasip j. wirel. commun. netw., vol. 2011, no. 1, p. 151, oct. 2011. [20] p. s. bithas and a. a. rontogiannis, “mobile communication systems in the presence of fading/shadowing, noise and interference,” pp. 1–14, 2014. [21] m. stefanović, s. r. panić, n. simić, p. spalević, and ĉ. stefanović, “on the macrodiversity reception in the correlated gamma shadowed nakagami-m fading,” teh. vjesn., vol. 21, no. 3, pp. 511–515, 2014. [22] b. jaksic, m. stefanovic, d. aleksic, d. radenkovic, and s. minic, “first-order statistical characteristics of macrodiversity system with three microdiversity mrc receivers in the presence of κ-μ short-term fading and gamma lon,” j. electr. comput. eng., vol. 2016, pp. 1–9, 2016. [23] i. s. gradshteyn and i. m. ryzhik, table of integrals, series, and products, 5th ed. sad diego: san diego, academic press. [24] m. s. alouini and a. j. goldsmith, “capacity of rayleigh fading channels under different adaptive transmission and diversity-combining techniques,” ieee trans. veh. technol., vol. 48, no. 4, pp. 1165– 1181, 1999. [25] n. y. ermolova, “capacity analysis of two-wave with diffuse power fading channels using a mixture of gamma distributions,” ieee commun. lett., vol. 20, no. 11, pp. 2245–2248, 2016. [26] b. s. jakšić, “level crossing rate of macrodiversity sc receiver with two microdiversity sc receivers over gamma shadowed multipath fading channel,” facta universitatis, ser. autom. control robot., vol. 14, no. 2, pp. 87–98, mar. 2015. [27] a. p. prudnikov and j. a. brychkov, integrasl and series, 2nd ed. moscow: moscow, fizmatlit, 2003. facta universitatis series: electronics and energetics vol. 31, n o 4, december 2018, pp. 613-626 https://doi.org/10.2298/fuee1804613k calibration of ac induction magnetometer  branko koprivica, marko šućurović, alenka milovanović university of kragujevac, faculty of technical sciences ĉaĉak, ĉaĉak, serbia abstract. the aim of this paper is to describe a procedure and experimental setup for calibration of ac induction magnetometer. the paper presents an overview of the previous research and results of measurement of magnetic flux density inside largediameter multilayer solenoid. this solenoid is magnetising coil of the magnetometer. the paper also describes a system of five smaller coils of the magnetometer which are placed inside the large solenoid. three small coils are pickup coils, accompanied with two compensation coils, of which one is an empty coil for magnetic field measurement. the experimental results of calibration of this coil system have been presented. a proper discussion of all the results presented has been also given in the paper. key words: induction magnetometer, calibration, measurement uncertainty, hall sensor, labview. 1. introduction iron loss in induction motors can reach up to 20% of the total losses [2]. large efforts were made to improve production of the electrical steel and to reduce losses. amorphous materials have been also used because of lower losses. measurement of their magnetic characteristics has become of great importance in order to obtain reliable data on power loss of these materials. ac induction magnetometer have proved to be a powerful tool for characterisation of the ferromagnetic materials [3, 4]. induction magnetometer uses a long solenoid for magnetisation of the sample [3, 4]. a pickup coil is placed inside the long solenoid and used for measurement of the magnetic flux density in the sample of ferromagnetic material. a lateral dimension of the sample is not large, usually in order of several millimetres (diameter of wire or bar and width of strips), while its length usually amounts several centimetres and may be up to 10-15 cm. another pickup coil may be also placed inside the long solenoid, without the sample, and used for measurement of the magnetic field.  received february 25, 2018; received in revised form july 5, 2018 corresponding author: branko koprivica university of kragujevac, faculty of technical sciences, svetog save 65, 32000 ĉaĉak, serbia (e-mail: branko.koprivica@ftn.kg.ac.rs) * an earlier version of this paper was presented at the 13 th international conference on applied electromagnetics (пес 2017), august 31 september 01, 2017, in niš, serbia [1]. 614 b. koprivica, m. šućurović, a. milovanović in general, it is easy to control a time waveform of the magnetic field created by the long solenoid. therefore, the magnetic field has desired shape, while the magnetic flux density shape depends on the material response. when it is used along with the personal computer, as a digital measurement setup, it enables performing of very complex experiments, such as those for measurement of first-order reversal curves [3, 4]. this measurement method is based on two basic laws of magnetism ampere’s law for magnetising coil and faraday’s law for pickup coil. in the case when a long solenoid with small diameter is used as magnetising coil, according to the ampere’s law, the magnetic field is homogeneous and has longitudinal direction and constant amplitude [4]. however, this is not easy to achieve in practice and in many cases the magnetic field is homogenous only in the middle zone of the solenoid. this limits the length of the pickup coil. because of this inhomogeneity, the magnetic field can not be always accurately calculated according to ampere’s law, using measured current of the solenoid. the faraday’s law applied to the pickup coil may also provide inaccurate result if the pickup coil is placed in the zone where the magnetic field is not homogeneous. therefore, the whole system needs to be calibrated. in this paper, a calibration procedure will be described and performed on the coil systems of one old ac induction magnetometer ferrotester 2738/s-3. as an initial step in the calibration of the magnetometer, a homogeneity of the magnetic field inside a largediameter multilayer solenoid (large solenoid) has been investigated in the prior research [1]. it has been found that the homogeneity zone covers only one third of the solenoid (its central part). a variation of the magnetic field in this zone was less than 1 % of its maximum value. an inner coil system of this magnetometer contains five smaller coils, one coil for measurement of the magnetic field (empty coil) and two pairs of coils in mutual opposition for measurement of the magnetic flux density and magnetisation. the calibration of this coil system, along with the calibration of the magnetising solenoid, will be presented in this paper. a calibration of the magnetometer has been performed using a pc based measurement setup. the hall sensor has been used for measurement of the magnetic flux density in a homogeneity zone of the large solenoid. voltage supplied and electric current of large solenoid have been also measured. five voltages from the system of pickup coils have been measured: one on the empty coil, two on coils in the opposition and two compensated voltages. measurements have been performed using ni usb 6009 data acquisition card and application created in labview software. a ratio of the magnetic flux density maximum and the electric current maximum gives a calibration constant of the large solenoid. a calibration constant of each pickup coil has been calculated from the corresponding voltage maximum. this paper gives a detailed description of ac induction magnetometer, all information on the measuring equipment and calibration procedure, as well as the results obtained during the calibration. it also gives a detailed calculation of the measurement uncertainty, as well as a discussion of the results obtained, and explains how to calculate the magnetic field, the magnetic flux density and the magnetisation using obtained calibration constants. moreover, some practical comments on measurements at various frequencies of the magnetising current are given in the paper. calibration of ac induction magnetometer 615 2. ac induction magnetometer a photo of the ac induction magnetometer is given in fig. 1. it was a part of equipment of ferrotester 2738/s-3. it has 360 mm long magnetising coil (large solenoid) of inner diameter 2r0=65 mm, fig 1a. a number of turns is unknown (it is not given in the user manual). it also has a system of five pickup coils, each 100 mm long with inner diameter 15 mm, fig 1b. a number of turns is also unknown. a cross-section of the magnetometer is presented in fig. 2a and an electrical scheme of connections of pickup coils is given in fig. 2b. a) b) fig. 1 photos of the induction magnetometer: a) large solenoid, b) pickup coils inside large solenoid five pickup coils are placed inside a large solenoid l0 in its central part (z(−5, 5) in fig. 3) at the same distance r1=20 mm from its axis (x=0 in fig. 3). as it has been presented in [1], this is a zone with a homogeneous magnetic field in which the variation of the magnetic field intensity is less than 1 % of its maximum. since whole system has axial symmetry, pickup coils are exposed to the same magnetic field. coil l1 is used for measurement of the magnetic field and its interior should not contain samples of ferromagnetic material. coils l2 and l′2 are two identical coils wound in the opposite direction and connected in mutual opposition, as it is given in fig. 2b. coils l3 and l′3 are identical, wound in the opposite direction, and connected in mutual opposition (fig. 2b). a small difference between coil systems 2 and 3 has been observed. a sample of ferromagnetic material under test can be placed in any of these 616 b. koprivica, m. šućurović, a. milovanović four pickup coils and the magnetic flux density can be measured. a resistor (45 kω) is connected in series with two coils in opposition to reduce the electric current in the coils, fig. 2b. 0 l 1 l 2 l ' 2 l ' 3 l 3 l 0 r 1 r hs 1 l ' 2 l 2 l ' 3 l 3l k45 4 12 3 k45 1′2′ a) b) fig. 2 induction magnetometer: a) cross-section, b) electrical scheme of connections of pickup coils fig. 3 total magnetic flux density distribution inside large solenoid fig. 2a also shows a position of hall sensor (hs) used for direct measurement of the magnetic flux density inside large solenoid. this sensor (type ss49e) is inserted in the vertical gap of the cylindrical plastic holder (perpendicular to its axis). along with holder, the sensor is placed inside the coil l1 in the middle of its length, perpendicular to the calibration of ac induction magnetometer 617 longitudinal axis of the coil and perpendicular to the magnetic field. a photo of the sensor and 3d printed plastic holder are presented in fig. 4. fig. 4 hall sensor and its position inside plastic holder usually, the magnetic field h generated by a long solenoid is calculated from the measured electric current i using expression (1) [5], according to ampere’s law: n n u h i l l r   , (1) where n is the number of the turns of large solenoid l0, l is its length, r is the resistor connected in the series with the solenoid and u is the voltage measured at the ends of r. however, this can not be used since n is unknown for l0. in general, the expression (1) can not be used with good accuracy in the case of a large-diameter multilayer solenoid l0. if the ferromagnetic sample is placed inside pickup coil, according to faraday’s law, a voltage induced ul in the pickup coil is equal to [4]: 0 d d d d d d l p p s p m h u n n s s t t t            , (2) where μ0 is the permeability of vacuum, np is the number of the turns of pickup coil, sp is the cross-section area of the pickup coil, ss is the cross-section area of the sample, φ is the magnetic flux and m is the magnetisation. the voltage measured at the non-common ends of the pickup coils in mutual opposition is equal to the first term in the expression (2). if no sample is placed inside the pickup coil in mutual opposition, the voltage induced in that pickup coil is equal to the second term in the expression (2). if no sample is placed inside pickup coils, induced voltages are equal and the resulting voltage is equal to zero or very close to zero. according to the iec standard for epstein frame [6], the resulting voltage should be smaller than 0.1 % of the individual voltages. expression (2) can not be applied to the pickup coils of the described magnetometer since these coils are multilayer coils and the number of turns is unknown. 3. calibration procedure and results the calibration of induction magnetometer is performed in three steps: 1. calibration of hall sensor, 2. calibration of large solenoid and 3. calibration of pickup coils. the calibration is necessary because numbers of turns of all coils are unknown for the used induction magnetometer. the large solenoid generates a homogeneous magnetic 618 b. koprivica, m. šućurović, a. milovanović field only in the central zone. even if this is not the case, it is always better to perform the calibration and to compare obtained results with calculations. all measurements are performed with controlled sinusoidal excitation voltage and current, at the frequency of 50 hz. ni usb 6009 data acquisition card is used in all measurements [7]. three simple labview applications are developed for each step of the calibration. in all measurements the averaging is used to reduce the noise [8]. 3.1. calibration of hall sensor the hall sensor ss49e has linear output voltage in the range of magnetic flux density from −100 mt to 100 mt [9]. therefore, it is suitable for measurement of the magnetic flux density generated by large solenoid l0. however, it needs to be calibrated in order to determine its sensitivity. a calibration is performed using a long solenoid (l=340 mm) with small diameter (25 mm). a varnished copper wire of 1.8 mm thick is used for winding of n=190 turns of this solenoid. an electric current i(t) of the solenoid is measured using a shunt resistor (20 a, 75 mv). the magnetic flux density is calculated as follows: 0 ( ) ( ) n i t b t l   . (3) the sensor characteristic uhmax=f(bmax) is obtained according to the maximum of the voltage uh measured at the output of the sensor and the maximum of the magnetic flux density calculated using expression (3). measurements are performed in the increasing and the decreasing direction in order to examine the linearity of the characteristic. a sensitivity of the sensor is calculated from the slope of the obtained characteristic. characteristics of the sensor (blue lines) obtained from two measurements for maximal magnetic flux densities of 10 mt (green squares) and 20 mt (red circles) are presented in fig. 5. 0.0 2.5 5.0 7.5 10.0 12.5 15.0 17.5 20.0 0.0 0.1 0.2 0.3 0.4 0.5 0.6 measured up to 10 mt measured up to 20 mt linear fit u h m a x [ v ] equation y = a + bx slope (b) stand. deviation adj. r-square 20 mt 0.02991 2.57e-4 0.99957 10 mt 0.03026 1.25e-4 0.99889 b max [mt] fig. 5 calibration of hall sensor measurement results and linear fit calibration of ac induction magnetometer 619 the figure presents also a table with calculated slopes, standard errors and adjusted rsquares. for a given dataset (xi, yi), i=1, 2, …, n, the standard deviation (error) ε and the adjusted r-square 2r of linear model y=a+bx can be calculated as [10]: 2 1 2 1 ( ( )) 1 n i i i n i i y a bx n x          , (4) 2 1 2 2 1 1 ( ( )) 1 1 ; ( ) n i i i n in i i i y a bx n r y y y y n              . (5) according to fig. 5, the sensitivity of hall sensor is equal to s=0.03 v/mt. it is interesting to notice that both measurements in fig. 5 show a dispersion of results. this effect has been examined more thoroughly. it has been found that this behaviour comes from the heating and cooling of the shunt resistor during the measurement of electric current. for lower values of the supply current (up to 10 a, rms) this effect is not expressed so much (green squares) and it can be neglected. therefore, it can be concluded that the sensor output is linear with a constant sensitivity. however, the second measurement (red circles) shows significant dispersion of the results and the calculated slope is lower by 1.16 % than in the first measurement. such a difference can be in the range or even higher than the overall measurement uncertainty of the experiment (discussed in details in section 4). therefore, attention needs to be paid to such effect and its influence on the calculated results. 3.2. calibration of large solenoid the calibration of the large solenoid is performed with the calibrated hall sensor. at this step, dependence of the magnetic flux density generated by the large solenoid on its electric current is examined. a final result of the calibration is bmax=f(imax) characteristic of the large solenoid. the electric current of the large solenoid is measured using a shunt resistor. the magnetic flux density is calculated using the measured output voltage of hall sensor and dividing it with the sensitivity s obtained in the previous step. the hall sensor is placed in the middle of large solenoid, so that the sensor surface is perpendicular to its longitudinal axis and the magnetic field. measurements are performed with the increasing and the decreasing of the electric current in order to examine the linearity of the characteristic. the result of the calibration of the large solenoid is presented in fig. 6. according to the linear fit of measured results, the slope of the b=f(i) characteristic of large solenoid is 620 b. koprivica, m. šućurović, a. milovanović around 19.77 mt/a. the linearity of this characteristic confirms the conclusion from the calibration of the hall sensor that dispersion of measurement results comes from the variation of temperature of the shunt resistor. the obtained slope can be used in further measurements in the calculation of the magnetic flux density generated by the large solenoid according to the measured current. 0.0 0.2 0.4 0.6 0.8 1.0 1.2 1.4 1.6 1.8 0 5 10 15 20 25 30 35 i max [a] b m a x [ m t ] measured increasing measured decreasing linear fit equation y = a + bx slope (b) stand. deviation adj. r-square linear fit 19.768 0.0244 0.999 fig. 6 calibration of large solenoid measurement results and linear fit 3.3. calibration of pickup coils in the third step of calibration, voltages induced in the pickup coils l1, l′2 and l′3 (fig. 2b) and voltages at the ends of mutual pickup coils l2-l′2 and l3-l′3 are measured in order to calculate calibration constants for all pickup coils. all voltages have been measured in relation to the ground terminal of a data acquisition card. additionally, the voltage supplied to the large solenoid ul0, as well as the electric current i and the magnetic flux density b (hall sensor) are measured. thus, all quantities of interest for the calibration of magnetometer are measured simultaneously. measurements have been performed at six different magnetising currents up to around imax=1.2 a. each signal (its time waveform) has been measured 1600 times and the averaged signal has been calculated. this reduces the noise in all signals to negligible levels [8]. because this is the most important calibration step, it has been repeated five times. finally, the mean value of all measurements has been calculated. signals measured at imax=0.39 a are presented in fig. 7. it can be noticed that signals that represent the magnetising current and the magnetic flux density are in phase, while signal ul1 from pickup coil l1 is lagging for π/2. signals ul′2 and ul′3 from pickup coils l′2 and l′3 are opposite in phase with signal ul1. the maximum of the voltage induced in a pickup coil can be derived as: max max 0 maxl l lu k b k h     , (6) calibration of ac induction magnetometer 621 where kl is a constant proportional to the product of the number of turns and the crosssection area of the pickup coil and ω is the angular frequency. since the number of turns are unknown for pickup coils of the calibrated magnetometer, this product can be calculated from the measured maximums of the induced voltage and the magnetic flux density. thus, the calibration constant of the pickup coil can be obtained as: kl=ulmax/ωbmax. 0.00 0.02 0.04 0.06 -60 -40 -20 0 20 40 60 -0.4 -0.3 -0.2 -0.1 0.0 0.1 0.2 0.3 0.4 -8 -6 -4 -2 0 2 4 6 8 voltage current magn. flux density b [ m t ] i [a ] u l 0 [ v ] t [s] 0.00 0.01 0.02 0.03 0.04 0.05 0.06 -3 -2 -1 0 1 2 3 u l 1 , u l '2 , u l '3 [ v ] t [s] u l1 u l ' 2 u l ' 3 fig. 7 calibration of pickup coils measured signals table 1 presents results of calibration of pickup coils obtained for different magnetising currents, containing maximums of the supply voltage and electric current of the large solenoid, maximum of the magnetic flux density measured with hall sensor, maximum of the voltage induced in the pickup coil l1 and calculated calibration constants for all pickup coils. last row of table 1 gives averaged values of the ratio of maximums of the magnetic flux density and the magnetising current, which is the calibration constant of the large solenoid l0 (step two) and averaged values of the calibration constant of all pickup coils. 622 b. koprivica, m. šućurović, a. milovanović obtained calibration constant for the large solenoid is in accordance with the result given in fig. 6. table 1 results of pickup coils calibration ul0max [v] imax [a] bmax [mt] bmax/imax [mt/a] ul1max [v] kl1 [m 2 ] kl′2=kl2 [m 2 ] kl′3=kl3 [m 2 ] 1. 2. 3. 4. 5. 6. 31.65 63.41 95.22 127.14 159.22 191.37 0.195 0.390 0.586 0.782 0.979 1.176 3.855 7.718 11.596 15.455 19.321 23.161 19.738 19.780 19.801 19.767 19.738 19.685 1.267 2.540 3.815 5.088 6.368 7.641 1.046 1.048 1.047 1.048 1.049 1.050 1.225 1.226 1.225 1.226 1.228 1.229 1.170 1.171 1.170 1.171 1.172 1.174 average 19.751 1.048 1.227 1.171 4. discussion of results point of interest and important part of calibration procedure is estimation of the uncertainty of performed measurements [11]. calibration of hall sensor is found to be complex and influenced by many variables. also, the uncertainty of this calibration influences the uncertainty of other two calibrations. therefore, it is analysed thoroughly in this section. in order to achieve the lowest possible uncertainty, the calibration of hall sensor is repeated with another data acquisition card (ni 9205) which has adjustable voltage ranges and better absolute accuracy than ni usb 6009. voltage range for measurement of hall sensor voltage is set to 5 v and voltage range for measurements of voltage at the ends of shunt resistor is set to 200 mv in measurements made at magnetic flux density of 10 mt. hall sensor sensitivity is calculated as: max max max 0 1 max 2.5 ( 2.5)h hu u lr s b n u      , (7) where: uhmax=2.799 v is the maximum of hall sensor voltage, the quiescent output voltage of hall sensor is 2.5 v, bmax is the maximum of magnetic flux density, l=340 mm is the length of solenoid, r=0.00375 ω is the shunt resistance, μ0=4π·10 −7 h/m is the magnetic permeability of vacuum (air), n1=190 is the number of turns of solenoid and umax=0.05369 v is the maximum of shunt voltage. the combined uncertainty of sensitivity s is calculated according to the sensitivity coefficients, which are obtained as partial derivatives of sensitivity s expressed by (7), and absolute uncertainties of each independent variable in (7) [11, 12], as: calibration of ac induction magnetometer 623   2 2 2 , 1 1 i i s s c b x x ii i s u s u u x            , (8) where xi{uhmax, l, r, n1, umax}, uxi=s/xiub,xi and ub,xi is a type b standard uncertainty evaluated as , 3i ib x xu u or , 1.960 i ib x x u u for a rectangular distribution or a normal distribution with the confidence level of 95 %, respectively. sensitivity coefficients are calculated using values of uhmax, l, r, n1, μ0 and umax. their values are given in table 2. absolute uncertainties of voltages are calculated according to the specification for ni 9205 given by the manufacturer [13], taking into account three components of error: error of full scale, error of reading and noise error. the absolute uncertainty of the length of solenoid is taken as one half of measuring unit. the absolute uncertainty of the resistance is given by the manufacturer as 0.5 % of its rated value. the absolute uncertainty of the number of turns is equal to one turn. all values are given in table 2. rectangular distribution is assumed for voltages and resistance with coefficient of division 3 and the normal distribution is assumed in the case of length and number of turns with coefficient of division 1.96 (confidence level of 95 % at infinite degrees of freedom). its absolute value is 0.174 v/t and its relative value is 0.58 %. therefore, standard uncertainty is 0.58 %. moreover, correction factor k=2 can be used for calculation of expanded uncertainty. in such a case, the confidence level is around 95 % and expanded uncertainty is 0.35 v/t or 1.17 %. finally, the result of measurement of hall sensor sensitivity can be reported as: 29.77 v t 0.35 v ts   . (9) table 2 type b uncertainty for hall sensor calibration at 10 mt variable absolute uncertainty sensitivity coefficient distribution absolute standard uncertainty ub [v/t] relative uncertainty [%] uhmax 0.002138 v 99.46 1/t rectangular 0.12277 0.412 l 0.5·10 −3 m 87.555 v/tm normal (95%) 0.02233 0.075 r 0.01875·10 −3 ω 7938.32 v/tω rectangular 0.08593 0.289 n1 1 −0.1567 v/t normal (95%) 0.07994 0.268 umax 0.08953·10 −3 v −554.455 1/t rectangular 0.02866 0.096 the main contribution to the measurement uncertainty comes from the measurement of output voltage of hall sensor. the reason is a relatively high value of the measured voltage, as well as high measuring range. the absolute uncertainty ub,uhmax depends on both values. calculated uncertainty refers only to the measurements performed at 10 mt. whole calculation described need to be repeated to obtain uncertainty for other values of magnetic flux density. the results of the calibration can be used in further measurements with a magnetometer to shorten the time needed for measurement and calculation. they can be used in different 624 b. koprivica, m. šućurović, a. milovanović ways. at first, the calibration constant of the large solenoid can be used for calculation of the time waveform of the magnetic field generated by the large solenoid from the measured electric current i(t), as h(t)=19.77i(t)/μ0. as a consequence, the hall sensor may be excluded from the measurement setup. additionally, the magnetic field can be also calculated from the integral of measured voltage ul1(t) of the pickup coil l1 as h(t)=−ul1(t−t/4)/(μ0ωkl1), ω=2π/t (in the case of sinusoidal excitation current). thus, the shunt resistor for current measurement can be also excluded from the measurement setup. calibration constants of other four pickup coils can be used in calculations of time waveforms of the magnetic flux density b and the magnetisation m of the ferromagnetic sample. similar to expression (2), in the case when ferromagnetic sample is placed inside the pickup coil l2, the following expression can be used: 2 2 2 0 d d d d s l l l s m h u k s t t           , (10) where ul2 is the measured voltage of the pickup coil l2, sl2=πrp 2 is the cross-section area and rp=7.5 mm is the inner radius of the pickup coil. the magnetisation of the sample can be obtained by integration of this voltage and by substituting the magnetic field expressed over the voltage ul1 as: 2 2 20 0 1 d t l l s l s m u t h s k            . (11) on the other hand, the voltage u2 between points 2 and 4 (fig. 2b) can be obtained using only the magnetisation: 2 2 2 0 d d s l l s m u k s t   . (12) the magnetisation of the sample is obtained by the integration of (12), as: 2 2 2 0 0 1 d t l l s s m u t k s     . (13) the magnetic flux density of the sample can be calculated using (11) or (13) and previously calculated h(t), according to well-known relation: 0 ( )b m h   . (14) similar expressions can be used for other pickup coils in the case when the ferromagnetic sample is placed inside these coils. however, the above analysis may lead to measurement errors caused by two effects. the first one is related to the shape of the excitation current. as soon as the ferromagnetic sample is placed inside a pickup coil, for the same excitation voltage, the magnetising current will change by some amount. this change may be large and the shape of the current may be significantly distorted from sinusoidal. the level of influence depends on material calibration of ac induction magnetometer 625 characteristics. this effect can be overcome by digital feedback based on computer, as it has been discussed thoroughly in the literature [11]. the second effect also may appear after insertion of the ferromagnetic sample inside pickup coil. the magnetic field inside the pickup coil with ferromagnetic sample will be distorted, as well as surrounding magnetic field. this distortion might reach other pickup coils and disrupt the air flux compensation, which would be no longer effective as it was for the empty coils. this problem can be solved numerically in such a way that the compensating voltage is calculated using measured magnetic field [11]. this voltage needs to be subtracted from the measured voltage induced in the pickup coil, for example ul2 in (10). consequently, the magnetic field needs to be measured accurately to obtain effective air flux compensation. for these reasons, the analysis given by equations (10) to (14) need to be validated through numerous experiments in which different magnetic materials should be used. also, dimensions of magnetic samples need to be varied, as well as frequency of excitation current and its shape (sinusoidal, triangular and other). the main purpose of the calibrated magnetometer is to obtain instantly the hysteresis loop of some ferromagnetic sample from calculated time waveforms of the magnetic flux density (or magnetisation) and the magnetic field. it can be used for parallel comparison of the hysteresis loops up to four samples. samples can be made from different materials with the same dimensions or from one material with different dimensions. if all samples are made from one material and have the same dimensions, one sample can be used as a reference sample while the others can be checked against the reference and classified according to the predefined criteria. the magnetometer can operate with different frequencies of the magnetising current and with different shapes of its time waveform. it should be taken into account that the voltage induced in pickup coils should not exceed a voltage range of the data acquisition card (usually 10 v). the induced voltage increases with the increasing of the frequency of the magnetising current. sometimes, it is necessary to use voltage dividers to keep the induced voltage in the desired range. at very low frequencies (below 1 hz), the amplitude of the induced voltage is relatively small which results in a deterioration of the signal-tonoise ratio. in such cases, averaging of measured signals is useful [8]. 5. conclusion the paper gives a brief description of ac induction magnetometer, its construction and working principle, and emphasises its rising importance in complex measurements with ferromagnetic materials. the paper describes a calibration procedure of ac induction magnetometer. it has been performed on the coil systems of ferrotester 2738/s-3. initially, measurements have been performed on the large-diameter multilayer solenoid in order to investigate a homogeneity of the generated magnetic field. it has been found that the variation of the magnetic field in the homogeneity zone was less than 1 % of its maximum value. this homogeneity zone covers one third of the solenoid volume (its central part). the calibration constant of the large solenoid has been determined using measurements. further, using numerous measurements the calibration constants for all investigated pickup coils have been determined. 626 b. koprivica, m. šućurović, a. milovanović a detailed calculation of measurement uncertainty of hall sensor sensitivity is also presented in the paper. it has been found that the relative expanded uncertainty amounts 1.08 % at magnetic flux density of 10 mt. the possible ways to use a calibrated magnetometer for future measurements have been described. the method for determining time waveforms of the magnetic field and the magnetic flux density (or magnetisation) from the measured voltage induced in pickup coils and the calibration constants has been presented. a hysteresis loop of a ferromagnetic sample can be easily obtained from these waveforms. the magnetometer can be used for simultaneous testing of four samples of one material or different materials. two side effects that can produce errors in such measurements were discussed. some practical notes on the measurements with the calibrated magnetometer at different frequencies of the magnetising current have been also given in the paper. acknowledgement: this paper has been supported by scientific project tr 33016, financed by the ministry of education, science and technological development of the republic of serbia. references [1] b. koprivica, m. šućurović, n. jevtić, a. milovanović, "measurement of magnetic flux density of large-diameter multilayer solenoid", in proceedings of the 13th international conference on applied electromagnetics (пес 2017), niš, serbia, 2017, p. o5-2. [2] h. gavrila, v. manescu (paltanea), g. paltanea, g. scutaru, i. peter, "new trends in energy efficient electrical machines", procedia engineering, vol. 181, pp. 568-574, 2017. [3] f. béron, g. soares, k. r. pirota, "first-order reversal curves acquired by a high precision ac induction magnetometer", review of scientific instruments, vol. 82, no. 6, p. 063904, june 2011. [4] m. rivas, p. gorria, c. muñoz-gómez, j.c. martínez-garcía, "quasistatic ac forc measurements for soft magnetic materials and their differential interpretation", ieee transactions on magnetics, vol. 53, no. 11, p. 2003606, nov. 2017. [5] k.l. kaiser, electromagnetic compatibility handbook, crc press, new york, usa, 2004. [6] iec 60404-2 edition 3.1, "magnetic materials – part 2: methods of measurement of the magnetic properties of electrical steel strip and sheet by means of an epstein frame", iec, geneva, switzerland, june 2008. [7] national instruments: “ni 6009 low-cost, bus-powered multifunction daq for usb”, austin, usa, 2014. [8] s. zurek, t. kutrowski, a.j. moses, p. anderson, "measurements at very low flux density and power frequencies", journal of electrical engineering, vol. 59, no. 7/s, pp. 7-10, 2008. [9] ss49e linear hall effect sensor. available at: https://www.addicore.com/ss49e-linear-hall-sensorp/ad316.htm. [10] https://www.originlab.com/doc/origin-help/lr-algorithm#adj._r-square. [11] s. zurek, characterisation of soft magnetic materials under rotational magnetisation , crc press, london, uk, 2018. [12] jcgm 100, evaluation of measurement data, guide to the expression of uncertainty in measurement, joint committee for guides in metrology, 2008. [13] national instruments, ni 9205, datasheet, 2015. facta universitatis series: electronics and energetics vol. 32, n o 1, march 2019, pp. 105-118 https://doi.org/10.2298/fuee1901105k parallel overloaded cdma crossbar for network on chip ashok kumar k, dananjayan p department of ece, pondicherry engineering college, puducherry, india abstract. for high performance of network on chip (noc), code division multiple access (cdma) technique is used recently due to its fixed communication delay, reduced area utilisation and low power consumption. the cdma system uses walsh based spreading code which improves the bandwidth efficiency. on the contrary, it is not effective when the number of nodes present in the system increases. overloaded cdma (ocdma) is presented for such large network systems. in this paper, ocdma crossbar is modified and advanced with parallel encoding and decoding operation using orthogonal gold codes for improving the speed of crossbar thereby obtaining high performance in noc switch. a modified crossbar consisting of extra processing elements is used to enhance the performance of noc based system on chip (soc) system. this work is simulated on xilinx tool and implemented in vertex-6 (xc6vlx760) field programmable gate array (fpga) device. the proposed work is implemented for four ports, eight ports and sixteen ports with deterministic x-y routing algorithm in 3 3 noc design with mesh topology. this noc switch shows 9.79% improvement in delay and shows 20.76% improvement in power consumption when compared to the existing cdma nocs for 8 bit data packet. key words: cdma, gold code, noc, arbiter, fifo buffer, fpga. 1. introduction as the end user requirements have increased, integrated circuits have scaled down over the past few decades. according to itrs [1], the communication issues have evolved due to the down scaling of technology. existing communication protocols like the bus technology, shared bus and point to point technology which achieves high performance in chip multiprocessor (cmp) has inherent drawback while sharing the resources [2]. hence these protocols do not meet the performance requirements of system on chip (soc) [3]. network on chip (noc) is a scalable communication paradigm which provides high performance in cmp with the aid of parallel processor. however, when the number of processors increases, the design of noc becomes complicated and affects the communication latency, area occupancy and power consumption. conventional noc received april 24, 2018; received in revised form july 17, 2018 corresponding author: ashok kumar k department of ece, pondicherry engineering college, puducherry, india (e-mail: kashok483@gmail.com) 106 a. k. k, d. p switch has five input port, five output port (four directional and one local) and crossbar with control module (arbiter). the four bi-directional ports are connected with neighboring switches for transfer of data between the source and destination [4]. the local port is used as a processing element (pe) and is responsible for communication between the port and crossbar. this paper proposes a new method for noc with fixed latency, reduced system cost and power consumption. recently, cdma is used to transfer data between the input and output in noc switch [5]. fig.1 depicts the structure of cdma noc switch. noc switch has neither solid design nor standard protocol and hence can be designed flexibly to meet the user requirements. the proposed method is implemented for noc switch using a cdma crossbar with 2-d mesh topology and 3 3 noc designed with deterministic x-y routing algorithm. the routing of data initially searches along the x-direction of destination router and proceeds to the y-direction. depending on the destination availability, the distance between the source and destination switch is calculated and transferred to the neighboring switches. since each port has fifo buffer, store and forward packet switching [6] is used for the proposed work. the crossbar is the key module for noc switch as it affects the switch performance and provides multiple access for the data packets. the primary multiple access technique, time division multiple access (tdma) is simple but not efficient for cmp. in tdma only one port sends the data packet simultaneously leaving the other ports to wait until it releases the physical link, thereby increasing the packet latency. space division multiple access (sdma) a dedicated path is created between the ports. cdma is another traditional multiple access technique where the spreading code enables the medium access sharing. this method provides error-free data in cmp and reduces the multiple access interference (mai) by appropriately selecting the spreading code sequence with low cross correlation. the performance of cdma depends on its spreading code and hence choosing the sequence is crucial. recently, overloaded cdma is the most suitable medium sharing technique for cmp which increases the performance of classical cdma crossbar with more fig. 1 noc switch architecture with cdma crossbar of n input and n output ports parallel overloaded cdma crossbar for network on chip 107 available spreading codes. most of the cdma systems use walsh codes, but these codes are suitable only for noc system with fewer processor. walsh code generator provides sequences, out of which only sequences can be used for spreading. on the other end, the orthogonal gold codes are of much use for noc system with more pes. the rest of the paper is as follows, section 2 discusses the related work of cdma interconnects. section 3 describes the classical cdma operation with mathematical expressions. section 4 presents the generation of orthogonal gold codes. section 5 presents the noc router with parallel ocdma encoder and decoder. section 6 shows the implementation of ocdma system, and finally, the conclusion is presented in section 7. 2. related work recently, cdma technique is favored for crossbar of noc switch because of its fixed latency and reduced system cost. kim et al. [7] proposed and implemented walsh based cdma crossbar. this walsh based cdma gave suitable results for noc switch in terms of throughput and latency. star-mesh based noc switch is suggested to control large systems which have seven resources connected to the local switch and each local switch is linked to the central switch. this walsh based cdma gave suitable results for noc switch concerning throughput and latency. wang et al. [8] nominated a cdma technique for both synchronous and asynchronous system such as globally asynchronous locally synchronous (gals) scheme. a 6-node noc was simulated and the results were compared with ptp noc. kim et al. [9] advanced the source synchronous cdma interconnect (sscdma-i) thereby reducing the system overhead compared to tdma bus. nikolic et al. [10] presented two types of bus wrappers i.e. master wrapper along with arbiter module and slave wrapper with peripheral modules for cdma based shared bus architecture. the transaction delay has reduced by bundling the different connections as single, two and four to reduce the parallel lines. halak et al. [11] initiated dynamic assignment of spreading codes for cdma users and developed a novel cdma protocol (d protocol) for dynamic assignment. two different architectures were proposed for cdma i.e. serial cdma implementation, where the data chips from all users are arithmetically summed according to their bit position and in parallel cdma implementation where the data bits are transferred parallelly in the same cycle. the serial and parallel implementation schemes are compared with traditional cdma, mesh based noc and tdma bus and it is observed that the clock frequency was improved for parallel cdma implementation. wang et al. [12] preferred standard basis (sb) code in place of walsh based codes. the sb method duplicates the tdma technique as each spreading code consists only a single chip of one and the remaining chips are zeros. this method further decreases the latency and maximizes the throughput of noc. ahmed et al. [13] presented the overloaded cdma crossbar interconnect to improve the performance of noc. two different types of overloaded cdma interconnect (oci) have been suggested i.e. tdmaoverloaded cdma interconnect (t-oci) and parallel-overloaded cdma interconnect (p-oci) is compared with bus wrappers [10] and parallel implementation cdma [11]. by combining p-oci and t-oci, the speed of cdma crossbar is improved whereas the overall system gets complicated in terms of area utilization. to improve the results of ocdma, this paper proposes an encoder and decoder operated in parallel. by advancing 108 a. k. k, d. p the existing work, this paper has provided better results for noc with cdma crossbar. to the best of knowledge, this paper is the first to investigate ocdma crossbar with orthogonal gold codes. 3. classical cdma as cdma provides the same bandwidth for all users, it is more popular than tdma or sdma. among the various spread spectrum techniques in literature, direct sequence spread spectrum (dsss) is the dominant method for multiple access. dsss-cdma is a method of multiplexing using unique high-frequency spreading codes. two types of spreading codes used for cdma are the orthogonal codes and non-orthogonal codes. the orthogonal walsh-hadamard code is frequently used in cdma systems as the crosscorrelation is zero and the impulse autocorrelation property is unity. pn sequence, gold code and kasami code are the few non-orthogonal spreading codes in use. these codes are used in encoding and decoding of original data and protecting from interference in cdma. in cdma encoder, the input data signal is applied to the modulator with unique spreading code, and these modulated signals are added arithmetically before transmission. the encoded signal is transmitted through the channel and received in the decoder module. the decoder demodulates the encoded signal with the same unique spreading code. the encoded multi-sum signal is either accumulated in positive accumulator register (if spreading code bit is 0) or negative accumulator register (if spreading code bit is 1) and these accumulated values are sent to the comparison module. after comparison, if the positive accumulator value is high the transmitted data signal is 1 otherwise the signal is 0. unique spreading codes are assigned to each user to avoid multiple access interference. the different spreading code protocols are reviewed and analyzed [8]. among the several protocols, transmitter based protocol (t protocol) gives better performance by assigning unique spreading code to cdma system. table 1 describes the conventional cdma operation with suitable notation. table 1 definition of notations notation description data bit of jth sender for ith code sequence orthogonal code ith sequence for jth sender encoded chip value of ith code and jth sender arithmetic sum of ith code sequence positive register value for ith value of code sequence negative register value for kth value of code sequence n number of code sequences the data sent by each sender is xored with the unique code to generate the chip. these chips are added arithmetically to get the multi-bit sum. this multi-bit sum is sent to the decoder for reconstruction of original data. the encoding process is shown mathematically in the below equations. parallel overloaded cdma crossbar for network on chip 109 (1) ∑ (2) where means xor operation. the decoding process is expressed mathematically as below and (3) ∑ and ∑ (4) where pr and nr are positive and negative registers, also pac and nac are positive register with an accumulator and negative register with an accumulator. problem statement cdma through its multiple accesses technology enables number of transmitters to transfer data simultaneously to number of receivers. for efficient data transfer, spreading sequences must satisfy the orthogonal, balance and run-length properties. though orthogonal sequence like walsh code is used for improving bandwidth utilization, the code utilization of it is less, the cross correlation between some shifts is not zero and the total delay for generating the code is not fixed. a 16 node cdma needs 32-bit walsh code as 16-bit walsh code provides only 15 orthogonal sequences and hence it leads to wastage of code sequences in the system [12]. the proposed work provides a suitable solution for this problem. the contributions of the proposed work is as follows (i) implementing cdma with orthogonal gold codes to increase the code utilisation and to reduce the mai. (ii) modifying and advancing a novel approach of ocdma system and adding extra pes to the router for improving the performance of noc [13]. (iii) simulating the proposed design in xilinx software for synthesis and comparison of the results with existing work. 4. generation of spreading code for cdma system as described above, walsh code is not suitable for high data rate systems. therefore, instead of walsh code orthogonal gold code is implemented. the generation of gold code is through proper selection of pn-sequences used as initial values for linear feedback shift registers (lfsr) [14]. fig.2 describes the generation of gold code with proper msequence. gold sequence generator gives codes with a n bit sequence. experiments show that orthogonal gold codes are obtained by affixing „0‟ to the non-orthogonal gold sequence. further, there is no wastage of code sequences in gold code set and the sequences are utilized efficiently. hence, orthogonal gold sequences are suitable for huge node (port) noc system whose data width is 16, 32, and 64 bits. nevertheless with regard to ber, orthogonal gold codes provide similar performance compared to walsh codes. 110 a. k. k, d. p fig. 2 generation of gold code sequence with correct selection of pn-sequences 5. noc router with parallel ocdma crossbar each pe of noc is connected with network interface (ni) i.e. either transmit ni or receive ni. the input port consists of fifo and finite state machine (fsm) controller, and the fsm controller will direct the data packets based on fifo [16]. the data is divided into packets before being transferred to the fifo from transmitting ni. the size of fifo is decided by the width of the data packet. the distributed round robin arbiter provides grants for the packets which are ready to transmit from fifo. the arbiter selects the input port and output port based on the fifo memories. the transmit ni will assert the request to the arbiter, then depending on the fifo memory the arbiter will provide the grants in round robin fashion. hence only one data packet will be sent to the cdma system, thereby avoiding the conflict between data packets during transmission. from ni, the data packets are sent to parallel to serial converter (pts) module, and pts provides the data packets serially to the encoder of cdma. then, the serialized data is encoded with spreading code and the bits are summed arithmetically to form multi-bit sum. this multi-bit sum is sent to the decoder module where the original data bits are reconstructed based on decoder logic and then these serialized data are forwarded to the serial to parallel converter (stp) module. stp converts the data bits into data packet again, and the data packets are sent to the received ni of the output port. finally, the receive ni sends the data packets to the fifo port. the store and forward packet switching is flexible for the proposed noc system as the ports are using fifo. similarly, ocdma replaces the crossbar of noc switch configuration. the deterministic x-y routing protocol is used for data transfer from source noc switch to destination switch as it is straightforward, flexible for 2-d mesh design and free from deadlock. the control block which consists of arbiter is used to operate the spreading sequence assignment and provide data transaction permission for the winning ports. the concept of overloaded cdma system is implemented in wireless communication networks for increasing the number of trans/receiving ports without increasing the system complexity [13]. the difference between ocdma technology and standard cdma is in terms of code length i.e. l>n1. l is the code length for ocdma and n1 is the code length for classical cdma. ocdma facilitates multi-bit port transmission with minimal changes to traditional cdma system. hence, ocdma system needs long sequence generator such as gold code generator. parallel overloaded cdma crossbar for network on chip 111 the proposed work is implemented for noc switch with cdma without increasing the system complexity, fixed latency and limited system cost. to improve the bandwidth and reduce the area overhead, extra pe is connected to each noc switch which reduces the requirement of more switches and also reduces the area overhead of per-pe. the fact that the increasing number of pes per noc switch increases the communication requests there by increasing the inter communications links [15]. the modification of standard cdma system is required to achieve these objectives. the total encoding or decoding process of cdma depends on the spreading code length which equals to clock cycles for one data transaction. the completion of a single transaction requires n clock cycles which are also synchronized with the counter. fig. 3 n input and n output ports of noc switch with ocdma crossbar building blocks of ocdma crossbar the ocdma crossbar is designed of three main components: (i) encoder (ii) decoder and (iii) control block and these modules are shown in fig. 3 along with components of noc. the control block mainly controls the data transmission in terms of selection of proper input port, assigning the code sequence and counter for measuring the clock cycles. 1) encoder module the operation of encoding process is same as conventional cdma but the data is encoded bit wise in parallel manner. the multi-bit sum of data is transferred to the decoder module parallelly [11] therefore one clock cycle is sufficient for the completion of the process of encoding one bit of nodes. the data chips are xored and added simultaneously from the ports hence the proposed encoder reduces the clock cycles for completion of the encoding process than the standard cdma. fig.4 shows the parallel encoding method for ocdma with orthogonal gold code. the nodes send the data bits serially to the encoder block, and then the multi-bit sum is sent parallel to the decoder block. the cdma requires total of 24 bit to transfer original data of 8 bit because multisum of each bit requires 3 bit when it adds arithmetically. 112 a. k. k, d. p fig. 4 parallel process of encoding in ocdma crossbar 2) decoder module fig. 5 describes the parallel decoding process of ocdma. the parallel multibit sum is received by the decoder module through the channel and the encoded sum value first reaches the de-multiplexer stage. the encoded data bit is sent to the positive register (if spreading code is zero) or negative register (if spreading code is one), then the values of fig. 5 parallel process of decoder architecture of ocdma crossbar parallel overloaded cdma crossbar for network on chip 113 both registers are accumulated. finally, these positive accumulated values and negative accumulated values are sent to the comparison module. the original data bit would be 1 if the pac is high else the original data bit is 0. these registers are usually of length n/2 because of the balance property of the orthogonal spreading code. therefore, both the registers are of same length which is half of the spreading code length. the decoding process is executed parallelly for each spreading code of the multi-bit sum. 3) control block at the initial stage of data transfer, the control block provides spreading code sequences for the transmitter and then the transmitter transfers the code to the receiver. the arbiter eliminates the congestion and provides grant signal to input port for transferring the data to the crossbar by round robin fashion [15]. the counter within the arbiter module initializes the spreading sequences for all the senders. the control block sends the handshake signals to verify codes of the corresponding encoder and decoder. the code pool will assign a unique spreading code to each transmitter when it receives a request from the arbiter module. fig.6 describes the encoding and decoding process of 8 bit orthogonal gold code. the sender sends the data bit serially and orthogonal codes are assigned to each sender by the gold code generator. the data bit is xored with code bit parallelly and the encoded first bit of each sender is sent to the decoder section. the first bit of the code for each sender is zero but the multi-bit sum of the first encoded data is four after xoring with data bits. this process continues for each sender and the multi-bit sum is calculated for each encoded bit. the multi-bit sum is sent to the accumulators depending on the code bit value. for decoding of first bit, the positive accumulator register is more than negative accumulator register hence the data bit is re-constructed as zero. the process continues until the system gets the 8-bit original data. 6. implementation the simulation and synthesis results are presented in terms of area, delay and power consumption for the parallel ocdma crossbar of noc switch with 2 pe. the proposed work is simulated in xilinx software and implemented on vertex6 (xc6vlx760) fpga. the implementation of noc switch is carried out using different ocdma crossbar with spreading code lengths n= {4, 8, 16} and the comparison is also provided with existing noc switches. the parameters used for simulation of noc switch are tabulated in table 2. table 2 simulation parameters simulation parameter values topology 2d mesh arbiter distributed round robin switching store and forward routing algorithm minimal adaptive crossbar ocdma data packet length 8 bit buffer yes simulator riviera-pro traffic scenario uniform random traffic distribution poisson 114 a. k. k, d. p fig. 6 transmission and reception of ocdma with 8 orthogonal gold codes the performance metrics considered are area utilization (slice registers, slice luts and lut-ff pairs), maximum clock frequency (delay) and power consumption (dynamic power). the encoder and decoder of ocdma is implemented individually and applied to the crossbar of noc switch. fig. 7 shows the implementation results for 4,8,16 nodes of ocdma crossbar for noc switch.from fig. 7 (a), it is evident that the area utilization is increasing with increasing number of bits because the noc switch requires high architecture for transmission and reception of data packet. inference from fig. 7 (b) concludes that the maximum clock frequency is decreasing with increase of data packets because of the converters (stp and pts) present in ni. the power consumption of this noc switch increases with its data packet because the transition activity is more when the data is undergoing stp/pts block, hence dynamic power consumption also increases and it is shown in fig.7(c). the throughput ( ) is calculated as (5) where nc is the number of required clock cycles, nbpp is the number of bits in a packet, npe is the number of received packets at the pe and tc is the clock period for complete data transmission. from fig. 7(d), it is inferred that the throughput is increasing with increasing data because more pes are receiving data packets within the specified clock period. to improve the bandwidth and reduce the area overhead, extra pe is connected to parallel overloaded cdma crossbar for network on chip 115 each noc switch which reduces the requirement of more switches and also reduces the area overhead of per-pe. (a) (b) (c) (d) fig. 7 (a-d) implementation results in terms of area utilization, maximum clock frequency, power consumption and throughput for noc switch with ocdma crossbar of 4, 8, 16 nodes table 3 shows the comparison results for the 8-bit different cdma crossbar of noc switch in terms of area utilization (lut-ff pairs), delay (ns) and power consumption (mw) which are implemented in vertex-6 fpga device. the proposed work provides better results than wb-cdma [7], sb-cdma [12] and ocdma [13] as the encoding and decoding processes are executed in parallel. this parallel ocdma crossbar switch requires less area utilization because of orthogonal gold codes used for spreading and elimination of selector (multiplexor) and additional non-orthogonal sequence generator in ocdma [13]. even though, the number of spreading codes is more, the area overhead of parallel ocdma is lesser than ocdma [13] because of pe clustering which reduces the number of required switch for complete data transfer from source port to destination port. table 3 comparison of parallel ocdma with existing cdma crossbar of 8-bit data packet per switch cdma(8-node) area (no. of lut-ff) delay(ns) power consumption(mw) wb-cdma [7] 782 2.82 17.53 sb-cdma [12] 684 2.71 15.21 ocdma [13] 692 2.96 11.46 parallel ocdma 663 2.673 9.08 116 a. k. k, d. p fig. 8 (a-d) shows the comparison of parallel ocdma with ocdma [13] for different number of nodes. from the figure, it is inferred that the performance of parallel ocdma is improved compared to the existing work because of efficient code utilization and pe clustering. this crossbar switch shows 9.79% improvement in delay than ocdma [13] with minor modifications in simple cdma operation. the major power consumption is due to buffers in the bi-directional ports. but in the proposed method encoder and decoder modules are placed in ni of noc, hence buffers are not operated when the data packets are encoding and decoding. consequently 20.76% improvement in power consumption is obtained than ocdma [13]. (a) (b) (c) (d) fig. 8(a-d) comparison for ocdma [13] with parallel ocdma for different number of nodes in terms of area utilization, clock frequency, power consumption and throughput the parallel ocdma noc switch with 2pe extended for 3 3 mesh based noc system which of 9 bi-directional routers and 18 pes. for analyzing the packet latency and throughput, the mesh based noc is simulated on riviera pro for windows. number of experiments are conducted in uniform-random traffic pattern for observation of these performance metrics. the data packet latency (clock cycles) performance for noc with 2 pe is obtained and compred with that of noc switch with single pe as shown in fig.9 (a). from this figure, it is evident that the proposed work shows reduced packet latency because of parallel processing of encoder and decoder for transmission of the data packet. throughput performance of noc switch with 2 pe is also obtained and compared with parallel overloaded cdma crossbar for network on chip 117 that of noc switch with single pe. from figure, it is understand that as number of pe‟s increased, its latency and throughtput performances are improved. (a) (b) fig. 9 simulation results for network latency with injection load (a) and throughput with injection load (b) in uniform-random traffic pattern 7. conclusion this paper proposed the overloaded cdma crossbar for noc with parallel encoding and decoding process with walsh codes being replaced by orthogonal gold codes. a parallel encoder and decoder transfer the data in the same clock cycle hence the performance of proposed ocdma crossbar is increased. the results are improved with respect to latency, area usage and power consumption when compared with the existing cdma crossbars. the parallel ocdma crossbar switch showed 9.79% decreament in delay and showed 20.76% improvement in power consumption than ocdma [13]. in future work, noc switch will be present with different fault routing algorithms for handling permanent and transient faults. references [1] international technology roadmap for semiconductors 2012(www.itrs.net). [2] m. c. chiang, g. s. sohi, “evaluating design choices for shared bus multiprocessors in a throughput oriented environment,” ieee transactions on computers, vol. 41, no. 3, pp. 297-317, march 1992. [3] d. sigüenza-tortosa, t. ahonen, and j. nurmi, “issues in the development of a practical noc: the proteo concept,” integretion the vlsi journal, vol. 38, no. 1, pp. 95–105, october 2004. [4] t. bjerregaard and s. mahadevan, “a survey of research and practices of network-on-chip,” acm computing surveys, vol. 38, no. 1, pp.1-50, march 2006. [5] s. a. hosseini, o. javidbakht, p. pad, and f. marvasti, “a review on synchronous cdma systems: optimum overloaded codes, channel capacity, and power control,” eurasip journal of wireless communications networking, vol. 1, pp. 1-22, december 2011. [6] l. benini and d. bertozzi, “xpipes: a network-on-chip architecture for gigascale systems-on-chip,” ieee circuits and systems magazine, vol. 4, no. 2, pp. 18-31, september 2005. [7] d. kim, m. kim, and g. e. sobelman, “cdma-based network-on-chip architecture,” in proceedings of the ieee asia-pacific conference circuits systems, december 2004, pp. 137-140. 118 a. k. k, d. p [8] x. wang, t. ahonen, and j. nurmi, “applying cdma technique to network-on-chip,” ieee transactions on very large scale integration systems, vol. 15, no. 10, pp. 1091-1100, october 2007. [9] j. kim, i. verbauwhede, and m.-c. f. chang, “design of an interconnect architecture and signaling technology for parallelism in communication,” ieee transactions on very large scale integration systems, vol. 15, no. 8, pp. 881-894, august 2007. [10] t. nikolic, m. stojcev, and g. djordjevic, “cdma bus-based onchip interconnect infrastructure,” microelectrons reliability, vol. 49, no. 4, pp. 448-459, april 2009. [11] b. halak, t. ma, and x. wei, “a dynamic cdma network for multicore systems,” microelectrons journal, vol. 45, no. 4, pp. 424-434, april 2014. [12] j. wang, z. lu and y. li, “a new cdma encoding/decoding method for on-chip communication network,” ieee transactions on very large scale integration systems, vol. 24, no. 4, pp. 1607-1611, april 2016. [13] k. e. ahmed, r. rizkand m. m. farag, “overloaded cdma crossbar for network on chip,” ieee transactions on very large scale integration systems, vol. 25, no. 6, pp. 1842-1855, january 2017. [14] l. hanzo and t. keller, “ofdm and mc-cdma: a primer,” © 2006 john wiley & sons, ltd. isbn: 0470-03007-0, 2006. [15] r. kumar and a. gordon-ross, “macs: a highly customizable low-latency communication architecture,” ieee transactions on parallel and distributed systems, vol. 27, no. 1, pp. 237-249, january 2016. [16] a. k. k, p. d., “a survey for silicon on chip communication”, indian journal of science and technology, vol. 10, no. 1, january 2017. instruction facta universitatis series: electronics and energetics vol. 32, n o 1, march 2019, pp. 91-104 https://doi.org/10.2298/fuee1901091i characteristics of curcumin dye used as a sensitizer in dye-sensitized solar cells stefan ilić, vesna paunović university of niš, faculty of electronic engineering, niš, serbia abstract. dye-sensitized solar cells are the closest mankind has come to replicating nature’s photosynthesis. the type of a dye influences the efficiency of these cells. in this paper we studied curcumin dye as a sensitizer in dye-sensitized solar cells and compared it with most often used cyanidin. the results have shown that curcumin has higher efficiency and higher absorption in the visible part of the spectrum compared to cyanidin. simulation models of dye molecules, curcumin and cyanidin, are deprotonated upon adsorption on the titanium dioxide surface. the energy levels obtained from the calculation indicate a higher probability of electron transition from molecule to titanium dioxide surface in case of curcumin than in case of cyanidin. based on these results, we concluded that curcumin dye has better properties as sensitizer in dye-sensitized solar cells. key words: solar cells, curcumin, cyanidin, titanium dioxide, density functional theory, voltage-controlled resistance 1. introduction a solar cell is a renewable source of energy that directly converts visible light into electricity [2-4]. when exposed to light, the solar cell becomes the source of direct current. operation principle of all solar cells is based on photoelectric effect. there are first, second and third generations of solar cells. dye-sensitized solar cells (dssc) belong to the third generation. the major part of these cells is the nanoparticle anatase titanium dioxide coated with dye molecules. the type of the dye, the way it anchored to the tio2, directly affects the efficiency. ruthenium polypyridyl complexes are known as the most efficient pigments, they achieved almost 12% efficiency [5]. however, these pigments contain a heavy metal which has undesired environmental impact. cheaper alternative can be given by natural pigments, such as anthocyanins, betalains, chlorophyll, etc. betalains are recorded as most efficient natural pigments achieving more than 2% [6]. anthocyanins received february 28, 2018; received in revised form july 5, 2018 corresponding author: stefan ilić faculty of electronic engineering, university of niš, aleksandra medvedeva 14, 18000 niš, serbia. (e-mail: stefan.ilic@yahoo.com) * an earlier version of this paper was presented at the 61st national conference on electrical, electronic and computing engineering (etran 2017), june 5-8, 2017, in kladovo, serbia [1]. 92 s. ilić, v. paunović are very frequent in research papers that study natural pigments as sensitizers in dssc [7]. they give different sensitizing performances from various plants, absorb light at the longest wavelength and have widespread availability [8]. wongcharee et al. used extracts from rosella and blue pea flowers. solar cells sensitized by rosella (delphinidin and cyanidin) have been reported to achieve efficiency up to 0.37%, whereas extract from blue pea (ternatin) can achieve up to 0.05% [9]. tekerek et al. fabricated a solar cell also with rosella dye and compared it to black raspberry and black carrot dyes. they achieved efficiencies of 0.16%, 0.16% and 0.25%, respectively [10]. curcumin can also be a sensitizer, but it has not attracted significant research attention. kim et al. reported a dye-sensitized solar cell sensitized with curcumin dye, and showed 0.36% efficiency [11]. in this work we investigate two types of natural pigments: cyanidin extracted from raspberries and curcumin dye extracted from curcuma longa. the aim of this paper is to both experimentally and theoretically (simulation) confirm the thesis that curcumin is a better sensitizer in dye-sensitized solar cells than cyanidin. firstly, we measured current-voltage characteristics and absorption spectrum. after that, to confirm the experimental results, we simulated the models of anatase (tio2)16 cluster and cyanidin or curcumin molecule attached to it. our calculation is based on density functional theory (dft) and time-dependent density functional theory (tddft). calculations were carried out with nwchem software [12]. 2. operation principle of dsscs the main idea of dye-sensitized solar cells is to separate the light absorption process from charge collection process by using dye sensitizer with semiconductor. this process imitates the natural light harvesting procedure in photosynthesis [13]. that is why dyesensitized solar cells are the closest mankind has ever come to replicate nature's photosynthesis. to separate these two processes we could use semiconductor with wide band gap such as titanium dioxide (tio2). a dye-sensitized solar cell is composed of photoactive electrode, electrolyte and counter electrode. photoactive electrode is made of porous nanocrystalline anatase titanium dioxide deposited on fto conducting glass (fluorine doped tin oxide). fto layer is 220 nm thick and it is deposited on the glass. it enables transport of photo-generated charge carriers to the electrode and it is also transparent so the light can penetrate into the solar cell. dye is absorbed on tio2 layer to complete the photoactive electrode. counter electrode is also fto glass, but it is deposited with platinum to increase the conductivity. the space between electrodes is fulfilled with electrolyte which is based on iodide and triiodide ions (fig. 1). when sunlight passes through the photoactive electrode, molecules of the dye absorb the photons and electrons go from the homo (highest occupied molecular orbital) in the ground state to the lumo (lowest unoccupied molecular orbital) in the excited state. some of the excited electrons have enough energy to jump to the conduction band of titanium dioxide and then to diffuse to the electrode. dye molecules that lost electrons are oxidized. electrolyte gives electrons to replace the lost ones. after that, iodide molecules are oxidized. electrons from photoactive electrode flow through an external load to counter electrode and recombine with electrolyte, thus completing the circuit. hence, the operating mechanism of dye-sensitized solar cell generates electricity without irreversible http://en.wikipedia.org/wiki/molecular_orbital characteristics of curcumin dye used as a sensitizer in dye-sensitized solar cells 93 chemical changes in the cell. dye molecules play a key role in producing electricity. they need to overcome small absorption of titanium dioxide by absorbing the photon and exciting the electron. therefore, they are increasing the efficiency of solar cell. thus, the greater absorption of the dye is, the more efficient the solar cell will be. 3. dye sensitizers a dye sensitizer absorbs energy in dye-sensitized solar cell. when using natural pigments as a dye-sensitizer, a big problem is the degradation during prolonged exposure to sunlight due to uv radiation. figure 2 shows optimized molecular structures of the cyanidin and curcumin. fig. 2 optimized molecular structures of the cyanidin and curcumin. anthocyanins are widespread water-soluble pigments that can be found in many flowers, fruits and leaves of angiosperms. they are responsible for different colours (red, fig. 1 schematic structure and principle of operation of dssc. 94 s. ilić, v. paunović blue and violet) depending on the ph value [14]. they have found new application in dye-sensitized solar cells because they have significant absorption in the visible part of the spectrum. only organic dyes that contain several =o or -oh groups (for example cyanidin found in raspberry) capable of chelating to tio2 can be used as dye sensitizer. curcumin is an active ingredient of turmeric (curcuma longa). turmeric is a rhizomatous herbaceous perennial plant of the ginger family. it is used for indian spice, it has yellow color and is known as e100 (food additives). curcumin can exist in two tautomeric forms (keto – solid and enol solution). a molecule of curcumin has carbonyl and hydroxyl groups which can bind to tio2 surface. 4. models and computational details we used (tio2)16 cluster, to model anatase tio2 slab. cluster is obtained by correct ''cutting'' of anatase slab (fig. 3). for proper cutting, three conditions must be fulfilled: all titanium atoms must be coordinated to at least four oxygen atoms, all oxygen atoms must be coordinated to at least two titanium atoms, and the ratio of the number of titanium and oxygen atoms in the cluster must be 1 : 2 [15]. after optimization band gap of the (tio2)16 cluster was 4.52 ev. fig. 3 model of anatase (tio2)16 cluster before and after optimization. for all calculations a freely accessible software nwchem was used, performing density functional theory and time-dependent density functional theory for which we used b3lyp functional together with 6-31g basis set. density functional theory is a powerful tool for solving multi-stage problems in quantum mechanics. it allows the complicated nelectron wave function and its associated schrodinger equation to be replaced by much simpler single-electron equations in which the electron density is determined. we used dft to calculate band gap, homos and lumos for all structures and tddft to calculate absorption spectra of molecules. 5. fabrication of dsscs fabrication of dye-sensitized solar cell requires a preparation of titanium dioxide film, extraction of natural pigments, electrolyte preparation and solar cell assembly [16]. characteristics of curcumin dye used as a sensitizer in dye-sensitized solar cells 95 5.1. preparation of tio2 film a nanoparticle powder tio2 (p25 degussa) was used to prepare the films. water and acetic acid have been added due to the contribution to the mechanical properties of the films, i.e. good adhesion to the substrate and preventing the formation of cracks. terpineol is added to prevent particle growth, ethyl cellulose to achieve porosity of the films due to decomposition during thermal annealing. the films were deposited with a doctor-blade technique on an fto glass. doctor-blade technique is process of paste deposition on some surface by a razor blade, while the scotch tape is used as a pattern which gives the shape to the deposited layer and uniform thickness of the film about 40 μm. quadratic shapes were made with dimension 5×5mm with initial thickness 40 μm and final thickness 10-11 μm, after drying and thermal annealing. after the deposition, the films were left at room temperature for a few minutes, after which each film due to calcination was treated with the procedure: at 120°c/10 min, at 250°c/10 min, at 400°c/10 min, at 450°c/5 min and finally at 500°c/15 min, similar to the procedure presented elsewhere [17]. 5.2. natural dyes preparation and photoactive electrode formation anthocyanins are extracted from frozen raspberries. raspberries are crushed in mortar and pestle until they became juicy. curcumin was extracted from commercially purchased turmeric powder. the preparation process involved the dissolution of 5 grams of turmeric powder in ethanol. the prepared solutions were stored at room temperature and in a dark place to prevent their photodegradation. photoactive electrodes are made by soaking fto glasses with tio2 layer in crushed raspberries or in solution of turmeric. they can stay in from several minutes to several hours, while dye molecules from the raspberries and turmeric naturally adsorb onto the titania particles. tio2 layer absorbs more dye molecules if it stays longer [18]. films were pre-warmed to 80°c during staining to prevent unwanted binding of moisture from air to tio2. figure 4 shows the look of a photoactive electrode after each procedure, chronologically. a) b) c) fig. 4 fto glass with tio2 layer deposition (a), finalized photoactive electrode stained with raspberry (b) and finished solar cell stained with curcuma longa (c). 96 s. ilić, v. paunović 5.3. preparation of electrolyte the electrolyte was prepared by dissolving 1.66 g of lithium iodide (approx. 60 mm lii) and 0.254 g of iodine (approx. 0.5 mm i2) in 20 ml of ethylene glycol at 50°c with stirring. the preparation of iodine-based electrolyte was chosen based on the reported procedures [9, 19]. 5.4. solar cell assembly after photoactive electrode formation, the films were washed carefully with ethanol and distilled water. after drying with warm air, they were coupled with counter electrode and fastened with clips. fig. 5 dye-sensitized solar cell assembly. a platinum transparent electrode was prepared by a doctor-blade deposition of commercially available platinum paste (platisol t/sp, solaronix) on fto glass. furthermore, counter electrode was thermal annealed at 450°c for 30 minutes. figure 5 shows a schematic representation of the cross-section of the solar cell. after coupling the electrodes, the pressure of the clips is slightly reduced and the addition of the electrolyte between the electrodes by needle and syringe is applied, which completes the process of solar cell assembly (fig. 4). 6. measurement of current-voltage characteristics when measuring the current-voltage characteristics of a solar cell, it is necessary to measure the voltage of the cell and the current passing through the cell for different values of resistance in the circuit when it is exposed to solar radiation. since dye-sensitized solar cell gives very weak current (microamperes or less), the current in the circuit was not measured directly by the ampere meter, because it would disturb the measurement. instead, the current was determined indirectly, by measuring resistance and voltage in the circuit. this is done by using light-emitting diode and photo-resistor facing each other in a dark and closed system. therefore, we used so-called voltage-controlled resistance, because different voltages on the light-emitting diode, give different resistances on the photo-resistor. characteristics of curcumin dye used as a sensitizer in dye-sensitized solar cells 97 fig. 6 measuring equipment. we used multifunctional system ni usb-6008 [20]. voltage values of the led were applied for 1136 known resistance values on the photo-resistor (range of 367-250000 , and then, after 10 milliseconds the voltage of the solar cell (which was exposed to solar radiation) was measured (fig. 6). based on the known resistance and voltage in the circuit the current is calculated. after that the current-voltage characteristics are drawn. 7. experimental results the analysis of tio2 films by scanning electron microscopy confirms the presence of a developed surface and a porous structure (fig. 7). fig. 7 sem image of the tio2 on fto glass surface on the left and on the right its cross section. results for the current-voltage characteristics measured for dye-sensitized solar cell stained with curcuma longa and raspberry are shown in figure 8. all measurements were recorded at a solar radiation intensity of 790 w/m 2 . 98 s. ilić, v. paunović fig. 8 current-voltage curve of dsscs stained with curcuma longa (black curve) and with raspberry (red curve). dye-sensitized solar cell stained with curcuma longa has efficiency of 0.028% and fill factor of 45%, while dye-sensitized solar cell stained with raspberry has efficiency of 0.017% and fill factor of 36%. by comparing the current-voltage characteristics, we can conclude that the dye-sensitized solar cell stained with curcuma longa is better than dyesensitized solar cell stained with raspberry. graphic results can be explained by the absorption spectra of curcuma longa and raspberry (fig. 9). fig. 9 absorption spectra of curcuma longa (black) and raspberry (red). characteristics of curcumin dye used as a sensitizer in dye-sensitized solar cells 99 the curcuma longa is active in the visible region 400-500 nm and has a peak at 429.6 nm, while the raspberry is active in the visible region 480-580 nm and has a peak at 544 nm, which is the characteristic of an anthocyanins [16]. the absorption spectra were recorded using the perkin-elmer lambda 15 uv/vis spectrophotometer. samples did not have the same concentration of the solution, turmeric has a much higher absorption than shown. for our work the most important was to see the absorption peaks and compare them with simulation results. 8. simulation results based on tddft, absorption spectra for cyanidin and curcumin were calculated (fig. 10). curcumin has an absorption peak at 420.8 nm, while the cyanidin has the highest peak at 477.3 nm, which differs from the experimental results. considering that in experiment raspberry dye contains more than one pigment that can absorb light, results obtained from simulation are in good agreement with the experimental values that has been previously explained (fig. 9). curcumin has higher absorption than cyanidin, which can explain the higher efficiency of the solar cell stained with curcuma longa [21]. after optimization for models of the cyanidin and curcumin molecules, the homolumo gap has value: for cyanidin 2.43 ev, which is in perfect agreement with reference work [13], and for curcumin 3.22 ev. fig. 10 absorption spectra of curcumin (black) and cyanidin (red). dye molecule can be anchored on tio2 surface by the carbonyl (=o), hydroxyl (-oh) or carboxyl group (-cooh). curcumin and cyanidin have only carbonyl and hydroxyl groups. a carboxyl group can be represented as a combination of a hydroxyl group and a carbonyl group. adsorption modes can be bridged bidentate and monodentate modes. for simplicity, the adsorption modes are represented with a carboxyl group (fig. 11) [22]. 100 s. ilić, v. paunović fig. 11 anchoring region for bridged bidentate (a) and monodentate (b) adsorption modes. the dotted circle denote the position of deprotonated atom (a). when dye molecule binds to the titanium dioxide surface deprotonation process may occur. deprotonation process happens when hydrogen atom of the dye molecule transfers to the titanium dioxide surface during anchoring. in the case of curcumin and cyanidin the h atom is transferred from the hydroxyl group to the tio2 structure. deprotonation process lowers the energy of the system. in figure 11a, we can see that the dye molecule formed a bridged bidentate adsorption after the deprotonation was performed. note that the hydrogen atom (dotted circle) is bound to oxygen from the cluster of titanium dioxide. however, the dye molecule can be adsorbed, as in figure 11b, without deprotonation. in this case, hydrogen bond may occur. of course, hydrogen atom can be also deprotonated which is lowering the energy of the system [13]. fig. 12 optimized geometries of the cyanidin adsorbed onto the (tio2)16 model (c@tio2) 1 , along with their homo and lumo+7. in our simulation, we observed three systems of molecule/cluster. deprotonation was performed in each of them. dotted circles denote the positions of protons that have been deprotonated from dye molecules to the (tio2)16 cluster (fig. 12, 13, 14). figures also illustrate the homos and lumos of molecule/cluster systems. 1 c@tio2 label means cyanidin anchored onto the tio2. characteristics of curcumin dye used as a sensitizer in dye-sensitized solar cells 101 in the case of c@tio2 and k2@tio2 the first level above the lumo that is delocalized on the whole molecule/cluster is lumo+7 (energy -2.859 ev for c@tio2 and -3.039 ev for k2@tio2). for k1@tio2 the first such level is lumo+28, which is at higher energy (-2.597 ev). the absorption of electrons from the valence band to the lumo+7 and lumo+28 levels lead to direct electron injection [23] in the tio2, since the lumo levels are delocalized along the whole system. fig. 13 optimized geometries of the curcumin in monodentate anchoring adsorbed onto the (tio2)16 model (k1@tio2) 2 , along with their homo and lumo+28. fig. 14 optimized geometries of the curcumin in bridged bidentate anchoring adsorbed onto the (tio2)16 model (k2@tio2) 3 , along with their homo and lumo+7. 2 k1@tio2 label means curcumin with one bond anchored onto the tio2. 3 k2@tio2 label means curcumin with two bonds anchored onto the tio2. 102 s. ilić, v. paunović after optimization was carried out for three molecule/cluster systems homo-lumo gaps were calculated: for c@tio2, k1@tio2 and k2@tio2 in the order of 2.37 ev, 1.93 ev and 2.34 ev. we notice that homo-lumo gaps have decreased after binding molecules onto the clusters. also that curcumin has the smallest homo-lumo gap when it is monodentate (k1@tio2) anchored onto the tio2. based on these results, energy diagram of the cyanidin, curcumin, tio2 model and three molecule/cluster systems was made (fig. 15). effective dye-sensitized solar cell requires the homo of the dye molecule to reside in the tio2 band gap and its lumo to lie within the conduction band of the tio2 [13]. we noticed that the homo levels of all three molecule/cluster systems are in the band gap of the tio2, and that lumo levels are below the cbm (conduction band minimum). the energy of the cbm is -3.835 ev. the nearest to the conduction band is lumo level of k2@tio2 (-3.842 ev), then lumo level of k1@tio2 (-3.925 ev) and at the end lumo level of c@tio2 (-4.386 ev). in all systems, all other lumo levels were found in the conduction band of tio2 cluster. fig. 15 schematic energy diagram of the cyanidin, curcumin, tio2 model and three molecule/cluster systems. the results confirm that electron has a higher probability to reach the conduction band in case of systems with the curcumin than in case of system with the cyanidin, which indicates another reason why the solar cell with curcumin has greater efficiency. characteristics of curcumin dye used as a sensitizer in dye-sensitized solar cells 103 9. conclusion the experimental results showed that the dye-sensitized solar cell stained with curcuma longa provides greater efficiency than the dye-sensitized solar cell stained with raspberry. dft calculations showed that a curcumin is closer to the conduction band minimum than a cyanidin, which indicates that electron from curcumin has a higher probability to reach the conduction band. we concluded that curcumin has better properties as a sensitizer than cyanidin for the needs of dye-sensitized solar cells, which is confirmed both by experimental and by simulation results. it is essential to find new dye sensitizers to improve efficiency of the dye-sensitized solar cells, one of the potential new dye sensitizer could be curcumin. acknowledgement: the authors would like to thank to the petnica science center, the institute of physics in belgrade on great assistance and cooperation, also the authors gratefully acknowledge the financial support of serbian ministry of education, science and technological development. references [1] s. ilić, v. paunović, “application of curcumin in dye-sensitized solar cells,” in proceedings of the extended abstracts of the 61st national conference on electrical, electronic and computing engineering (etran 2017), kladovo, serbia, june 5-8, 2017. [2] s. abasian, r. sabbaghi-nadooshan, “introducing a novel high-efficiency arc less heterounction dj solar cell,” facta universitatis, series: electronics and energetics, vol. 31, no. 1, pp. 89-100, 2018. [3] m. jošt, m. topič, “efficiency limits in photovoltaics – case of single junction solar cells,” facta universitatis, series: electronics and energetics, vol. 27, no. 4, pp. 631-638, 2014. [4] r. singh, g. alapatt, g. bedi, “why and how photovoltaics will provide cheapest electricity in the 21 st century,” facta universitatis, series: electronics and energetics, vol. 27, no. 2, pp. 257-298, 2014. [5] b. o’regan, m. gratzel, “a low-cost, high-efficiency solar cell based on dye-sensitized colloidal tio2 films,” nature, vol. 353, pp. 737-740, 1991. [6] g. calogero, j. yum, a. sinopoli, g. di marco, m. gratzel, m. k. nazeeruddin, “anthocyanins and betalains as light-harvesting pigments for dye-sensitized solar cells,” solar energy, vol. 86, pp. 15631575, 2012. [7] n. a. ludin, et al. "review on the development of natural dye photosensitizer for dye-sensitized solar cells." renewable and sustainable energy reviews, vol. 31, pp. 386-396, 2014. [8] m. r. narayan, "dye sensitized solar cells based on natural photosensitizers." renewable and sustainable energy reviews, vol. 16, no. 1, pp. 208-215, 2012. [9] k. wongcharee, v. meeyoo, s. chavadej. "dye-sensitized solar cell using natural dyes extracted from rosella and blue pea flowers." solar energy materials and solar cells, vol. 91, no. 7, pp. 566-571, 2007. [10] s. tekerek, a. kudret, and ü. alver. "dye-sensitized solar cells fabricated with black raspberry, black carrot and rosella juice." indian journal of physics, vol. 85, no. 10, pp. 1469-1476, 2011. [11] h. kim, d. kim, s.n. karthick, k.v. hemalatha, c. justin raj, sunseong ok, youngson choe, “curcumin dye extracted from curcuma longa l. used as sensitizers for efficient dye-sensitized solar cells,” int. j. electrochem. sci., vol. 8, pp. 8320-8328, 2013. [12] m. valiev, et al., “nwchem: a comprehensive and scalable open-source solution for large scale molecular simulations,”computer physics communications, vol. 181, pp. 1477-1489, 2010. [13] s. meng, j. ren, e. kaxiras, “natural dyes adsorbed on tio2 nanowire for photovoltaic applications: enhanced light absorption and ultrafast electron injection,” nano letters, vol. 8, no. 10, pp. 32663272, 2008. [14] m. alhamed, a. isaa, w. doubal, “studying of natural dyes properties as photo-sensitizer for dyesensitized solar cells (dssc),” journal of electron devices, vol. 16, pp. 1370-1383, 2012. [15] p. persson, j. c. gebhardt, s. lunell, “the smallest possible nanocrystals of semiionic oxides,” thejournal of physical chemistry b, vol. 107, pp. 3336-3339, 2003. 104 s. ilić, v. paunović [16] i. đorđević, s. ilić, “the application of combined natural pigments in dye-sensitized solar cells,” petnica science center – selected students’ papers, vol. 73, pp. 96-105, 2014 (in serbian). [17] s. ito, p. chen, p. comte, m. k. nazeeruddin, p. liska, p. péchy, m. grätzel, "fabrication of screen‐printing pastes from tio2 powders for dye‐sensitised solar cells." progress in photovoltaics: research and applications, vol. 15, no. 7, pp. 603-612, 2007. [18] from the official website solaronix [on line]. available at: http://www.solaronix.com/documents/ dye_solar_cells_for_real.pdf [19] a. luque, s. hegedus, eds. handbook of photovoltaic science and engineering. john wiley & sons, 2011. [20] multifunctional system ni usb-6008. available at: http://www.ni.com/pdf/manuals/371303n.pdf [21] s. ilić, “dft characterization of curcumin and cyanidin as photosensitizers in dye-sensitized solar cells,” petnica science center – selected students’ papers, vol. 74, pp. 68-74, 2015 (in serbian). [22] e. ronca, m. pastore, l. belpassi, f. tarantelli, f. de angelis, “influence of the dye molecular structure on the tio2 conduction band in dye-sensitized solar cells: disentangling charge transfer and electrostatic effects,” energy & environmental science, vol. 6, pp. 183-193, 2013. [23] d. rocca, r. gebauer, f. de angelis, m. k. nazeeruddin, s. baroni, “time-dependent density functional theory study of squaraine dye-sensitized solar cells,” chemical physics letters, vol. 475, pp. 49-53, 2009. facta universitatis series: electronics and energetics vol. 32, n o 2, june 2019, pp. 231-238 https://doi.org/10.2298/fuee1902231l plug-and-play transceiver with high gain and ultra low noise figure for ieee 802.15.4 application josue lopez-leyva, miguel ponce-camacho, ariana talamantes-alvarez center for innovation and design, cetys university, microwave street, ensenada, mexico abstract. this paper shows the design and performance simulation of a 2.4 ghz plugand-play transceiver based on a high speed switch for ieee 802.15.4 applications. the electrical design was optimized taking into account the scattering parameters, inputoutput impedance matching and minimum trace width. the simulation results show an important performance regarding the noise figure (0.38 db) and gain (21 db) at particular temperature for reception mode, transmission scattering parameters (s12 and s21) and reflection scattering parameters (all the rest parameters) for both mode operation (power amplifier and low noise amplifier). key words: power amplifier, low noise amplifier, scattering parameters. 1. introduction nowadays, wireless communication systems are necessary to improve and expand the variety of services for the private, public and personal sectors [1,2]. in particular, the concepts of internet of things (iot) and machine-to-machine (m2m) impose a tendency towards the monitoring, control and data acquisition for different types of clients [3]. although there is a large number of wireless communication systems, these require improvements to some parameters, such as the extension of coverage (i.e. link distance) considering the trade-offs between energy consumption, the complexity of the electronic design, and the cost-effect. in order to improve these parameters, the power amplifier (pa) and the low noise amplifier (lna) are suitable technical options for full-duplex high-end telecomm systems; both have important features such a noise figure, gain, linearity, single / multiple narrow/wide bands and impedance matching [4,5]. however, designing and manufacturing these circuits with high performance for all parameters is a difficult task. resizing pa and lna is a trend but the gain-size trade-off is a highlight issue [6,7]. a lna+pa circuit with higher gain and lower noise figure (nf) is required received september 6, 2018; received in revised form november 14, 2018 corresponding author: josue lopez-leyva cetys university, center for innovation and design, mexico (e-mail: josue.lopez@cetys.mx) 232 j. lopez-leyva, m. ponce-camacho, a. talamantes-alvarez for wide coverage applications where plug-in-play transceiver systems are needed [8,9]. in terms of low data rate wireless personal area network technologies, ieee 802.15.4 is the most useful standard used due to the extended life of the device based on low power consumption [10,11]. wide coverage applications based on this protocol are a crucial issue that the presented novel and optimum transceiver can solve. we propose a reduced plug-and-play transceiver in comparison with the traditional transceiver. the principal objective of our proposal is to increase the distance of communication links without the digital processing performed in traditional transceiver. this paper is organized as follows: section 2 is dedicated to the general description of the electrical design. section 3 shows the simulation results regarding scattering parameters in both operation modes, noise figure and gain performance. section 4 concludes the paper and mentions the future work for the manufacture of the electrical board with industrial quality level. 2. electronic design fig. 1 shows the block diagram of the transceiver (pa+lna), and multisim software was used for simulation analysis. the general set-up presents the lna subsystem where the incoming signal is received by the antenna (sma connector) and fed to a high speed rf switch. the switch presents a high isolation based on a rlc circuit and two diode circuits, i.e. dual switching diode circuit (baw56lt1) and a high shunt signal isolator / low shunt insertion loss diode (bar81w) with a switching rate up to 2 ghz. in particular, the rf switch has a control port to commute between transmission and receiver mode. after the lna block, the electrical signal is fed to another rf switch to send the signal to the processing board. as for the signal path and the way of processing for the pa, it is the same as that of lna. in addition, two test points were established in order to measure the scattering parameters (s-parameters) [12] using a network analyzer (na) for different time slots (i.e., slot #1 for reception mode that relates port #2 as input and port #1 as output, while slot #2 for transmission mode that relates port #1 as input and port #2 as output). fig. 1 block diagram of transceiver. blue trace describes the lna-pa path and red trace describes the pa-lna path with the respective measurement points at slot 1 and 2. plug-and-play transceiver with high gain and ultra low noise figure for ieee 802.15.4 application 233 fig. 2a) shows the general electronic diagram for the high speed switch / rf isolator based on the diodes mentioned for transmission mode. a mode controller is used in order to switch modes using the connection points, c and d. in particular, the connection point c, enables or disables the pa circuit shown in fig. 2b, and connection point d controls the lna circuit shown in fig. 3b). while the connection points a and b are the input and output of the pa circuit. an input-matching-impedance-network (imn) and outputmatching-impedance-network (omn) were implemented in the input and output port of the pa, respectively, as fig. 2 shows. a) b) fig. 2 a) electronic diagram for high speed switch / rf isolator for transmission mode, b) electronic diagram of the pa. a) b) fig. 3 a) electronic diagram for high speed switch / rf isolator for reception mode, b) electronic diagram of lna. fig. 3a) shows the general electronic diagram for the high speed switch / rf isolator for reception mode. in general, the electronic diagrams shown in fig. 2a) and 3a) are similar, however, particular inductance and capacitance values are modified in order to optimize the imn and omn. in addition, the connection points, e and f, represent the input and output of the lna circuit. as mentioned, the pa circuit uses the bfp650 transistor, therefore, the first step of the design is to measure the current-voltage 234 j. lopez-leyva, m. ponce-camacho, a. talamantes-alvarez characteristics in order to choose and set the q-point (operating or quiescent point). fig. 4 shows the relation between the vce and ic for different ib, where the trace corresponding to ib = 6 ma was selected for vce = 3.3 v in order to establish proper operating conditions (q-point) based on the input data signal. the same procedure was performed to determine the q-point of the bfp843f used in the lna circuit and the same biasing voltage (vce) was chosen. fig. 4 analysis of the transistor bfp650. blue trace describes the vce-ic relation for different ib. red trace is the optimum steady state for q point. an important issue in the circuit design is the matching impedance with respect to the electronic element and the transmission line in the pcb. therefore, the characteristic impedance (z0) for the microstrip line can be calculated using some physical and electromagnetic parameters as eq. (1) shows [13]. 0 120 1.393 0.667 ln 1.444 eff z w w h h            (1) where w is the width, h is the dielectric thickness and εeff is the effective dielectric constant. in particular, eq. (1) is only suitable for microstrip satisfying the relation (w/h > 1). however, to optimize this matching, a transmission line calculator was used where the transmission line type, length and dielectric material characteristics were selected to produce a z0 ≈ 50 ω and a minimum capacitance and inductance (see fig. 5). due to the high power demand of the circuit, a trace width analysis was performed in order to calculate the minimum trace width based on the root mean square (rms) electric current in each electrical path. fig. 6 shows the printed circuit board (pcb) layout based on the aforementioned parameters and fig. 7 shows the three-dimensional view of the pcb. the ultiboard software was used for the pcb designs. plug-and-play transceiver with high gain and ultra low noise figure for ieee 802.15.4 application 235 fig. 5 transmission line calculator in order to determine the characteristic impedance based on particular physical features of dielectric material and microstrip. by using the matching circuits imn and omn shown in fig. 2 and 3, the input and output impedances of the pa and lna are obtained as follows: for pa, zin = 50.1 ω and zout = 49.48 ω, while for lna circuit, zin = 49.3 ω and zout = 45.05 ω. the good impedance matching was performed using l-section networks (i.e. using an inductor and a capacitor), however, the bandwidth and gain are an important trade-off considered in the complete design. fig. 6 pcb layout using c0402 packaging in each electronic element. fig. 7 3d view of printed circuit board layout. 236 j. lopez-leyva, m. ponce-camacho, a. talamantes-alvarez 3. simulation results fig. 8 shows the simulation results of the s-parameters for the reception mode. the s12 value means that there is a high transmission power ratio (≈ 21 db) of the complete circuit (lna+pa+ high speed rf switch), while the s22 value (≈ -19 db) and the s11 value (≈ 21 db) means a good matching performances achieved in the input and output ports, respectively, in the reception operation mode. the s21 (≈ -27 db) has an adequate electrical performance of isolation between input and output ports. fig. 8 performance of the pa-lna scheme in the reception mode (port #2 is the input and port #1 is the output). fig. 9 performance of the pa-lna scheme in the transmission mode (port #1 is the input and port #2 is the output) plug-and-play transceiver with high gain and ultra low noise figure for ieee 802.15.4 application 237 with respect to the measurements of the s-parameters in the transmission mode (see fig. 9), s21 and s12 are the most important because they describe the transmitted and the reflected level signal (≈ 18 db and ≈ -19 db, respectively). in addition, fig. 10 shows the performance of nf and gain (g) depending on the temperature variation at 2.4 ghz. the nf measurement is ≈ 0.6 db and gain is ≈ 21 db for 27 °c. fig. 10 nf and gain of the pa-lna scheme in transmission mode (slot #1) at 2.4 ghz with temperature variations. in addition, nf and g parameters were measured at 18.8 °c (i.e. 292 °k, temperature standard). in this case, nf is ≈ 0.38 db and g is ≈ 21.5 db. 4. conclusion this paper presented a transceiver circuit that has good performance parameters considering s-parameters, noise figure and gain based on the detailed design for imn and omn. the plug-and-play feature imposes an easy way to extend the coverage of different traditional wireless systems based on the ieee 802.15.4 standard. it is important to clarify that the principal objective of the proposal is to increase the distance of the communication link of systems based on ieee 802.15.4. therefore, although conventional and commercial transceivers perform other processes (e.g. digital-to-analog converter, frequency synthesizer, among others), our proposal only focuses on improving the transmission and reception mode without considering modulation, synchronization, coding, encryption among others schemes. in particular, the analysis for the pcb design is based on microstrip transmission lines, although a ground layer is added in order to improve the performance. due to the above, it is possible to confuse the transmission lines shown in fig. 7 as a conventional coplanar waveguide (cpw). in fact, the impedance analysis is not performed considering a cpw. currently, we have a first prototype that uses fr4 dielectric material in order to perform some accelerated life testing (alt) and technical operating production (top). in addition, the transceiver circuit has been manufactured using a flexible dielectric material and other types of transmission lines in order to enhance the electronic performance. 238 j. lopez-leyva, m. ponce-camacho, a. talamantes-alvarez acknowledgement: this work was supported by the grant of center for innovation and design (ceid), cetys university as an internal scientific and technical project. in addition, this article was prepared within the frame of industrial-academic relationship of the ceid. in particular, thanks to the english native speaker colleagues that supported this document. references [1] j. g. d. hester, j. kimionis and m.m. tentzeris, “printed motes for iot wireless networks: state of the art, challenges, and outlooks”. trans. microwave theory and techniques, vol. 65, pp. 1819–1830, may 2017. [2] g. zheng, c. hua, r. zheng and q. wang, “toward robust relay placement in 60 ghz mmwave wireless personal area networks with directional antenna”, trans. mobile computing, vol. 15, pp. 762–773, march 2016. [3] j.w. raymond, t.o. olwal and a.m. kurien, “cooperative communications in machine to machine (m2m): solutions, challenges and future work”. access, vol. 6, pp. 9750–9766, february 2018. [4] j-e. baek, y.m cho and k-c. ko, “analysis of design parameters reducing the damage rate of lownoise amplifiers affected by high-power electromagnetic pulses”, trans. plasma science, vol. 46, pp. 524–529, march 2018. [5] h. laaouane, j. foshi and s. bri, “design of a low noise amplifier for lte radio base station receivers. in: international conference on wireless technologies”, in proceedings of the international conference on wireless technologies, embedded and intelligent systems (wits). morocco, ieee, 2017, pp. 1–5. [6] w-l. ou, y-k. tsai, p-y. tsengand and l-h. lu, “a 2.4-ghz dual-mode resizing power amplifier with a constant conductance output matching”. in proceedings of the international system-on-chip conference. munich, ieee, 2017, pp. 258–261. [7] p. qin and q. xue, “compact wideband lna with gain and input matching bandwidth extensions by transformer”, microwave and wireless components letters, vol. 27, pp. 657–659, july 2017. [8] j. p. carmo, n. dias, p. m. mendes, c. couto and j. h. correia, “low-power 2.4-ghz rf transceiver for wireless eeg module plug-and-play”. in proceedings of the international conference on electronics, circuits and systems, nice, ieee, 2006, pp. 1144–1147. [9] w-t. fang and y-s. lin, “highly integrated switched beamformer module for 2.4-ghz wireless transceiver application”, trans. microwave theory and techniques, vol. 64, pp. 2933–2942, sept. 2016. [10] h-j. jeon, t. demeechai, w-g. lee, d-h. kim and t-g. chang, “ieee 802.15.4 bpsk receiver architecture based on a new efficient detection scheme”, trans. signal processing, vol. 58, pp. 4711 – 4719, sept 2010. [11] a. zolfaghari, m-e. said, m. youssef, g. zhang, t-t. liu, f. cattivelli, y-i. syllaios, f. khan, f-q. fang, j. wang, k-y. jason-li, fh. liao, d-s. jin, v. roussel, d-u. lee and f-m. hameed, “a multimode wpan (bluetooth, ble, ieee 802.15.4) soc for low-power and iot applications”, in proceedings of the symposium on vlsi circuits, kyoto, ieee, 2017, pp. c74–c75. [12] b. lehmeyer, m.t. ivrlač and j.a. nossek, “lna noise parameter measurement”, in proceedings of the european conference on circuit theory and design, trondheim, ieee, 2015 pp. 1–4. [13] j.w.n. rogers and c. plett, “radio frequency integrated circuit design”. norwood: artech house, 2010, chapters 4, pp. 63–93. facta universitatis series: electronics and energetics vol. 32, n o 1, march 2019, pp. 119-128 https://doi.org/10.2298/fuee1901119s design of efficient coplanar 1-bit comparator circuit in qca technology ahmadreza shiri, abdalhossein rezai, hamid mahmoodian acecr institute of higher education, isfahan branch, isfahan 84175-443, iran abstract. qca technology is an emerging and promising technology for implementation of digital circuits in nano-scale. the comparator circuits play an important role in digital circuits. in this work, a new and efficient coplanar 1-bit comparator circuit is proposed and evaluated in the qca technology. the designed coplanar 1-bit qca comparator circuit is constructed based on majority gate, xnor gate and inverter gate that are designed carefully. the functionality of the designed coplanar 1-bit qca comparator circuit is verified by using qcadesigner version 2.0.3. the obtained results indicate that the designed 1-bit qca comparator circuit requires 0.03 µm 2 area and 38 qca cells. it also has 0.5 clock cycles delay. the comparison demonstrates that the designed qca comparator circuit provides improvements in comparison with other qca comparator circuits in terms of effective area, cell count, and delay as well as cost. key words: comparator, quantum-dot cellular automata, high-performance design, coplanar circuit 1. introduction two important issues in the vlsi design are scaling and reducing the computation time. the quantum-dot cellular automata (qca) technology is an emerging and promising technology to these issues at nano-scale [1]. the basic element in this technology is a square cell that has two free electrons in four dots [1-14]. the qca cell is a building block for constructing qca gates [1-14]. there are three basic gates in this technology: inverter gate, majority (m) gate, and xor gate [3-4]. these gates are building blocks for constructing the logic circuits such as qca multiplexers [5, 7], qca full address [1-3, 6, 8] and qca comparators [9-12, 15-18]. on the other hand, the comparator circuits play an important role in digital circuits such as micro controllers [6, 12, 15-18]. thus, the implementation of high-performance comparator circuits has a great deal of attention, and a lot of effort [10-12, 15-18] has been invested in performance improvement in the qca comparator circuits. das and de [10] have presented a 1-bit qca comparator that requires 0.343 µm 2 area and 319 qca received may 8, 2018; received in revised form september 14, 2018 corresponding author: abdalhossein rezai acecr institute of higher education, isfahan branch, isfahan 84175-443, iran (e-mail: rezaie@acecr.ac.ir) 120 a. shiri, a. rezai, h. mahmoodian cells. alshafi and bahar [11] have presented a 1-bit qca comparator, which requires 0.182 µm 2 area and 117 qca cells. shinha et al. [12] have proposed two qca comparator circuits that require 40 and 37 qca cells and 0.032 and 0.028 µm 2 area, respectively. ghosh et al. [16] have presented a 1-bit qca comparator circuit that requires 0.06 µm 2 area and 73 qca cells. akter et al. [17] have presented a 1-bit qca comparator circuit, which requires 0.11 µm 2 area and 87 qca cells. bhoi et al. [17] have presented a 1-bit qca comparator circuit that requires 0.23 µm 2 area and 220 qca cells. this study proposes an efficient coplanar 1-bit qca comparator circuit. the designed qca comparator is based on majority, xor and inverter gates. the accuracy of the designed circuit functionality is demonstrated by using qcadesigner version 2.0.3. the simulation results show that the designed coplanar 1-bit qca comparator circuit provides improvements compared with other 1-bit qca comparator circuits in terms of cell count, area, delay time and cost. the rest of this study is unified as follows: section 2 provides a review for qca technology. in section 3, the designed qca comparator circuit is presented. the results and comparison of the designed qca comparator circuit are provided in section 4. the conclusion is presented in section 5. 2. background 2.1. qca cell figure 1 shows the basic qca cell and its possible stats. we can consider each qca cell as a square including four quantum dots and a pair of electrons [1, 5]. electrons can be located at diagonally opposite locations due to the coulomb interaction between electrons in each cell. there are two different forms in each cell that their polarizations are specified as -1 and +1. these polarizations denote the binary values of 0 and 1, respectively [1, 5]. fig. 1. the possible stats for the qca cell [5] 2.2. qca gates there are three fundamental gates in this technology: inverter, majority, and xor gates, which are used to construct the circuits in this technology [11, 13, 19]. figure 2 shows these qca gates [3, 13]. design of efficient coplanar 1-bit comparator circuit in qca technology 121 (a) (b) (c) (d) (e) fig. 2 qca gates: (a) corner inverter, (b) robust inverter, (c) original majority gate (omg), (d) rotated majority gate (rmg), (e) xor gate [11, 3, 13, 19] in figure 2(a) and figure 2(b), the inverted polarization value of the input in each inverter is shown as the output. in figure 2 (c) and figure 2 (d), two kinds of qca threeinput majority gates are shown. figure 2 (e) shows three inputs qca xor gate [6, 13]. 122 a. shiri, a. rezai, h. mahmoodian 2.3. qca comparator comparator circuits play an important role in digital circuits [9-12, 15-18]. this circuit compares their two inputs. suppose a and b are two inputs of the comparator circuit, the outputs of this circuit are defined as follows [15]: output (ab) = a. ̅ where a>b for implementation of comparator circuit in the qca technology, equation (1) is reformulated as follows [15]: output (ab) = m (a, ̅, 0) where a>b 2.4. related works das and de [10] have developed a qca comparator circuit by combining the feynman and tr gates functional property, which is shown in figure 3. fig. 3. the utilized qca comparator circuit in [10] this qca comparator circuit requires 0.343 µm 2 area and 319 qca cells. al-shafi1 et al. [11] have developed a qca comparator circuit without wire-crossing, which is shown in figure 4. design of efficient coplanar 1-bit comparator circuit in qca technology 123 fig. 4. the utilized qca comparator circuit in [11] this qca comparator circuit requires 0.182 µm 2 area and 117 qca cells. shinha roy et al. [12] have developed a qca comparator circuit based on layerd-t or and and gates, which is shown in figure 5. fig. 5. the utilized qca comparator circuits in [12] this qca multilayer comparator circuit requires 0.03 µm 2 area and 37 qca cells. 124 a. shiri, a. rezai, h. mahmoodian ghosh et al. [16] have developed a qca comparator circuit, which is shown in figure 6. fig. 6. the utilized qca comparator circuit in [16] this qca comparator circuit requires 0.06 µm 2 area and 73 qca cells. bhoi et al. [17] have developed a qca comparator circuit, which is shown in figure 7. fig. 7. the utilized qca comparator circuit in [17] this qca comparator circuit requires 0.23 µm 2 area and 220 qca cells. design of efficient coplanar 1-bit comparator circuit in qca technology 125 akter et al. [18] have developed a qca comparator circuit based on tr and feynman gates, which is shown in figure 8. fig. 8. the utilized qca comparator circuit in [18] this qca comparator requires 0.11µm 2 area and 87 qca cells. although, these qca comparator circuits are suitable, the performance of the comparator can be improved as will be described in the next section. 3. the proposed qca comparator circuit the proposed qca comparator circuit has two 1-bit inputs and three 1-bit outputs. the inputs are indicated by a and b, and the outputs are indicated by l(a, b), e(a, b), and g(a, b). the relation between outputs and inputs are defined as follows: l(a, b)= ̅.b where ab as it is shown in equation (3), if the input a is less than the input b, the output l(a, b) is “1” and other outputs are “0”. moreover, if the input a is greater than the input b, the output g(a, b) is “1” and other outputs are “0”. otherwise, the inputs a and b are equal, the output e(a, b) is “1” and other outputs are “0”. figure 9 shows the designed 1-bit qca comparator circuit. (a) (b) fig. 9. the designed 1-bit qca comparator circuit (a) block diagram (b) layout 126 a. shiri, a. rezai, h. mahmoodian the designed 1-bit comparators consist of 2 original majority gates (fig.2 (c)), 3 inverter gates (fig. 2 (a)) and an xor gate (fig. 2 (e)). the majority gates in the developed 1-bit qca comparator circuit are used for implementation of and gates. as a result, one input of these majority gates is set as logic "0". the designed 1-bit qca comparator circuit requires 38 qca cells. 4. simulation results and comparison the designed 1-bit qca comparator circuit is simulated by using qcadesigner tool version 2.0.3. the following parameters are used for simulation: the number of samples: 12800, radius of effect [nm]:65.000000, the convergence tolerance: 0.00100, relative permittivity: 12.900000, clock low [j]: 3.800000e-023, clock high [j]: 9.800000e-022, clock shift: 0.000000e+000, and clock amplitude factor: 2.000000. other simulation parameters are chosen as default. figure 10 shows the simulation results of the designed 1-bit comparator circuit. fig. 10. the results for the designed 1-bit comparator circuit these results demonstrate that the outputs of the designed 1-bit comparator circuit are correctly obtained after 0.5 clock cycles delay. moreover, the designed 1-bit qca comparator circuit requires 0.03 µm 2 area and 38 qca cells. table 1 summarizes the simulation results of the designed 1-bit comparator circuit compared with other 1-bit comparator circuits in [10-12, 16-18]. table 1 the comparison table for 1-bit qca comparator circuit reference cell count area (μm 2 ) time delay (clock cycle) crossover cost [10] 319 0.343 3 multilayer 1.029 [11] 117 0.182 1 coplanar 0.182 [12] design1 40 0.032 1 multilayer 0.032 [12] design2 37 0.028 1 multilayer 0.028 [16] 73 0.06 1 coplanar 0.060 [18] 87 0.11 0.50 coplanar 0.055 [17] 220 0.23 0.75 coplanar 0.172 this paper 38 0.030 0.50 coplanar 0.015 design of efficient coplanar 1-bit comparator circuit in qca technology 127 in this table, area and delay are shown in terms of µm 2 and clock cycle, respectively. moreover, following equation is used to determine the cost value based on [1, 5, 7]. cost= area × delay (4) as it is shown in table 1, the designed 1-bit comparator circuit has advantages in terms of cost and area compared to [10-12, 1618]. for example, the cell count, area, delay and cost in the designed 1-bit qca comparator circuit are improved compared to 1bit qca comparator circuits in [10] by about 88%, 91%, 83% and 98%, respectively. the only 1-bit qca comparator circuit, which requires a slightly lower cell count and area than the designed qca comparator circuit is the 1-bit qca comparator circuit in [12] (design 2). however, this advantage has been resulted from the increased number of layers, not from logic design. in addition, the delay time and cost in the proposed 1-bit qca comparator circuit are reduced by about 50% and 40% compared to the 1-bit qca comparator circuit in [12] (design 2). 5. conclusions qca technology is a promising technology for implementation of digital circuits in nano-scale [1-6]. the comparator circuits play important role in digital circuits [9-12, 1618]. in this study, an efficient 1-bit qca comparator circuit was proposed and evaluated. the designed 1-bit qca comparator circuit was constructed based on majority gate, xnor gate and inverter gate that were designed carefully. the functionality of the designed 1-bit comparator circuit was verified by using qcadesigner version 2.0.3. the obtained results indicate that the designed 1-bit comparator circuit requires 0.03 µm 2 area and 38 qca cells. it also has 0.5 clock cycle delay. the results showed that the designed 1-bit comparator circuit provided improvements compared with other 1-bit comparator circuits in [10-12, 16-18] in terms of cell count, effective area, and delay as well as cost. references [1] h. rashidi, a. rezai, “high-performance full adder architecture in quantum-dot cellular automata,”j. eng., vol. 2017, pp. 394–402, 2017. [2] d. mokhtari, a. rezai, h. rashidi, f. rabiei, s. emadi, a. karimi, “ design of novel efficient full adder circuit for quantum-dot cellular automata technology, ” facta univ. series: electr. energy, vol. 31, no. 2, pp. 279-285, 2018. [3] i. edrisi arani, a. rezai, “novel circuit design of serial-parallel multiplier in quantum-dot cellular automata technology”, j. comput. electr., 2018. [4] m. niknejad divshali, a. rezai, a. karimi, “towards multilayer qca siso shift register based on efficient d-ff circuits”, int. j. theor. phys., 2018. [5] h. rashidi, a. rezai, “design of novel efficient multiplexer architecture for quantum-dot cellular automata,” j. nano electr. phys., vol. 9, no. 1, pp. 1-7, 2017. [6] m. balali, a. rezai, h. balali, f. rabiei, s. emadid, “towards coplanar quantum-dot cellular automata adders based on efficient three-input xor gate,” result phys., vol. 7, pp. 1389-1395, 2017. [7] h. rashidi, a. rezai, s. soltani, “high-performance multiplexer architecture for quantum-dot cellular automata” j. comput. electr., vol. 15, pp. 968-981, 2016. [8] m. balali, a. rezai, “design of low-complexity and high-speed coplanar four-bit ripple carry adder in qca technology,” int. j. theor. phys., pp. 1-13, 2018. [9] d. bahrepour, “a novel full comparator design based on quantum-dot cellular automata,” int. j. inf. electr. eng., vol. 15, pp. 406-410, 2015. 128 a. shiri, a. rezai, h. mahmoodian [10] j c. das, d. de, “reversible comparator design using quantum dot cellular automata,” iete j. res., vol. 62, pp. 323-330, 2016. [11] m d. abdullah-al-shafi, a n. bahar, “optimized design and performance analysis of novel comparator and full adder in nanoscale,” cogent eng., vol. 3, 2016. [12] s. sinha roy, c. mukherjee, s. panda, a. k. muchopadhyay, b. maji, “layered t comparator design using quantum-dot cellular automata,” ieee conf. dev. integ. circ. (devic), pp. 90-94, 2017. [13] a. n. bahar, s. waheed,”a novel 3-input xor function implementation in quantum dot-cellular automata with energy dissipation analysis,” alexandria eng. j., in press, 2018. [14] j. c. das, d. de, “novel low power reversible binary incrementer design using quantum-dot cellular automata,” microprocess microsyst., vol. 42, pp. 10-23, 2016. [15] a. sarker, md. badrul alam miah, “design of 1-bit comparator using 2 dot 1 electron quantum-dot cellular automata,” int. j. adv. comput. sci. appl., vol. 8, no. 3, pp. 481-485, 2017. [16] b. ghosh, sh. gupta, s kumari, “quantum dot cellular automata magnitude comparators,” ieee int. conf. electr. dev. solid state circ. (edssc), pp. 1-2, 2012. [17] b. k. bhoi, n. k. misra, m. pradhan, “a universal reversible gate architecture for designing n-bit comparator structure in quantum-dot cellular automata,” int. j. grid distr. comput., vol. 10, no. 9, pp. 33-46, 2017. [18] r. akter, n islam, s waheed, “implementation of reversible logic gate in quantum dot cellular automata,” int. j. comput. appl., vol. 109, pp. 41-44, 2017. [19] m. balali, a. rezai, h. balali, f. rabiei, s. emadid, “a novel design of 5-input majority gate in quantum-dot cellular automata technology,” ieee symp. comput. appl. indust. electr. (iscaie), pp. 1316, 2017. facta universitatis series: electronics and energetics vol. 32, n o 2, june 2019, pp. 239-247 https://doi.org/10.2298/fuee1902239r design and analysis of quadrifilar helical antenna for cube-sats using c-band frequency range for satellite communication pinku ranjan 1 , mihir patil 2 , amit bage 3 , brajesh kumar 2 , sandeep kumar p. 2 1 department of computer science & engineering, abv-indian institute of information technology and management, gwalior, madhya pradesh–474015, india 2 department of electronics and communication engineering, srm institute of science and technology, kattankulathur, chennai, tamil nadu– 603203, india 3 department of electronics and communication engineering, national institute of technology, hamirpur, himachal pradesh – 177005, india abstract. design and analysis of quadrifilar helical antenna are presented in this paper. the proposed antenna is designed for cube-sats in the low earth and medium earth orbits. it is a combination of four helical antennas, each separated by 90°, and excited separately at the feeding point. the antenna is designed for operation at 4.5 ghz with an impedance bandwidth of 11.11 %. design of the antenna is done in two steps. the first step being the design of a ground plane, which can make the antenna operate at 4.5 ghz. the second step is to analyze the antenna’s performance for different helix angles using the best ground plane dimensions obtained in the first step. the gain versus frequency curve has been obtained and the designed antenna is having a gain of more than 4 db at the resonant frequency of 4.5 ghz. key words: quadrifilar antenna, satellite communication, coaxial probe feed 1. introduction due to a huge building, assembling and launching costs of large satellites, many of the private institutions who are willing to contribute even a bit of chunk to the space exploration department are having a cube and microsatellites as their priority. in modern microwave and millimeter wave communication systems, the use of quadrifilar helical antenna is increasing day by day. this is due to the very large beam-width and high gain provided by the antenna [1-4]. it has become a major pillar for antenna design of satellite communication. even due to the evolution of electronics and vlsi technology, it is possible for small satellites to perform pretty difficult space exploration task. and thus, there is a need received september 14, 2018; received in revised form february 15, 2019 corresponding author: amit bage department of ece, srm institute of science and technology, kattankulathur, chennai, india, 603203 (e-mail: bageism@gmail.com) 240 p. ranjan, m. patil, a. bage, b. kumar and s. kumar p. felt to design small antennas suitable to fit on cube-sats. in [5] the deployable helical antennas antenna is presented for cube-sats. the deployment mechanism is used for the antenna to take as less space as possible. thus, it would require a ground plane which is also deployable to reflect out the back lobes. in some cases, this might help but, because it requires a deploying mechanism. it becomes very hard for small satellites to carry out the job with perfection. even, it could worsen the radiation pattern if not deployed properly and would thus be prone to a lot of errors. the dual-band quadrifilar helix antenna using stepped widths arms has been demonstrated in [6]. in [7] an omnidirectional antenna, sending circularly polarized waves is presented. c-band is selected for the antenna operations. because the antenna’s dimensions are very small in this band. it becomes very suitable to fit on 1u, 2u, 3u cube-sats. also, very less free space loss is incurred as compared to the x and the ku-band. adding to that going towards higher frequencies leads to more atmospheric losses. the s-band is rejected because it would require an antenna of about 14 cm in height. which, the c band is providing at about half the height. the low earth orbits have much less time for a direct line of sight communication. thus, they need to be properly oriented when the line of sight communication can be established. as one qfh antenna could serve only 180°, 2 qfh antennae are required to serve the whole 360 o view of the satellite. the antenna helices require a phase difference of 90 o between two helices. in [8] a very cost efficient and very small sized circuit is designed to give phase differenced signal, which could help to lower the burden of generating and sending out phase differenced signal. the basic design of the quadrifilar helical antenna has been demonstrated in [9]. on basis of that, an antenna is proposed and further optimized. the gain enhancement techniques have been presented in [10]. in [11] printed circuits discontinuities have been taken into account. this works as a resonating structure and thus allows only certain frequency to be received by the antenna. in 2011[12], b. pawan. k et.al. presented circularly polarized (cp) quadrifilar helix antenna (qfh).. this manuscript presents the design and analysis of the quadrifilar helical antenna. it is a combination of four helical antennas, each separated by 90 º , and excited separately at the feeding point. the antenna is operated at 4.5 ghz, with 11.11 % of impedance bandwidth. the antenna works as a circularly polarized antenna in 4.28 – 4,64 ghz. the length of the antenna is 7.5 cm and the bottom cylinder which is below the ground plane is 1 cm. while the length of the feed cylinder (above the ground plane) is 3.6 mm. the numerical simulation analysis has been carried out using ansys high-frequency structure simulator (version 15). the organization of the manuscript is as follows. in the first section, the quadrifilar helical antenna’s geometry is presented. in the second section results and discussion are presented. in the last section final conclusion has been presented. 2. antenna geometry the pitch of the helices is 15 cm and has half a turn and thus making a total length of 7.5 cm, which is 1.125λ. the helix radius is 1.15 cm. and the radius of the wire is 0.5 mm. in [13] the maximum antenna characteristics are achieved using an angle of 73°. while in this design the antenna has a helix angle of 81.28°, to attain a better radiation pattern. all the four helices are of the same dimensions. each of the helices is rotated by 90° with respect to the previous helix. the number of segments per turn is taken as 36, which is the default value. the top part design and analysis of quadrifilar helical antenna 241 of the antenna is having four cylinders, which have their axis perpendicular to the z-axis. this is to support the antenna structure from the top. these cylinders are called as top cylinders. the total height of these cylinders is 11.5 mm, and a radius of 1 mm. these values are taken such that cylinder can easily accommodate the helix into itself. the top cylinders are made up of copper. and thus, no losses are incurred into the design. four metallic rods are placed above the ground plane to support the antenna structure from down. these cylinders are called bottom cylinders. all the four rods are having a radius of 0.9 mm and height of 7.5 mm, such that it could easily accommodate the incoming helix. the four helices, the top four supporting cylinders and the bottom four supporting cylinders are united to make one antenna radiating structure. copper is assigned as the material to the structure. it is assigned a perfect e boundary condition. then below these cylindrical rods, there is a ground plane which is square shaped. below the ground plane, there are four copper rods of 0.9 mm. these cylinders are called as feed cylinders. the feed cylinders are made up of copper. the feed cylinder rods are surrounded by a teflon tube of an inner radius of 0.9 mm and outer radius of 3.018 mm. the radius is taken such that it makes a total impedance of 50 ohms. this makes any wire suitable to attach to the antenna with 50 ohms of impedance. the impedance matching plays a major role for power transmission through the antenna. the height of the feed cylinder is 3.6 mm. the height is selected so that the antenna can be easily mounted on any structure. on the bottom face of these feed cylinders, excitation is given to the ports. separate excitation is given to all the four ports. the four ports are feed with a 90° of phase shifted signal with respect to the simultaneous port. all the ports are fed in clockwise direction. the dielectric constant of teflon is 2.1. the inner copper tube is responsible for transferring the electrical signal from the wave-port to the antenna structure. the cross sections are taken as minimum as possible such that they can easily accommodate the incoming helices, to avoid losses. there are four holes subtracted from the cross-section of the ground plane of radius 0.9 mm, so as to allow the passing of the electric signals through the ground plane. fig. 1 shows the side view of the proposed antenna. fig. 2 shows only the bottom view of the ground plane. fig. 3 shows the cross-sectional view of the whole antenna. fig. 4 shows the direction of alignment for the feeds of the 4 ports of the antenna. fig. 1 side view of qfh antenna 242 p. ranjan, m. patil, a. bage, b. kumar and s. kumar p. fig. 2 bottom view of the ground plane. fig. 3 cross-section view of qfh antenna fig. 4 feed alignment of the 4 ports. 3. result and discussion the antenna’s input characteristics have been analyzed for the desired operating frequency. the ground plane dimension has been analyzed for the lowest reflection coefficient. from the simulations, as shown in fig. 5, it is found that the reflection coefficient is least for the ground plane of length 3.75 cm. it is half of the total height of the antenna. the y x z helices top cylinders ground plane bottom cylinder feed cylinders y x z design and analysis of quadrifilar helical antenna 243 antenna is simulated in hfss with the following design constraints. the maximum no. of passes taken is 6 and the maximum delta s is 0.02.the step size is kept as 0.01 so as to depict the most accurate antenna parameters. the minimum length of the ground plane for which the simulation is evaluated is 3 cm. below 3 cm the ground plane would not be able to support the helix structure. the lowest value of the reflection coefficient is -12 db at 3 cm of the ground plane, which then further decreases until the length of 3.75 cm. the lowest value at 3.75 cm ground plane is about -28.8 db. but after that, the value increases until 5 cm. the value goes up to -16.7 db and then further decreases. the value at 6 cm ground plane is about -21.4 db. but, after 6 cm of ground plane length, the resonating frequency starts to move towards 4.4 ghz. then further at 7, 8, 9, 10 cm the lowest reflection coefficient stays in between 19.5 and -20.5 db but resonating at 4.4 ghz. fig. 5 plot for reflection coefficient against frequency for the different lengths of the ground plane. after this, by keeping the length of the ground plane as 3.75 cm, further optimization is tried by calibrating its results against different helix angles. fig. 6 reflection coefficients for different helix angle with the constant ground plane of 3.75 cm. 244 p. ranjan, m. patil, a. bage, b. kumar and s. kumar p. thus, from fig. 6, it can be inferred that the antenna at the helix angle of 70º resonates at the frequency of 4.8 ghz. and at 75º the antenna resonates at the frequency of 4.4 ghz. but, after that from 80º until 85º, the resonating frequency stays at 4.5 ghz. the antenna performs best at the helix angle of 81.3º, with minimum reflection coefficient of -28.8041. the antenna helix angles are evaluated until 85º. because, above 85º it becomes impossible to mount the coaxial feed as they intersect with other feeds. the final |s11| versus frequency curve has been extracted and it is shown in fig. 7. from fig. 7, it can be inferred that it has a resonant frequency of 4.5 ghz with 11.11 % impedance bandwidth. fig. 7 reflection coefficient for the ground plane of dimension 3.75 cm and helix angle of 81.3º. the far-field analysis has been done for the proposed antenna at their resonant frequency (4.5 ghz). the radiation pattern for xz-plane and xy-plane has been shown in fig. 8 and fig. 9 respectively. the difference between co and cross-polarized is more than 15 db. the eplane view has the maximum value of e field radiation in that cross-sectional plane, which is shown in fig. 8. similarly, h-plane has the maximum value of h field radiation in that crosssectional plane, which is shown in fig. 9. thus, the antenna assures very promising radiation pattern. fig. 8 radiation pattern for the optimum antenna dimension for xz-plane (e-plane). design and analysis of quadrifilar helical antenna 245 fig. 9 radiation pattern for optimum antenna dimension for xy-plane (h-plane). fig. 10 gain (db) vs. frequency (ghz) for phi=80º and theta=110º. fig. 11 axial ratio of the antenna. 246 p. ranjan, m. patil, a. bage, b. kumar and s. kumar p. fig. 12 3-d radiation pattern of the proposed antenna. the gain versus frequency curve has been analyzed for the proposed antenna and it is shown in fig. 10. thus, it can be inferred that the antenna gives a maximum gain of 4.2254 db at phi=80º and theta=110º at resonant frequency 4.5 ghz. the antenna gain is constant throughout the operating frequency band. in fig. 11, the axial ratio of the antenna is shown. the antenna works as circular polarized antenna from 4.28-4.64 ghz.in fig. 12, 3-d radiation pattern of the proposed antenna is shown, in that the maximum gain is 4.22 dbi. the radiation efficiency of the antenna is 79 % is obtained at 4.5 ghz. the proposed antenna is compared with other quadrifilar helical antennas in table-1. it can be inferred from the data that the antenna has a very high bandwidth of 500 mhz, as compared to other designs. also, the proposed antenna has a moderate gain as compared to other designs presented in table-1. linearity in gain over the bandwidth proves very helpful. thus, the novelty in this design is the impedance bandwidth and gain of the antenna. it supports 500 mhz of bandwidth, with a linear gain of 4.22 db. this is the major advantage of the design. the design is a result of intense optimization in the antenna’s height, ground plane size and the diameter of the cylindrical rod. all this is possible with the very simple design of the antenna using the metallic rods. table 1 comparison of the proposed antenna with other antenna designs. ref. resonating frequency (ghz) bandwidth (mhz) gain (db) length of the antenna (cm) % impedance [1] 2.51 20 2.32 4183 0.0079 [11] 0.86 95 6.4 16.1 0.110 [12] 4.2 500 3.5 4.6 0.1190 [13] 1.53 200 6.2 19.5 0.1307 our work 4.5 500 4.22 7.5 0.1111 design and analysis of quadrifilar helical antenna 247 4. conclusion the quadrifilar helical antenna has been designed at 4.53 ghz resonant frequency with 11.11% impedance bandwidth (4.3 ghz to 4.8 ghz). the optimized antenna dimension has a total height of 7.5 cm with half a turn, and it performs the best at 3.75 cm x 3.75 cm of the ground plane with a helix angle of 81.3º. it gives a gain of about 4.2 db at resonant frequency 4.5 ghz at phi=80º and theta=110º. this paper shows that the qfh antenna is a very good candidate for omnidirectional on cube-sats application with a good gain. acknowledgment: the authors would like to the department of science and technology (dst), government of india, for its support through the fist project. references [1] n. bhuma and c. himabindh, “right hand circular polarization of a quadrifilar helical antenna for satellite and mobile communication systems,” recent advances in space techn. servic. and climate change 2010 (rsts & cc-2010), chennai, pp. 307–310, 2010. [2] chapari, z. h. firouzeh, r. moini and s. h. h. sadeghi, “a low weight s-band quadrifilar helical antenna for satellite communication,” in proceedings of the 13th intern. symp. on antenna techn. and applied electromag. and the canadian radio science meeting, toronto, 2009, pp. 1-3c. [3] t. cvetković, v. milutinović, n. dončov, b. milovanović, "numerical calculation of shielding effectiveness of enclosure with apertures based on em field coupling with wire structures", facta universitatis, series: electronics and energetics, vol. 28, no. 4, pp. 585–596, 2015. [4] mengmeng and h. weina, “a printed quadrifilar-helical antenna for ku-band mobile satellite communication terminal,” in proceedings of the 17th intern. conf. on comm. techn. (icct), chengdu, 2017, pp. 755–759. [5] j. costantine, y. tawk, i. maqueda, m. sakovsky, g. olson, s. pellegrino, c. g. christodoulou, “uhf deployable helical antennas for cubesats,” ieee trans. on antennas and propag., vol. 64, no. 9, pp. 3752-3759, 2016. [6] g. byun, h. choo, s. kim, “design of a dual-band quadrifilar helix antenna using stepped-width arms,” ieee trans. on antennas and propag., vol. 63, no. 4, pp. 1858–1862, april 2015. [7] j. hou, x. sun and h. yang, “design of a high gain quadrifilar helix antenna for satellite mobile communication,” in proceedings of the china-japan joint microw. conf., hangzhou, 2011, pp. 1-3. [8] m. s. ghaffarian, s. khajepour and g. moradi, “a quadrifilar helix antenna using low cost planar feeding circuit,” in proceedings of the 24th iranian conf. on electrical engg. (icee), shiraz, 2016. [9] adams, r. greenough, r. wallenberg, a. mendelovicz and c. lumjiak, “the quadrifilar helix antenna,” ieee trans. on antennas and propag., vol. 22, no. 2, pp. 173–178, 1974. [10] s. gao, q. luo, and f. zhu, “circularly polarized antennas,” hoboken, nj, usa: wiley, nov. 2013. [11] y. tawk, m. chahoud, m. fadous, j. costantineand c. g. christodoulou, “the miniaturization of a partially 3-d printed quadrifilar helix antenna,” ieee trans. on antennas and propag., vol. 65, no. 10, pp. 5043–5051, oct. 2017. [12] p. kumar, m. kumar, c. kumar, s. kumar, v. srinivasan, “integrated quadrifilar helix at c-band for spacecraft omni antenna system,” in proceedings of the ieee applied electromag. conf. (aemc), pp. 1–4, 2011. [13] z. y. zhang, l. yang, s. l. zuo, m. u. rehman, g. fu, c. zhou, “printed quadrifilar helix antenna with enhanced bandwidth,” iet microw. antennas & propag., vol. 11, pp. 732–736, 2017. instruction facta universitatis series: electronics and energetics vol. 28, n o 2, june 2015, pp. 263 274 doi: 10.2298/fuee1502263a design of cmos readout frontend circuit for mems capacitive microphones  daniel arbet, viera stopjaková, martin kováč, lukáš nagy, gabriel nagy slovak university of technology in bratislava faculty of electrical engineering and information technology institute of electronics and photonics department of ic design and test abstract. this paper deals with a frontend part of the readout circuit developed as an integrated circuit that after bonding together with a mems capacitive microphone (mcm) chip will be used in a noise dosimeter applicable in very noisy and harsh environment, e.g. mine. therefore, the main attention has been paid to the high dynamic range, low offset and low noise of the developed readout interface as well as its lowpower consumption feature. for conversion of the mcm’s capacitance variation into voltage, an approach based on the buffered input conversion stage biased by a voltage divider was used. the advantage of this approach is that the voltage divider formed by mos transistors can be connected to the high-impedance node (i.e. the output of the mcm, in this case). the whole frontend part of the readout interface was designed in a standard 0.35m cmos technology. finally, the achieved results are discussed and compared to other works. key words: readout interface, mems microphones, analog front-end, preamplifier 1. introduction mems (micro-electro-mechanical-system) microphones are commonly used devices in a portable electronic systems, because they offer the miniaturization and integration of the whole system on a single chip. for such integration, cmos technology is rather advantageous thanks to its good compatibility with mems manufacturing process and relatively low price. mcms are acoustic sensors being proposed to improve the integration and cost of acoustic systems by employing great features of advanced mems technologies. even though the well-known electret-condenser-microphones (ecm) still represent the current market solution for most of acoustic applications, mcms are considered as the future choice for mobile phones, consumer electronics and number of medical applications, received august 29, 2014; received in revised form december 4, 2014 corresponding author: daniel arbet department of ic design and test, slovak university of technology, ilkovicova 3, 812 19 bratislava, slovakia (e-mail: daniel.arbet@stuba.sk) 264 d. arbet, v. stopjaková, m. kováč, l. nagy, g. nagy e.g. hearing aids [1]. with the fast development of mems technologies also the field of possible applications for mems sensors is getting wider. mcms offer several advantages over the classical ecms. first of all, they are smaller in size, compatible with high-temperature automated printed circuit board (pcb) mounting process and less susceptible to mechanical shocks. furthermore, the possibility of monolithic integration of the sound sensor with cmos electronics is another major advantage towards a robust and cost-effective system, enabling both the electrical and mechanical properties of silicon. to convert the output of a mcm (capacity variations in order of ff-pf) into an appropriate electrical signal representation, dedicated circuitry, either analog or digital so-called readout interface is required [2]–[5]. this necessitates a low-noise signal conversion provided by the readout interface. additionally, in recent applications, lowpower profile is required in order to provide the system portability. in our research, a mcm-based application specific integrated circuit (asic) designed as a part of a portable low-cost noise dosimeter for very noisy and harsh environment is targeted. therefore, the main goal is to develop a novel readout circuit schemes suitable for such applications. thus, the most significant features of the developed asic are: low power, low cost and mass production aspects. moreover, a wide dynamic range of the developed analog frontend is one of the major priorities as well. thus, this paper presents an approach based on the buffered conversion input stage biased by a voltage divider that is formed using diode-connected mos transistors. a mos transistor of small size exhibits high resistance and therefore, such a voltage divider can be connected to high impedance node i.e. the microphone output. this solution ensures reliable dc bias voltage for an input buffer of the readout circuit. moreover, with high-resistance mos transistors, lowfrequency pole of the mcm with a small value of the nominal capacitance can be easily set. 2. preliminary work microphone is a device, which converts the sound into an electrical signal. typical structure of a mems capacitive microphone is shown in fig. 1.acoustic sound pressure incident the diaphragm (membrane) causes capacitance changes of the structure that is then transformed into electrical signal by the readout circuit. the mcm has a diaphragm and cavity like some other mems microphones. however, compared to other types, mcms have a fixed and porous backplate that is separated from the diaphragm by the air gap. the backplate holes are used to tune the bandwidth and resonance frequency of the microphone [6]. generally, capacitive microphone can be classified as electret and condensed. electret microphones are less used because they are biased with the stable embedded charge and therefore, the fabrication is more difficult. condensed microphones are biased by an external voltage source, which makes them easier to use. accordingly, the sensitivity of the capacitive microphone depends on the size of the membrane as well as on the electric field in the air gap that is invoked by the external voltage source [1]. readout interface the main role of the readout interface (ri) is to convert the capacitance changes produced by an mcm into electrical signal such as voltage, current, etc. in the last years, design of cmos readout frontend circuit for mems capacitive microphones 265 many readout circuits based on the capacitive sensing for mems sensors and its several modifications have been proposed [2,4,7-11]. the most important requirements for the mcms readout interface include very high input impedance, low offset and low noise. on the other hand, the design of readout circuitry strongly depends on features of the mcm but it is also determined by the end application. therefore, it is not easy to make the optimum design of the readout circuit, which could fulfill all the requirements. in general, there are three basic approaches (based on a preamplifier) to readout circuit topology, depending on the method used for the microphone capacitance conversion. those are: constant-voltage, constant-charge and force-feedback approaches [12]. in continuous time constant-voltage approach, changes in capacitance of the mcm caused by acoustic input pressure create an ac current that can be sensed using a preamplifier with extremely high input impedance so called transimpedance amplifier (tia) [8]. in this approach, the microphone output node tries to be maintained fixed (termed as the dc-component of the sensor) that mitigates the influence of parasitic capacitances. this results in the transformation of charge change(δq) by a charge preamplifier into output voltage signal. the dc biasing of the preamplifier input in constantvoltage approach is not so critical, and also pole of high pass filter can be localized more properly [12, 17]. in [9], this approach was supported by floating-gate circuit techniques to adapt the charge and improve snr. constant-voltage approach can be also implemented in the discrete form. therefore, switched capacitor (sc) circuits can be also used to implement a readout circuit for the mcms [2,8]. using sc circuits, the robust dc bias in the sensing node can be achieved and the influence of the mcm parasitic capacitance can be reduced. low input offset can be achieved by employing the offset reduction techniques used for sc circuits like autozero, correlated double sampling or chopper stabilization technique [18-20]. however, the main disadvantage of a sc readout circuit is thermal noise of the switches, kt/c noise caused by sampling capacitors as well as noise produced by sampling process itself. better noise performance can be achieved with constant-charge approach based on an impedance conversion buffer, where the capacitance change is converted into a voltage signal by proper biasing of the mcm. however, in this approach, high impedance of the preamplifier dramatically influences the frequency band and stray impedance of the mcm and also attenuates the microphone sensitivity. nevertheless, the effect of interconnects and parasitic capacitances can be reduced by bootstrapping [12-13]. preamplifier offset also belongs to critical features of this approach. despite these disadvantages, constant-charge approach still remains popular [14-16]. however, the key challenge in this technique is setting the dc bias voltage at the mcm output (represents the very high impedance node). approaches presented in [21-22] uses diodes and a unity gain ota (operational transconductance amplifier) for dc biasing and for current to voltage conversion. force-feedback approach has been commonly used to minimize the impact of mechanical imperfections and inherent non-linearities in mems capacitive sensors through close-loop bias voltage tuning [14, 23]. a feedback loop can be successfully exploited for offset cancellation and the dynamic range enhancement [24-25]. the reset noise reduction schemes [26] also include a feedback loop that either cancels the reset noise or reduces the bandwidth of noise or controls the reset process itself. interesting concept was presented in [27], where electro-mechanic feedback incorporates a triangle voltage wave generator. in combination with the constant-voltage approach and transimpedance amplifier, high 266 d. arbet, v. stopjaková, m. kováč, l. nagy, g. nagy dynamic range was achieved. however, the power consumption, sensitivity, a range of capacitance variation and lower limit of sense capacitance make this readout design impractical for mems microphones. this concept belongs to the semi-digital methods that encode the information either in time (pulse-width modulation) or frequency (pulsefrequency modulation) domain. therefore, main advantages include broad dynamic range, low power consumption and robustness against the environmental noise. however, the vast majority of realizations do not meet the readout circuit requirements for mems microphones[22-27]. this paper presents a novel constant-charge approach, where the voltage divider based on diode-connected mos transistor is used for dc biasing of the input conversion lna (low noise amplifier) buffer. advantage of this approach is that the highresistance voltage divider does not affect the high output impedance of the mcm, and also the pole of high pass filter can be easy localized. 3. proposed readout frontend circuit 3.1. the whole ri concept the principal scheme of the proposed ri is shown in fig. 1. the proposed readout approach is based on the buffered conversion input stage. whole ri was designed in a standard 0.35 m cmos technology, with the supply voltage of 3v. generally, the readout interface for mcms can be divided into two main parts frontend part and backend section. in our case, the frontend part consists of the impedance conversion buffer (input conversion bock), preamplifier and filters while the sigma-delta modulator, which converts the output of the frontend circuitry into a digital form, represents the backend part of the ri. fig. 1 concept of the proposed ri in this paper, we present the frontend part of the proposed ri. because of high output impedance of the mcm, a buffer with low output impedance has been connected to the microphone’s output. capacitance variation at the mcm output is converted into voltage using so-called sensing element that is followed by the input buffer. the sensing element has been also used to ensure the dc bias voltage for the input buffer. in general, the input buffer can be implemented using a source follower, a common-gate amplifier or an operational amplifier (opamp) based voltage follower. in our case, a simple two stage opamp connected as a voltage follower was used. design of cmos readout frontend circuit for mems capacitive microphones 267 usually, the sensing element requires high resistance and therefore, it is implemented by a diode-connected mos transistor or by a diode in the reverse direction. the sensing element is the most critical part of the ri because several significant parameters (e.g. sensitivity, low corner frequency, dc bias voltage, etc.) of the microphone can be affected by its properties. output signal from the input buffer is then converted into a differential form to improve noise immunity and consequently, the differential signal is amplified. since the low corner frequency of the ri is set to 30hz, two external (off-chip) capacitors (with capacitance in order of hundreds of nf) are employed to implement the high-pass filter (hpf) with the slope of 20 db per decade. the active low-pass filter (lpf) with the slope of 40 db/decade is realized completely on a chip. consequently, the output signal from the lpf is buffered and processed by the sigma-delta modulator. 3.2. input conversion block the most important building block of the proposed ri is the input conversion block. its main role is to convert the capacitance variations into voltage changes. transistor level schematic of the designed input conversion block as well as size of the transistors used in the input buffer are depicted in fig. 2. fig. 2 schematic diagram of the input conversion block the proposed readout approach is based on the buffered input conversion stage, where the main challenge is how to ensure the dc bias voltage for the input buffer. since the output impedance of the mcm is very high, another important task is to maintain the input impedance of the input conversion block as higher as possible. in order to achieve high input impedance of the input conversion block, in our case, the dc bias voltage was 268 d. arbet, v. stopjaková, m. kováč, l. nagy, g. nagy provided by a voltage divider implemented with diode-connected mos transistors. this represents rather important advantage of the proposed readout approach. to achieve a high resistance of the mos transistor, w\l ratio should be lower than 1. the main requirement for the input buffer is low voltage noise and low value of the input offset. however, a two-stage opamp topology has been used for realization of the input lna buffer (fig. 2). in order to increase common-mode input range, bulk of the mos transistors used in the input differential pair was shorted with source. in this way, the body effect of those transistors is eliminated. bias voltage (vbias) is generated on chip using a mirrored reference current. design of the lna buffer has been optimized by the proper transistor size to obtain low noise and low input offset voltage. nevertheless, from the whole readout frontend circuit point of view, the input offset voltage of lna is not critical parameter because dc voltage is removed by external coupling capacitors c1 and c2 (fig. 1). since the input capacitance of the input buffer depends on gate-source capacitance (cgs) of input transistors used in the buffer, sizes of the input transistors have to be optimized in order to reduce capacitance cgs. to reduce the input noise caused by the mcm, capacitors cp1 and cp2 were connected in parallel with the diode-connected mos transistors (fig. 3). on the other hand, capacitances cp1 and cp2 together with capacitance of the microphone form a capacitive divider (sensitivity of the mcm might be influenced). therefore, another critical issue is minimization of the impact of capacitances cp1 and cp2, which can negatively affect the microphone sensitivity. small-signal equivalent circuit of the input conversion block is depicted in fig. 3, where cmcm is the nominal capacitance of the microphone, cin_buff is the input capacitance of the input buffer and rds represents a resistance of the respective diode-connected mos transistor. fig. 3 small-signal equivalent circuit of the input conversion block transfer function of the proposed block is expressed by eq. 1. a(s) = c mcm c mcm +c tot × 1 1+ c mcm c mcm +c tot × 1 sc mcm r ds (1) capacitance ctot represents the sum of capacitances cp1, cp2 and cin_buff. however, transfer function can be rewritten as: design of cmos readout frontend circuit for mems capacitive microphones 269 a(s) = k × 1 1+k × 1 sc mcm r ds (2) where k = c mcm c mcm +c tot (3) from transfer function, one can express the gain and corner frequency of the inverted pole. as can be seen from eq. 2, gain of the transfer function is multiplied by term k. thus, sensitivity of the microphone depends on the ratio of capacitances cmcm and ctot, expressed by eq. 3. hence, in terms of requirements for the mcm sensitivity, values of capacitors cp1 and cp2 should be at least 10 times lower than the nominal capacitance (cmcm) of the microphone. under this condition, the original value of the mcm sensitivity is maintained. the low frequency corner, formed by the input conversion block, can be expressed as: _ 3 1 2 ( ) l db ds mcm tot f r c c    (4) from eq. 4, one can observe that capacitors cp1 and cp2 might cause an undesired shift of the low corner frequency of the readout interface. therefore, to maintain the original value of the low corner frequency, the values of capacitors cp1 and cp2 should be chosen appropriately. to conclude, we would like to underline that capacitors cp1 and cp2 can be used for noise reduction but in order to maintain the original value of the mcm sensitivity and low corner frequency of the readout circuit, their values must be carefully selected. 4. results and discussion for evaluation the designed ri, the main parameters were simulated in cadence design environment. fig. 4 shows the boundary frequency responses of the proposed readout circuit obtained from corner analysis, where the process and temperature variations were taken into account. the low corner frequency will vary from 19.2 hz to 33.1 hz while the high corner frequency is kept in the range from 7.4 khz to 15.2 khz. the obtained gain is in the range from 24.6 db to 26.1 db. slope of the lpf and hpf is 10db/octave and 18db/octave, respectively. however, shape of the final frequency response mainly depends on the frequency response of the mems microphone itself. further parameter that expresses the linearity of the ri circuit is the total harmonic distortion (thd), which depends on the amplitude of the input signal. fig. 5 shows the dependence of the thd parameter on the input amplitude. it can be observed that thd will be lower than 0.1% for the input amplitude in the range from 100 v to 70 mv. for the input amplitude over 70 mv, the thd will rapidly increase, and as the input amplitude reaches 100 mv the thd of whole ri will be about 0.6%. 270 d. arbet, v. stopjaková, m. kováč, l. nagy, g. nagy 1 10 100 1k 10k 100k -10 -5 0 5 10 15 20 25 30 g a in [ d b ] frequency [hz] min max fig. 4 frequency response of the ri circuit the linearity of the ri can also be observed from fig. 5, where the dependence of the output amplitude on the input amplitude is shown. since the input range of the sigmadelta modulator is from 0 v to 2 v, one can observe that in the whole considered range, the proposed ri exhibits good linearity. 0.00 0.02 0.04 0.06 0.08 0.10 0.0 0.4 0.8 1.2 1.6 2.0 o u p u t a m p li tu d e [ v ] input amplitude [v] vout 0.0 0.2 0.4 0.6 0.8 1.0 t h d [ % ] thd fig. 5 output amplitude and thd versus the input amplitude the main achieved parameters of the designed readoutfrontend circuit obtained from schematic as well as postlayout simulationsare summarized in tab. 1. layout of the proposed readout frontend circuit and comparison of the frequency responses obtained from schematic and postlayout simulations are depicted in fig. 6a and fig. 6b, respectively. in fig. 6a, the input conversion block is marked. from tab. 1 and fig. 6b, one can observe that the biggest change in the frequency response was achieved at higher design of cmos readout frontend circuit for mems capacitive microphones 271 frequencies. the high frequency corner 9.66 khz was achieved, which represents about 10% lower value compared to the schematic simulation result. other parameters either slightly change or maintain their original values. table 1 main parameters of the proposed readout frontend circuit parameter conditions typical postlayout unit maximum gain 25.75 25.7 db frequency response low frequency -3 db point 25.75 25.6 hz high frequency -3 db point 10.66 9.66 khz noise input noise @ 1 khz 0.293 0.27 µv/√hz output noise @ 1 khz 5.66 5.24 µv/√hz thd 0.099 0.096 % signal to noise (s/n) 68.76 68.31 db dynamic range (dr) 110.6 110.7 db idd consumption of selected block input conversion block 50 50 µa single-ended to differential 165 165 µa 2x amp 550 550 µa filter 820 820 µa total idd consumption vdd = 3v 1.56 1.56 ma 1 10 100 1k 10k 100k -5 0 5 10 15 20 25 30 g a in [ d b ] frequency [hz] postlayout schematic a) layout of the readout frontend b) postlayout vs schematic simulation results fig. 6 ri layout and post-layout simulation results 272 d. arbet, v. stopjaková, m. kováč, l. nagy, g. nagy comparison of the achieved results to other works is presented in tab.2. table 2 comparison of the achieved results sc switching capacitors, se: single ended, de: differential ended, dr: dynamic range, cv: constant voltage, cc: constant charge nevertheless, it is important to note that the other works are based on different approaches, so the comparison might not be fully relevant. thus, we also specify the approach, which the respective design is employing. as can be observed, in our case, the best result was achieved for the dynamic range (dr) parameter, where 110db is achieved as required by the target application. moreover, sensitivity and input noise as well as the noise floor obtained by this approach are better than those presented in [31]. the proposed readout circuit requires smaller chip area than in approaches presented in [29] and [30]by 18% and 23% higher than [27] and [31], respectively. finally, in comparison to other works, the most important improvements and features of the presented readout frontend circuitry can be summarized as follows:  broader dynamic range  better noise performance  low power consumption however, these features are achieved at costs of area overhead (about 20% with respect to works presented in [27] and [31]). 5. conclusion the frontend part of the readout interface for mcm was proposed and designed in a standard 0.35 m cmos technology. the achieved results show that the proposed approach brings the high dynamic range and very good noise performance, which are the most important parameters required by a mcm-based noise dosimeter meant to be used in very harsh environment. [27] [29] [30] [31] this work cmos process 0.35 μm 0.35 μm 0.8 μm 0.8 μm 0.35 μm supply ± 1.65 v 3.3 v 5 v 5 v 3 v power 7.9 mw n/a 8.38 mw 0.56 mw 4.68 mw area 0.24 mm 2 12 mm 2 1.21 mm 2 0.22 mm 2 0.29 mm 2 peak sndr/dr 91.7 db n/a 60 db n/a 110.66 db noise floor n/a n/a n/a 0.25 af/√hz (500hz) 0.09 af/√hz (1khz) adjustable gain yes (by drive current) no no no no sensitivity 0.1vnorm/1pf 38 μsec/pf 9980 mv/ff 12.42 mv/ff 62.6 mv/ff output swing n/a n/a n/a n/a se: 1.9 v de: 3.8 v approach cv with ac drive current semi-digital (pwm) sc sc cc design of cmos readout frontend circuit for mems capacitive microphones 273 currently, the developed readout interface is being fabricated. in the future research, evaluation of prototype chips will be performed and possible modification of the ri design towards further improvements is expected. acknowledgement. this work was supported in part by the ec under fp7 ict project smac (288827), eniac ju under project e2sg (296131), and by the slovak republic under grant vega 1/0823/13. references [1] g. w. elko, f. pardo, d. lpez, d. bishop, and p. gammel, "capacitive mems microphones", bell labs technical journal, vol. 10, no. 3, pp. 187–198, 2005. [2] s. a. jawed, d. cattin, m. gottardi, n. massari, r. oboe, and a. baschirotto, "a low-power interface for the readout and motion-control of a mems capacitive sensor", in 10th ieee international workshop on advanced motion control, amc 2008, pp. 122–125. [3] c.-t. chiang, w.-c. chou, j.-c. tsai, and h.-l. lee, "acmos readout circuit with frequency optimization for microphone sensor arrays", in proceedings of the international symposium on computer communication control and automation, 3ca 2010, vol. 2, pp. 249–252. [4] j.-t. huang, k.-s. chen, and c.-c. chien, "a differential capacitive sensing circuit for micro-machined omnidirectional microphone", in proceedings of the ieee international conference on nano/micro engineered and molecular systems, nems 2011, pp. 948–951. [5] j. van den boom, "a 50_w biasing feedback loop with 6mssettling time for a mems microphone with digital output", in digest of technical papers the ieee international solid-state circuits conference, isscc 2012, 2012, pp. 200–202. [6] brüel and kjaer, "microphone handbook", vol. 1 theory. nearum, denmark: brüel and kjaer, 1996. [7] tao yin; huanming wu; qisong wu; haigang yang; jiao, j., "a tia-based readout circuit with temperature compensation for mems capacitive gyroscope", in proceedings on ieee international conference on nano/micro engineered and molecular systems, nems 2011, pp. 401–405, 20-23 feb. 2011. [8] jawed, s. a; gottardi, m.; baschirotto, a, "a switched capacitor interface for a capacitive microphone," research in microelectronics and electronics 2006, ph. d., pp. 385-388. [9] sheng-yu peng; qureshi, m.s.; hasler, p.e.; basu, a; degertekin, f.l., "a charge-based low-power high-snr capacitive sensing interface circuit," ieee transactions on circuits and systems i: regular papers, vol.55, no.7, pp.1863–1872, aug. 2008. [10] w. jiangfeng, g.k. fedder, l.r. carley, "a low-noise low-offset capacitive sensing amplifier for a 50μg/√hz monolithic cmos mems accelerometer," ieee journal of solid-state circuits, vol.39, no.5, pp. 722–730, may 2004. [11] pastre, m., kayal, m. methodology for the digital calibration of analog circuits and systems: with case studies, 1st ed. springer publishing company, incorporated, 2009. [12] a. van roermund, a. baschirotto, and m. steyaert, nyquist ad converters, sensor interfaces, and robustness: advances in analog circuit design, 2012, ser. springer link: bücher. springer, 2012. [13] l. baxter, capacitive sensors: design and applications, ser. ieee press series on electronics technology. john wiley & sons, 1996. [14] s. a. jawed, d. cattin, m. gottardi, n. massari, a. baschirotto, and a. simoni, "a 828μw 1.8v 80db dynamic-range readout interface for a mems capacitive microphone", in proceedings on 34th european solid-state circuits conference, esscirc 2008, sept 2008, pp. 442–445. [15] i. deligoz, s. naqvi, t. copani, s. kiaei, b. bakkaloglu, s.-s. je, and j. chae, "a mems-based powerscalable hearing aid analog front end", ieee transactions on biomedical circuits and systems, vol. 5, no. 3, pp. 201–213, june 2011. [16] s. a. jawed, j. nielsen, m. gottardi, a. baschirotto, and e. bruun, "a multifunction low-power preamplifier for mems capacitive microphones", in proceedings of esscirc, esscirc 2009, sept 2009, pp. 292–295. [17] c. furst, "a low-noise/low-power preamplifier for capacitive microphones", in proceedings on the ieee international symposium on circuits and systems connecting the world, iscas 1996, vol. 1, may 1996, pp. 477–480. 274 d. arbet, v. stopjaková, m. kováč, l. nagy, g. nagy [18] c. enz, e. vittoz, f. krummenacher, "a cmos chopper amplifier", ieee journal solid-state circuits, vol. 22, no. 3, jun 1987, p. 335–342. [19] c. enz, and g. temes, "circuit techniques for reducing the effects of op-amp imperfections: autozeroing, correlated double sampling, and chopper stabilization", in proceedings of the ieee, vol. 84, no. 11, nov 1996, pp. 1584–1614. [20] m. white, d. lampe, f. blaha, i. mack, "characterization of surface channel ccd image arrays at low light levels", ieee journal solid-state circuits, vol. 9, no. 1, feb 1974, pp. 1–12. [21] m. pederson, w. olthuis, p. bergveld, "high-performance condenser microphone with fully integrated cmos amplifier and dc-dc voltage converter," journal of microelectromechanical systems, vol.7, no.4, pp. 387–394, dec 1998. [22] c. e. furst, "a low-noise/low-power preamplifier for capacitive microphones," in proceedings on the ieee international symposium on circuits and systems connecting the world, iscas 1996, 12-15 may 1996, vol. 1, pp. 477–480. [23] s. a. jawed, d. cattin, n. massari, m. gottardi, and a. baschirotto, "a mems microphone interface with force-balancing and charge-control", in research in microelectronics and electronics, 2008. prime 2008. ph.d., june 2008, pp. 97–100. [24] t. ting zhang, h.-j. li, j.-q. huang, m. zhao, l.-c. hong, y.-c. zhang, w.-g. lu, and z.-j. chen, "an offset-compensated switched-capacitor interface circuit for closed-loop mems capacitive accelerometer", in proceedings on the ieee 11th international conference on solid-state and integrated circuit technology, icsict 2012, oct 2012, pp. 1–3. [25] l. dong-hyuk, l. sang-yoon, c. woo-seok, p. jun-eun, and d.-k. j., "a digital readout ic with digital offset canceller for capacitive sensors", journal of semiconductor technology and science, vol. 12, no. 3, pp. 278–285, 2012. [26] b. fowler, m. godfrey, and s. mims, "reset noise reduction in capacitive sensors", ieee transactions on circuits and systems i: regular papers, vol. 53, no. 8, pp. 1658–1669, aug 2006. [27] f. aezinia and b. bahreyni, "a readout circuit with wide dynamic range for differential capacitive sensing applications", in proceedings 26th annual ieee canadian conference on electrical and computer engineering, ccece 2013, may 2013, pp. 1–4. [28] j.-l. lu, m. inerowicz, s. joo, j.-k. kwon, and b. jung, "a low-power, wide-dynamic-range semidigital universal sensor readout circuit using pulsewidth modulation", ieee sensors journal, vol. 11, no. 5, pp. 1134–1144, may 2011. [29] m. lee, s. lee, s. jung, c. je, g. hwang, and c. choi, "design, fabrication, and characterization of a readout integrated circuit (roic) for capacitive mems sensors", sensors, 2007 ieee, pp. 260–263. oct 2007. [30] j. shiah, h. rashtian, and s. mirabbasi, "a low-noise parasitic-insensitive switched-capacitor cmos interface circuit for mems capacitive sensors", in proceedings on the ieee 9th international new circuits and systems conference, newcas 2011, june 2011, pp. 470–473. [31] j. shiah and s. mirabbasi, "a 5-v 555-μw 0.8-μm cmos mems capacitive sensor interface using correlated level shifting", in proceedings on the ieee international symposium on circuits and systems, iscas 2013, may 2013, pp. 1504–1507. plane thermoelastic waves in infinite half-space caused facta universitatis series: electronics and energetics vol. 30, n o 3, september 2017, pp. 429 429 doi: 10.2298/fuee1703429e corrigendum tomislav suligoj, marko koričić, josip žilak, hidenori mochizuki, so-ichi morita, katsumi shinomura, hisaya imai horizontal current bipolar transistor (hcbt) – a low-cost, highperformance flexible bicmos technology for rf communication applications. facta universitatis, series: electronics and energetics (fu elec energ), vol. 28, no 4, december 2015, pp. 507 525. doi: 10.2298/fuee1504507s  the editor-in-chief has been informed that in the article tomislav suligoj, marko koričić, josip žilak, hidenori mochizuki, so-ichi morita, katsumi shinomura, hisaya imai. horizontal current bipolar transistor (hcbt) – a low-cost, high-performance flexible bicmos technology for rf communication applications. facta universitatis, series: electronics and energetics, vol. 28. no 4, 2015, pp. 507-525. doi: 10.2298/fuee1504507s fig. 15 with its legend has been ommited in published version of the paper. after further discussion with the corresponding author, editor-in-chief has decided to publish a corrigendum for this article, providing the figure and legend of fig. 15. 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 10 -13 10 -12 10 -11 10 -10 10 -9 10 -8 10 -7 10 -6 10 -5 10 -4 10 -3 i c c u rr en t, ( a ) base-emitter voltage, (v) single-poly double emitter v ce =2 v i b 0 2 4 6 8 10 0 20 40 60 80 100 c o lle ct o r c u rr en t , ( a ) collector-emitter voltage, (v) w hill =0.36 m w hill =0.5 m w hill =0.6 m single poly hcbt i b =0.1 a (a) (b) fig. 15 measured dc characteristics of double-emitter (de) hcbt: (a) comparison between the gummel characteristics of de hcbt with n-hill width whill=0.36 µm (b) output characteristics of de hcbts with different n-hill widths (whill). single polysilicon region hcbt is added for the reference. link to the corrected article doi:10.2298/fuee1504507s received march 2, 2017 doi:10.2298/fuee1504507s facta universitatis series: electronics and energetics vol. 31, n o 4, december 2018, pp. 599-612 https://doi.org/10.2298/fuee1804599k a blind decision feedback equalizer with efficient structure-criterion switching control  vladimir r. krstić, nada bogdanović “mihajlo pupin” institute, belgrade, serbia abstract. this paper considers and proposes an innovated method of structurecriterion switching control for the self-optimized blind decision feedback equalizer (dfe) scheme which operates by switching between adaptation modes according to the mean square error (mse) convergence state. the new switching control shortens the blind acquisition period time of the dfe and, consequently, speeds up its effective convergence rate. the switching control is based on the variable switching threshold which combines the commonly used mse estimate of the dfe’s output and a posteriori error of the all-pole whitener performing front-end amplitude equalization during the blind operation mode. the efficiency of the dfe switching control is verified by simulations of single-carrier system transmitting qam signals over multipath channels. key words: blind equalization, decision feedback equalizer, maximum joint entropy, operation mode switching control. 1. introduction in this paper, we have addressed the new method for the convergence rate increasing of the decision feedback equalizer (dfe) scheme which is based on the improvement of the equalizer’s operation mode switching control. with respect to the earlier version [1], presented at the 5 th icetran2017 conference, this paper includes a new set of case studies followed by the most recent simulation results. the convergence rate of the blind equalization is, besides its complexity, an issue of the utmost importance from the perspective of its usage in today communication systems continually striving for the increased data throughput and frequency efficiency [2]. because of that, the frequency efficiency advantages, achieved by removing a training sequence from the system [3], [4], have to be followed by an adequate equalization convergence rate if we want to preserve the benefits of the blind equalization. to reconstruct an unknown source signal, blind equalizers use the higher-order statistics of channel outputs as well as some knowledge of the given signal statistic. in  received february 12, 2018; received in revised form april 24, 2018 corresponding author: vladimir r. krstić, university of belgrade, institute „mihajlo pupin“, volgina 15, 11060 beograd, serbia (email: vladimir.krstic@pupin.rs) *the earlier version of this paper was presented at the 5 th international conference icetran2017, kladovo, serbia, june 5-8, 2017. [1] 600 v. r. krstić, n. bogdanović such environment the resulting symbol-by-symbol based blind algorithms are typically characterized by the relative low convergence rate and high residual mean square error (mse) [3], [4] compared to the conventional pilot-trained equalizers employing the second-order statistic based algorithms [5]. as a way to mitigate these drawbacks twosteps adaptation strategy is commonly used dividing blind equalization task between blind and decision-directed operation modes [4], [6]. at the initial (blind) operation mode, the equalizer adjusts its adaptive parameters to open “eye diagram” enough and then, depending on convergence state, switches adaptation to the decision-directed (dd) operation mode that should guarantee both the successful proceeding of the convergence process and the maximal reduction of the output mse. in such scenario, blind equalizers must be provided by an algorithm estimating some measure of convergence state or signal quality as well as an appropriate performance threshold to decide operation mode switching. this task, as well as blind equalization alone, is not so easy because it depends on unknown system parameters, such as a source signal and channel characteristic. the operation mode switching control based on the online mse estimation of the equalizer’s output and its comparison with in advance selected threshold level is an often used method because of its simplicity [6]. on the other hand, this scheme strongly depends on both the applied mse estimation efficiency and the heuristically selected threshold level according to the given signal statistic and the assumed channel characteristics. an alternative but more complex approach is to join an equalizer’s operation mode switching control with its blind adaptation algorithm aiming at the soft switching scheme [7], [8] which eliminates the above mentioned difficulties and possibly improves equalization performance. in [7], the noise-predictive decision feedback equalizer (dfe) smoothly transforms the equalization process between its two extreme stages: blind linear and dd steady state. for that purpose, the equalizer employs the soft decision device defined by the linear convex function combining identity function (linear) and hard decision (nonlinear) device. in [8], using a similar convex mixing rule, the soft switching blind equalization is considered more generally in the context of linear blind equalization. this soft-switching scheme combines the outputs of two linear equalizers working in parallel: one adapted blindly and the other adapted using the dd-lms algorithm minimizing mse. both schemes aggregate the equalizer’s adaptation algorithm and the operation mode control function into one adaptation task not needing a switching threshold. in this paper we have considered the blind dfe, called soft-dfe [9], using the operation mode switching control scheme based on both the on-line estimation of mse and the variable switching threshold [1], [10]. the purpose of using the variable threshold instead of a fixed one includes several goals such as relaxing the issue of mse threshold level selection and speeding up the equalizer’s effective convergence rate all with minimal computation complexity rate. besides, these goals have been concerned with keeping the error propagation phenomenon [11] a major drawback of blind dfe equalization under the control guaranteeing high values of equalization successfulness. the paper is organized as follows. section 2 describes the soft-dfe structure-criterion optimization scheme. in section 3 the insufficiency of the existing switching control is addressed and then the innovated control that combines the variable threshold with online mse estimation is introduced. in section 4 the efficiency of the threshold variable switching control is verified by simulations. a blind decision feedback equalizer with efficient structure-criterion switching control 601 2. soft-dfe: background and problem definition 2.1. structure-criterion optimization a simplified based-band model of a single-carrier qam (quadrature amplitude modulated) system with the soft-dfe is presented in fig. 1 where the in-phase and quadrature components of complex-valued symbols { }na , generated in time intervals of t seconds, are independent identically distributed real zero-mean variables with a finite variance and sub-gaussian distribution, the time-invariant channel pulse response { }nh represents combined effects of the transmitter filter, channel impulse response and anti-alias filter at the receiver side and the noise is a zero-mean white gaussian process independent of the source data. the signal ( )x t at the input of the equalizer’s feedforward part given by the fractionally-spaced equalizer (fse) is sampled at the rate 2/t and its odd and even samples 0 ,( / 2) n ix t nt it x   , 1, 2i  , are alternatively shifted to the delay lines of the corresponding fir filters presented by coefficient vectors ic . fig. 1 simplified model of transmission system with dfe (soft-dfe) the operation of soft-dfe is based on the principles of the self-optimized dfe scheme [6] which, in order to eliminate the error propagation effects, optimizes both the structure and the cost criteria according to its convergence state. specifically, the softdfe optimizes both the filter structure including four fir filters, two in fff (feedforward) and two in fbf (feedback) part, and the combination of three cost criteria: joint entropy maximization (jem) [12], constant modulus [13] and minimum mse (mmse) [5]. also, besides blind and tracking operation modes, which are commonly performed by blind equalizers, the new soft-transition mode has been introduced into the soft-dfe scheme in order to mitigate the error propagation effects caused by a rapid structure-criterion switching from the blind to decision-directed adaptation mode. at the beginning of the blind mode, the soft-dfe transforms its structure into the cascade of four linear signal transformers the gain control (gc), whitener (wt), blind equalizer (te) and phase rotator (pr) operating independently of each other except of the gc-wt pair, fig. 2a. effectively, in the blind mode the soft-dfe acts as a t/2-fse linear equalizer [14] dividing the equalization task between the whitener of the received signal and the te equalizer where the wt-jem and the te-cm, respectively, performs the channel amplitude and phase equalization. in the next soft-transition mode the soft-dfe proceeds to adapt filters combining the mmse and jem criteria, fig. 2b. finally, in the tracking mode, the soft-dfe continues to converge to the mmse steady-state using the dd-lms algorithm. x(t) + zn {b} an ^ decision{hn} + {c1,c2} feedforward filterchannel feedback filter noise an yn -2/t 602 v. r. krstić, n. bogdanović (a) (b) fig. 2 soft-dfe structure-criterion transformation: (a) blind mode and (b) soft-transition mode (sfbf with jem, dotted line) and tracking mode (fbf with dd-lms, solid lines) the phase rotator pr is realized as a modified variant of the decision-directed phaselocked loop [15] that, using the reduced signal constellation based only on the symbols with the largest energy [16], aims to evade catastrophic effects being caused by the carrier phase estimation exploiting an insufficiently open signals; this is particularly critical for high-order signal constellations such as 64-qam and higher. 2.2. algorithms in this subsection, the adaptation algorithms used by the soft-dfe are revisited in the order following the operation mode switching. gain control. the gain control gc is realized as a single-coefficient equalizer [6] which has a task to recover the power of the source signal. the gc operation is enhanced by the whitener’s outputs ,n iu , 1, 2i  , and given by the recursion 2 2 , 1 , ,[ ]i n i n g i n ag g u     , , 1 , 1i n i ng g  (1) where g is the adaptation step size and 2 a is the variance of source symbols  na . jem whitening algorithm. the whitener wt of the received signal is realized as allpole filter (equalizer) to compensate for the channel amplitude distortion, i.e., recover the second order statistic of the given source signal by using the entropy-based jem cost [12]. the corresponding stochastic-gradient jem whitening algorithm (jem-vl) [16] is given by , , , , t i n i n i n i n u x b u , 1, 2i  (2) 2 * , 1 , , , , , (1 ) i n i n n i n bb i n w i n i n u u       b b b u (3) where , , ,1 , ,[ ,..., ] t i n i n i n n u uu and , , ,1 , ,[ ,..., ] t i n i n i n nb bb are, respectively, whitener’s regression and coefficient vectors, 0 n   is the time-variable leaky factor, w  is the free parameter representing the slope of the employed neuron function, bb  is a step-size, n is the span of the whitener delay line given in t periods and the superscripts t and * signify, respectively, the transpose and conjugation. the specific of the jem-vl algorithm, besides the slope w  controlling its entropic capability, is its variable leaky factor n . acting in opposition to the entropy-gradient term, the leaky term n nb controls jem(b1) cma(c1) jem(b2)  cma(c2)  xn 2/t x un,1 un,2 x wt tegc pr un yn exp(-jn) b1 ili b2  an ^ (c1,c2) x pr 2/t jem/lms xn yn sfbf/fbf dd-lms te exp(-jn) zn a blind decision feedback equalizer with efficient structure-criterion switching control 603 the magnitudes of whitener coefficients avoiding superfluous coefficients to degrade the equalizer convergence process. the undesirable influence of superfluous coefficients is particularly exposed at the time of equalizer switching from the blind to decision-directed operation mode. the adaptation of the leaky n is based on the analysis of the whitener’s a posteriori errors and the heuristic punish/award rule [17] which decides when and how much to increase or decrease the leaky factor. accordingly, the leaky adaptation rule in jem-vl comprises the following three operations: the calculation of a posteriori errors with ( > 0) and without ( = 0) coefficient leakage, decisions when and decisions how much to increase or decrease leaky. the a posteriori error vl ne estimate for  > 0 in jem-vl is given by 1 t n n n n u x    b u (4) 2 (1 ) vl n n w n e u u  (5) and the corresponding a posteriori error w ne estimate for 0n  in (3) (corresponds to the original whitening algorithm jem-w [9]) is given by 2 * 1 (1 ) n n bb n w n u u     b b u (6) 1 t n n n n u x    b u (7) 2 (1 ) w n n w n e u u  (8) it should be noted that the a posteriori errors, given in (5) and in (8), are obtained using the same current value of the whitener input xn; in the above recursions the indexing 1, 2i  is omitted for simplicity. in the next step, based on the comparison of the achieved a posteriori errors, the “ifelse” relation if vl w n ne e then set 1 max( , 0)n n dm m l   else set 1 min( , )n n um m l m   end if (9) decides when to decrease or to increase the leaky factor and, finally, the quantized function max( ) ( / )n n nf m m m   (10) estimates how much to decrease or to increase the leaky factor employing parameters 0( , , , )d um l l m  , max  and 0,...,nm m is an independent variable. cma algorithm. the constant modulus algorithm (cma) is realized in its commonly used variant for dispersion function of order p=2 [13] 604 v. r. krstić, n. bogdanović , , ,' t i n i n i ny  c u , 2 , 1 'n i n i y y   (11) 2 ' * , 1 , , , ,'i n i n fb i n i n c i ny y r         c c u , 4 2 { } { } n c n e a r e a  (12) where ci,n = [ci,0,..., ci,m1] t is the coefficient vector of fff, fb is an adaptation step-size and the constant cr is the kurtosis of the source signal which represents the source probability density function (pdf) distance measure from normality [18]. assuming the amplitude equalization is done efficiently by the gc-wt pair, the t/2-fse-cma has the task to equalize for a channel phase distortion by retrieving the kurtosis statistic of the source signal [19]. soft jem algorithm. the performing of the soft-dfe in the soft-transition mode is characterized by the sfbf equalizer behaviour operating between the original soft fbf equalizer maximizing the joint entropy of the neuron outputs [9] and a hard dd fbf equalizer suffering from incorrect decisions ˆ n a . the operation of the sfbf is described by the following relations ˆexp( ) t n n n ny j c u (13) 1ˆ t n n n nz y  b a (14) 2 * 1 ˆ1n n bs n d n nz z           b b a (15) where ˆ n  is a carrier phase estimate, 1ˆ ˆ ˆ[ ,..., ] t n n n na a a is the vector of previously detected symbols, bs is a step size and d is the neuron slope which is determined by the given source statistic [20]. tracking mode. in the tracking mode, the soft-dfe approaches to the mmse steadystate and continues to follow slow-time channel variations using dd-lms algorithms in its both fff and fbf parts optimizing jointly the mmse criterion given by  2ˆ ˆ( , , )mmse n n n n nj e z a  c b (16) it should be noted that despite the soft-dfe strives to reach a global mmse solution, the local solutions cannot be avoided at all because the soft-dfe’s final convergence state depends on the local ( )jemj b and ( )cmj c criteria. 3. switching control with variable threshold the soft-dfe controls the convergence state using the mse monitor which estimates online the output mse and, according to the a priori selected mse threshold levels (tl), switches the structure and adaptation criterion through three operation modes. to switch from the blind to soft-transition mode and from the soft-transition to tracking mode, the monitor, respectively, compares the estimated mse with tl1 and tl2 thresholds. also, a blind decision feedback equalizer with efficient structure-criterion switching control 605 to switch the pr operation between a reduced and full signal constellation, the mse is compared with threshold tl3. since the latter indicates the signal constellation opening, it is also utilized as a measure of equalization successfulness, given by the equalization success index (esi), which is defined by the ratio of the number of successful equalizations and the total number of monte carlo runs. thus, the soft-dfe controls its convergence process completely by mse thresholds satisfying the relation tl1>tl2>tl3. 3.1. mse switching control the online estimation of the mse in the blind mode is given by the relation   2 , , 1 (1 )b n b n n cmse mse y r      (17) where the forgetting factor  > 0 regulates a quality of estimation process, and typically takes values little less than 1.0. the same mse estimation principle is also used during the next soft-transition and tracking modes provided that the error  ˆ nn z a is substituted for the error  n cy r in (17). the quality of ,b nmse estimate obtained by (17) suffers from several weaknesses. firstly, the ,b nmse is a crude estimate of the mse for all non-constant modulus qam signals (except for 4-qam) because the term  n cy r on the right-hand side of (17) is not a real error but a dispersion measure of the modulus of symbol estimates with respect to the constant c r . secondly, the ,b nmse estimate aggregates the mse affected by the cascaded gc-wt-te (see fig. 2a) with the dominate influence of the te-cma algorithm which is based on the fourth-order statistic represented by the constant cr (12). in other words, the estimate ,b nmse relays mostly on the outlier sensitive kurtosis statistic [18] neglecting the second-order statistic being reconstructed by the wt-jem. to illustrate the soft-dfe convergence behaviour controlled by the mseb,n estimator, we have presented in fig. 3 the results of the convergence tests carried out for three different heuristically selected thresholds tfmse tl1 using system in fig. 1 with 64qam signal and mp-e channel; see channel amplitude in fig. 5 in the next section. if we suppose the optimal mean square error ,b optmse is achievable during the blind mode if the equalizer’s coefficients reached the optimal setup then the three typical equalization scenarios are possible: 1) for tf ,mse b optmse the equalizer successfully switches operation from the blind to the dd operation mode, 2) for tf ,mse b optmse the equalizer stays longer in the blind mode than in the case 1) or, possibly, it will never reach the softtransition mode and the equalization will be ended in failure and, finally, 3) for tf ,mse b optmse the equalizer switches operation to the dd mode faster than in the case 1) but, the mmse steady-state performance is not guaranteed, and even some pathological states are possible. as can be seen from the presented convergence curves the threshold tl1=8.02 db is selected to be the best threshold. 606 v. r. krstić, n. bogdanović fig. 3 mse convergence curves obtained for three different fixed thresholds tl1; soft-dfe single run test for 64-qam signal and mp-e channel 3.2. variable threshold in order to compensate for insufficiency of the ,b nmse estimation given by (17), we have combined a fixed threshold tlmse , generally different from the threshold tfmse , with the whitener’s a posteriori errors , 1 vl i ne  introducing in such a way the variable threshold tlvmse [10]  tlv tl 1, 1 2, 1mse mse vl vln ns e e    (18) which includes two terms, the fixed threshold tlmse and the variable term  1, 1 2, 1 vl vl n ns e e  where s is a small positive scale factor. it is worth noting that the scaling factor s should be selected through the analysis of the ratio between the sum of a posteriori errors and the msetl threshold. the first verifications of the variable threshold model have proved its efficiency for the s values scaling down a posteriori term to a level comparable with the msetl term. the full exploration of the variable threshold usage and its limitations need to be a subject of further study. the above innovation of the blind mode threshold comes from the fact that a posteriori errors of the wt-jem carry up-to-date information on the second-order statistics missing to the ,b nmse (17). a posteriori errors of the jem-vl algorithm are functions of wt-jem outputs which are almost free from isi disturbance (outliers) coming from channel amplitude characteristics. as it is mentioned in the previous section jem-vl provides efficient compensation for frequency-selective channels. practically, by introducing the whitener’s a posteriori errors as a variable threshold term we have created the switching control that directly reflects the recovery of both the secondorder and four-order statistics of the applied source data. using the variable threshold, the switching control responds as follows: for a lower a posteriori error the tlvmse becomes higher, which shortens the blind equalization time and, hence, speeds up the equalizer convergence rate, and reverse, for a higher a posteriori error the tlvmse becomes lower which lengthens the blind acquisition time and slows the equalizer convergence. effectively, from the perspective of the mse estimation quality, the msetlv becomes more robust against the dispersion of magnitudes ny of symbol estimates. a blind decision feedback equalizer with efficient structure-criterion switching control 607 to avoid the false equalizer switching through the operation modes, which could be caused by the non-stationarity of the mse data, the soft-dfe switching control implementation is based on the multiple checking of the threshold level passage. according to the switching rule presented in fig. 4, the equalizer is allowed to switch from the blind to the softtransition mode if and only if the ,b nmse satisfies , tlvmseb nmse  during the k equalizer’s update iterations where k is an integer larger than 1. the same switching rule is valid for the soft-dfe switching from the softtransition to the tracking operation mode but it is less critical than the former from the perspective of convergence rate. 4. simulation results the efficiency of the innovated structure-criterion switching control and its impact on the equalizer’s convergence rate is verified by the software simulator of the qam system presented in fig. 1. the simulations are carried out using 16and 64-qam signals and the multi-path channel adding the white gaussian noise determined by the signal-to-noise ratio (snr). the selection of soft-dfe dimensions and parameters is done aiming at the best compromise between the convergence rate achievements and the equalization successfulness defined by esi. the frequency selective mp-(a, c, e) channels, whose normalized amplitude characteristics are presented in fig. 5, are design in a way to gradually increase isi severity from mp-a to mp-e. fig. 5 normalized attenuation characteristics of mp-(a, c, e) channels fig. 4 soft-dfe rule switching from blind to soft-transition mode mseb,n blind mode mseb,nk 608 v. r. krstić, n. bogdanović the soft-dfe parameters are given as follows. the filter tapped-delay line span, in t intervals, for fbf is 5 for both qam signals and for fff is 23 and 24, respectively, for 16 and 64-qam signals. the fbf is initialized for all zero coefficient-values while the initialization of the fff is realized by two strategies: 1) double-spike initialization (ds) with two central reference tapes 1, 2, 0.707r rc c  and 2) single-spike initialization (ss) with a single central reference tape 1, 1.0 r c  . the adaptation steps for the gc, jem, cma and lms algorithms are selected in a way to optimize their efficiency through the corresponding operation modes. it is of particular importance for gc, jem and cma algorithms which divide the blind equalization task into several simpler subtasks. for example, the gs uses two adaptation steps 11 20 {2 , 2 } g     for both signals. the first step, applied at the early beginning of the blind mode, is much larger than the second one aiming to prevent the wt and te equalizers from taking over the gain control function. the adaptation steps of jem, cma and lms algorithms are selected in order to produce the best response of the fbf and fff filters through three operation modes. depending on the 16and 64-qam signals, they are selected as follows: 1) for fbf { 19 ,16 2 bb    , 22 ,64 2 bb    }, { 18 ,16 2 bs    , 21 ,64 2 bs    }, { 14 ,16 2 bt    , 13 ,64 2 bt    } and 3) for fff { 16 ,16 2 fb    , 21 ,64 2 fb    }, { 15 ,16 2 fs    , 20 ,64 2 fs    }, { 13 ,16 2 ft    , 16 ,64 2 ft    }; the second subscripts of adaptation steps, b, s and t, respectively signify blind, soft-transition and tracking modes. the leaky parameters are given by { 0 40m  , 5 d l  , 40 u l  , 400m  , 11 max 2   } for both signals. the selection of neuron slopes { , } w d   is done according to the considerations given in [16], [20]. the slope d  , which depends mostly on the given signal constellation , takes values 12 and 1.95, respectively, for 16and 64-qam constellations. on the other hand, the slope w  , together with the threshold parameters s and k, is used as a tool to optimize an initial convergence rate of the equalizer, see table 1. the comparison of the soft-dfe performance achieved by the fixed (tlf) and variable (tlv) switching controls are given in the terms of the pdf histograms of the blind acquisition period time, mse convergence and equalization successfulness esi. the comparison tests are carried out for ss and ds equalizer initialization methods, (16,64)-qam signals and switching control parameters {msetf/msetl, s, k} as given in table 1. the motivation to test the switching control for two initialization methods comes from the fact that the success and speed of convergence of fse-cma equalization are strongly affected by the coefficient initialization [14]. the presented pdh histograms and esi tests are obtained for 10000 and mse convergence curves for 200 independent monte carlo runs. table 1 system setups: switching control parameters qam/soft-dfe msetf/msetl w  k s 16qam-tlf 1.30 db 7.5 95 0 16qam-tlv 2.30 db 9 105 0.00145 64qam-tlf 8.02 db 2.4 95 0 64qam-tlv 8.61 db 2.8 105 0.00165 a blind decision feedback equalizer with efficient structure-criterion switching control 609 fig. 6 presents the pdf histograms of the blind acquisition period time obtained with tlf and tlv switching controls for the 64-qam signal. the histogram obtained by tlf control demonstrates a positive skewness caused by fse-cma kurtosis outliers in contrast to the histograms obtained by tvl control which are much more symmetrical; the latter is obviously affected by a posteriori variable threshold term in (18). the more quantitative measure of the switching control impact on the blind acquisition time is provided by mean and standard deviation (sd) statistic presented in table 2 for (16,64) (a) channel mp-a (b) channel mp-c (c) channel mp-e fig. 6 pdf histograms of blind acquisition period time for tlf and tlv thresholds and ss initialization: 64-qam, snr=30 db, a) mp-a, b) mp-c, c) mp-e 610 v. r. krstić, n. bogdanović table 2 blind mode statistic, [t]: (16, 64)-qam, ss mean, std/channel mp-a mp-c mp-e 16-qam mean: tlf 3146 4264 3317 std: tlf 759 1314 843 mean: tlv 2778 2957 2684 std: tlv 232 301 173 64-qam mean: tlf 4776 8431 6103 std: tlf 1893 2507 1795 mean: tlv 4691 6422 4585 std: tlv 1142 1049 793 qam signals and ss initialization method. the presented results emphasize an important decrease of mean and sd in the case of the tlv control. for example, for the tlv control and 64-qam signal, mean and sd values are, respectively, 18% and 51.8% smaller (averaged over all channels) with respect to the tlf control. the impact of the operation mode switching control on the equalizer convergence rate is presented in figures 7 and 8. in fig. 7, the convergence curves obtained for tlf and tlv controls and ss and ds initializations in the case of the 16-qam signal are given. as can be seen, the convergence rates achieved by the tlv control are significantly higher than by the tlf for both ss and ds initialization provided that the residual mse is not sacrificed and, also, the best results are reached for ss-tlv combination. the similar results are achieved for the 64-qam signal, fig. 8. in this case, for the sake of the figure clarity, only the convergence curves obtained by the tlv control and for ss and ds initializations are presented. it is worth noting that the equalizer converges faster by tlv control because the (a) channel mp-a (b) channel mp-c (c) channel mp-e fig. 7 comparison of mse convergence curves obtained using tlf and tlv controls and (ss, ds) initializations: 16-qam, snr=25 db, a) mp-a, b) mp-c, c) mp-e a blind decision feedback equalizer with efficient structure-criterion switching control 611 blind mode time has been made shorter as a result of the improved switching control. also, these results have proved an efficiency of the gc-wt amplitude equalizer that has been insufficiently visible unless the tlv control has been applied. fig. 8 comparison of mse convergence curves obtained using tlv control and (ss, ds) initializations: 64-qam, snr=30 db, mp-(a,c,e) the results of esi tests are given in table 3. the purpose of these tests is to prove that the new tlv control does not degrade the equalization successfulness; the results for both the tlf and the tlf methods are practically same for parameters selected in table 1. it is of an essential importance because we have used different control switching parameters (w , s, k) aiming to achieve the best convergence performance by both methods and, at the same time, to preserve the equalization efficiency. table 3 equalization success index [%] esi/channel mp-a mp-c mp-e 16-qam ss-tf 99.99 99.80 100 ss-tv 99.99 99.75 100 ds-tf 99.97 99.91 99.16 ds-tv 100 99.87 98.94 64-qam ss-tlf 100 99.68 98.83 ss-tlv 100 99.80 98.64 ds-tlf 100 99.69 98.65 ds-tlv 100 99.84 98.03 conclusions our goal in this paper was to increase the convergence rate of the blind soft-dfe equalizer by improving its operation mode switching control. the performing of the online mse estimator monitoring the equalizer’s convergence state is enhanced by the innovated switching control that combines the fixed value threshold term with the a posteriori error of the all-pole amplitude equalizer coefficient updates. in this innovation the robust second-order statistic of a posteriori errors is employed to compensate for the undesirable effects of the outlier sensitive kurtosis statistic of fse-cma outputs. it is verified by different simulation setups that the simple mse estimation method combined 612 v. r. krstić, n. bogdanović with the variable up-to-date threshold information significantly reduces the blind mode operation time and, hence, greatly improves the effective equalizer convergence rate. acknowledgement: the paper is a part of the research done within the project tr 32037, 20112018, the ministry of education, science and technological development of the republic of serbia. the authors thank to the anonymous reviewers for their valuable suggestions and comments. references [1] v. r. krstić, n. bogdanović, “on structure-criterion switching control for self-optimized decision feedback equalizer”, in proceedings of conference papers icetran 2017, srbija, june 5-8, 2017. [2] v. savaux, f. bader, j. palicot, “ofdm/oqam blind equalization using cna approach”, ieee trans. signal processing, vol. 64, no. 9, pp. 2324-2333, 2016. [3] j. r. treichler, m. g. larimore and j. c. harp, “practical blind demodulators for high-order qam signals,” in proceedings of the ieee, vol. 86, no. 10, pp. 1907-1926, 1998. [4] z. ding, y. g. li, blind equalization and identification. signal processing and communication series, marcel dekker, 2001. [5] s.u.h. qureshi, "adaptive equalization," in proceedings of the ieee, vol. 73, pp.1349-1387, sept. 1985. [6] j. labat, o. macchi and c. laot, “adaptive decision feedback equalization: can you skip the training period?,” ieee trans. commun., vol. 46, no. 7, pp. 921-930, jul, 1998. [7] a. goupil and j. palicot, “an efficient blind decision feedback equalizer,” ieee commun. letters, vol. 14, no. 5, pp. 462-464, 2010. [8] m. t. m. silva, j. arenas-garcía, “a soft-switching blind equalization scheme via convex combination of adaptive filters,” ieee trans. signal processing, vol. 61, no. 5, pp. 1171-1182, march 1, 2013. [9] v. r. krstić. and m. l. dukić, “blind dfe with maximum-entropy feedback,” ieee signal processing letters, vol. 16, no 1, pp. 26-29, jan. 2009. [10] v. r. krstić, “fast start-up blind dfe equalizer,” pending patent rs, p-2017/0205, feb. 2017. [11] j. g. proakis, digital communications.3 rd ed. new york: mcgraw-hill, 1995. [12] y. h. kim, h. s. shamsunder, “adaptive algorithms for channel equalization with soft decision feedback,” ieee journal on selected areas in communications, vol. 16, no. 9, pp. 1660-1669, 1998. [13] d. n. godard, “self-recovering equalization and carrier tracking in two-dimensional data communication systems”, ieee trans. commun., 1980, vol. 18, no. 11, pp. 1867-1875, 1980. [14] c. r. johnson, jr. et al., “the core of fse-cma behavior theory”. in s. haykin (ed.), unsupervised adaptive filtering, vol. ii blind deconvolution, pp. 13-112. new york: john wiley & sons, 2000. [15] s. abrar, a. zerguine, a. k. nandi, “blind adaptive carrier phase recovery for qam signals,” digital signal processing, vol. 49, pp. 65-85, 2016. [16] v. r. krstić, a. m. stevanović and b. lj. odadžić, “a variable leaky entropy-based whitening algorithm for blind decision feedback equalization”, wireless personal communications, vol. 95, issue 2, pp. 931-946, july 2017. [17] m. kamenetskyand, b. widrow, “a variable leaky lms adaptive algorithm”, in proceedings of the thirty-eighth asilomar conference on signal, systems and computers, nov. 2004, vol.1, pp. 125-126. [18] l. t. decarlo, “on the meaning and use of kurtosis,” psychological methods, vol. 2, no. 3, pp. 292-307, 1997. [19] o. shalvi, e. weinstein, "new criteria for blind deconvolution of nonminimum phase systems (channels)," ieee trans. inf. theory, vol. 36, pp.312-321, march 1990. [20] v. r. krstić, m. l. dukić, ”decision feedback blind equalizer with tap-leaky whitening for stable structure-criterion switching.” international journal of digital multimedia broadcasting volume 2014, article id 987039, 10 pages, 2014. facta universitatis series: electronics and energetics vol. x, no x, x 2018, pp. x energy-efficient cryptographic primitives elena dubrova∗ royal institute of technology (kth), stockholm, sweden abstract: our society greatly depends on services and applications provided by mobile communication networks. as billions of people and devices become connected, it becomes increasingly important to guarantee security of interactions of all players. in this talk we address several aspects of this important, many-folded problem. first, we show how to design cryptographic primitives which can assure integrity and confidentiality of transmitted messages while satisfying resource constrains of low-end low-cost wireless devices such as sensors or rfid tags. second, we describe countermeasures which can enhance the resistance of hardware implementing cryptographic algorithms to hardware trojans. keywords: security, lightweight cryptography, cryptographic primitive, encryption, message authentication, hardware trojan. 1 introduction today minimal or no security is typically provided to low-end low-cost wireless devices such as sensors or rfid tags in the conventional belief that the information they gather is of little concern to attackers. however, case studies have shown that a compromised sensor can be used as a stepping stone to mount an attack on a wireless network. for example, in the attack manuscript received x x, x corresponding author: elena dubrova royal institute of technology (kth), stockholm, sweden (e-mail: dubrova@kth.se) ∗an earlier version of this paper was presented as an invited address at the reed-muller 2017 workshop, novi sad, serbia, may 24-25, 2017. 1 2 s. stojković, d. veličković and c. moraga 1 introduction decision diagrams (dds) are a compact data structures for discrete functions representation. bryant showed their canonicity in 1986 in [1] and after that they have been applied in many areas in which discrete functions are used: hardware design, hardware testing, signal processing, etc. complexities of the designed hardware or of the computations that are done by decision diagrams are directly proportional to the size of the diagram. a main disadvantage of the decision diagrams is that their size is dependent on the order of the variables that are used in the diagram. optimization of the dd size is a very often solved problem. algorithms for dd optimization can be classified into two categories: exact algorithms and heuristic algorithms. the basic exact algorithm is a brute-force algorithm creating the diagrams for all possible orders of variables and choosing the best. a slightly improved exact algorithm is presented in [2]. but, all exact algorithms are very slow and inapplicable for functions with a large number of variables. the most widely used heuristic algorithm for dd optimization is rudells sifting algorithm that was proposed in [3]. the main idea in that algorithm is the sifting of each variable through all levels in the diagram and choose the optimal position. a genetic algorithm is a heuristic algorithm that can be applied in solving different optimization problems. using genetic algorithm in dd optimization was first discussed in [4]. after that, genetic algorithms for dd optimization were improved in many papers ( [5–13]). some of them optimize the dd size ( [4–11]). in [12] the 1-paths number is optimized, and in [13] a method for these two optimization is proposed. in this paper, we present a genetic algorithm for optimization of bdds and fdds. our main goal was minimization of fdd size because we use fdd in reversible synthesis (see for example [14], [15]). for comparison we also include results on the minimization of the size of the bdds. in the applications for fdd-based reversible synthesis, the complexity of the generated network is directly dependent of the fdd size. additional problem in fdd usage is that the size is dependent of the decomposition rules that are used in the nodes. in fdd in each a node positive or negative davio decomposition can be used. usually, the same decomposition is used in all nodes from the same level. it follows that for one variable order, 2n different fdds can be created. an exact algorithm in that case should check 2n · n! cases, which is impossible for large number of variables. another group facta universitatis series: electronics and energetics vol. 31, no 2, june 2018, pp. 169 187 https://doi.org/10.2298/fuee1802169s suzana stojković1, darko veličković1, claudio moraga2 received october 21, 2017; received in revised form january 30, 2018 corresponding author: suzana stojkovic faculty of electronic engineering, university of niš, medevedeva 14, 18000 niš, serbia (e-mail: suzana.stojkovic@elfak.ni.ac.rs) *an earlier version of this paper was presented as an invited address at the reed-muller 2017 workshop, novi sad, serbia, may 24-25, 2017 facta universitatis series: electronics and energetics vol. 28, no 4, december 2015, pp. 507 525 doi: 10.2298/fuee1504507s horizontal current bipolar transistor (hcbt) – a low-cost, high-performance flexible bicmos technology for rf communication applications tomislav suligoj1, marko koričić1, josip žilak1, hidenori mochizuki2, so-ichi morita2, katsumi shinomura2, hisaya imai2 1university of zagreb, faculty of electrical engineering and computing, department of electronics, microand nano-electronics laboratory, croatia 2asahi kasei microdevices co. 5-4960, nobeoka, miyazaki, 882-0031, japan abstract. in an overview of horizontal current bipolar transistor (hcbt) technology, the state-of-the-art integrated silicon bipolar transistors are described which exhibit ft and fmax of 51 ghz and 61 ghz and ftbvceo product of 173 ghzv that are among the highest-performance implanted-base, silicon bipolar transistors. hbct is integrated with cmos in a considerably lower-cost fabrication sequence as compared to standard vertical-current bipolar transistors with only 2 or 3 additional masks and fewer process steps. due to its specific structure, the charge sharing effect can be employed to increase bvceo without sacrificing ft and fmax. moreover, the electric field can be engineered just by manipulating the lithography masks achieving the high-voltage hcbts with breakdowns up to 36 v integrated in the same process flow with high-speed devices, i.e. at zero additional costs. double-balanced active mixer circuit is designed and fabricated in hcbt technology. the maximum iip3 of 17.7 dbm at mixer current of 9.2 ma and conversion gain of -5 db are achieved. key words: bicmos technology, bipolar transistors, horizontal current bipolar transistor, radio frequency integrated circuits, mixer, high-voltage bipolar transistors. 1. introduction in the highly competitive wireless communication markets, the rf circuits and systems are fabricated in the technologies that are very cost-sensitive. in order to minimize the fabrication costs, the sub-10 ghz applications can be processed by using the high-volume silicon technologies. it has been identified that the optimum solution might received march 9, 2015 corresponding author: tomislav suligoj university of zagreb, faculty of electrical engineering and computing, department of electronics, microand nano-electronics laboratory, croatia (e-mail: tom@zemris.fer.hr) genetic algorithm for binary and functional decision diagrams optimization* 1faculty of electronic engineering, university of niš, niš, serbia 2department of computer science, tu dortmund, germany university, dortmund, germany abstract. decision diagrams (dd) are a widely used data structure for discrete functions representation. the major problem in dd-based applicationsis the dd size minimization (reduction of the number of nodes), because their size is dependent on the variables order. genetic algorithms are often used in different optimization problems including the dd size optimization. in this paper, we apply the genetic algorithm to minimize the size of both binary decision diagrams (bdds) and functional decision diagrams (fdds). in both cases, in the proposed algorithm, a bottom-up partially matched crossover (bupmx) is used as the crossover operator. in the case of bdds, mutation is done in the standard way by variables exchanging. in the case of fdds, the mutation by changing the polarity of variables is additionally used. experimental results of optimization of the bdds and fdds of the set of benchmark functions are also presented. key words: binary decision diagrams, functional decision diagrams, decision diagrams oprimization, genetic algorithm. 2 s. stojković, d. veličković and c. moraga 1 introduction decision diagrams (dds) are a compact data structures for discrete functions representation. bryant showed their canonicity in 1986 in [1] and after that they have been applied in many areas in which discrete functions are used: hardware design, hardware testing, signal processing, etc. complexities of the designed hardware or of the computations that are done by decision diagrams are directly proportional to the size of the diagram. a main disadvantage of the decision diagrams is that their size is dependent on the order of the variables that are used in the diagram. optimization of the dd size is a very often solved problem. algorithms for dd optimization can be classified into two categories: exact algorithms and heuristic algorithms. the basic exact algorithm is a brute-force algorithm creating the diagrams for all possible orders of variables and choosing the best. a slightly improved exact algorithm is presented in [2]. but, all exact algorithms are very slow and inapplicable for functions with a large number of variables. the most widely used heuristic algorithm for dd optimization is rudells sifting algorithm that was proposed in [3]. the main idea in that algorithm is the sifting of each variable through all levels in the diagram and choose the optimal position. a genetic algorithm is a heuristic algorithm that can be applied in solving different optimization problems. using genetic algorithm in dd optimization was first discussed in [4]. after that, genetic algorithms for dd optimization were improved in many papers ( [5–13]). some of them optimize the dd size ( [4–11]). in [12] the 1-paths number is optimized, and in [13] a method for these two optimization is proposed. in this paper, we present a genetic algorithm for optimization of bdds and fdds. our main goal was minimization of fdd size because we use fdd in reversible synthesis (see for example [14], [15]). for comparison we also include results on the minimization of the size of the bdds. in the applications for fdd-based reversible synthesis, the complexity of the generated network is directly dependent of the fdd size. additional problem in fdd usage is that the size is dependent of the decomposition rules that are used in the nodes. in fdd in each a node positive or negative davio decomposition can be used. usually, the same decomposition is used in all nodes from the same level. it follows that for one variable order, 2n different fdds can be created. an exact algorithm in that case should check 2n · n! cases, which is impossible for large number of variables. another group genetic algorithm for bdd and fdd optimization 3 of minimization algorithms, sifting algorithms, analyze only variables order. because of that we choose a side to try by applying the genetic algorithm. in the presented algorithm, for both bdd and fdd, a modified pmx crossover operator (bu-pmx bottom-up partially matched crossover) and mutation by variable exchange are used. in the case of fdd optimization, mutation by polarity change is additionally used. the paper is organized in the following way: section 2 contains most important definitions related to the decision diagrams. section 3 presents the general idea of genetic algorithms and their specifications in the case of applying in dd optimization. section 4 describes the algorithm for bdd and fdd optimization and the genetic operations that are used in it. section 5 discuses experimental results and in section 6 some concluding remarks are given. 2 decision diagrams definition 1 (binary decision tree) a binary decision tree (bdt) representing a boolean function f is the binary tree created by the recursive application of the shannon decomposition rule: f = xk · f(xk = 0) ⊕ xk · f(xk = 1) (1) definition 2 (terminal and nonterminal nodes) a bdt contains two types of nodes: nonterminal and terminal. a nonterminal node represents one decomposition and it has a joint decision variable. a terminal node contains the value of the function. definition 3 (level in bdt) a level in the bdt is a set of nonterminal nodes with the same decision variable, or the set of terminal nodes. definition 4 (functional decision tree) a functional decision tree (fdt) representing a boolean function f is the binary tree created by the recursive application of the positive (2) or negative (3) davio decomposition rule: f = f(xk = 0) ⊕ xk · (f(xk = 0) ⊕ f(xk = 1)) (2) f = xk · (f(xk = 0) ⊕ f(xk = 1)) ⊕ f(xk = 1) (3) definition 5 (fixed polarity functional decision tree) a functional decision tree in which the same decomposition is used in each node from the same level is called a fixed polarity functional decision tree (fpfdt). 170 s. stojković, d. veličković, c. moraga genetic algorithm for bdd and fdd optimization 171 2 s. stojković, d. veličković and c. moraga 1 introduction decision diagrams (dds) are a compact data structures for discrete functions representation. bryant showed their canonicity in 1986 in [1] and after that they have been applied in many areas in which discrete functions are used: hardware design, hardware testing, signal processing, etc. complexities of the designed hardware or of the computations that are done by decision diagrams are directly proportional to the size of the diagram. a main disadvantage of the decision diagrams is that their size is dependent on the order of the variables that are used in the diagram. optimization of the dd size is a very often solved problem. algorithms for dd optimization can be classified into two categories: exact algorithms and heuristic algorithms. the basic exact algorithm is a brute-force algorithm creating the diagrams for all possible orders of variables and choosing the best. a slightly improved exact algorithm is presented in [2]. but, all exact algorithms are very slow and inapplicable for functions with a large number of variables. the most widely used heuristic algorithm for dd optimization is rudells sifting algorithm that was proposed in [3]. the main idea in that algorithm is the sifting of each variable through all levels in the diagram and choose the optimal position. a genetic algorithm is a heuristic algorithm that can be applied in solving different optimization problems. using genetic algorithm in dd optimization was first discussed in [4]. after that, genetic algorithms for dd optimization were improved in many papers ( [5–13]). some of them optimize the dd size ( [4–11]). in [12] the 1-paths number is optimized, and in [13] a method for these two optimization is proposed. in this paper, we present a genetic algorithm for optimization of bdds and fdds. our main goal was minimization of fdd size because we use fdd in reversible synthesis (see for example [14], [15]). for comparison we also include results on the minimization of the size of the bdds. in the applications for fdd-based reversible synthesis, the complexity of the generated network is directly dependent of the fdd size. additional problem in fdd usage is that the size is dependent of the decomposition rules that are used in the nodes. in fdd in each a node positive or negative davio decomposition can be used. usually, the same decomposition is used in all nodes from the same level. it follows that for one variable order, 2n different fdds can be created. an exact algorithm in that case should check 2n · n! cases, which is impossible for large number of variables. another group genetic algorithm for bdd and fdd optimization 3 of minimization algorithms, sifting algorithms, analyze only variables order. because of that we choose a side to try by applying the genetic algorithm. in the presented algorithm, for both bdd and fdd, a modified pmx crossover operator (bu-pmx bottom-up partially matched crossover) and mutation by variable exchange are used. in the case of fdd optimization, mutation by polarity change is additionally used. the paper is organized in the following way: section 2 contains most important definitions related to the decision diagrams. section 3 presents the general idea of genetic algorithms and their specifications in the case of applying in dd optimization. section 4 describes the algorithm for bdd and fdd optimization and the genetic operations that are used in it. section 5 discuses experimental results and in section 6 some concluding remarks are given. 2 decision diagrams definition 1 (binary decision tree) a binary decision tree (bdt) representing a boolean function f is the binary tree created by the recursive application of the shannon decomposition rule: f = xk · f(xk = 0) ⊕ xk · f(xk = 1) (1) definition 2 (terminal and nonterminal nodes) a bdt contains two types of nodes: nonterminal and terminal. a nonterminal node represents one decomposition and it has a joint decision variable. a terminal node contains the value of the function. definition 3 (level in bdt) a level in the bdt is a set of nonterminal nodes with the same decision variable, or the set of terminal nodes. definition 4 (functional decision tree) a functional decision tree (fdt) representing a boolean function f is the binary tree created by the recursive application of the positive (2) or negative (3) davio decomposition rule: f = f(xk = 0) ⊕ xk · (f(xk = 0) ⊕ f(xk = 1)) (2) f = xk · (f(xk = 0) ⊕ f(xk = 1)) ⊕ f(xk = 1) (3) definition 5 (fixed polarity functional decision tree) a functional decision tree in which the same decomposition is used in each node from the same level is called a fixed polarity functional decision tree (fpfdt). genetic algorithm for bdd and fdd optimization 3 of minimization algorithms, sifting algorithms, analyze only variables order. because of that we choose a side to try by applying the genetic algorithm. in the presented algorithm, for both bdd and fdd, a modified pmx crossover operator (bu-pmx bottom-up partially matched crossover) and mutation by variable exchange are used. in the case of fdd optimization, mutation by polarity change is additionally used. the paper is organized in the following way: section 2 contains most important definitions related to the decision diagrams. section 3 presents the general idea of genetic algorithms and their specifications in the case of applying in dd optimization. section 4 describes the algorithm for bdd and fdd optimization and the genetic operations that are used in it. section 5 discuses experimental results and in section 6 some concluding remarks are given. 2 decision diagrams definition 1 (binary decision tree) a binary decision tree (bdt) representing a boolean function f is the binary tree created by the recursive application of the shannon decomposition rule: f = xk · f(xk = 0) ⊕ xk · f(xk = 1) (1) definition 2 (terminal and nonterminal nodes) a bdt contains two types of nodes: nonterminal and terminal. a nonterminal node represents one decomposition and it has a joint decision variable. a terminal node contains the value of the function. definition 3 (level in bdt) a level in the bdt is a set of nonterminal nodes with the same decision variable, or the set of terminal nodes. definition 4 (functional decision tree) a functional decision tree (fdt) representing a boolean function f is the binary tree created by the recursive application of the positive (2) or negative (3) davio decomposition rule: f = f(xk = 0) ⊕ xk · (f(xk = 0) ⊕ f(xk = 1)) (2) f = xk · (f(xk = 0) ⊕ f(xk = 1)) ⊕ f(xk = 1) (3) definition 5 (fixed polarity functional decision tree) a functional decision tree in which the same decomposition is used in each node from the same level is called a fixed polarity functional decision tree (fpfdt). 4 s. stojković, d. veličković and c. moraga definition 6 (polarity-vector) the polarity-vector of the fpfdt is a bit vector which defines the types of the decompositions that are used in the levels. 0 denotes that the positive davio decomposition is used, 1 denotes the negative. definition 7 (positive-polarity fdt) a fdt in which positive davio decomposition is used at all levels is a positive-polarity fdt. definition 8 (binary decision diagram) a bdt is transformed into a binary decision diagram (bdd) by using the following reduction rules: 1. share the isomorphic sub-trees: if there are two terminal nodes with the same value, or two non-terminal nodes with isomorphic sub-trees, one of them is deleted. its incoming edges are directed to the remaining node. 2. eliminate the redundant nodes: if both outgoing edges from a non-terminal node point to the same sub-tree, this node is redundant and it is deleted. its incoming edges are directed to the common sub-tree. definition 9 (functional decision diagram) an fdt is transformed into an fdd by using the reduction rule 1 above and the following 0-suppress reduction rules: 2.1 if the right outgoing edge from a positive davio node points to the 0, the node is deleted. the edges pointing to the deleted node are directed to its left sub-tree. 2.2 if the left outgoing edge from a negative davio node points to the 0, the node is deleted. the edges pointing to the deleted node are directed to its right sub-tree. example 1 figure 2 shows the bdd (a) and the positive-polarity fdd (b) of the function f(x1, x2, x3, x4) = x1 · x2 + x1 · x2 + x3 + x4. definition 10 (dd size) dd size is equal to the number of the nonterminal nodes. example 2 figure 2 shows the bdds of the function f(x1, . . . , x6) = x1x2 + x3x4 + x5x6 for variables orders (a) (x1, x2, x3, x4, x5, x6) and (b) (x1, x4, x2, x5, x3, x6). the size of the first bdd is 6, but the size of the second is 14. 170 s. stojković, d. veličković, c. moraga genetic algorithm for bdd and fdd optimization 171 4 s. stojković, d. veličković and c. moraga definition 6 (polarity-vector) the polarity-vector of the fpfdt is a bit vector which defines the types of the decompositions that are used in the levels. 0 denotes that the positive davio decomposition is used, 1 denotes the negative. definition 7 (positive-polarity fdt) a fdt in which positive davio decomposition is used at all levels is a positive-polarity fdt. definition 8 (binary decision diagram) a bdt is transformed into a binary decision diagram (bdd) by using the following reduction rules: 1. share the isomorphic sub-trees: if there are two terminal nodes with the same value, or two non-terminal nodes with isomorphic sub-trees, one of them is deleted. its incoming edges are directed to the remaining node. 2. eliminate the redundant nodes: if both outgoing edges from a non-terminal node point to the same sub-tree, this node is redundant and it is deleted. its incoming edges are directed to the common sub-tree. definition 9 (functional decision diagram) an fdt is transformed into an fdd by using the reduction rule 1 above and the following 0-suppress reduction rules: 2.1 if the right outgoing edge from a positive davio node points to the 0, the node is deleted. the edges pointing to the deleted node are directed to its left sub-tree. 2.2 if the left outgoing edge from a negative davio node points to the 0, the node is deleted. the edges pointing to the deleted node are directed to its right sub-tree. example 1 figure 2 shows the bdd (a) and the positive-polarity fdd (b) of the function f(x1, x2, x3, x4) = x1 · x2 + x1 · x2 + x3 + x4. definition 10 (dd size) dd size is equal to the number of the nonterminal nodes. example 2 figure 2 shows the bdds of the function f(x1, . . . , x6) = x1x2 + x3x4 + x5x6 for variables orders (a) (x1, x2, x3, x4, x5, x6) and (b) (x1, x4, x2, x5, x3, x6). the size of the first bdd is 6, but the size of the second is 14. 4 s. stojković, d. veličković and c. moraga definition 6 (polarity-vector) the polarity-vector of the fpfdt is a bit vector which defines the types of the decompositions that are used in the levels. 0 denotes that the positive davio decomposition is used, 1 denotes the negative. definition 7 (positive-polarity fdt) a fdt in which positive davio decomposition is used at all levels is a positive-polarity fdt. definition 8 (binary decision diagram) a bdt is transformed into a binary decision diagram (bdd) by using the following reduction rules: 1. share the isomorphic sub-trees: if there are two terminal nodes with the same value, or two non-terminal nodes with isomorphic sub-trees, one of them is deleted. its incoming edges are directed to the remaining node. 2. eliminate the redundant nodes: if both outgoing edges from a non-terminal node point to the same sub-tree, this node is redundant and it is deleted. its incoming edges are directed to the common sub-tree. definition 9 (functional decision diagram) an fdt is transformed into an fdd by using the reduction rule 1 above and the following 0-suppress reduction rules: 2.1 if the right outgoing edge from a positive davio node points to the 0, the node is deleted. the edges pointing to the deleted node are directed to its left sub-tree. 2.2 if the left outgoing edge from a negative davio node points to the 0, the node is deleted. the edges pointing to the deleted node are directed to its right sub-tree. example 1 figure 2 shows the bdd (a) and the positive-polarity fdd (b) of the function f(x1, x2, x3, x4) = x1 · x2 + x1 · x2 + x3 + x4. definition 10 (dd size) dd size is equal to the number of the nonterminal nodes. example 2 figure 2 shows the bdds of the function f(x1, . . . , x6) = x1x2 + x3x4 + x5x6 for variables orders (a) (x1, x2, x3, x4, x5, x6) and (b) (x1, x4, x2, x5, x3, x6). the size of the first bdd is 6, but the size of the second is 14. 4 s. stojković, d. veličković and c. moraga definition 6 (polarity-vector) the polarity-vector of the fpfdt is a bit vector which defines the types of the decompositions that are used in the levels. 0 denotes that the positive davio decomposition is used, 1 denotes the negative. definition 7 (positive-polarity fdt) a fdt in which positive davio decomposition is used at all levels is a positive-polarity fdt. definition 8 (binary decision diagram) a bdt is transformed into a binary decision diagram (bdd) by using the following reduction rules: 1. share the isomorphic sub-trees: if there are two terminal nodes with the same value, or two non-terminal nodes with isomorphic sub-trees, one of them is deleted. its incoming edges are directed to the remaining node. 2. eliminate the redundant nodes: if both outgoing edges from a non-terminal node point to the same sub-tree, this node is redundant and it is deleted. its incoming edges are directed to the common sub-tree. definition 9 (functional decision diagram) an fdt is transformed into an fdd by using the reduction rule 1 above and the following 0-suppress reduction rules: 2.1 if the right outgoing edge from a positive davio node points to the 0, the node is deleted. the edges pointing to the deleted node are directed to its left sub-tree. 2.2 if the left outgoing edge from a negative davio node points to the 0, the node is deleted. the edges pointing to the deleted node are directed to its right sub-tree. example 1 figure 2 shows the bdd (a) and the positive-polarity fdd (b) of the function f(x1, x2, x3, x4) = x1 · x2 + x1 · x2 + x3 + x4. definition 10 (dd size) dd size is equal to the number of the nonterminal nodes. example 2 figure 2 shows the bdds of the function f(x1, . . . , x6) = x1x2 + x3x4 + x5x6 for variables orders (a) (x1, x2, x3, x4, x5, x6) and (b) (x1, x4, x2, x5, x3, x6). the size of the first bdd is 6, but the size of the second is 14. 172 s. stojković, d. veličković, c. moraga genetic algorithm for bdd and fdd optimization 173 genetic algorithm for bdd and fdd optimization 5 fig. 1: bdd (a) and fdd (b) of the function from example 1. in the general case, for the function of 2n variables f(x1, x2, . . . x2n−1, x2n) = x1x2+· · ·+x2n−1x2n, the size of the bdd with variables order (x1, x2, . . . , x2n−1, x2n) is 2n, and with variables order (x1, xn+1, . . . , xn, x2n) it is o(2 n−1). besides the variable order, the size of fixed polarity fdds is dependent also on the polarities for the variables. example 3 figure 3 shows the fdds of the function in example 1 for polarity vectors (a) f = [1 1 1 1]t and (b) f = [0 1 0 1]t. the size of the diagram if the first case is 4, and in the second case is 6. 3 genetic algorithm a genetic algorithm is a method for solving different optimization problems based on an analogy to the natural selection process. in this algorithm, the solution of a problem is presented as an array that is named chromosome. an element of the chromosome is a gene. in general, the initial set of chromosomes are generated randomly, and then, the new generation is created by using two genetic operations: crossover and mutation. the crossover operator defines the way for creating the child chromosomes by combination of the genes from parent chromosomes. in practice, one point crossover (fig. 4(a)) and two-point crossover (fig. 4(b)) 4 s. stojković, d. veličković and c. moraga definition 6 (polarity-vector) the polarity-vector of the fpfdt is a bit vector which defines the types of the decompositions that are used in the levels. 0 denotes that the positive davio decomposition is used, 1 denotes the negative. definition 7 (positive-polarity fdt) a fdt in which positive davio decomposition is used at all levels is a positive-polarity fdt. definition 8 (binary decision diagram) a bdt is transformed into a binary decision diagram (bdd) by using the following reduction rules: 1. share the isomorphic sub-trees: if there are two terminal nodes with the same value, or two non-terminal nodes with isomorphic sub-trees, one of them is deleted. its incoming edges are directed to the remaining node. 2. eliminate the redundant nodes: if both outgoing edges from a non-terminal node point to the same sub-tree, this node is redundant and it is deleted. its incoming edges are directed to the common sub-tree. definition 9 (functional decision diagram) an fdt is transformed into an fdd by using the reduction rule 1 above and the following 0-suppress reduction rules: 2.1 if the right outgoing edge from a positive davio node points to the 0, the node is deleted. the edges pointing to the deleted node are directed to its left sub-tree. 2.2 if the left outgoing edge from a negative davio node points to the 0, the node is deleted. the edges pointing to the deleted node are directed to its right sub-tree. example 1 figure 2 shows the bdd (a) and the positive-polarity fdd (b) of the function f(x1, x2, x3, x4) = x1 · x2 + x1 · x2 + x3 + x4. definition 10 (dd size) dd size is equal to the number of the nonterminal nodes. example 2 figure 2 shows the bdds of the function f(x1, . . . , x6) = x1x2 + x3x4 + x5x6 for variables orders (a) (x1, x2, x3, x4, x5, x6) and (b) (x1, x4, x2, x5, x3, x6). the size of the first bdd is 6, but the size of the second is 14. 4 s. stojković, d. veličković and c. moraga definition 6 (polarity-vector) the polarity-vector of the fpfdt is a bit vector which defines the types of the decompositions that are used in the levels. 0 denotes that the positive davio decomposition is used, 1 denotes the negative. definition 7 (positive-polarity fdt) a fdt in which positive davio decomposition is used at all levels is a positive-polarity fdt. definition 8 (binary decision diagram) a bdt is transformed into a binary decision diagram (bdd) by using the following reduction rules: 1. share the isomorphic sub-trees: if there are two terminal nodes with the same value, or two non-terminal nodes with isomorphic sub-trees, one of them is deleted. its incoming edges are directed to the remaining node. 2. eliminate the redundant nodes: if both outgoing edges from a non-terminal node point to the same sub-tree, this node is redundant and it is deleted. its incoming edges are directed to the common sub-tree. definition 9 (functional decision diagram) an fdt is transformed into an fdd by using the reduction rule 1 above and the following 0-suppress reduction rules: 2.1 if the right outgoing edge from a positive davio node points to the 0, the node is deleted. the edges pointing to the deleted node are directed to its left sub-tree. 2.2 if the left outgoing edge from a negative davio node points to the 0, the node is deleted. the edges pointing to the deleted node are directed to its right sub-tree. example 1 figure 2 shows the bdd (a) and the positive-polarity fdd (b) of the function f(x1, x2, x3, x4) = x1 · x2 + x1 · x2 + x3 + x4. definition 10 (dd size) dd size is equal to the number of the nonterminal nodes. example 2 figure 2 shows the bdds of the function f(x1, . . . , x6) = x1x2 + x3x4 + x5x6 for variables orders (a) (x1, x2, x3, x4, x5, x6) and (b) (x1, x4, x2, x5, x3, x6). the size of the first bdd is 6, but the size of the second is 14. 172 s. stojković, d. veličković, c. moraga genetic algorithm for bdd and fdd optimization 173 6 s. stojković, d. veličković and c. moraga fig. 2: bdds of the function from example 2 for two different variable orders. are usually chosen. the mutation is often realized by changing the value of the gene at a selected position. the measure of the quality of a solution (chromosome) is named fitness score. fitness scores are used to compute the possibilities for selecting the chromosomes for parents for the next generation, and for selecting the chromosomes that will die after an iteration. to define the genetic algorithm for a concrete optimization problem means to define: the type of genes, the fitness function and the genetic operations. 3.1 genetic algorithm for bdd size optimization one chromosome in a bdd optimization problem is one order of input variables, i.e. one permutation of the integer numbers from interval [1, n]. it follows that standard genetic operators cannot be used. because of that, for a bdd optimization, special genetic operators are defined. crossover oper174 s. stojković, d. veličković, c. moraga genetic algorithm for bdd and fdd optimization 175 6 s. stojković, d. veličković and c. moraga fig. 2: bdds of the function from example 2 for two different variable orders. are usually chosen. the mutation is often realized by changing the value of the gene at a selected position. the measure of the quality of a solution (chromosome) is named fitness score. fitness scores are used to compute the possibilities for selecting the chromosomes for parents for the next generation, and for selecting the chromosomes that will die after an iteration. to define the genetic algorithm for a concrete optimization problem means to define: the type of genes, the fitness function and the genetic operations. 3.1 genetic algorithm for bdd size optimization one chromosome in a bdd optimization problem is one order of input variables, i.e. one permutation of the integer numbers from interval [1, n]. it follows that standard genetic operators cannot be used. because of that, for a bdd optimization, special genetic operators are defined. crossover opergenetic algorithm for bdd and fdd optimization 7 fig. 3: fdds of the function from example 1 for two different polarity vectors. fig. 4: one-point (a) and two-point (b) crossover operators. ators that will be discussed in this section are: order crossover ( [10], [11]), cyclic crossover ( [10], [11]), partially matched crossover ( [4], [10], [11]) and alternating crossover ( [7]). algorithm 1 (cyclic crossover operator cx) : step 1. create a cycle of the genes defined by corresponding positions in the parent chromosomes starting from first unused gene in the first parent. step 2. copy the genes from the cycle from one parent in the first child and from other parent in the second child. 174 s. stojković, d. veličković, c. moraga genetic algorithm for bdd and fdd optimization 175 8 s. stojković, d. veličković and c. moraga step 3. repeat steps 1 and 2 by alternating change the target child in which the genes from one parent is copied. example 4 let we see the following parents: p1 = [1 2 3 4 5 6 7 8 9 10] p2 = [5 4 6 9 2 8 3 7 1 10] the first cycle of the genes is created starting from the gene 1 from the first parent. on the corresponding position in the second parent is the gene 5. then, we find the gene 5 in the first parent and in the corresponding position in the second parent is the gene 2. process is continued until the cycle is closed. the created cycle is 1 → 5 → 2 → 4 → 9 → 1. the child chromosomes after putting first cycle are: c1 = [1 2 4 5 9 ] c2 = [5 4 9 2 1 ] second cycle is created starting from the gene 3: 3 → 6 → 8 → 7 → 3 child chromosomes after putting second cycle in the child chromosomes are: c1 = [1 2 6 4 5 8 3 7 9 ] c2 = [5 4 3 9 2 6 7 8 1 ] the last cycle contains only gene 10, and, finally, child chromosomes are: c1 = [1 2 6 4 5 8 3 7 9 10] c2 = [5 4 3 9 2 6 7 8 1 10] algorithm 2 (order crossover operator ox) : step 1. randomly select two crossover points. step 2. copy in the child chromosome the genes from the first parent between crossover points. step 3. delete from second parent the genes which are already in the child. step 4. place the genes from the second parent into unfilled positions in child chromosome from left to right. 176 s. stojković, d. veličković, c. moraga genetic algorithm for bdd and fdd optimization 177 8 s. stojković, d. veličković and c. moraga step 3. repeat steps 1 and 2 by alternating change the target child in which the genes from one parent is copied. example 4 let we see the following parents: p1 = [1 2 3 4 5 6 7 8 9 10] p2 = [5 4 6 9 2 8 3 7 1 10] the first cycle of the genes is created starting from the gene 1 from the first parent. on the corresponding position in the second parent is the gene 5. then, we find the gene 5 in the first parent and in the corresponding position in the second parent is the gene 2. process is continued until the cycle is closed. the created cycle is 1 → 5 → 2 → 4 → 9 → 1. the child chromosomes after putting first cycle are: c1 = [1 2 4 5 9 ] c2 = [5 4 9 2 1 ] second cycle is created starting from the gene 3: 3 → 6 → 8 → 7 → 3 child chromosomes after putting second cycle in the child chromosomes are: c1 = [1 2 6 4 5 8 3 7 9 ] c2 = [5 4 3 9 2 6 7 8 1 ] the last cycle contains only gene 10, and, finally, child chromosomes are: c1 = [1 2 6 4 5 8 3 7 9 10] c2 = [5 4 3 9 2 6 7 8 1 10] algorithm 2 (order crossover operator ox) : step 1. randomly select two crossover points. step 2. copy in the child chromosome the genes from the first parent between crossover points. step 3. delete from second parent the genes which are already in the child. step 4. place the genes from the second parent into unfilled positions in child chromosome from left to right. genetic algorithm for bdd and fdd optimization 9 example 5 let the parent variable orders be given by arrays: p1 = [1 2 3 |4 5 6 |7 8 9 10] p2 = [6 7 4 2 3 10 9 5 1 8] the crossover points are marked in the first parent. after step one, the child chromosome is: c = [ |4 5 6 | ] after deleting corresponding genes, second parent is: p2 = [�6 7 �4 2 3 10 9 �5 1 8] finally, after putting the genes from second parent, the generated child is: c = [7 2 3 |4 5 6 |10 9 1 8] algorithm 3 (partially matched crossover operator pmx) : step 1. perform a two-point crossover. step 2. create the mapping table of the genes from the central part of one parent that do not appear in the central part of the second parent. the mapping pair of a gene from position i of the first parent (p1[i]) is the gene at the same position in the other parent (p2[i]) if the gene p2[i] is not in the central part of the first parent, otherwise, if the p2[i] = p1[j] the mapping pair of p1[i] is equal to the mapping pair of the gene p1[j]. step 3. eliminate duplicated genes in child chromosomes so that the central part of the chromosomes remains unchanged. if some gene from the central part appears again in other parts, replace it by the corresponding mapping pair. example 6 let the parent variable orders be given by arrays: p1 = [9 8 4 |5 2 7 |1 3 6 10] p2 = [8 7 1 |2 3 10 |9 5 4 6] let the two-point crossover operator be performed with the crossover points 3 and 6. the resulting child chromosomes are: c′1 = [9 8 4 |2 3 10 |1 3 6 10] 176 s. stojković, d. veličković, c. moraga genetic algorithm for bdd and fdd optimization 177 10 s. stojković, d. veličković and c. moraga c′2 = [8 7 1 |5 2 7 |9 5 4 6] let us create a mapping table: p2[4] = 2 exists in the central part of p1, and it is not mapped. p2[5] = 3 → 2 → 5. pair (3, 5) is added into the mapping table. p2[6] = 10 → 7. pair (10, 7) is added into the mapping table. resulting child chromosomes after duplicate elimination are: c1 = [9 8 4 |2 3 10 |1 5 6 7] c2 = [8 10 1 |5 2 7 |9 3 4 6] algorithm 4 (alternating crossover operator ax) create the child chromosome by taking alternatively the genes from the first and the second parent. before storing the gene into a child chromosome check whether it already exists there. example 7 let the alternating crossover be performed over the same parents as in the previous example. the resulting child chromosome is: c = [9 8 7 4 1 5 2 3 10 6] mutation cannot be realized as it is shown in the previous section, too. in the literature, three ways for the mutation operation are suggested: algorithm 5 (mutation by one variables exchange) randomly select two positions in a chromosome and exchange the variables from the selected positions. algorithm 6 (mutation by two variables exchanges) apply two-times mutation defined in the algorithm 5. algorithm 7 (mutation by neighbor exchange) randomly select one position i. exchange the variables from positions i and i + 1. the fitness function in a bdd optimization problem is the size of the bdd. 178 s. stojković, d. veličković, c. moraga genetic algorithm for bdd and fdd optimization 179 10 s. stojković, d. veličković and c. moraga c′2 = [8 7 1 |5 2 7 |9 5 4 6] let us create a mapping table: p2[4] = 2 exists in the central part of p1, and it is not mapped. p2[5] = 3 → 2 → 5. pair (3, 5) is added into the mapping table. p2[6] = 10 → 7. pair (10, 7) is added into the mapping table. resulting child chromosomes after duplicate elimination are: c1 = [9 8 4 |2 3 10 |1 5 6 7] c2 = [8 10 1 |5 2 7 |9 3 4 6] algorithm 4 (alternating crossover operator ax) create the child chromosome by taking alternatively the genes from the first and the second parent. before storing the gene into a child chromosome check whether it already exists there. example 7 let the alternating crossover be performed over the same parents as in the previous example. the resulting child chromosome is: c = [9 8 7 4 1 5 2 3 10 6] mutation cannot be realized as it is shown in the previous section, too. in the literature, three ways for the mutation operation are suggested: algorithm 5 (mutation by one variables exchange) randomly select two positions in a chromosome and exchange the variables from the selected positions. algorithm 6 (mutation by two variables exchanges) apply two-times mutation defined in the algorithm 5. algorithm 7 (mutation by neighbor exchange) randomly select one position i. exchange the variables from positions i and i + 1. the fitness function in a bdd optimization problem is the size of the bdd. genetic algorithm for bdd and fdd optimization 11 4 genetic algorithm for bdd and fdd size optimization in the original pmx algorithm, the central part of the chromosome is transferred into the child chromosome unchanged. but, the possibility of deleting a dd node in the reduction phase is greater if the node is at the bottom levels. it follows that good properties of the parents will be inherited if the order of the variables on the last levels is not changed. because of that,we used a modified pmx algorithm in which the right part of the genes from parent chromosomes are directly transferred to the child chromosomes. this operator is named as the bottom-up pmx, because the genes are written into the child chromosome from the right to the left, i.e. from the bottom levels up. the second reason why the part of the unchanged genes is shifted to the end of the chromosome is that in that case the dd corresponding to the child chromosome contains an identical set of nodes in the last levels as the dd corresponding to the parent chromosome and calculation time of the fitness function is shortened. algorithm 8 (bottom-up pmx operator bu-pmx) : step 1. perform an one-point crossover. step 2. create the pmx mapping table for the right part of chromosomes. step 3. eliminate duplicate genes from the left part of child chromosomes by using the pmx mapping table. example 8 let the bottom-up pmx operator be performed over parents: p1 = [1 2 3 4 5 6 |7 8 9 10] p2 = [7 4 1 2 5 6 |9 3 8 10] after performing one-point crossover the generated children are: c′1 = [7 4 1 2 5 6 |7 8 9 10] c′2 = [1 2 3 4 5 6 |9 3 8 10] the mapping table contains only the pair (7, 3). after duplicates elimination, the resulting child chromosomes are: c1 = [3 4 1 2 5 6 |7 8 9 10] c2 = [1 2 7 4 5 6 |9 3 8 10] 178 s. stojković, d. veličković, c. moraga genetic algorithm for bdd and fdd optimization 179 12 s. stojković, d. veličković and c. moraga fig. 5: positive (a) and negative (b) davio nodes. as it was shown in example 3, the fdd size is dependent on the polarity vector. because of that, in fdd optimization an additional mutation producing a polarity change is used. to specify the transformation that is done on the fdd when this mutation is performed, the positive davio and negative davio nodes are shown in figure 5 ((a) and (b), respectively). in this figure f0 = f(xk = 0) and f1 = f(xk = 1). let fl and fr be the left and right successors of the node. if the polarity is changed from positive to negative, the transformation that is done is: fl new = fr old fr new = fl old ⊕ fr old (4) if the reverse polarity change is done, the applied transformation is: fr new = fl old fl new = fl old ⊕ fr old (5) algorithm 9 (mutation by polarity change) randomly select a variable. change the expansion rule in all nodes at the level corresponding to the selected variable. the complete genetic algorithm that is used for bdd and fdd optimization is shown in the algorithm 10. algorithm 10 (genetic algorithm for dd optimization) : step 1. create initial population of chromosomes and compute the fitness score for each of them. step 2. select pairs of parents for reproduction. 180 s. stojković, d. veličković, c. moraga genetic algorithm for bdd and fdd optimization 181 12 s. stojković, d. veličković and c. moraga fig. 5: positive (a) and negative (b) davio nodes. as it was shown in example 3, the fdd size is dependent on the polarity vector. because of that, in fdd optimization an additional mutation producing a polarity change is used. to specify the transformation that is done on the fdd when this mutation is performed, the positive davio and negative davio nodes are shown in figure 5 ((a) and (b), respectively). in this figure f0 = f(xk = 0) and f1 = f(xk = 1). let fl and fr be the left and right successors of the node. if the polarity is changed from positive to negative, the transformation that is done is: fl new = fr old fr new = fl old ⊕ fr old (4) if the reverse polarity change is done, the applied transformation is: fr new = fl old fl new = fl old ⊕ fr old (5) algorithm 9 (mutation by polarity change) randomly select a variable. change the expansion rule in all nodes at the level corresponding to the selected variable. the complete genetic algorithm that is used for bdd and fdd optimization is shown in the algorithm 10. algorithm 10 (genetic algorithm for dd optimization) : step 1. create initial population of chromosomes and compute the fitness score for each of them. step 2. select pairs of parents for reproduction. genetic algorithm for bdd and fdd optimization 13 fig. 6: number of iterations needed to reach the minimum bdd size for the bw benchmark function as a function of a percents of the mutated child chromosomes. step 3. create child chromosomes by bu-pmx. step 4. mutate child chromosomes (by mutation probability). step 5. do darwins process remove from population the worst chromosome or more bad chromosomes if the population is full. step 6. repeat steps 2-5 until the goal is reached or the computing time is exhausted. 5 experimental results 5.1 results of bdd size optimization at first, we tested how the mutation probability influences the convergence of the algorithm. figure 6 shows the number of iterations that is needed to reach the minimum bdd size for the function bw for different percents of mutated chromosomes. each experiment was repeated 100 times and in the figure the average values are shown. the number of needed iterations decreases when the percents of the mutated chromosomes increases. for percents greater than 15 the decreasing is very slow and 0.15 is chosen as an optimal mutation probability. then, we tested the convergence of the proposed algorithm on the set of a small benchmark functions for which we know the optimal size. we tested the number of iterations that is needed to reach the minimum bdd size. we compared these results with results obtained by using the order corossover (ox), cyclic crossover (cx), original pmx and alternating crossover (ax) 180 s. stojković, d. veličković, c. moraga genetic algorithm for bdd and fdd optimization 181 14 s. stojković, d. veličković and c. moraga table 1: number of iterations needed to reach minimum bdd size by using different crossover operators function in/out ox cx pmx ax bu-pmx bw 5/25 5.5 7.4 5.9 7.9 5.2 5xp1 7/10 56.1 46.6 47.5 72.8 32.1 con1 7/2 97.9 76.8 87.2 86.7 71 misex1 8/7 152.5 93 67.7 84.2 107.8 sqrt8 8/4 135.4 159.3 116.8 371.1 113.7 clip 9/5 72.9 75 59.4 194.4 49.7 operators. the experiments were repeated 10 times and in table 1 the average values are shown. table 1 shows that for 5 out of 6 functions the smallest bdd size was obtained with the smallest number of iterations when the bu-pmx operator is used. in these 5 cases, alternating crossover was the worst. only for the function misex1 the minimal bdd size was obtained with less number of iterations when pmx operator is used. finally, we optimized the bdd size for benchmark functions of a larger number of variables. table 2 compares the sizes of the bdds with initial order of variables and with optimal order generated by the genetic algorithm. in each experiment, the initial population contains 2n chromosomes (permutations) and maximum population size is 10n, where n is the number of input variables. the table shows that the proposed algorithms reduced the size of the bdd, on the average, by 46.375%. 5.2 comparison bdd optimization by proposed genetic algorithm and by other heuristic algorithms the paper [11] compares the sizes of bdds optimized by different heuristic algorithms and with genetic algorithm with 3 types of crossover operators (ox, cx and pmx). the paper shows that results that were produced by genetic algorithms are better than results of the other heuristic algorithms. table 3 compares the sizes of dds generated by the genetic algorithm presented in the paper [11] and by the genetic algorithm that is proposed in this paper. table 3 shows that, for the functions with small number of variables, all algorithms found absolute minimum. for the functions with large number of variables algorithms that used pmx or bu-pmx operator produced better results. the algorithm that is proposed in this paper produced the smallest bdd for 13 out of 15 functions. 182 s. stojković, d. veličković, c. moraga genetic algorithm for bdd and fdd optimization 183 14 s. stojković, d. veličković and c. moraga table 1: number of iterations needed to reach minimum bdd size by using different crossover operators function in/out ox cx pmx ax bu-pmx bw 5/25 5.5 7.4 5.9 7.9 5.2 5xp1 7/10 56.1 46.6 47.5 72.8 32.1 con1 7/2 97.9 76.8 87.2 86.7 71 misex1 8/7 152.5 93 67.7 84.2 107.8 sqrt8 8/4 135.4 159.3 116.8 371.1 113.7 clip 9/5 72.9 75 59.4 194.4 49.7 operators. the experiments were repeated 10 times and in table 1 the average values are shown. table 1 shows that for 5 out of 6 functions the smallest bdd size was obtained with the smallest number of iterations when the bu-pmx operator is used. in these 5 cases, alternating crossover was the worst. only for the function misex1 the minimal bdd size was obtained with less number of iterations when pmx operator is used. finally, we optimized the bdd size for benchmark functions of a larger number of variables. table 2 compares the sizes of the bdds with initial order of variables and with optimal order generated by the genetic algorithm. in each experiment, the initial population contains 2n chromosomes (permutations) and maximum population size is 10n, where n is the number of input variables. the table shows that the proposed algorithms reduced the size of the bdd, on the average, by 46.375%. 5.2 comparison bdd optimization by proposed genetic algorithm and by other heuristic algorithms the paper [11] compares the sizes of bdds optimized by different heuristic algorithms and with genetic algorithm with 3 types of crossover operators (ox, cx and pmx). the paper shows that results that were produced by genetic algorithms are better than results of the other heuristic algorithms. table 3 compares the sizes of dds generated by the genetic algorithm presented in the paper [11] and by the genetic algorithm that is proposed in this paper. table 3 shows that, for the functions with small number of variables, all algorithms found absolute minimum. for the functions with large number of variables algorithms that used pmx or bu-pmx operator produced better results. the algorithm that is proposed in this paper produced the smallest bdd for 13 out of 15 functions. genetic algorithm for bdd and fdd optimization 15 table 2: bdd size for initial variable order and for optimal order generated by the proposed genetic algorithm function in/out init optimal iterations red. ratio [%] alu4 14/8 1352 701 300 48 cu 14/11 65 37 300 43 misex3 14/14 1301 544 300 58 misex3c 14/14 810 443 300 45 table3 14/14 941 752 300 20 b12 15/9 91 60 300 34 table5 17/15 873 683 300 22 cc 21/20 105 49 400 53 dike2 22/29 976 373 400 62 i1 25/16 58 43 500 26 misex2 25/18 140 86 500 39 vg2 25/8 1059 84 500 92 frg1 28/3 203 89 600 56 c8 28/18 145 93 600 36 in4 32/20 1109 410 600 63 unreg 36/16 146 81 600 45 average 46.375 5.3 results of fdd size optimization as was shown above, the fdd size is dependent on the variable order and the polarity. to determine the mutation that should be used in fdd optimization, a genetic algorithm with different mutation operators is performed on the set of function of a small number of variables (less than 10). table 4 shows sizes of fdds when: • the initial order of variables and positive-polarity is used (init), • the genetic algorithm with mutation by variables exchange is used (ga,ve), • the genetic algorithm with mutation by polarity change is used (ga, pc), and • the genetic algorithm with both mutation operators (with probabilities 0.5) are used (ga,ve+pc). genetic algorithm for bdd and fdd optimization 15 table 2: bdd size for initial variable order and for optimal order generated by the proposed genetic algorithm function in/out init optimal iterations red. ratio [%] alu4 14/8 1352 701 300 48 cu 14/11 65 37 300 43 misex3 14/14 1301 544 300 58 misex3c 14/14 810 443 300 45 table3 14/14 941 752 300 20 b12 15/9 91 60 300 34 table5 17/15 873 683 300 22 cc 21/20 105 49 400 53 dike2 22/29 976 373 400 62 i1 25/16 58 43 500 26 misex2 25/18 140 86 500 39 vg2 25/8 1059 84 500 92 frg1 28/3 203 89 600 56 c8 28/18 145 93 600 36 in4 32/20 1109 410 600 63 unreg 36/16 146 81 600 45 average 46.375 5.3 results of fdd size optimization as was shown above, the fdd size is dependent on the variable order and the polarity. to determine the mutation that should be used in fdd optimization, a genetic algorithm with different mutation operators is performed on the set of function of a small number of variables (less than 10). table 4 shows sizes of fdds when: • the initial order of variables and positive-polarity is used (init), • the genetic algorithm with mutation by variables exchange is used (ga,ve), • the genetic algorithm with mutation by polarity change is used (ga, pc), and • the genetic algorithm with both mutation operators (with probabilities 0.5) are used (ga,ve+pc). 182 s. stojković, d. veličković, c. moraga genetic algorithm for bdd and fdd optimization 183 16 s. stojković, d. veličković and c. moraga table 3: comparision of sizes of bdds produced by the proposed algorithm and by the existing genetic algorithm with ox, cx and pmx crossover operators function in/out ox cx pmx bu-pmx squar5 5/8 37 37 37 37 bw 5/28 106 106 106 106 5xp1 7/10 68 69 68 68 con1 7/2 16 15 15 15 inc 7/9 72 72 72 72 misex1 8/7 36 36 36 36 sqrt8 8/4 33 33 33 33 clip 9/5 102 109 93 93 sao2 10/4 92 90 85 85 alu4 14/8 891 939 734 701 b12 15/9 70 68 50 60 t481 16/1 85 78 30 38 duke2 22/9 506 512 390 373 misex2 25/18 100 102 87 86 vg2 25/8 339 301 148 84 it can be seen from the table that the fdds with minimal sizes are generated when both mutations are used in the genetic algorithm. because of that, in the experiments for optimization of fdds of the functions of a larger number of variables (greater than 10), the approach with both mutation operators is used. results of these experiments are shown in table 5. as it can be seen from this table, fdds are reduced by the proposed genetic algorithm, on the average, by 48.875%. these experiments are done with the functions up to 25 variables. it is applicable on the functions with large number of variables, because the number of cases that are checked in the algorithm is determined by three parameters: • number of crossover operations that is done in one iteration (cx), • possibility of applying of mutation operator (pm), and • maximal number of iterations (it). total number of created dds is: 184 s. stojković, d. veličković, c. moraga genetic algorithm for bdd and fdd optimization 185 16 s. stojković, d. veličković and c. moraga table 3: comparision of sizes of bdds produced by the proposed algorithm and by the existing genetic algorithm with ox, cx and pmx crossover operators function in/out ox cx pmx bu-pmx squar5 5/8 37 37 37 37 bw 5/28 106 106 106 106 5xp1 7/10 68 69 68 68 con1 7/2 16 15 15 15 inc 7/9 72 72 72 72 misex1 8/7 36 36 36 36 sqrt8 8/4 33 33 33 33 clip 9/5 102 109 93 93 sao2 10/4 92 90 85 85 alu4 14/8 891 939 734 701 b12 15/9 70 68 50 60 t481 16/1 85 78 30 38 duke2 22/9 506 512 390 373 misex2 25/18 100 102 87 86 vg2 25/8 339 301 148 84 it can be seen from the table that the fdds with minimal sizes are generated when both mutations are used in the genetic algorithm. because of that, in the experiments for optimization of fdds of the functions of a larger number of variables (greater than 10), the approach with both mutation operators is used. results of these experiments are shown in table 5. as it can be seen from this table, fdds are reduced by the proposed genetic algorithm, on the average, by 48.875%. these experiments are done with the functions up to 25 variables. it is applicable on the functions with large number of variables, because the number of cases that are checked in the algorithm is determined by three parameters: • number of crossover operations that is done in one iteration (cx), • possibility of applying of mutation operator (pm), and • maximal number of iterations (it). total number of created dds is: genetic algorithm for bdd and fdd optimization 17 table 4: initial fdds sizes and sizes of fdds generated by genetic algorithms with different mutation operators function in/out (init) (ga,ve) (ga,pc) (ga, ve+pc) add2 4/3 8 7 7 7 squar5 5/8 32 30 29 29 bw 5/28 144 97 93 93 inc 7/9 121 79 78 73 f51m 8/8 40 34 27 20 sqrt8 8/4 48 25 26 24 table 5: fdd size for initial variable order and positive-polarity and for order and polarity generated by the proposed genetic algorithm function in/out init optimal iterations red. ratio [%] alu4 14/8 840 541 300 36 cu 14/11 74 37 300 50 misex3 14/14 1024 764 300 25 misex3c 14/14 759 635 300 16 b12 15/9 116 62 300 47 cc 21/20 78 40 400 49 misex2 25/18 149 37 500 75 vg2 25/8 942 68 500 93 average 48.875 n = cx · (1 + pm) · it if we need smaller dd, the number of cx and it should be greater. if the execution time is critical, cx and it should be smaller. in our experiments: cx = 2 · n, pm = 0.15, it = [n/5] · 100. n = 2 · n · 1.15 · [n/5] · 100 ≈ 46 · n2. it is much smaller than when the brut-force exact algorithm in which n = 2n · n!. genetic algorithm for bdd and fdd optimization 17 table 4: initial fdds sizes and sizes of fdds generated by genetic algorithms with different mutation operators function in/out (init) (ga,ve) (ga,pc) (ga, ve+pc) add2 4/3 8 7 7 7 squar5 5/8 32 30 29 29 bw 5/28 144 97 93 93 inc 7/9 121 79 78 73 f51m 8/8 40 34 27 20 sqrt8 8/4 48 25 26 24 table 5: fdd size for initial variable order and positive-polarity and for order and polarity generated by the proposed genetic algorithm function in/out init optimal iterations red. ratio [%] alu4 14/8 840 541 300 36 cu 14/11 74 37 300 50 misex3 14/14 1024 764 300 25 misex3c 14/14 759 635 300 16 b12 15/9 116 62 300 47 cc 21/20 78 40 400 49 misex2 25/18 149 37 500 75 vg2 25/8 942 68 500 93 average 48.875 n = cx · (1 + pm) · it if we need smaller dd, the number of cx and it should be greater. if the execution time is critical, cx and it should be smaller. in our experiments: cx = 2 · n, pm = 0.15, it = [n/5] · 100. n = 2 · n · 1.15 · [n/5] · 100 ≈ 46 · n2. it is much smaller than when the brut-force exact algorithm in which n = 2n · n!. 184 s. stojković, d. veličković, c. moraga genetic algorithm for bdd and fdd optimization 185 18 s. stojković, d. veličković and c. moraga 6 conclusion in this paper, a genetic algorithm for bdd and fdd optimization is presented. in the algorithm a modification of the pmx operator is proposed: in the initial phase, instead of two-point crossover, one-point crossover is used. it follows that in the generated dd based on child permutation, part of the dd in the last levels is equal to the corresponding part of dd generated by the parent chromosome. in this way, the child chromosome inherits good properties of parent chromosome. in the case of fdd optimization, the proposed algorithm introduced mutation of polarity. experiments show that when this mutation is used in combination with variable exchange, the genetic algorithm gives the best results. in the presented algorithm, sifting is not used as an additional method to improve the generated diagrams. our goal was to show the performances of the genetic algorithm. in a real application of the algorithm, sifting can be included, too. references [1] r. e. bryant, “graph-based algorithms for boolean function manipulation,” ieee transactions on computers, vol. c-35, no. 8, pp. 677–691, 1986. [2] s. j. friedman and k. j. supowit, “finding the optimal variable ordering methods for binary decision diagrams,” ieee transactions on computers, vol. 39, no. 5, pp. 710–713, 1990. [3] r. rudell, “dynamic variable ordering for ordered binary decision diagrams,” in proceedings of international conference on cad, 1993, pp. 42–47. [4] r. drechsler, b. becker, and n. göckel, “a genetic algorithm for variable ordering of ob-dds,” in iee proceedings computers and digital techniques, vol. 143, no. 6, 1996, p. 363368. [5] r. drechsler and n. göckel, “minimization of bdds by evolutionary algorithms,” in international workshop on logic synthesis (iwls), 1997. [6] r. drechsler, b. becker, and n. göckel, “learning heuristics for obdd minimization by evolutionary algorithms,” in proceedings parallel problem solving from nature (ppsn), lecture notes in computer science, vol. 1141, 1996, pp. 730–739. [7] w. lenders and c. baier, “genetic algorithms for the variable ordering problem of binary decision diagrams,” lecture notes in computer science, vol. 3469, pp. 1–20, 2005. [8] i. furdu and b. patrut, “genetic algorithm for ordered decision diagrams optimization,” in proceedings of icmi 45, 2006, pp. 437–444. 186 s. stojković, d. veličković, c. moraga genetic algorithm for bdd and fdd optimization 187 18 s. stojković, d. veličković and c. moraga 6 conclusion in this paper, a genetic algorithm for bdd and fdd optimization is presented. in the algorithm a modification of the pmx operator is proposed: in the initial phase, instead of two-point crossover, one-point crossover is used. it follows that in the generated dd based on child permutation, part of the dd in the last levels is equal to the corresponding part of dd generated by the parent chromosome. in this way, the child chromosome inherits good properties of parent chromosome. in the case of fdd optimization, the proposed algorithm introduced mutation of polarity. experiments show that when this mutation is used in combination with variable exchange, the genetic algorithm gives the best results. in the presented algorithm, sifting is not used as an additional method to improve the generated diagrams. our goal was to show the performances of the genetic algorithm. in a real application of the algorithm, sifting can be included, too. references [1] r. e. bryant, “graph-based algorithms for boolean function manipulation,” ieee transactions on computers, vol. c-35, no. 8, pp. 677–691, 1986. [2] s. j. friedman and k. j. supowit, “finding the optimal variable ordering methods for binary decision diagrams,” ieee transactions on computers, vol. 39, no. 5, pp. 710–713, 1990. [3] r. rudell, “dynamic variable ordering for ordered binary decision diagrams,” in proceedings of international conference on cad, 1993, pp. 42–47. [4] r. drechsler, b. becker, and n. göckel, “a genetic algorithm for variable ordering of ob-dds,” in iee proceedings computers and digital techniques, vol. 143, no. 6, 1996, p. 363368. [5] r. drechsler and n. göckel, “minimization of bdds by evolutionary algorithms,” in international workshop on logic synthesis (iwls), 1997. [6] r. drechsler, b. becker, and n. göckel, “learning heuristics for obdd minimization by evolutionary algorithms,” in proceedings parallel problem solving from nature (ppsn), lecture notes in computer science, vol. 1141, 1996, pp. 730–739. [7] w. lenders and c. baier, “genetic algorithms for the variable ordering problem of binary decision diagrams,” lecture notes in computer science, vol. 3469, pp. 1–20, 2005. [8] i. furdu and b. patrut, “genetic algorithm for ordered decision diagrams optimization,” in proceedings of icmi 45, 2006, pp. 437–444. genetic algorithm for bdd and fdd optimization 19 [9] i. furdu and t. socaciu, “genetic algorithm for variable ordering of ordered binary decision diagrams,” in proceedings of cnmi, 2007, pp. 67–78. [10] r. kaur and m. bansal, “bdd ordering and minimization using variouscrossover operators in genetic algorithm,” inernational journal of innovative research in electrical, electronics, instrumentation and control engineering, vol. 2, no. 3, pp. 1247–1250, 2014. [11] s. jindal and m. bansal, “a novel and efficient variable ordering and minimization algorithm based on evolutionary computation,” indian journal of science and technology, vol. 8, no. 48, pp. 1–10, 2016. [12] m. hilgemeier, n. drechsler, and r. drechsler, “minimizing the number of one-paths in bdds by an evolutionary algorithm,” 2003. [13] s. shirinzadeh, m. soeken, and r. drechsler, “multi-objective bdd optimization with evolutionary algorithms,” 2015, pp. 751–758. [14] s. stojković, m. stanković, and c. moraga, “complexity reducton of toffoli networks besed on fdd,” facta universitatis, ser. electronics and energetics, vol. 28, no. 2, pp. 251–262, 2015. [15] s. stojković, m. stanković, c. moraga, and r. stanković, “procedure for fddbased reversible synthesis by levels,” 2016, pp. 1–6. 186 s. stojković, d. veličković, c. moraga genetic algorithm for bdd and fdd optimization 187 instruction facta universitatis series: electronics and energetics vol. 29, no 4, december 2016, pp. 509 541 doi: 10.2298/fuee1604509p p-channel mosfet as a sensor and dosimeter of ionizing radiation  milić m. pejović university of niš, faculty of elecronic engineering, niš, serbia abstract. this paper presents a study of mosfets as a sensor and dosimeter of ionizing radiation. the electrical signal used as a dosimetric parameter is the threshold voltage. the functionality of these components is based on radiation-induced ionization in sio2, which results in increase of positive charge trapped in the sio2 and interface traps at si sio2, leads to change in threshold voltage. the first part of the paper deals with analysis of defect precursors created by ionizing radiation, responsible for creation of fixed and switching traps, as well as most important techniques for their separation. afterwards, the results for sensitive p-channel mosfets (radfets) are presented, following with results for commercially available mosfets applications as a sensors of ionizing radiation. key words: fixed traps, fading, mosfet, radfet, switching traps, threshold voltage shift 1. introduction the attention of today’s research on the impact of ionizing radiation on mosfets is directed in two ways. the first one is the production on mosfets with the highest possible resistance to ionizing radiation (radiation hardness), while the other is toward to ionizing radiation dosimeters production. the first report on the use p-channel mosfet as integrating radiation dosimeter was published in 1970 [1] and this idea was verified by results published in 1974 [2]. further investigations lead to the manufacture of radiation sensitive p-channel mosfets, also known as radiation sensitive field effect transistor (radfet) or pmos dosimeter [3]. radfet has been shown to be suitable for dose measurements in various applications, such as diagnostic radiology and radiotherapy [4][8], space radiation monitoring [9]-[12], irradiation of food plants [13] and in personal dosimetry [14], [15]. the radfet radiation-sensitive region, the oxide film layer under the al-gate is typically 1m  200m  200m, i.e., the sensing volume is much smaller than competing integral dose measuring devices as the ionizing chamber or thermoluminescent dosimeter, implying that it can also be used in vivo dosimetry [16], [17]. this property of the radfet received march 22, 2016 corresponding author: milić m. pejović universiy of niš, faculy of electronic engineering, aleksandra medvedeva 14, 18000 niš, serbia (e-mail: milic.pejovic@elfak.ni.ac.rs) 510 m. pejović also makes it attractive for measurement in the gradient radiation field where the gradient mostly depends on a single space coordinate, like resolving dose of x-ray microbeams or depth dose distribution [18]. the advantages of radfets include immediate, nondestructive dosimetric information readout, real time or delayed reading, possible integration with other sensors and/or electronics, wide dose range, accuracy and competitive price [19], [20]. the application of radfets for dosimetry is hadron therapy, which is one of the promising radiation modalities in radiotherapy, another field where it is possible to explore their advantages. hadron therapy includes, fast neutron therapy, proton therapy, heavy ion therapy and boron-neutron capture therapy. it is shown [21] that radfets are less sensitive to neutron radiation than the photon or charge particles. on the other hand, a disadvantage of radfets is the need for separate calibration in the fields of different modalities and energy. moreover, radfets have a certain range of the total accumulated dose, which depends on the dosimeter type and sensitivity. once the upper limit of linearity is achieved, the radfets need to be replaced. however, recent studies have shown that such radfets can be recovered for reuse by storing at room or elevated temperature for a sufficient time [22], [23] or by annealing with current [24], [25]. the dosimetry of ionizing radiation using radiation sensitive mosfets is based on converting the threshold voltage shift vt into radiation dose d. this shift originates in the radiation-induced electron-hole pairs in the gate oxide layer of the transistor which lead to increase in the density of interface traps and build-up or neutralization of positive trapped charge. the sensitivity of radfets can be adjusted, which makes them suitable for various applications. for example, sensitivity can be tuned using different gate oxide layer thickness [26], [27], or in some cases by stacking transistors [14], [15]. the sensitivity can also be tuned by applying positive bias on the gate during irradiation [28], [29]. 2. the defects precursors created by ionizing radiation ionizing radiation leads to formation of large number of defects in sio2 and at sio2 si interface, which are responsible for mosfets threshold voltage shift. the defects which make significant impact to devices performance will be discussed further. 2.1. photon induced ionization during gamma or x-ray irradiation photons interact with the electrons in the sio2 molecules releasing secondary electrons and holes, i.e., photons break sio  o and sio  sio covalent bonds in the oxide [30] (the index o is used to denote silicon atom in the oxide). the released electrons (so called “secondary electrons”) which are highly energetic, may be recombined by holes at the place of production, or may escape recombination. the secondary electrons that escape recombination with holes travel some distance until they leave the oxide, losing their kinetic energy through the collisions with the bonded electrons in the sio  o and sio  sio covalent bonds in the oxide, releasing more secondary electrons (the latter bond represents an oxygen vacancy). p-channel mosfet as a sensor and dosimeter of ionizing radiation 511 each secondary electron, before it has left the oxide or been recombined by the hole, can break a lot of covalent bonds in the oxide producing a lot of new secondary highly energetic electrons, since its energy is usually much higher than an impact ionizing process energy (energy of 18 ev is necessary for the creation of one electron-hole pair [30], i.e., for the molecule ionization). it is obvious that the secondary electrons play a more important role in bond breaking than highly energetic photons, as a consequence of the difference in their effective masses, i.e., in their effective cross sections. the electrons leaving the production place escape the oxide very fast (for several picoseconds), but the holes remain in the oxide. the holes released in the oxide bulk are usually only temporary, but not permanently trapped at the place of production, since there are no energetically deeper centers in the oxide bulk. the holes move toward one of the interface (sio2-si or sio2-gate), depending on the oxide electric field direction, where they have been trapped at energetically deeper trap hole centers [31], [32]. moreover, even in the zero gate voltage case, the electrical potential due to a work function difference between gate and substrate is high enough for partial or complete moving towards an interface. 2.2. the defects created by secondary electrons in impact ionization a secondary electrons passing through oxide bulk, break covalent bonds in the oxide by the impact ionization and create  sio  o + sio  complex, where  denotes the three sio  o bonds (o3  sio  o) and  denotes the unpaired electron. the formed  sio  o + sio  complex is energetically very shallow, representing the temporary hole centre (the trapped holes can easily leave it [33]). the strained silicon-oxygen bond  sio  o  sio  mainly distributed near the interfaces can also be easily broken by the passing secondary electrons, usually created non-bridgingoxygen (nbo) centre,  sio  o  , and positively charged e' center,   o si [34] known as a e's center [35]. a nbo centre is an amphoteric defect that could be more easily negatively charged than positively by trapping an electron. the nbo as an energetically deeper centre is the main precursor of the traps (defects) in the oxide bulk and the interface regions. a secondary electron passing through the oxide can also collide with an electron in the strained oxygen vacancy bond  sio  sio , which is a precursor of a e' centre (   o si ) [36], breaking this bond and knocking out an electron. the oxygen vacancy bonds are mainly distributed in the vicinity of the interfaces. the trapped charge can be positive (oxide trapped holes) and negative (oxide trapped electrons) and the former is more important, since the hole trapping centers more numerous including e's, e' and nbo centers, compared with one electron trap centre (nbo). the holes and electrons trapped near the si – sio2 interface have the biggest effect on mosfet characteristics, since they have the strongest influence on the channel carriers. 2.3. defects created in sio2 by hole transport the holes trapped at   o si centers formed from oxygen vacancies and strained silicon-oxygen bonds are energetically deep and steady, at which the holes can remain for longer time period, i.e. they can be hardly filled by electrons than some shallowly trapped holes. these centers exist near both interfaces, especially near the si – sio2 interface. the 512 m. pejović holes created and trapped at the bulk defects, representing energetically shallow centers, are forced to move towards one of the interfaces under the electric field, where they are trapped at deeper traps, since there a lot of oxygen vacancies, as well as a lot of strained silicon-oxygen bonds near the interfaces, grouping all positive trapped charge there. the holes leave the energetically shallow centers in the oxide spontaneously and transporting to the interface (fig. 1(a)) by hopping process using either shallow centers in the oxide (fig. 1(b)); the holes “hop” from one to another center or centers in the oxide valence band (fig. 1(c)) [31], [37]. fig. 1 displays the hole transport in the space for the positive gate bias (a) and the energetic diagram for the possible mechanisms of this space process (b) and (c). fig. 1 space diagram: (a) hole transport through the gate oxide layer in the case of positive gate bias. “x” represents unbroken bonds and “o” broken bonds (trapped holes at shallow traps), respectively, and “  ” represents the hole trap precursors near the interface (precursors of a deep trap). energetic diagram: the hole transport (b) by tunneling between to localized traps and (c) by the oxide valence band. fig. 2 shows the possible hole (electron) tunneling between adjacent centers: a shallow centre and deep centre. when there is no gate bias (fig. 2 (a)) the holes (electrons) tunneling between these centers, is not possible. when the transistor is positively biased (fig. 2 (b)), the bonded electron can tunnel from the deep centre to the shallow centre. it represents the hole tunneling from shallow to deep centers, being permanently trapped at the deep center. the electron, which is now in the shallow centre, can easily tunnel from this shallow centre to the next adjusted shallow centre, enabling the hole transport towards the interface [31]. p-channel mosfet as a sensor and dosimeter of ionizing radiation 513 fig. 2 the electron tunneling between adjacent centers: (a) shallow centre and (b) deep centre. moving throughout the oxide, the holes react with the hydrogen defect hsio  and ohsi o  finally create e's, e' , nbo centers, hydrogen ions  h and hydrogen atoms o h .  h ions and o h atoms were important for defect creation at sio2-si interface (see the next subsection). when the holes reach the interface, they can break both the strained oxygen vacancy bonds  oo sisi , forming e' centers [34] and the strained silicon-oxygen bonds  oo siosi , created the e's and nbo centers [31]. these centers represent energetically deeper hole and electron trapping centers, respectively. it should be noted that the energetic levels of the defects created after the holes at e's, and e' centers and electrons at nbo centre, respectively, have been trapped can be various. the chemically same defects show different behaviors depending on the whole bond structure: the angles and distances between the surrounding atoms [38]-[43]. 2.4. defects created at sio2 si interface the defects at the sio2-si interface known as true interface traps represent an amphoteric defect si3  si  s (index s is used to denote silicon atom in substrate): a silicon atom  si  s at sio2-si interface back bonded to three silicon atoms from the substrate  sis usually denoted as  si  s or si  . they can directly be created by incident photons passing to the substrate or the gate [44], [45] but this amount can be neglected. interface traps are mainly created by trapped holes (h + model) [46]-[49] and by hydrogen released in the oxide (hydrogen-released species model– h model) [50]-[52]. the h + model proposes that a hole trapped near the sio2-si interface created interface trap, suggesting that an 514 m. pejović electron-hole recombination mechanism is responsible [47]. namely, when holes are trapped near the interface and electrons are subsequently injected from substrate, recombination occurs. from the energy released by this electron-hole recombination the interface state may be created. the h model proposes that h + ions released in the oxide by trapped holes drift towards sio2-si interact with  sio  h and  sio  oh defects, drifting toward the sio2-si interface under the positive electric field. when the h + ion arrives at the interface, it picks up an electron from the substrate, breaking a highly reactive hydrogen atom h o [53]. also, according to the h model the hydrogen atoms o h released in reaction holes with  sio  h and  sio  oh defects and diffuse towards the sio2-si interface under the existing concentration gradient. these atoms react without an energy barrier at the interface producing interface trap in interaction with interface trap precursors  sis  h and  sis  oh [54]-[56]. interaction between h o atoms with  sis  h and  sis  oh precursors, beside creation of interface traps in interaction with interface trap precursors, leads to the creation of h2 and h2o molecules, respectively [31], [53]. h2 molecules diffuse towards the bulk of oxide where it is cracked at cc + centers [57]. this cracking process ensured the continuous source of h + ions, which drift towards the interface to form interface traps [58]. 2.5. classification of defects according to their influence on i-v characteristics the above mentioned defects can be divided to fixed traps (ft) and switching traps (st). ft represents traps in the oxide that do not have an ability to exchange the charge with the channel (substrate) within the transfer/subthreshold characteristic measurement time frame [59]. ft could be either negatively or positively charged, and they attract or repulse the channel carrier by the coulomb force, depending on the charge sign of both their charge and channel carrier charge. st represent the traps created near and at sio2-si interface and they do capture (communicate with) the carrier from the channel within the transfer/subthreshold characteristic measurement time frame [59]. the st created in the oxide near sio2-si interface are called slow switching traps (sst), but the st created at the interface are called fast switching traps (fst) also called true interface traps. the sst located in the oxide, closed to the sio2-si interface are also known slow states (ss) [60], anomalous positive charge (apc) [61], [62], switching oxide traps (sot) [63] and border traps [64]. it was emphasized that the influence of ft and st on the transistor subthreshold characteristics is manifested through the parallel shift and its slope variation, respectively. ft are usually deeper in the oxide, and during the long time post-irradiation annealing they can only be permanently recovered or temporally compensated (as in the case of switching gate bias experiments). it is emphasized that fst are amphoteric, and each of them contributes to two states within the silicon band gap (an acceptor and a donor) which could be randomly distributed inside it. 3. transistor characterization there are several techniques for ft and st separation [65]. most commonly used techniques are subthreshold midgap and charge pumping technique. their basic principle will be presented. p-channel mosfet as a sensor and dosimeter of ionizing radiation 515 3.1. subthreshold midgap technique fig. 3 subthreshold characteristics of radfets with 100 nm thick gate oxide manufactured by tyndall national institute, cork, ireland: (0) before gamma-ray irradiation and (1) after irradiation to 500 gy. the midgap-subthreshold (mg) technique [59] for determination of ft and st densities is based on analysis of mosfets subthreshold characteristics. namely, the influence of ft and st on the transistor subthreshold characteristics in saturation is in their parallel shifts and their slope changes, respectively. the first step is linear regression of the linear regions of subthreshold characteristics (fig. 3). the linear regression gives a straight line nvmi gd )log( . the next step in the procedure is the calculation of the midgap current before irradiation, img0 and after irradiation img. the calculation of the midgap current is performed using the subthreshold-current equation for a transistor in saturation [66]: )exp()( 2 2 2 ,0 s sda i dx s d kt q q kt qn ktn lc i     , (1) where 0 /x effw c l  and 2 , / d s a d l kt q n is the debye length. in this equation c0x is the oxide capacitance per unit area, k is the boltzmann’s constant, q is the absolute value of electron charge, t is the absolute temperature, ni is the intrinsic carrier concentration, na,d is the doping concentration, s is the silicon permittivity, s is the surface potential,  is the carriers mobility, w is the channel width and leff is the effective channel length. regardless of the distribution within the substrate energy gap, interface traps are electrically neutral (total charge equals zero) when surface potential s is equal to fermi’s potential f and that is the case when fermi’s level is in the middle of the semiconductor’s energy gap. in that case, the shift between two subthreshold characteristics towards the vg-axis is a consequence of the charge of ft only, and the gate voltage which 516 m. pejović corresponds to these surface potential is denoted as vmg (midgap voltage) and it can be obtained as abscissa of the (vmg, img) point at subthreshold characteristics (fig. 3). using the equation log(id) = m  vg + n, obtained by the linear fit of subthreshold characteristic, the vmg, i.e., vg that corresponds to id = img could be found as ./])[log( mniv mgmg  using this procedure, vmg0 and vmg are found. in fig. 3 a region used for the linear fit is shown, and the straight lines obtained by the linear fits of subthreshold characteristics are extended up to corresponding midgap current img. the component of threshold voltage shift due to ft, ft v is 0mgmgmgft vvvv  , (2) where vmg0 and vmg are midgap voltages before irradiation and after irradiation, respectively. the component of threshold voltage shift due to st, vst, is 000 )()( ssmgtmgtst vvvvvvv  , (3) where vt0 and vt are transistors threshold voltages before irradiation and after irradiation, respectively and threshold voltage shift is 0ttt vvv  . vt0 and vt are determined from the transfer characteristics in saturation as the intersection between vg-axis and extrapolated linear region of )( gd vfi  curves that are modeled by the following equation [66]: 20 )( 2 tg eff x d vv l wc i    . (4) the total value of threshold voltage shift, t v can be expressed as [67]: stftt vvv  , (5) st x ft x t n c q n c q v  00 , (6) where q is the absolute value of electron charge nft is the areal density of ft and nst is the areal density of st. signs “+” and “-“ are for p-channel and n-channel mosfet, respectively. as it can be seen from exp. (6), both the ft and st contribute to the threshold voltage shift in p-channel mosfet in the same direction. also, so called “rebound effect” [30] is absent in p-channel mosfets: this phenomenon is due to competitive effects to the positive charge in the oxide and negative interface traps generated in n-channel mosfets leading to a positive or negative vt values depending on the relatively values of nft and nst. this is a reason that more commonly p-channel mosfets are used as sensor or dosimeter of ionizing radiation. it is emphasized that nft could contain a small amount of sst that are located deeper in the oxide, since there is not enough time for the carriers from the channel to reach them during measurement frames. 3.2. charge pumping technique as opposed to the mg technique, the charge-pumping (cp) technique does not give changes in charge densities in the positive oxide trapped charge and interface traps, but is p-channel mosfet as a sensor and dosimeter of ionizing radiation 517 used solely to determine interface traps density while the positive oxide trapped charge can be subsequently determined on the basis of the expression (6) under the condition that the change in threshold voltage known [68]-[70]. fig. 4 shematic diagram of charge pumping measurement. the charge-pumping effect can be explained on the basis of the scheme in fig. 4 [69]. the source and the drain of the transistor are short-circuited and p-n junction of source and drain with the substrate are inversely polarized with vr voltage. in the absence of signal at the gate, under the influence of inverted polarization at the junction source-substrate and drain-substrate, the inverted saturation current of these connections will flow. when a train of rectangular pulses of sufficiently high amplitude is applied to gate (with pulse generator), a change of current direction in the substrate occurs. the intensity of that current is proportional with the pulse frequency, and “pumping” of the same amount of electric charge towards the substrate. as current cannot flow through oxide, the electric charge in the substrate go through p-n junction of source and drain. in this way, in the case of n-channel mosfets, a channel is formed under the gate in positive pulse half-period, whereby electrons are captured on interface traps. during the negative half-period, when the channel area turns into the state of accumulation, mobile electrons from the channel are returned to the source and drain, and the captured electrons are recombined with holes from the accumulated layer, thereby generating cp current icp, whose maximum value icp,max is expressed by[70] edfqadafqi itgsitgcp  2 max, , (7) where ag is the area under the gate active in charge pumping and f is the pulse frequency and s = qe is the total sweep of the surface potential that corresponds to the e. in order to avoid recombination with channel electrons, it is necessary to ensure their return to the source and drain before overflow of cavities from the substrate occurs, which is accomplished by using reverse polarization of p-n junction or using a train of trapezoid pulses or triangular pulses with sufficient times for rise of time tr and fall time tf pulse. however, part of the electrons whose capture is shallowest, are in the meantime thermally emitted into conductive band of the substrate, reducing the width of interface traps energy range measure by the cp technique, so that cp current is generated by interface traps within the range [70] 518 m. pejović )ln(2 fr g fbt pnith tt v vv nvkte    (8) which is 0.5 ev from the middle of the forbidden band. in the expression (8) vth is thermal velocity, n and p are cross section surface of carrier captures, ni is self-concentration of carriers in the semiconductor and vg is pulse height. fig. 5 elliot-tipe cp curves of radfets with a 100 nm thick gate oxide manufactured by tyndall national institute, cork, ireland: (0) before gammaray irradiation and (1) after irradiation to 500 gy. the absolute value of interface traps density nit can be calculated using equation (7) and edn itit  : faq i n g cp it   max . (9) the change in areal density of interface traps is 0)()( ititit ntncpn  , where nit(t) is the absolute value of interface trap density after irradiation time t and nit0 is the absolute value of interface trap density before irradiation. icpmax (fig. 5) is directly proportional to the pulse frequency and a small-size transistor with usual state density needs a frequency of at least several khz to enable the charge-pumping current level reach the order of magnitude of picoampers. due to this, cp measuring is most often conducted with frequencies in the range between 100 khz and 1 mhz, whereby only fst (true interface traps) are registered (in some frequencies, cp is also contributed by of sst which also captures electrons from the channel [71]). as the cp technique required a separate outlet for the substrate, it could be concluded that it is not applicable for power vdmosfets, in which the p-bulk is technologically connected to the source. however the cp technique for these devices is applicable in a somewhat altered form (see [72], [73] for more details). p-channel mosfet as a sensor and dosimeter of ionizing radiation 519 the density nit found by cp technique using expression (9) is, in fact, also the switching trap density nst (cp)  nit (cp). however, a very useful feature of the cp technique is that, as a much fast technique, it can sense only the fst and eventually just the fastest among sst. hence, the density of st measured by the cp technique is indicative of true interface trap (fst) behavior, i.e., nst (cp) = nfst. the simultaneous use of both techniques has a great advantage. for instance, the difference in the behavior of nst (mg) and nst (cp) is a consequence of sst [74], since nst (mg) = nsst + nfst and nst (cp) = nfst. 3.3. single point threshold voltage shift measurements fig. 6 threshold voltage measurement configuration based on constant current. as mentioned above, one of the methods for threshold voltage determination is based on transfer characteristics in saturation as the intersection between vg-axis and extrapolated linear region of )( gd vfi  curves that are modeled by equation (4). the single point threshold voltage measurement requires measuring the drain-source voltage while the transistor remain biased by a constant drain current and the gate and drain terminals are short-circuited (fig. 6) [75]. under this configuration, the source-drain voltage shift is taken as vt. the monitoring of the drain-source voltage can be done continuously during irradiation. most of the commercial dosimetry systems based on mosfets measure increments of the drain-source voltage at constant drain current [76]-[78]. usually, in the order to minimize the thermal drift, the drain current selected is the zero temperature coefficient current, iztc, for which the thermal dependence of the drain-source voltage cancels out. when i  vout are measured at different temperatures, all of them intersect in the same point. in fig. 7 presented the readout current ranging from 1 to 150 a and the vout voltage (vsd) were measured at temperature ranges from 25 to 100 o c for radfets with 400 nm thick gate oxide manufactured by tyndall national institute, cork, ireland. as it can be seen, all of these curves intersected in the vicinity of 12 a. it could be concluded that a selection of this current would minimize the effect of the temperature on the threshold voltage. 520 m. pejović fig. 7 single-point characteristics of radfets with a 400 nm thick gate oxide layer manufactured by tyndall national institute, cork, ireland at various temperatures. 4. radfet as a sensor and dosimeter of ionizing radiation as it was stated before, the first results in mosfets application in dosimetry were published by andrew holmse-siedle in 1974. [2]. basic principles in application of these devices as sensors and dosimeters of ionizing radiation were presented. several research groups which dealt with similar problems appeared afterwards. those are canadian research group [79], usa navy research group [80], [81], french research group [82], [83], netherlands [84], [85], usa [86], [87] and serbia [88]-[90]. large number of companies and institutes throughout the world are engaged in production of radiation sensitive mosfets. among them is tyndall national institute, cork, ireland. this institute produces radfets with gate oxide thicknesses of 100 nm, 400 nm and 1 m. some of the results related to these components will be presented in this paper regarding several important dosimetric parameters. 4.1. sensitivity of radfets irradiated by gamma rays 4.1.1. influence of gate bias fig. 8 shows the threshold voltage shift vt of radfets with 100 nm gate oxide layer thickness for gamma-ray radiation dose d in the range from 100 to 500 gy without and with gate bias during irradiation virr = 5v [91]. it can be seen that for irradiation without gate bias in the range from 0 to 500 gy, vt increases for about 0.4 v. for the same dose interval, for gate bias 5 v, vt increased for about 2.3 v. the sensitivity is defined as vt / d, so it can be concluded that the gate bias virr = 5v significantly increases the sensitivity of the radfet. p-channel mosfet as a sensor and dosimeter of ionizing radiation 521 fig. 8 threshold voltage shift vt as a function of radiation dose d, for 100 nm gate oxide thick layer radfets, without and with gate bias during irradiation of virr = 5v. in order to determine the contribution of ft and st to total vt during irradiation, their densities were determined using mg and cp techniques, and results are presented in figs. 9 and 10, respectively. it can be seen that the increase of radiation dose d lead to increase in both nft and nst and that these increase are smaller for radfets previously irradiated without gate bias. also, for the same values of d and virr the increase of ft density is larger than the increase of st density. the st density nst (mg) determined using mg technique is bigger than the st density nst (cp) determined using cp technique. this is due to the fact that mg technique determines both sst and fst, while cp technique determines only fst (true interface traps). from figs. 9 and 10, it can be seen that ft density is for about order of magnitude larger than st density obtained using the mg technique. these results have shown that ft play a crucial role in threshold voltage shift. fig. 9 the change in areal density of fixed traps nft as a function of radiation dose d, for 100 nm gate oxide thick layer radfets, without and with gate bias during irradiation virr = 5v. 522 m. pejović fig. 10 the change in areal density of switching traps as a function of radiation dose d, for 100 nm gate oxide thick layer radfets, obtained by mg and cp technique without and with gate bias of virr = 5v. fig. 11 shows vt = f (d) dependence of a radfets with a gate oxide layer thickness of 100 nm for gamma-ray radiation dose in the range from 0 to 50 g [92]. during the irradiations the gate biases virr were 0, 1.25, 2.5, 3.75 and 5 v. the threshold voltage shift for the same dose increases in gate bias increase. the radiation dose up to 50 gy did not degrade the linearity of the radfets significantly, which is significant for practical applications of these devices. fig. 11 threshold voltage shift vt as a function of radiation dose d for radfets with a 100 nm thick gate oxide layer for various gate bias virr during irradiation. in general, one can express the dependence of vt on d as: n dav  , (10) where a is a constant and n is the degree of linearity. ideally, n =1, and the dependence is linear with the sensitivity s = vt / d. correlation coefficients for linear fits for all values of virr (fig. 11) are r 2 = 0.999. having that r 2 are very close to one, it can be assumed that there is a linear dependence between vt and d and that the sensitivity of radfets for a given values of irr v is the same in the range from 0 to 50 gy. p-channel mosfet as a sensor and dosimeter of ionizing radiation 523 fig. 12 sensitivity s as a function of gate bias virr during irradiation for radfets with a 100 nm thick gate oxide layer for radiation dose of 50 gy. fig. 12 shows the sensitivity s as a function of gate bias virr during gamma-ray radiation dose of 50 gy for the radfets with 100 nm gate oxide thickness [92]. the symbols stand for experimental data while the solid lines represent fits, which are exponential. fig. 13 shows vt = f(d) of radfets with a 400 nm gate oxide layer thickness for gamma-radiation dose in range from 0 to 5 gy and virr = 0v and virr = 5v [78], [93]. expression (10) very well describes experimental data because the correlation coefficient is r 2 = 0.999. these results, as well as those presented in figs. 8 and 12, show that the increase in gate bias during irradiation lead to the increase in vt value, i.e. the sensitivity is increased as well. fig. 13 threshold voltage shift vt as a function of radiation dose d for radfets with a 400 nm thick gate oxide layer in the case without gate bias and gate bias of virr = 5v. 4.1.2. influence of gate oxide layer thickness fig. 14 shows the threshold voltage shift vt as a function of radiation dose d for radfets with gate oxide layer thicknesses of 100 nm, 400 nm and 1m. the gammaray irradiation of these devices was performed in the dose range from 0 to 50 gy, while the gate bias was virr = 5v [92]. it can be seen that the increase in gate oxide layer 524 m. pejović thickness lead to significant increase in vt for the same radiation dose. it is mainly due to the increase in ft density [94]. experimental data fitting using expression (10) for n=1, gives the correlation coefficient value, for radfets whith 100 and 400 nm gate oxide thickness, r 2 = 0.999, what proves linear dependence between vt and d, i.e. the sensitivity is the same in considered dose range and this value is higher for 400 nm gate oxide later thickness. for 1 m gate oxide thickness radfets, correlation coefficient is r 2 = 0.976, so there is no linear dependence between vt and d, and hence the sensitivity is different for different values of radiation dose. fig. 14 threshold voltage shift vt as a function of radiation dose d for three values of gate oxide layer thickness. gate bias during irradiation was virr = 5v. vt = f(d) dependence for radfets with a gate oxide layer thicnesses of 400 nm and 1m is shown in fig. 15 [78]. irradiation of these devices was also performed with gamma-rays and gate bias during irradiation of virr = 5v but the dose range was from 0 to 5 gy. i was also shown that the sensitivity increases with gate oxide thickness and that here is linear dependence between vt and d (correlation coefficient r 2 = 0.999). fig. 15 threshold voltage shift vt as a function of radiation dose d for two values of gate oxide layer thickness. gate bias during irradiation was virr = 5v. p-channel mosfet as a sensor and dosimeter of ionizing radiation 525 4.1.3. photon energy influence on radfets sensitivity threshold voltage shift vt for radfets with 1 m gate oxide layer thickness, irradiated with gamma-rays which originates from 60 co and x-rays with energy of 140 kev for radiation dose in the range from 0 to 5 gy for virr = 0v and virr = 5v is presented in figs. 16 and 17, respectively [95]. it can be seen that vt increases in much higher in the case when radfets are irradiated with x-rays then in the case of gamma-rays. it is a consequence of different photon energies which lead to ionization of the oxide gate molecules. namely, x-rays photon energy of 140 kev leads to molecule ionization by both photoeffect and compton’s effect, while gamma-rays, which originate from 60 co with energies of 1.17 and 1.33 mev lead to molecules ionization only by compton’s effect [31]. since the probability for molecule ionization by photoeffect is significantly higher than compton’s effect, during x-ray irradiation a large number of ft and st are formed than during gamma-ray irradiation which directly causes the change in vt values. fig. 16 threshold voltage shift vt as a function of radiation dose of gamma and x-ray d for 1 m gate oxide thick layer radfets irradiated without gate bias. fig. 17 threshold voltage shift vt as a function of radiation dose of gamma and x-ray d, for 1 m gate oxide thick layer radfets irradiated with virr = 5v. 526 m. pejović fitting of experimental data for gamma radiation dose in the range from 0 to 5 gy (figs. 16 and 17) using the expr. (10) for n=1 gives correlation coefficient of r 2 = 0.998 for both virr = 0v and virr = 5v. having that the correlation coefficients are very close to one, it can be assumed that there is a linear dependence between vt and d, so that the sensitivity vt / d is the same in whole interval. correlation coefficients for the case when radfets are irradiated with x-rays (figs. 16 and 17) are 0.96 and 0.95 for virr = 0v and virr = 5v, respectively, so it is shown that there is no linear dependence between vt and d. fig 18 threshold voltage shift vt as a function of radiation dose d, for 1 m gate oxide thick layer radfets, for two value of energy of x-ray. gate bias during irradiation is virr = 5v. fig. 18 shows vt = f(d) dependence of radfets with 1 m gate oxide layer thikcness during x-ray irradiation with photons energies of 90 and 140 kev for gate bias virr = 5v [96]. it can be seen that lower photon energy leads to a greater change in vt for the same radiation dose. similar behavior is detected in tn-502 rdi mosfets (thomson and neilson electronic ltd, ottawa, canada) [97]. fig. 19 threshold voltage shift vt as a function of x-ray radiation dose d for radfets with a 400 nm thick gate oxide layer. radiation was performed without and with gate bias of virr = 5v. p-channel mosfet as a sensor and dosimeter of ionizing radiation 527 fig 20 threshold voltage shift vt as a function of x-ray radiation dose d for radfets with a 1m thick gate oxide layer. radiation was performed without and with gate bias of virr = 5v. figs. 19 and 20 show the threshold voltage shift for x-ray radiation dose in the range from 0 to 100 cgy for radfets with gate oxide layer thickness of 400 nm and 1 m, respectively. fig. 21 shows the same dependence for x-ray radiation dose in the range from 1 to 10 cgy for radfets with gate oxide layer thickness of 1 m [98]. these dependence are given for gate bias during irradiation virr = 0v and virr = 5v. as it can be seen, vt values are higher when the gate bias during irradiation was virr = 5v, compared with the case when it was virr = 0v. furthermore, vt is higher for radfets with large gate oxide layer thickness (figs. 19 and 20). results presented in fig. 21 show that vt values can be detected with good reliability even for radiation dose of 1 cgy. fig. 21 threshold voltage shift vt as a function of x-ray radiation dose d for radfets with a 1m thick gate oxide layer. radiation was performed without and with gate bias of virr = 5v. 528 m. pejović 4.2. irradiated radfets fading as a dosimeter a radfets must satisfied two fundamental dosimetric demands: a good compromise between sensitivity to irradiation and stability with time after irradiation. the stability means insignificant change in vt of an irradiate radfet at room temperature for a long period of time, i.e., dosimetric information should be saved for a long period. there are two important reasons for this: first, being the fact that the dose cannot always be acquired immediately after irradiation, but after a certain period of time; second, as by individual monitoring, the exact moment of irradiation is often unknown, and the radiation dose measurements are performed periodically. room temperature stability of irradiated radfet can be followed by calculating fading f, which can be calculated as [95]: 0 (0) ( ) (0) ( ) 100 [%] 100 [%] (0) (0) t t t t t t t v v t v v t f v v v         . (11) where vt0 is the pre irradiation threshold voltage, vt(0) is the threshold voltage immediately after irradiation, vt(t) is the threshold voltage after annealing time t and vt(0) is the threshold voltage shift immediately after irradiation. fig. 22 fading f at room temperature for 2000 h of radfets with a 400 nm thick gate oxide layer previously irradiated with gamma-ray radiation dose 5 gy. fig. 23 fading f at room temperature for 2000 h of radfets with a 400 nm thick gate oxide layer previously irradiated with gamma-ray radiation dose 50 gy. p-channel mosfet as a sensor and dosimeter of ionizing radiation 529 fading radfets with gate oxide layer thickness of 400 nm which were previously irradiated with 5 gy gamma-rays are shown in fig. 22. it can be seen that fading for the first 24 h annealing at room temperature is about 3.5% while for the annealing time from 24 h to 800 h it is increases for about 6%. during further annealing fading is insignificant. for the same type of radfets previously irradiated with 50 gy gamma-rays is shown in fig. 23. it can be seen that for the first 24 h annealing at room temperature fading is about 6% and its value slightly increases to 200 h annealing time. for annealing time longer than 200 h comes to a slight increase in fading, and therefore, is fading after 2000 h for about 2% higher than after 200 h. fig. 24 fading f at room temperature for 2000 h of radfets with a 400 nm thick gate oxide layer previously irradiated with x-ray radiation dose 100 cgy. fig. 25 fading f at room temperature for 28 d of radfets with a 1m thick gate oxide layer previously irradiated with x-ray radiation dose 100 cgy. fading results for radfets with gate oxide thickness of 400 nm and 1 m, at room temperature previously irradiated with x-rays up to 100 cgy, are presented in figs. 24 and 25, respectively [98]. in fig. 26 fading of radfets with gate oxide layer thickness of 1 m previously irradiated with x-rays up to 10 cgy is also presented [98]. fading of radfets with gate oxide layer thickness of 400 nm, which were irradiated with gate bias of 5 v, is 40% in the first 7 d, whereas those of radfets irradiated without gate 530 m. pejović bias during irradiation have 22% fading also in the first 7 d (fig. 24). for the time period between 7 and 28 d, fading of radfets irradiated with gate bias 5 v increased for about 3% whereas that of radfets irradiated without gate bias during irradiation fading had a nearly constant value. fading of 1m thick gate oxide layer radfets, which were irradiated up to 100 cgy with gate bias 5 v in the first 7 d was 14% (fig. 25), whereas for the time period between 7 and 28 d, it increases about 1%. radfets with the same gate oxide layer thickness, which were irradiated without gate bias the first 7 d, have a fading increase for about 1% and this value is kept up to 28 d. figs. 24 and 25 show that fading is lower when the gate oxide layer of radfets is thicker which in accordance with early study [10], [89], showed that fading decreases with the increase in gate oxide thickness. fig. 26 fading f at room temperature for 28 d of radfets with a 1m thick gate oxide layer previously irradiated with x-ray radiation dose 10 cgy. in radfets with gate oxide layer thickness of 1m irradiated up to 10 cgy, the highest fading occurs in the first 3 d and it is 15% for radfets irradiated without gate bias and 13% for radfets with 5 v gate bias during irradiation (fig. 26). moreover, in both case, fading from 3 to 28 d is smaller than 2%. fading of radfets is mainly a consequence of positive oxide trapped charge decrease. this decrease is a consequence of electrons tunneling from si into sio2; these electrons are captured at positive oxide trapped charge, which leads to their neutralization/ compensation, and thus instability of manifested threshold voltage shift [99]. 4.3. the possibility of radfets re-use many investigations have showed that radfets cannot be used for subsequent determination of ionizing radiation dose. namely, these dosimeters are only used to measure the maximum dose, which is determined by the type and sensitivity of radfet. when the maximum radiation dose is reached, these radfets should be replaced. the first results dealing with the possibility of re-use of these devices are given in ref. [10] for radiation dose 400 gy. later investigations for the same components are presented in [23], [100]. irradiation was performed with gamma-rays up to 35 gy, without gate bias and with gate bias virr = 2.5v and virr = 5v. fig. 27 shows the threshold voltage shift vt p-channel mosfet as a sensor and dosimeter of ionizing radiation 531 as a function of radiation dose d, for both the first and second irradiation with gate bias of virr = 5v. after the first irradiation, the radfets were annealed at room temperature for 5232 h without gate bias. after this, the annealing process was continued at 120 o c without gate bias for 432 h. the radfets were then irradiated under the same conditions. the values of vt during the first and second irradiation is very close. such results are in oposition with earlier results [10] where it was shown that values for vt during the first irradiation are higher than the values obtained during the second irradiation. fig. 27 threshold voltage shift vt as a function of radiation dose d of radfets with a 400 nm thick gate oxide layer for both the first and second irradiation with gate bias of virr = 5v. the first and second irradiation of radfets lead to approximately the same increase of nft (fig. 28) while the increase of nst (mg) is higher during the second irradiation (fig. 29). nfst (cp) is higher during the second irradiation (fig. 30). on the basis of the results presented in figs. 28, 29 and 30 it can be seen that the major contribution to vt increase during the first and second irradiation originates from ft, which density is an order magnitude higher than st(mg) density for a radiation dose of 35 gy. fig. 28 areal density of fixed traps nft as a function of radiation dose d of radfets with a 400 nm thick gate oxide layer for both the first and second irradiation with gate bias of virr = 5v. 532 m. pejović fig. 29 areal density of switching traps nst (mg) as a function of radiation dose d of radfets with a 400 nm thick gate oxide layer for both the first and second irradiation with gate bias of virr = 5v. fig. 30 areal density of switching traps nst (cp) as a function of radiation dose d of radfets with a 400 nm thick gate oxide layer for both the first and second irradiation with gate bias of virr = 5v. 5. low-cost commercial p-channel mosfet as a radiation sensor in recent years, many investigations are driven toward applications of low-cost commercial p-channel mosfets as a radiation sensors in radiotherapy [101]. paper [102] presents results of some most important dosimetric parameters (sensitivity, linearity, reproductibility and angular dependence) for power p-channel mosfets 3n163. these transistors were irradiated by gamma rays from 60 co up to 55 gy. these devices were irradiated without gate bias. fig 31 shows the threshold voltage shift vt versus radiation dose d for 15 devices. as expected, the vt values increases when the radiation dose in mosfets increases. the data show excellent linearity with a mean sensitivity value of 29.2 mv/gy and resonable good reproducibility up to a total dose of 58 gy (which is around to the total dose used in typical radiotherapy treatments). moreover, the angular and dose-rate dependencies are similar to those of other, more specialised p-channel mosfets (radfets). authors of this paper concluded that power p-channel mosfet as a sensor and dosimeter of ionizing radiation 533 p-channel mosfet 3n163 would be an excellent candidate as a sensor of a low-cost system capable of measuring the gamma radiation dose. this radiation sensor could be placed on patient without the need for wires, and the threshold voltage shift, which is indicative of the radiation dose could be measured after the completion of each irradiation session with a resonable degree of confidance. fig. 31 threshold voltage shift vt as a function of radiation dose d for fifteen mosfets 3n163 irradiated with gamma-rays without gate bias. martines-garicia et al [103] investigatted the possibility of vertical diffusion mos, also called double-diffused mos transistor, or simply dmos, as a sensor of ionizing radiation. those components were dmos bs250f, zvp3306 and zvp4525, manufacured by diodes incorporated (plano, usa). the irradiation was performed by an electron beam of 6 mev energy without gate bias. the same auhors invesigated the behavior of p-channel mos transistors from integrated circuis cd4007 (texas instruments, dallas, usa and nxp semiconductors, eindhoven, netherlands) under 6 mev energy electron beam. in fig. 32 the vt versus d is ploted for four simples of the zvp3306 dmos transistors. the results for the other models dmos transistors are similar. as it can be seen there is a linear dependence betveen vt and d to radiation dose of 25 gy. values of sensitivity for bs250f, zvp4525 and zvp3306 are 3.1, 3.4 and 3.7 mv/gy, respectively. fig. 32 threshold voltage shift vt as a function of radiation dose d for four dmod zvp3306 irradiated with 6 mev electrons without gate bias. 534 m. pejović it is shown [103] that p-channel mos transistors from integrating circuits cd4007 in unbiased configuration during irradiation showed the sensitivity 4.6 mv/gy with a very good linear behaviour of the threshold voltage shift versus radiation dose. as the thermal compensation may be applied this transistor may be considered as a promising candidate to use as dosimeter in intra-operative radiology. fig. 33 threshold voltage shift vt and sensitivity s as a function of radiation dose d for p-channel mos transistors from integrated circuits cd4007 irradiated with 6 mev electrons with gate bias virr = 0.6v. fig. 33 shows the vt = f(d) dependence when cd4007 manufactured by texas instruments irradiated with electron beam of 6 mev. during irradiation gate bias is 0.6 v. the data present a linear behaviour showing that p-chnnel transistors from this integrating circuit is suitable for electron beam dosimetry because the sensitivity is 7.4 mv/gy. sensitivity for cd4007 manufactured by nxp semiconductor for the same conditions is 8.9 mv/gy. fig. 34 threshold voltage shift vt as a function of radiation dose d for radfets and vdmosfets irf9520 irradiated without gate bias. p-channel mosfet as a sensor and dosimeter of ionizing radiation 535 fig. 35 threshold voltage shift vt as a function of radiation dose d for radfets and vdmosfets irf9520 irradiated with gate bias of virr = 10v. a comparative study of radfets manufactured by tyndall national institute, cork, ireland with 100 nm gate oxide layer thicknes and commercial p-channel power vdmosfets irf9520 manufactured by international rectifier sensitivity to gamma-ray irradiation in the dose range from 0 to 500 gy is given in paper [104]. figs. 34 and 35 show the dependence between vt and d for radfets and irf9520 in the case when they were irradiated without gate bias (virr = 0v) and with gate bias of virr = 10v, respectively. it can be seen that vt is higher for irf9520 then for radfet for the same radiation dose. the difference in vt is probably a concequence of different technological procedures during device fabrication. it is shown that linear dependence between vt and d valid only for devices with virr = 10v during irradiation (the value of corelation coefficient obtained by experimental data fiting using expression (10) is r 2 = 0.998). figs. 36 and 37 present the change in areal densities of ft, nft for radfet and irf9520 without gate bias and with gate bias virr = 10v during irradiation, respectively [104]. it can be seen that nft is larger in irf9520 then in radfet as well as that gate bias leads to the increase in vt for the same value of radiation dose for both types of transistors. fig. 36 the change in areal density of ft nft as a function of radiation dose d for radfets and vdmosfets ir9520 irradiated without gate bias. 536 m. pejović fig. 37 the change in areal density of ft nft as a function of radiation dose d for radfets and vdmosfets ir9520 irradiated with gate bias of virr = 10v. fig. 38 the change in areal density of st nst, determined using mg technique, as a function of radiation dose d for radfets and vdmosfets ir9520 irradiated without gate bias. fig. 39 the change in areal density of st nst, determined using mg technique, as a function of radiation dose d for radfets and vdmosfets ir9520 irradiated with gate bias of virr = 10v. p-channel mosfet as a sensor and dosimeter of ionizing radiation 537 the change in areal densities of st, nst determined by mg technique for radfet and irf9520 without gate bias and with virr = 10v gate bias are presented in figs. 38 and 39, respectively [104]. it can be seen that nst is smaller in irf9520 than in radfet. however, nft is considerably larger than nst in both types of devices. on the basis of these data it can be concluded that nft predominantly contributes to vt increase during irradiation. fading of irradiation irf9520 and radfet up to 500 gy is calculated 24 h after irradiation using equat. (11). for this time the device were kept at room temperature without gate bias. it was shown that the fading is higher in irf9520 than in radfets and it is smaller for devices previously irradiated with gate bias virr = 10v. 6. conclusion intensive investigations of radiation sensitive mosfets (radfets) have been performed in order to investigate their application in dosimetry. their relatively small volumen give them advantage over some other dosimetric systems, which is particulary important in in-vivo dosimetry as well as in control of gradient radiation fielld of x-rays. radfets are most commonly used for photon and ionizing radiation charged particles detection. it can be also used for neutron detection, but their sensitivity is much smaller than for photons and charged particles. their sensitivity can be increased by gate bias application during irradiation and by increasing the gate oxide layer thickness. the sensitivity increases with the decrease in ionizing radiation photon energy. it is required for these components to achieve minimal variation in threshold voltage shift after irradiation at room temperature, i.e. it is neccessary to preserve the dosimetric information for a long period of time. considered radfets are sensitive sensors of gamma and xrays, because they can register doses below 1 cgy. unfortinatelly, their major disadvantage is large fading immidiatelly after irradiation. investigations in the past few years have shown that some commercially available p-channel mosfets can be very efficiently applied as gamma and x-ray sensors as well as electrons sensors with energyes of seweral mev. those are low power p-channal mosfets 3n163, dmos bs250f, zvp3306, zvp4525 and power vdmosfets irf9520. furthermore, p-channel mos transistors, for example from cd4007, can be used as sensors of ionizing radiation. acknowledgement: the paper is a part of the research done within the project supported by the ministry of education, science and technological development of the republic of serbia under project no. 32026. references [1] w. poch and a.g. holmes-siedle, „the mosimetera new instrument for measuring radiation dose“, rca eng., vol. 16, pp. 56-59, 1970. [2] a. g. holmes-siedle, „the space charge dosimeter-general principles of a new method of radiation dosimetry“, nucl. instrm. methods, vol. 121, pp. 169-179, 1974. [3] l. adams and a. holmes-siedle, „the development of mos dosimetry unit for use in space“, ieee trans. nucl. sci., vol. 18, pp. 1607-1612, 1978. [4] r. r. price, c. benson an k. rodgers, „development of radfet linear array for intracavitary in vivo dosimetry in external radiotherapy and brachyterapy“, ieee tran. nucl. sci., vol. 51, pp. 1420-1426, 2004. 538 m. pejović [5] r. ramasechum, k.s. kulli, t.j. zhang, b. norling, a. hallil and m. islam, „performance characteristics of a micro mosfet as an in vivo dosimeter in radiation therapy“, phys. med. biol., 49, pp. 4031-4048, 2004. [6] g. tarr, k. shortt, y. wang and i. thomson, „a sensitive temperature-compensated, zero-bias floating gate mosfet dosimeter“, ieee trans. nucl. sci., vol. 51, pp. 1277-1282, 2004. [7] m. c. lavallee, l. gingras and b. luc, „energy and interated dose dependence of mosfet dosimeter sensitivity for irradiation eneries between 30 kv and 60 co“, med. phys., vol. 33, pp. 3683-3689, 2006. [8] r. kohuo, t. nishio, t. miyagishi, e. hirano, k. hotta, m. kowashima and t. ogino, „experimental evaualation of a mosfet dosimeter for proton dose measurements“, phys. med. biol., vol. 51, pp. 6077-6086, 2006. [9] a. holmes-siedle and l. adams, „radfets: a review of the use of metal-oxide silicon devices as integrating dosimeters“, rad. phys. chem., vol. 28, 224-235, 1986. [10] a. kelleher, n. mcdonnell, b. o'neill, w. lane, l. adams, „investigation into the re-use of pmos dosimeter“, ieee trans. nucl. sci., vol. 41, pp. 445-451, 1994. [11] l. z. scheick, p.j. mcnulty and d.r. roth, „dosimetry based on the erasure of floating gates in natural radiation environments“, ieee trans. nucl. sci., vol. 45, pp. 2681-2688, 1998. [12] k. kay, e. mullen, w. stapor, r. circle and p. mcdonald, „grres dosimetry results and comparison using the space radiation dosimeter and p-channel mos disieteter“, ieee tran. nucl. sci., vol. 39, pp. 1846-1850, 1992. [13] a. faigon, j. lipovetzky, e. redin and g. kruscenski, „expresion of measurement range of mos dosimeters using radiation induced charge neutralization“, ieee trans. nucl. sci., vol. 55, pp. 21412147, 2008. [14] b. o'connell, c. connely, c. mccarthy, j. doyle, w. lane and l. adams, „electrical performance and irradiation sensitivity of stacked pmos dosimeters under bulkbias control“, ieee trans. nucl. sci., vol. 45, pp. 2689-2694, 1988. [15] g. sarrabayrouse, buchdahl, v. poliscuk and s. siskos, „stacked-mos ionizing radiation dosimeters: potentials and limitations, radiat. phys. chem., vol. 71, pp. 737-739, 2004. [16] r.c. hughes, d. huffman, j.v. snelling, t.e. zipperian, a.j. ricoo and c.a. kelsey, „miniature radiation dosimeter for in vivo radiation measuremnts“, int. rad. oncol. biol. phys., vol. 14, pp. 963967, 1988. [17] d.j. gladstone, x.q. lu. j.l. humm, h.f. bowman and l.m. chin, „a miniature mosfet radiation probe“, med. phys., vol. 21, pp. 1721-1728, 1994. [18] g.i. kaplan, a.b. rosenfeld, b.j. allen, j.t. booth, m.g. carolan and a. holmes-siedle, „a special resolution by mosfet dosimetry of an x-ray microbeam“, med. phys., vol. 27, pp. 239-244, 2000. [19] g. sarabayrouse and v. polischuk, „mos ionizing radiation dosimeters: from low to high dose measurement“, radiat. phys. chem., vol. 61, pp. 511-513, 2001. [20] a. jaksić, g. ristić, m. pejović, a. mohammadzadeh, c. sudre and w. lane, „gamma-ray irradiation and post-irradation response of high dose range radfets“, ieee trans. nucl. sci., vol. 49, pp. pp. 1356-1363, 2002. [21] r. a. price, „towards and optimum design of a p-mos radiation detector for use in high-energy medical photon beams and neutron facilities: analysis of activation materials“, radiation protection dosimetry., vol. 115, pp. 386-390, 2005. [22] m. m. pejović, m.m. pejović, a.b. jakšić, k.dj. stanković and a.a. marković, “successive gamma-ray irradiation and corresponding post-irradiation annealing of pmos dosimeters”, nucl. technol. and radiat. protection, vol. 27, pp. 341-345, 2012. [23] m. m. pejović, m.m. pejović and a.b. jakšić, „contribution of fixed oxide traps to sensitivity of pmos dosimeters during gamma ray irradiation and annealing at room and elevated temperature”, sensors and actuators a, vol. 174, pp. 85-90, 2012. [24] s. alshaikh, m. carolan, m. petasecca, m. lerch and a.b. metealfe, „direct and pulsed current annealing of p-mosfet based dosimeter, the moskin“, australs phys. eng. sci. med., vol. 37, pp. 311-319, 2014. [25] g-wen luo, qi. z.-y. deng, a. rosenfeld and wx?, „investigated of a pulsed current annealing method in reusing mosfet dosimeters for in vivo imrt dosimetry“, med. phys., vol. 41, 0511710, 2014. [26] g. ristić, s. golubović and m. pejović, „pmos dosimeter with two-layer gate oxide operated at zero negative bias”, electr. lett., vol. 30, pp. 295-296, 1994. [27] g. ristić, a. jakšić, m. pejović, “pmos dosimetric transistors with two-layer gate oxide”, sensors and actuators a, vol. 63, pp. 129-134, 1997. p-channel mosfet as a sensor and dosimeter of ionizing radiation 539 [28] g. sarrabayrouse and f. gessinn, “thick oxide mos trnsistors for ionizing radiation dose measurement”, radioprotection, vol. 29, pp. 557-572, 1994. [29] a. haran, a. jakšić, n. rafaeli, a. elyahu, d. david and j. barak, ieee trans. nucl. sci., vol. 51, 2917-2921, 2004. [30] t. p. ma and p.v. dressendorfer, ionizing radiation effects in mos devices and circuits, new york: willey and sons, 1989. [31] g. s. ristić, “influence of ionizing radiation and hot carrier injection on metal-oxide-semiconductor transistors”, j. phys. d: appl. phys., vol. 41, 023001 (19 pp), 2008. [32] m. pejović, p. osmokrović, m. pejović and k. stanković, “influence of ionizing radiation and hot carrier injection on metal-oxide-semiconductor transistors”. in m. nenoi (ed), current topic in radiation research. intech. institute for new technologies, maastricht (nl), chapter 33, . oclc: 846871029, 2012. [33] c. t. sah, “origin of interface states and oxide charges generated by ionizing radiation”, ieee tran. nucl. sci., vol. 23, pp. 1563-1567, 1976. [34] d. l. griscom, “optical properties and structure of defects in silica glass”, j. ceram. soc. japan, vol. 99, pp. 923-941, 1991. [35] r. helms and e.h. poindexter, “the silicon-silicon-dioxide system: its microstructure and imperfections”, rep. prog. phys., vol. 57, pp. 791-852, 1994. [36] r. a. weeks, “paramagnetic resonance of lattice defects in irradiated quartz”, j. appl. phys., vol. 27, pp. 1376-1381, 1959. [37] h. e. boesch, jr, f.b. mclean, j.m. mcgarrity and g.a. ausman, jr,”hole transport and charge relaxation in irradiated sio2 mos capacitors”, ieee trans. nucl. sci., vol. 22, pp. 2163-2167, 1975. [38] w. l. warren and p.m. lenahan, “a comparison of positive charge generation in high field stressing and ionizing radiation on mos structure”, ieee trans. nucl. sci., vol. 34, pp. 1355-1358, 1987. [39] l. p. trombetta, f.j. feigl and r.j. zeto, “positive charge generation in metal-oxide-semiconductor capacitors, j. appl. phys., vol. 69, pp. 2512-2521, 1991. [40] r. k. freitag, d.b. brown and c.m. dosier, “experimental evidence of two species of radiation induced trapped positive charge”, ieee trans. nucl. sci., vol. 40, pp. 1316-1322, 1993. [41] r. k. freitag, d.b. brown and c.m. doser, “evidence for two types of radiation-induced trapped positive charge”, ieee trans. nucl. sci., vol. 41, pp. 1828-1834, 1994. [42] j. e. conley, p.m. lenahan, a.h. lelis and t.r. oldham, “electron spin resonance evidence for the structure of a switching oxide trap: long term structural charge at silicon dangling bond sites in sio 2”, appl. phys. lett., vol. 67, pp. 2179-2181, 1995. [43] j. f. conley, p.m. lenahan, a.j. lelis and t.r. oldham, “electron spin resonance evidence that  e center can behave as switching oxide trap”, ieee trans. nucl. sci., vol. 42, pp. 1744-1749, 1995. [44] d. a. buchanan, a.d. marwick, d.j. dimaria and l. dori, “hot-electron-induced hydrogen redistribution and defect generation in metal-oxide-semiconductors”, j. appl. phys., vol. 76, pp. 35953605, 1994. [45] d. j. dimaria, d.a. buchanan, j.h. stathis and r.e. stahlbush, “interface states induced by the presence of trapped holes near the silicon-silicon-dioxide interface”, j. appl. phys., vol. 77, pp. 20322040, 1995. [46] s.k. lai, “two carrier nature of interface-state generation in hole trapping and radiation damage”, appl. phys. lett., vol. 39, pp. 58-60, 1981. [47] s. k. lai, interface trap generation in silicon dioxide when electrons are captured by trapped holes”, j. appl. phys., vol. 54, pp. 2540-2546, 1983. [48] s. t. chang, j.k. wu and s.a. lyon, “amphoterical defects at si-sio2”, appl. phys. lett., vol. 52, pp. 622-624, 1986. [49] s. j. wang, j.m. sung and s.a. lyon, “relationship between hole trapping and interface state generation in metal-oxide-silicon structures, appl. phys. lett., vol. 52, pp. 1431-1433, 1986. [50] f. b. mclean, “a framework for understanding radiation-induced interface states in sio2 mos structures, ieee trans. nucl. sci., vol. 27, pp. 1651-1657, 1980. [51] n. s. saks, c.m. dozier and d.b. brown, ”time dependence of interface trap formation in mosfets following pulsed irradiation”, ieee trans. nucl. sci., vol. 35, no. 6, pp. 1168-1177, 1988. [52] n. s. saks and d.b. brown, “interface trap formation via the two-stage h + process”, ieee tran. nucl. sci., vol. 36, no. 6, pp. 1848-1857, 1989. 540 m. pejović [53] d. l. griscom, d.b. brown and n.s. saks, nature of radiation-induced point deffcts in amorphous sio2 and their role in sio2-on-si structure,the physics and chemistry of sio2 and sisio2 interface, ed c.r. holmes and b.e. deal, ney-york, plenum, 1988. [54] k. l. brower and s.m. mayers, “chemsical kinetics of hydrogen and (111) sisio2 interface defect”, appl. phys. lett., vol. 57, pp. 162-164, 1990. [55] j. h. stathis and e. cartier, “atomic hydrogen reactions with pb centers at the (100) sisio2 interface”, phys. rev. lett., vol. 72, pp. 2745-2748, 1994. [56] e. h. poindexter, “chemical reactions of hydrogenous species in the sisio2 system”, j. non. cryst. solids, vol. 187, pp. 257-263, 1995. [57] r. e. stahlbush, a.h. edwards, d.l. griscom and b.j. mrstik, “post-irradiation cracking of h2 and formation of interface states in irradiated metal-oxide-semiconductor field-effect transistors”, j. appl. phys., vol. 73, pp. 658-667, 1993. [58] m. m. pejović, “physico-chemical processes in vertical-double-diffusion metal-oxide-semiconductor field effect transistors induced by gamma-ray irradiation and post-irradiation annealing”, facta universitatis, series: physics, chemistry and technology, vol. 13, pp. 13-27, 2015. [59] mcwhorter and p.s. winocur, “simple technique for separating the effects of interface traps and trappedoxide charge in metal-oxide semiconductor transistors”, appl. phys. lett., vol. 48, pp. 133-135, 1986. [60] m. v. fischetti, r. gastaldi, f. maggoni and a. madelli, “slow and fasdt states induced by hot electrons at sisio2 inteface”, j. appl. phys., vol. 53, pp. 3136-3144, 1982. [61] l. p. trombetta, f.j. feigl and r.j. zeto, “positive charge generation in metal-oxide-semiconductor capacitors”, j. appl. phys., vol. 69, pp. 2512-2521, 1991. [62] r. k. freitag, d.b. brown and c.m. dozier, “experimental evidence of two species of radiation induced trapped positive charge”, ieee tran. nucl. sci., vol. 40, pp. 1316-1322, 1993. [63] a.j. lelis. and t.r. oldham, “time dependence of switching oxide traps”, ieee tran. nucl. sci., vol. 41, pp. 1835-1843, 1994. [64] d. m. fleetwood, “border traps in mos devices”, ieee tran. nucl. sci., vol. 39, 269-271, 1992. [65] v. davidovic, ph. d., university of nis, 2010. [66] s. m. sze, physics of semiconductor devices, ney york, wiley, 1981. [67] a. holmes-siedle and l. adams, handbook of radiation effects, 2 nd ed., new york: oxford university press, 2002. [68] m.a.b. eliot, “the use charge pumping currents to measure surface state densities in mos transistors”, solid-state electron., vol. 19, pp. 241-247, 1986. [69] j.s. brugler and p.g. jespres, “charge pumping in mos devices”, ieee trans. electron dev. lett., vol. 13, pp. 627-629, 1969. [70] g. groeseneken, h.e. maes, n. baltron and r.f. de keersmaeeker, “a reliable approch to chargepumping measurements in mos transistors”, ieee trans. electron dev., vol. 31, pp. 42-53, 1984. [71] r. e. paulsen, r.r. siergiej, m.l. french andm.h. white, “observation of near-interface oxide traps with the change pumping technique”, ieee electron dev. lett., vol. 13, pp. 627-629, 1992. [72] d. habaš, z. prijić, d. pantić and n. stojadinović, “charge-pumping characterization of sio2/si interface virgin and irradiated power vdmosfets”, ieee trans. electron dev., vol. 43, pp. 2197-2208, 1996. [73] s. c. witezak, k.f. gallawoy, r.d. schrimpf and j.r. brews, g. prevost, “ the determination of si sio2 interface trap density in irradiated four-terminal vdmosfets using charge pumping”, ieee trans. nucl. sci., vol. 43, pp. 2558-2564, 1996. [74] g. s. ristić, m.m. pejović and a.b. jakšić, „comparison between post-irradiation annealing and posthigh electrical field stress annealing of n-channel power vdmosfets”, appl. surf. sci., vol. 220, pp. 181-185, 2003. [75] a. kelleher, m. o’sullivan, j. rayn, b. o’neal and w. lane, “development of the radiation sensitivity of pmos dosimeters”, ieee tran. nucl. sci., vol. 39, pp. 342-346, 1992. [76] i. thomson, “direct reading dosimeters”, european patent office, ep0471957a2, 02/07/1991. [77] s. best, a. ralson and n. suchowerska, “clinical application of the one dose patient dosimetry system for total body irradistion”, phys. in medic. and biology, vol. 50, pp. 5909-5919, 2005. [78] m. m. pejović, “the gamma-ray irradiation sensitivity and dosimetric information instability of radfet dosimeter”, nucl. technol. and radiat. protection, vol. 28, pp. 415-421, 2013. [79] i. thomson, r.e. thomson and l. p. brendt, “radiation dosimetry with mos sensors”, radiation protec. dosimetry, vol. 6, pp. 121-124, 1983. [80] l. s. august, r.r. circle and j.c. ritter, “an mos dosimeter for use in space”, ieee tran. nucl. sci., vol. 30, pp. 508-511, 1983. p-channel mosfet as a sensor and dosimeter of ionizing radiation 541 [81] l. s. august, “estimating and reducing errors in mos dosimeters caused by exposure to different radiations”, ieee trans. nucl. sci., vol. 29, no. 6, pp. 2000-2003, 1982. [82] g. sarrabayrouse, a. bellaouar and p. rossel, “electrical properties of mos radiation dosimeters”, revue phys. appl., vol. 21, pp. 283-287, 1986. [83] a. ballaouar, g. sarrabayrouse and p. rassel, “mos transistor for ionizing radiation dosimetry”, proc. 13 th yugoslav conf. on mictoelectronics (miel 85), ljubljana, pp. 161-168, 1985. [84] l. adams and a. holmes-siedle, “the development of mos dosimetry unit for use in space”, ieee trans. nucl. sci., vol. 18, pp. 1607-1612, 1978. [85] l. adams, e.j. daly, r. harboe-sorensen, a.g. holmes-siedle, a.k. ward and a.a. bull, “measurements of seu and total dose in geostationary orbit under normal and solar frame conditions”, ieee trans. nucl. sci., vol. 38, pp. 1686-1692, 1991. [86] j. s. leffler, s.r. lendgren and a.g. holmes-siedle, “the aplications of radfet dosimetry to equipment radiation qualification and monitoring”, trans. of the american society, vol. 60, pp. 535536, 1989. [87] a. g. holmes-siedle, l. adams, j.s. leffler and s.r. lingren, ”the radfet system for real-time dosimetry in nuclear facilities”, 7th annual astm-euratom symp. on reac. dosimetry, strasbourg, pp. 851-859, 1990. [88] g. ristić, s. golubović and m. pejović, “p-channel metal-oxide-semiconductor detector fading dependencies on gate bias and oxide thickness”, appl. phys. lett., vol. 66, pp. 88-89, 1995. [89] g. ristić, s. golubović and m. pejović, “sensitivity and fading of pmos dosimeters with thick gate oxide”, sensors and actuators a, vol. 51, pp. 153-158, 1996. [90] z. savić, s. stanković, m. kovačević and m. petrović, „energy dependence of pmos dosimeters“, radiation protect. dosimetry, vol. 64, pp. 205-211, 1996. [91] m. m. pejović and m. m. pejović, „radiation-sensitive field effect transistor response to gamma-ray irradiation“, nuclear technol. and radiat. protection, vol. 26, pp. 25-31, 2011. [92] m. m. pejović, „dose response, radiation sensitivity and signal fading of p-channal mosfets (radfrts) irradiated up to 50 gy with 60 co”, appl. radiation and isotopes, vol. 104, 100-115, 2015. [93] s. pejović, p. bošnjaković, o. ciraj-bjelac and m.m. pejović, “characteristics of a pmosfet suitable for use in radiotherapy”, appl. radiation and isotopes, vol. 77, pp. 44-49, 2013. [94] g. ristić, a. jakšić and m. pejović, „pmos dosimetric transistors with two-layer gate oxide”, sensors and actuators a, vol. 63, pp. 129-134, 1997. [95] m. m. pejović, s.m. pejović, d. stojanov and o. ciraj-bjelac, “sensitivity of radfets for gamma and x-ray doses used in medicine”, nuclear technol. and radiat. protection, vol. 29, pp. 179-185, 2014. [96] m. pejović, o. ciraj-bjelac, m. kovačević, z. rajović and g. ilić, “sensitivity of p-channel mosfet to xand gamma-ray irradiation”, international journal of photoenergy, vol. 2013, pp. 1-6, 2013. [97] c. ehringfeld, s. schmid, k. poljanc, ch. kirisits, h. aiginger and d. georg, “application of commercial mosfet detectors in vivo dosimetry in the therapic x-ray range from 80 kv to 250 kv, physics in medicine and biology, vol. 50, pp. 289-303, 2005. [98] s. m. pejović, m.m. pejović, d. stojanov and o. ciraj-bjelac, “sensitivity and fading of pmos dosimeters irradiated with x-ray radiation doses from 1 to 100 cgy”, radiation protect. dosimetry, vol. 168, pp. 33-39, 2016. [99] p. j. mcwhorter, s.l. miller and w.m. miller, “modeling the anneal of radiation-induced traps holes in a varying thermal environment”, ieee trans. nucl. sci., vol. 37, pp. 16821689, 1990. [100] m. m. pejović, m. m. pejović and a.b. jakšić, “response of pmos dosimeters on gamma-ray irradiation during its re-use”, radiation protection dosimetry, vol. 155, pp. 394-403, 2013. [101] j. aristu, f. calvo, r. martinez, j. dubois, m. santors, s. fisher, et al., “lung cancer, in; intraoperative irradiation techniques and results, 437-453, 1999. [102] l. j. asensio, m.a. carvajal, j.a. lopez-villaneva, m. vilches, a.m. lallena and a.j. palma, „evaluation of a low-cost commercial mosfet as radiation dosimeter“, sensors and actuators a, vol. 125, pp. 288-295, 2006. [103] m. s. martinez-garcia, f. simancos, a.j. palma, a.m. lallena, j. banqueri and m.a. carvajal, „ general purpose mosfets for the dosimetry of electron beams used in intra-operative radiotherapy“, sensors and actuators a, vol. 210, pp. 175-181, 2014. [104] m. m. pejović, „application of p-channel power vdmosfet as a high radiation dose sensor”, ieee trans. nucl. sci., vol. 62, pp. 1905-1910, 2015. instruction facta universitatis series: electronics and energetics vol. 28, n o 2, june 2015, pp. 237 249 doi: 10.2298/fuee1502237d novel, low power, nonlinear dilatation and erosion filters realized in the cmos technology  rafał długosz 1,2 , andrzej rydlewski 3 , tomasz talaśka 1 1 utp university of sciences and technology, faculty of telecommunication, computer science and electrical engineering, bydgoszcz, poland 2 delphi automotive company, kraków, poland 3 alcatel-lucent, coldra woods, chepstow rd, newport np18 2yb abstract. in this paper we propose novel, binary-tree, asynchronous, nonlinear filters suitable for signal processing realized at the transistor level. two versions of the filter have been proposed, namely the dilatation (max) and the erosion (min) one. in the proposed circuits an input signal (current) is sampled in a delay line, controlled by a multiphase clock. in the subsequent stage particular samples are converted to 1-bit digital signals with delays proportional to the values of these samples. in the last step the delays are compared in digital binary-tree structure in order to find either the min or the max value, depending on which filter is used. both circuits have been simulated in the tsmc cmos 0.18µm technology. to make the results reliable we applied the corner analysis procedure. the circuits were tested for temperatures ranging from -40 to 120ºc, for different transistor models and supply voltages. the circuits offer a precision of about 99% at a typical detection time of 20 ns (for the max filter) and 100 ns for the min filter (the worst case scenario). the energy consumed per one input during a single calculation cycle equals 0.32 and 1.57 pj, for the max and min filters, respectively. key words: nonlinear filters, cmos realization, full-custom, binary-tree architecture 1. introduction dilatation and erosion operations often referred to as the max and min functions, respectively, are useful in many applications. these operations are commonly used in artificial neural networks (ann) but also in signal and image processing [1]. to perform the competitive learning, which is common in some types of anns, the min function is used to determine which of the neurons is located in the closest proximity to a provided learning pattern. in this case this operation is known as winner takes all (wta), which is somehow misleading, as from the formal point of view the min function corresponds to the loser takes all (lta) operation. however, the winning neuron is the one, for which the distance is the smallest and thus such a convention. received august 19, 2014; received in revised form february 11, 2015 corresponding author: rafał długosz utp university of sciences and technology, faculty of telecommunication, computer science and electrical engineering,ul. kaliskiego 7, 85-796 bydgoszcz, poland (e-mail: rafal.dlugosz@gmail.com) 238 r. długosz, a. rydlewski, t. talaśka another area in which the min, as well as the max operations are used is nonlinear filtering. such filters are used, for example, to enhance signals or to correct shapes of the objects in pictures. the min and the max operations are in such applications denoted as erosion and dilatation, respectively. both types of filters can be joined in series in order to perform more complex tasks, such as morphological opening and closing operations, commonly used in image processing to reconstruct digital image into original form from noisy image. for such applications we can use min/max detector based (mdb) filters or min/max exclusive mean (mmem) filters. these technics can be used to achieve best performance [2]-[4]. a large similarity exists between the both nonlinear dilatation (max) / erosion (min) filters and the wta / lta operations, as in both mentioned cases the core circuit fulfills exactly the same task. the task relies on searching for either the minimum or maximum signal among a set of the input signals. the difference exists in the input signals used in each of these cases. in anns all signals are independent as they come from separate neurons distributed over the input data space. on the other hand, the erosion/dilatation filters process samples of on one signal stored in the delay line, as shown in fig. 1. in this diagram a classic delay line is schematically shown to illustrate the idea. in the classic approach the samples are rewritten many times between memory cells. in software realizations it is not the problem, but in analog transistor level implementations this usually is the source of errors that have an impact on the quality of filtering. we faced with this problem in our former projects of finite impulse response (fir) filters realized in switched-capacitor technique [5]. to avoid the problem of reduced accuracy it is necessary to use such filters, in which the number of read/write operations in minimized. one of the possibilities in this regard is to use the, so-called, circular delay line [6], [7]. in this approach, the samples are stored in particular memory cells and remain there as long as they are replaced with new samples after m clock cycles, where m is also the number of samples stored in the whole delay line. we adopted this solution to nonlinear filters presented in this paper. in filters the input signal can be sampled in time domain (1-d signal), or particular samples can be, for example, pixels of an image (2-d signal). in this paper we focus on nonlinear filters used in the first situation. however, the proposed solution can be adopted to image filtering as well. fig. 1 nonlinear dilatation / erosion filtering of a 1-d signal in time domain numerous min/max circuits have been reported in the literature, but two major types of architectures can be clearly distinguished. in the first group the min/max circuits are usually based on the current conveyor (cc) architecture [8]-[10]. in this approach all input signals (either currents or voltages) are compared in a single stage. such circuits low power nonlinear min/max filters implemented in the cmos technology 239 usually feature a simple structure, but suffer from limited accuracy that decreases when the number of inputs increases [9], [11]. this problem results mostly from the, so called, 'corner error‟, which occurs when two or more input signals have similar values. in this case an average value between these signals appears at the output of the filter. fig. 2 block diagram of the proposed min / max filter binary-tree solution the second group of filters is based on the concept of the binary tree (bt) structure. in this case the competition between the input signals is conducted on particular layers of the tree. the number of layers equals log2m, where m is the number of inputs that equals the length of delay line. signals at each particular layer compete in pairs and always only one winning signal is allowed to take part in the competition at the next layer of the tree [9], [12], [13]. the binary-tree circuits usually are more complex than their cc counterparts. however, if precise comparators are used, they are able to properly distinguish signals that differ by very small amounts. the advantage of bt solutions is also evident in the fact that they are able to determine the address of the max or min signal, which is not possible in the cc circuits. this is an important feature in case of the application of such circuits in anns, in which the value of the winning signal is less important than the information which of the input signals has the smallest value. in typical bt solutions the signals (analog) at the outputs of particular layers of the tree are determined (calculated or copied) on the basis of signals provided from preceding layers [9], [12], [14]. this may be the source of errors [15] that accumulate at the top of the tree. in the proposed solution [16] this problem is less visible. at an early stage of the signal processing chain the analog input signals are converted to digital 1-bit signals with delays proportional to the values of the input signals. then the comparison of the signals (their delays) is performed in a digital bt structure. in this way, the copying of analog signals between layers has been eliminated. the paper is organized as follows: in next section we propose two filters specific for the dilatation (max) and erosion (min) nonlinear operations. in following section we present verification of the proposed circuit by means of transistor level simulations. to provide reliable results we performed rigorous pvt (process, voltage, temperature) variations tests. the conclusions are formulated in last section. 240 r. długosz, a. rydlewski, t. talaśka a) b) c) d) e) f) fig. 3 components of the proposed nonlinear filters: (a) input multiple-output cm used in circular delay line, (b) s&h memory element used in delay line, (c) current to time converter (itc), (d) delay comparator used in dilatation (max) filter, (e) delay comparator used in erosion (min) filter, (f) address determination block (adet) 2. proposed dilatation and erosion nonlinear filters both nonlinear filters proposed in this paper are based on the same structure shown in fig. 2. the circuit is composed of the analog part whose role is to prepare simplified signals for the subsequent digital bt structure. the circuit consists of several blocks, or groups of elements, presented in detail in fig. 3. analog part of the system in both filters the input current, iin, is first sampled and held in the circular delay line. this delay line has been used to avoid multiple read and write operations of particular signal samples, which is the source of large errors in classical delay line. in this approach particular samples are not rewritten between memory cells but remain in particular cells as long as they are replaced by new samples after m clock cycles, as described earlier. the delay line in the proposed filters works as follows: the input signal is copied m times by the use of the multiple output current mirror (cm), shown in fig. 3(a). in this way each branch receives a separate copy of the input signal and thus data processing in particular branches is independent from each other. particular samples of the input signal are stored in sample & hold (s&h) memory elements, shown in fig. 3 (b). to compensate a typical low power nonlinear min/max filters implemented in the cmos technology 241 in this case charge injection effect across the storage capacitors, cst, we have used the, so called, dummy switches, swd. in such switches inputs and outputs are shorted together, so they do not change the functionality of the circuit. such switches are controlled by clock signals of opposite polarity in comparison with the memory switches, swm. the circular delay line is in this case controlled by an m-phases clock. the complexity of the clock can be viewed as a disadvantage. however, since the length of nonlinear filters usually does not exceed 8-10, it is not a significant problem, taking into account an increased precision of the circuit. output signals from particular s&h elements, denoted as i ' in i are provided to currentto-time converters (itc), shown in fig. 3(c), that convert them into binary 1-bit flags (f). these converters are also common for both filters. the flag signals occur at the outputs of particular itcs with delays proportional to the values of the signal samples. each of these blocks is composed of a pmos-type cascoded cm, an integrating capacitor with reset function, and two not gates. the voltage across the capacitor is increasing with a rate which is proportional to the value of the i ' in i signal. the not gates change their logical states when the voltage across the capacitor reaches a value of about vdd/2. fig. 4 a theoretical influence of transistor sizes on the gain error of the current mirror (due to threshold voltage mismatch) for the weak and strong inversion regions. to improve the precision of the circuit we have used the cascoded cms to increase the accuracy of the copying operations. an additional problem while designing the cms, is how to determine the optimal sizes of transistors for particular values of the input currents (in the range up to 10 µa in this case). we faced with a similar problem in our former projects [17], [18]. the sizes of transistors have a strong influence on the mismatch effect [19], for example the threshold voltage mismatch ∆vth. the last parameter has, in turn, an impact on the theoretical gain error of the cm and thus on the precision of the circuit. in the weak inversion region the impact is usually larger than in the strong inversion region, as shown in fig. 4, and therefore we polarize the transistors in such a way to put their operating points in the strong inversion region. to make it possible we do not work with currents smaller than 1 µa. increasing the sizes of transistors always reduces the mismatch effect. however, for given values of the input currents this also decreases the gate to source voltage, vgs, that in turn enlarges the gain error of the cm. for the currents being in the range up to 10µa optimal sizes of transistors are w / l=3 / 1 and 9 / 1 µm for nmos and pmos transistors, respectively. 242 r. długosz, a. rydlewski, t. talaśka digital binary tree structure the binary tree structure used in the proposed nonlinear filters is composed of delay comparators (dcmp), which are different for particular filters. two versions of this block have been proposed. the circuit used in the dilatation filter is shown in fig. 3(d), while the one used in the erosion filter in fig.3 (e). both circuits are built on the basis of the rs flip flop (rsff) that is able to distinguish very small (at the level of 3-5ns) differences between delays of particular input signals. depending on the mode of the filter (min or max) either the smaller or the larger of two input signals becomes the winner, which dcmp signalizes by two digital signals, o1 and o2. in the overall bt the process of determination of the winning (or losing) signal is based on the competition performed at particular layers of the tree. to make it possible, dcmp blocks provide an additional signal flag (f) of a given pair that takes part in the competition at the following layer of the tree. depending on the type of filter, between the f11 and f12 inputs and the output f there is only a single or or and gate. in the dilatation filter, as soon as only one of the input flags becomes 1, a given dcmp immediately (with a delay below 0.5 ns) sends the flag f of a pair to the next layer of the tree. in the erosion filter, on the other hand, the and gate causes that a given dcmp has wait with sending the flag of the pair until both input flags become 1. this causes that the erosion filter is slower than the dilatation one. the other problem in this filter appears when the minimum signal very small or zero. in this case the process of detection of this signal can take a very long time. to solve this problem we assume not only an upper range of the input signals but also the bottom range, which in this case equals 10% of the upper range. if there is no possibility that the input signals are always larger than the bottom range, we can introduce a constant bias, added in junction to each signal charging the integrating capacitor in the itc block. the last operation performed in the proposed filters is determination of the address of the min or the max signal, depending on the type of the filter. the o1 and o2 signals from particular dcmp blocks are used by the adet block (address determination), shown in fig. 3(e), to determine the address. the o1 and o2 signals have always such values that enable an unambiguous indication of the winning signal. unfortunately, the problem with the rsff is that it can hang ('0.5' states at both outputs) when two input flags arrive at almost the same time i.e. when the corresponding input currents are almost equal. in this case the values at both outputs of the rsff are equal to about vdd/2. to avoid ambiguity in this case, a simple hierarchy mechanism has been introduced that is able to recognize the '0.5' states. in such situations the circuit arbitrarily decides that one of the input signals obtains the status of the 'winner'. the proposed arbitrary mechanism is based on asymmetrical not (notn and notp) gates. the gates have different threshold voltages obtained throughout a proper transistor sizing. these voltages are equal to 0.25∙vdd/2 and 0.75∙vdd/2 for the notp and notn gates, respectively. in case when the rsff hangs, the gates provide different output signals that is detected by the xor gate. this gate throughout the configuration switches (controlled by 'swn' / 'swp' signals) controls the values of the o1 and o2 output signals. in this case the circuit arbitrarily connects the outputs of the rsff to vdd ('1') and vss ('0') supplies. this function does not introduce a substantial error, as in this case both analog input signals are almost equal (difference < 0.2%). additionally, it is worth to say that the '0.5' states occur seldom in practice, so the mechanism is only an emergency solution. low power nonlinear min/max filters implemented in the cmos technology 243 fig. 5 transistor level simulations of a single dcmp equipped with the proposed arbitrary mechanism. in the b state the arbitrary mechanism eliminates the ambiguity. fig. 6 simulations of the circular delay line with eight s&h memory elements. from top to bottom are presented: (1) an example input current with the amplitude of 2µa, (2) controlling clock signals (8 phases), (3) signal samples stored in particular s&h cells (voltages across the storage capacitors, cst), and (4) the supply current 244 r. długosz, a. rydlewski, t. talaśka fig. 7 simulations of the bt block composed of the dcmp (max mode) circuits for t=20ºc and vdd=1.8v. from top to bottom are presented: (1) vc voltages in the itc circuits, (2) resultant flag signals, and (3) addresses of the samples that in a given time period have the maximum values fig. 8 simulations of the bt block composed of the dcmp (min mode) circuits for t=20ºc and vdd=1.8v. from top to bottom are presented: (1) vc voltages in the itc circuits, (2) resultant flag signals, and (3) addresses of the samples that in a given time period have the minimum values low power nonlinear min/max filters implemented in the cmos technology 245 3. verification of the proposed nonlinear filters the proposed circuit has been tested in several steps. at the beginning we tested the dcmp as a separate circuit. we put a special emphasis on the arbitrary mechanism, as this block has a crucial meaning. illustrative results for the circuit used in the dilatation filter are shown in fig. 5.the rs out1 and rs out2 signals can be either in a typical state (a), in which their values are `0' or '1', or in the '0.5' state (b), which is not desired. in case b (e.g. in the range from 47 to 52us) the outputs of asymmetrical notn and notp gates provide different values. this state is detected by the xor gate that signalizes it by reverting the values of the 'swp' and 'swn' signals. as a result, the outputs o1 and o2 are arbitrary connected to vdd and vss rails (logical '1' and '0'), respectively. after verifying the dcmp block we have tested the performance of the overall filter composed of a delay line with 8 memory cells and the bt block with three layers (log28). the results for the dilatation as well as the erosion filters are presented in figs. 6 – 8. fig. 6 illustrates the operation of the circular delay line. an example input current – sinus waveform with f=10khz and the amplitude of 2 µa across the 3µa dc signal – is sampled and held in the memory cells (cst = 400 ff). each sample remains in a given cell during eight subsequent clock cycles. an average supply current equals 70 µa, which means that an average power dissipation equals 126 µw (for vdd = 1.8 v). the performance of the overall circuit for both types of nonlinear filters is presented in figs. 7 and 8. top panel in both figures present voltages across storage capacitors in particular itc blocks. this phase is preceded by resetting the capacitors. after the reset signal is released, the capacitors are charged from 0 to vdd by currents (samples of the input signal), whose values are stored in the corresponding s&h elements. middle panel presents resultant delays of particular flags. finally, the bottom panels show the addresses of the samples with the max or min values that appear at the outputs of the filters. the input signals in both cases have been selected in such a way to present different scenarios. in fig. 7 in the first cycle (in between 50 and 60 µs) the i7 and i8 samples are almost equal. as a result, both corresponding flags are set to 1 in a short period of time that activates the arbitrary mechanism. in this case the mechanism arbitrary selects the i7 signal as a winner. the next two cycles (60-70 and 70-80 µs) present a typical situation, in which differences between particular signals are larger. the detection time varies in this case in between 5 and 20 ns. in most cases we expect that this time will be no greater than 20 ns. in the worst case scenario (not presented), i.e. for bottom values of all samples of the input signal (1µa) this time can reach even 100 ns. taking into account the average power dissipation provided above, the energy consumed during one detection cycle per one input can be determined to be 1.57pj and 0.32pj in the worst case scenario and in a typical situation, respectively. in case of the erosion filter we selected such input signals for which the flags appear almost at the same time (a difference of 1 – 3 ns). it is shown in fig. 8. in each situation the arbitrary mechanism properly selects one of these signals as a winner (minimum signal in this case). detection time is in this case longer than in case of the dilatation filter, as the circuit must wait until the flag of the smallest signal appears at the output of the itc. we can assume that the detection time is in this case closer to the worst case scenario of the dilatation filter (100 ns), while the power dissipation remains the same. 246 r. długosz, a. rydlewski, t. talaśka a) b) fig. 9 corner analysis: simulations of the dilatation (max) filter for different temperatures: (a) -40ºc , (b) 120 ºc. the meaning of particular diagrams seen from top to bottom is the same as in figs. 7 and 8. after the initial verification of the circuit described above we performed a detailed corner analysis of the circuits. we tested the filters in wide ranges of particular pvt (process, voltage and temperature) parameters. the environment temperature varied in the range from -40ºc to 120 ºc, while the supply voltage in the range from 1.2 to 1.8 v. we tested the circuit for three transistor models, namely slow, fast and typical (ss, ff, tt). fig. 9 presents selected simulation results for the same situation as in fig. 7, but for different temperatures to illustrate the stability of the system. low power nonlinear min/max filters implemented in the cmos technology 247 table 1 performance comparison of selected min / max circuits reported in literature. reference process (cmos µm) vdd [v] no. of inputs p / (1 in) [µw] data rate f [mhz] input range [µa] fom (f/p1in) [mhz/µw] [20] 0.5 3.3 8 106.25 5 3.3 0.047 [21] 0.35 3.3 8 70 1 10 0.014 [12] 0.8 6 8 120 2.8 50 0.023 [9] 2.4 5 8 200 13.8 100 0.069 [15] 0.6 3 8 283.75 20 70 0.070 this work (dilatation) 0.18 1.8 8 15.75 50 1 – 10 3.174 this work (erosion) 0.18 1.8 8 15.75 10 1 – 10 0.635 4. discussion of results in this section we compare the obtained results with performance of other min/max circuits reported in the literature. a straightforward comparison is not easy, as particular solutions were designed for different purposes and thus, to some extent, offer different features. most of the reported circuits does not contain the delay line, as they have been designed for independent input signals directly provided to the bt block. in case of our circuit the memory cells used to store the signal samples contain additional branches that conduct current, thus increasing the power dissipation. schemes presented in fig. 3 (a) – (c) show that each memory cell contains two additional branches that almost doubles the power dissipation in the comparison with the situation in which the same circuit (without the delay line) would be used to process independent signals. to facilitate the comparison of different solutions we define a figure-of-merit (fom) as data rate over the power dissipation for a single input. such assumption is correct, as the power dissipation increases approximately linearly with the number of inputs. note that the number of all elements used in the circuit also increases linearly with the number of inputs. as discussed earlier, the number of layers in the binary tree equals log2m. at each following layer the number of elements that serve as comparators (dcmp) is reduced by 2. for example, for 8 inputs the circuit has 3 layers with 4, 2, 1 comparators, respectively (7 comparators in total). for 128 inputs the number of comparators equals 127. the number of itcs and memory cells in delay line equals the number of the inputs. we are aware that the proposed circuit has been realized in newer technology than other circuits of this type, presented in table 1. however, as described earlier, to reduce the mismatch effect we oversized transistors used in the analog part that had some impact on the attainable data rate. the main source of the observable delay is the analog part of the system. in particular itc blocks the currents with the values in-between 1 and 10 µa have to charge the capacitors of 100 ff to the value of about 0.9v, that enables generating the flag. this process takes about 9 to 90 ns, for 10 and 1 µa, respectively. if the circuit would be realized in an older technology we would have to increase the supply voltage, so the process of charging the capacitors would take a longer time (let us assume 2 -3 times). the digital part of the system is very fast. the delay of a single layer of the bt block equals the delay of a single or or and gate only, as the flag of the pair is generated by these gates (fig. 3 d-e). this delay in the cmos 0.18µm technology does not exceed 1 ns for vdd = 1.8v. if the rsff hangs, the arbitrary mechanism requires about 3 ns to decide 248 r. długosz, a. rydlewski, t. talaśka which of the inputs is assumed to be the winner. however, this process is parallel to the process of propagating flags in the tree. in the bt block we use transistors with minimal lengths in a given technology. if we would redesign the circuit in an older technology, the propagation time of each layer of the tree would increase by a factor of (l1/l2) 2 . in the cmos 0.5µm technology, for example, this time would be longer about 7-10 times. we suppose that in this technology the delay of the overall circuit in the worst case scenario would not exceed 300 ns. this would reduce the fom of our circuit about 3 times. however the obtained results are still four times better than in an example circuit reported in [20], designed in cmos 0.5µm technology. the provided delay times and calculations are for an example case of 8 inputs i.e. 3 layers in the tree. in case of larger structures the delay of the analog part will remain the same, while the delay of the digital part will increase only moderately. this is one of the main advantages of the proposed solution. in other circuits of this type with analog bt, the delay is linearly proportional to the number of layers. during the corner analysis we simulated the filters with smaller supply voltages. the circuit worked properly dissipating less power, but it was also much slower in this case. for vdd = 0.8v the digital part was approximately 10 times slower. working with such supply voltages does not make sense as the energy consumed during one cycle does not decrease as fast as the dissipated power, just due to reduced speed. additionally for such voltages transistors used in the analog part work in the weak inversion region that reduces the precision of the circuit. 5. conclusions novel nonlinear dilatation and erosion filters have been proposed in the paper. the circuits are based on the binary tree concept. however, in contrary to typical solutions of this type, in which analog bt structures are used, is the proposed circuit we distinguish the analog part that converts the analog signals to 1-bit signals with different delays and the parallel and asynchronous digital bt block that determines which delay is the smallest or the largest, depending on the type of the filter. the proposed digital bt is much faster than its analog counterparts. it additionally eliminates propagation of analog signals in the tree, as it is in other circuits of this type. as a result, the circuit offers a precision at the level exceeding 99% that is sufficient in many signal processing tasks. the proposed bt is very sensitive and is able to distinguish very small differences of delays of particular input signals. this is possible through a not typical use of the rs flip flops, which serve in this case as time comparators. in a typical application of the rs flip flops the „11‟ input state is not allowed. in our circuit we call this situation an emergency state that happens relatively seldom. nevertheless, to avoid the situation in which this state will unable calculation of the output sample of the filter, we propose an arbitrary mechanism that is able to handle this situation. the next step of the project will be design and fabrication of the chip containing the filters and its laboratory tests. this phase is necessary, as the noise can have same impact on the results. low power nonlinear min/max filters implemented in the cmos technology 249 references [1] m. vemis, g. economou, s. fotopoulos, a. khodyrev, "the use of boolean functions and logical operations for edge detection in images", signal processing, 1995, vol. 45, 161–172 [2] r.a. araujo, a.l.i. oliveira, s. soares, s. meira, "designing dilation-erosion perceptrons with di_erential evolutionary learning for air pressure forecasting", in procedings of the international joint conference on neural networks, 2011, san jose, california, usa, pp. 595–602 [3] p.t. jackway, m. deriche, "scale-space properties of the multiscale morphological dilation-erosion", ieee transactions on pattern analysis and machine intelligence, 1996, vol. 18, no. 1, pp.38–51 [4] joseph (yossi) gil and ron kimmel, "efficient dilation, erosion, opening, and closing algorithms", ieee transactions on pattern analysis and machine intelligence, vol. 24, iss. 12, december 2002, pp.1606–1617 [5] a. dąbrowski, r. długosz, p. pawłowski, “integrated cmos gsm baseband channel selecting filters realized using switched capacitor finite impulse response technique”, elsevier microelectronics reliability journal, vol. 46, no. 5–6, pp. 949–958, june 2006. [6] sophocles j. orfanidis, "introduction to signal processing", previously published by pearson education, inc. 1996-2009 by prentice hall, inc. previous isbn 0-13-209172-0 [7] r. długosz, k. iniewski, “programmable switched capacitor finite impulse response filter with circular memory implemented in cmos 0.18μm technology”, journal of signal processing systems (formerly the journal of vlsi signal processing systems for signal, image, and video technology), springer new york, vol. 56, no. 2-3, pp. 295–306, september 2009. [8] w. w. moses, e. beuville, m. h. ho, "a winner-take-all ic for determining the crystal of interaction in pet detectors", ieee transactions on nuclear science, vol. 43, no. 3, 1996, pp.1615–1618 [9] a. demosthenous, s. smedley, j. taylor, "a cmos analog winner-takes-all network for large-scale applications", ieee transactions on circuits and systems-i: fundamental theory and applications, vol. 45, no. 3, 1998, pp.300–304. [10] j. ramirez-angulo, j.e. molinar-solis, s. gupta, r. g. carvajal, a. j. lopez-martin, "a high-swing, high-speed cmos wta using differential flipped voltage followers", ieee transactions on circuits and systems ii: express briefs, vol.54, no. 8, 2007, pp.668–672. [11] t. serrano, b. linares-barranco, "a modular current-mode high-precision winner-take-all circuit", ieee transactions on circuits and systems-ii: analog and digital signal processing, vol. 42, no. 2, 1995, pp.132–134. [12] k. wawryn, b. strzeszewski, "current mode ab class wta circuit", in the proceedings of the ieee international conference on electronics, circuits and systems (icecs), 2001, pp. 293–296. [13] g. t. tuttle, s. fallahi, a. a. abidi, "an 8-b cmos vector a/d converter", ieee international solidstate circuit conference (isscc), san francisco, usa, 1993, pp. 38–39 [14] r. długosz, t. talaśka, r.wojtyna, "new binary-tree-based winner-takes-all circuit for learning on silicon kohonen's networks", in proceedings on the int. conf. on signals and electronic systems (icses), lódź, poland, 2006, pp. 441–446. [15] b. tomatsopoulos, a. demosthenous, "low power, low complexity cmos multiple-input replicating current comparators and wta/lta circuits", in proceedings on the european conference on circuit theory and design (ecctd), vol. 3, no. 28, cork, ireland, 2005, pp. 241–244. [16] r. dlugosz , a. rydlewski , t. talaska, "low power nonlinear min/max filters implemented in the cmos technology", in proceedings on the 29th international conference on microelectronics, beograd, serbia, 12-14 may 2014, pp. 397–400. [17] r. długosz, w. pedrycz, "łukasiewicz fuzzy logic networks and their ultra low power hardware implementation", elsevier neurocomputing, vol. 73, iss.7-9, pp.1222–1234, march 2010. [18] r. długosz, t. talaska, w. pedrycz, "current-mode analog adaptive mechanism for ultra-low power neural networks", ieee transactions on circuits and systems–ii: express briefs, vol. 58, iss. 1, pp. 31–35, january 2011. [19] m.j.m. pelgrom, h.p. tuinhout and m. vertregt, "transistor matching in analog cmos applications", in proceedings on the ieee international electron devices meeting, december 1998, pp. 915–918 [20] y.c. hung, b.d. liu, "high-reliability programmable wta/lta circuit of o(n) complexity using a single comparator", iee proceedings-circuits devices and systems, vol. 151, no. 6, 2004, pp. 579–586. [21] yu chien-cheng, tang yun-ching, liu bin-da, "design of high performance cmos current-mode winner-take-all circuit", in proceedings on the international conference on asic, beijing, china, 2003, pp. 568–572. facta universitatis series: electronics and energetics vol. 31, n o 4, december 2018, pp. 501-518 https://doi.org/10.2298/fuee1804501j methods of decreasing losses in optical metamaterials  zoran jakšić 1 , marko obradov 1 , olga jakšić 1 , goran isić 2,3 , slobodan vuković 1,3 , dana vasiljević radović 1 1 center of microelectronic technologies, institute of chemistry, technology and metallurgy, university of belgrade, njegoševa 12, 11000 belgrade, serbia 2 institute of physics, university of belgrade, pregrevica 118, 11080 belgrade, serbia 3 science program, texas a&m university at qatar, p.o. box 23874 doha, qatar abstract. in this work we review methods to decrease the optical absorption losses in metamaterials. the practical interest for metamaterials is huge, but the possible applications are severely limited by their high inherent optical absorption in the metal parts. we consider the possibilities to fabricate metamaterial with a decreased metal volume fraction, the application of alternative lower-loss plasmonic materials instead of the customary utilized noble metals, the use of all-dielectric, high refractive index contrast subwavelength nanocomposites. finally, we dedicate our attention to various methods to optimize the frequency dispersion in metamaterials by changing their geometry and composition in order to reach lower absorption, which includes the use of the hypercrystals. the final goal is to widen the range of different metamaterialbased devices and structures, including those belonging to transformation optics. maybe the most important among them is the fabrication of a novel generation of alloptical or hybrid optical/electronic integrated circuits that would operate at optical frequencies and at the same time would offer a packaging density and complexity of the contemporary integrated circuits, owing to the strong localization of electromagnetic fields enabled by plasmonics. key words: metamaterials, transformation optics, plasmonics, low-loss metamaterials, hyperbolic metamaterials 1. introduction artificial structuring of optical materials at the subwavelength level ensures excellent control over spectral and spatial dispersion. it becomes possible to obtain very high, very low (near-zero) and negative effective values of refractive index. a path is thus opened to tailoring received august 22, 2018 corresponding author: zoran jakšić, institute of chemistry, technology and metallurgy, university of belgrade, njegoševa 12, 11000 belgrade, serbia (e-mail: jaksa@nanosys.ihtm.bg.ac.rs)  502 z. jakšić, m. obradov, o. jakšić, g. isić, s. vuković, d. vasiljević radović the optical space at will, ultimately leading to the new field of transformation optics, fig. 1 [1-3]. the materials structured in the quoted manner possess electromagnetic properties that surpass those normally met in nature and are thus denoted as metamaterials [4, 5]. in a general case, metamaterials represent 1d, 2d or 3d composites of constituent parts with different values of complex refractive index, fig. 2, and are the main building blocks for tailoring the optical space. they are typically structured at a subwavelength level, which in the case of visible radiation is of the order of nanometers. in most cases the optical metamaterials owe their operation to electromagnetic localization on an interface between the two constituent materials (typically metal and dielectric). in such a situation an evanescent wave is formed at the interface, called surface plasmon polariton. in its basic form such a surface wave is obtained by coupling free electron plasma in the metal part with the p-polarized electromagnetic wave at the interface between semi-infinite dielectric and semi-infinite metal. depending on geometry of the nanocomposite and its constituent materials, a host of different waves can appear at the surface and in the bulk [6-8]. the strong localization of the electromagnetic field at the surface lies in the root of many exotic optical phenomena connected with plasmonic metamaterials. many novel wave phenomena are met in such metastructures, for instance extreme light concentration [9, 10], near-perfect absorption [11-13], superlensing [14-16] and hyperlensing [17-19], optical cloaking (invisibility shields) [3, 20, 21], to name just a few. basically, using metamaterials anything could be done with propagation of electromagnetic waves, the practical limit being only the imagination. one of the most interesting practical goals of metamaterials and transformation optics is merging the packaging density of electronic devices with the speed of photonic ones by creating ultracompact, all-optical circuits as a new step in the continuation of the moore's law [22]. this holds the potential to revolutionize the electronics industry. fig. 1 modification of the optical space by transformation optics. black lines represent the optical space (the artificially made structure of the metamaterial) while red arrows show the propagation of electromagnetic beams. possible approaches to decreasing losses in optical metamaterials 503 fig. 2 an example of 3d structuring of optical metamaterials (spheres with a given value of complex refractive index within a host with a different refractive index.) to obtain extreme concentrations, the typical approach is to utilize electromagnetic resonances in metal-dielectric nanocomposites, thus ensuring field localization at metaldielectric interface (the already mentioned surface plasmons polaritons, spp) [23]. the use of metals means high absorption losses, thus short propagation paths and generally poor figures of merit. this is a major obstacle to the more widespread application of transformation optics. a host of practical applications would benefit from low-loss plasmonics and nanophotonics. in this work we consider strategies for nanostructuring of artificial optical composites to decrease or eliminate absorption. some approaches include  low metal volume fraction: utilize structures with smaller relative amount of metal – generalization of the old concept of artificial dielectrics.  alternative plasmonic materials: use of plasmonic materials with lower losses compared to pure metals.  all-dielectric meta-optics: completely avoid the use of metals and limit the design to pure dielectric and possibly low-loss semiconductors.  optimizing frequency dispersion: design metal-dielectric or even metal-metal nanocomposites with a frequency dispersion specifically tailored to obtain lower losses. in the next sections we consider each of the quoted strategies. 504 z. jakšić, m. obradov, o. jakšić, g. isić, s. vuković, d. vasiljević radović 2. low metal volume fraction an obvious approach to decrease the absorption losses in metal-dielectric nanocomposites is to decrease their metal to dielectric volume fraction. basically, this idea leans heavily on the concept of artificial dielectrics, used already in 1950ties in microwave technique [24, 25]. here we have a simple rule of thumb, which also follows the common sense: as the metal volume fraction decreases, a smaller part of the wave propagates through the absorptive medium, thus losses become lower. on the other hand, field localization also tends to become weaker and the electromagnetic field spreads over a larger volume, see fig. 3. therefore, there is need for a trade-off between the two. fig. 3 electromagnetic field distribution (light color) around metal/plasmonic material (dark color). figure 4. shows some examples of low metal fraction metamaterials. the top left structure is a one-dimensional plasmonic crystal [26] (metal-dielectric multilayer) with metal sheets much thinner than their dielectric counterparts. the top right structure in fig. 4 is the simplest 1d plasmonic crystal, a freestanding metallic/plasmonic material membrane with nanometer thickness. the dielectric part of this plasmonic crystal is the surrounding ambient and it ensures a perfect electromagnetic symmetry of the structure – the dielectric above and below is identical. as mentioned in the description of fig. 4, the plasmonic nanomembrane [27, 28] is a typical example of 1d plasmonic structure with minuscule volume fraction of metal. it can be defined as a freestanding metallic (or generally material containing free electron plasma) structure with an extremely high aspect ratio (lateral dimensions being even several million times larger than the thickness which can be of the order of tens of nanometers, even less). the thinner nanomembranes are, the longer are the propagation paths of spp. this makes them an ideal platform for long-range surface plasmons polaritons [29, 30]. possible approaches to decreasing losses in optical metamaterials 505 fig. 4 examples of plasmonic crystal-based metamaterials with low volume fraction of metal constituent. top left: conventional metal-dielectric multilayer (1d plasmonic crystal – pc); top right: freestanding metal nanomembrane as the simplest 1d pc; bottom right: wire medium (metal wires within a dielectric host, 2d pc) and bottom left: metal/plasmonic particles within dielectric host (3d pc). propagation of spp on separate interfaces of a membrane is shown in fig. 5. when interfaces are close enough to each other (sufficiently thin membrane), separate spp modes couple across the plasmonic membrane. if membrane becomes thinner still, two spp modes merge into one. fig. 5 coupled spp on membranes in imi (insulator-metal-insulator) configuration. thin metal strata in dielectric host metal nanoparticles in dielectric host freestanding plasmonic nanomembrane wire medium 506 z. jakšić, m. obradov, o. jakšić, g. isić, s. vuković, d. vasiljević radović 3. alternative plasmonic materials plasmonic effects were usually obtained using good metals like gold and silver. at the same time, these materials have very strong absorption losses. in the recent years, however, the focus of attention has shifted to alternative plasmonic materials [31, 32] which also have free electron plasma, but they offer various advantages like tailorability of their electromagnetic response, lower absorption losses. probably the most interesting group of alternative plasmonic materials are optically transparent, electrically conductive oxides (tco) [33]. heavy doping ensures an increase of the electron concentration in these materials, thus contributing to an improvement of their plasmonic properties and at the same time ensuring tailoring of their spectral characteristics, i.e. shifting them to the near-infrared part of the spectrum. examples of tco include ito – indium-tin-oxide; azo – aluminum-zinc-oxide; gzo – gallium-zincoxide. recently, however, it has been noted that the quoted properties come for a price and that the tco figure of merit (the ratio of the real by imaginary part of the refractive index) reaches rather poor values, even worse than in noble metals they are intended to replace [7]. other alternative plasmonic media include metallic alloys. they are tunable by design, by simply adjusting the alloy composition. such tailoring could shift the peak losses to another frequency, possibly outside the operating range. this group of materials includes noble-transition alloys (e.g. au-cd), alkali-noble inter-metallic compounds (e.g. li2agin, kau) and intermetallics (e.g. ag3sn, cu3sn). in spite of the tailorability of the composition of metal alloys and thus of their frequency dispersion, a problem of their excessively high absorption at optical wavelengths still remains, only mitigated to a minuscule degree. graphene represents weakly corrugated sub-nanometer honeycomb lattice of carbon. its relatively low optical absorption in visible and infrared (of the order of 3%), connected with the existence of quasi-2d free electron plasma makes it a convenient candidate for plasmonics [34]. its properties are easily tailored by doping and gating. a combination of graphene with noble metal nanoparticles has been proposed as a platform for tunable spp [35]. graphene plasmonics represents a field of its own and vastly surpasses the scope of this article. finally, an alternative plasmonic platform are highly doped semiconductors, e.g. gaas, gap, sic, gan, which also have free electron plasma. their use in active and ultrafast plasmonics has been considered [36]. however, absorption losses in the visible are again a hurdle towards more widespread use. 4. all-dielectric meta-optics the idea with all-dielectric metamaterials is to avoid absorption by completely removing lossy parts [37-39] . the price to pay is a lower degree of design freedom (it is far easier to localize em field using metal-dielectric interfaces). no metals or similar materials with free electron plasma are used: material can be pure dielectric or possibly semiconductor (simultaneously ensuring high refractive index and low losses). it is necessary to reach high refractive index contrast between scatterers and the embedding host. mie resonance theory [40] is applied (exact solution of the classical electromagnetic diffraction problem): both scatterer size and morphology/shape are important. extreme possible approaches to decreasing losses in optical metamaterials 507 field concentration is obtained through creation of hotspots (nonlocalities) at deep subwavelength level due to edge effects at sharp angles. to achieve this the shapes of scatterers are modified. some 3d shapes of nanoparticles are presented in fig. 6; this is only a very small number of examples among a vast variety of the existing forms. deep subwavelength hotspots cause effective medium approximation (ema) to break down (conventional ema theory no longer remains valid). fig. 6 illustration of nanoparticles with different shapes. since resonances are shape-dependent, a wealth of new modes appears. morphologydependent em behavior includes both electric and magnetic dipole resonances and higher order multipole resonances. in addition to that, relative positions of nanoparticles are important because magnetic or electric field tend to concentrate between them (nanoparticle dimers), again causing the appearance of magnetic or electric hotspots. as an illustration, the scattering properties of a single all-dielectric cylinder on a substrate are shown in figs. 7–8. the nanocylinders are on a low index substrate (n=2) surrounded by air and are built of a high index material (n=8). the cylinder radius is 50 nm and its height is 40 nm. the scattering properties of all-dielectric nano-cones are shown in figs. 9–12. the cones are deposited on a low index substrate (n=1.5) are surrounded by air and consist of a high index material (n=4). the dielectric cone base radius is 75 nm and the height is 100 nm. fig. 7 radiation pattern of far field scattered from a dielectric cylinder h=40 nm, r=50 nm, n=8 on a substrate n=2. the incident beam =560 nm arrives from above, along the cylinder axis. 508 z. jakšić, m. obradov, o. jakšić, g. isić, s. vuković, d. vasiljević radović fig. 8 spatial distributions of field intensity; electric field (top row) and magnetic field (bottom row) for a dielectric cylinder h=40 nm, r=50 nm, n=8 on a substrate n=2 at =560 nm. fig. 9 radiation pattern of far field scattered from a dielectric cone h=100 nm, r=75 nm, n=4 on a substrate n=1.5. the incident beam =300 nm arrives from above, along the axis of the cone. possible approaches to decreasing losses in optical metamaterials 509 fig. 10 spatial distributions of field intensity; electric field (top row) and magnetic field (bottom row) for a dielectric cone h=100 nm, r=75 nm, n=4 on a substrate n=1.5. at =300 nm. fig. 11 radiation pattern of far field scattered from a dielectric cone h=100 nm, r=75 nm, n=4 on a substrate n=1.5. the incident beam =380 nm arrives from above, along the axis of the cone. 510 z. jakšić, m. obradov, o. jakšić, g. isić, s. vuković, d. vasiljević radović \ fig. 12 spatial distributions of field intensity; electric field (top row) and magnetic field (bottom row) for a dielectric cone h=100 nm, r=75 nm, n=4 on a substrate n=1.5. at =380 nm. when deposited on an interface between two materials with different refractive indices single dielectric particle exhibits high directivity in its radiation pattern in favor of material with higher refractive index i.e. the substrate, same as with metallic particles [41, 42]. unlike metals, electric field localizations occur also within the particle but are still tied to edges of the particle with much lower efficiency in comparison to metals. however, unlike metals, dielectric particles exhibit high magnetic field localizations within the particles with spatial distributions almost complementary to those of electric fields. figure 13 shows the response of purely dielectric nanodimers (paired nanocylinders, top view) [38]. for the field directions as in fig. 13 the dimers exhibit field hotspots in the gap between the nanoparticles: for the electric field directed along the axis of the dimer a hotspot of the electric field appears, and for the magnetic field along the same axis a magnetic hotspot appears. in the case when the nanoparticles are metallic, an identical situation is encountered if the layout is as shown in fig. 13a. however, metallic dimers behave oppositely to the dielectric ones when the configuration shown in fig.13b is used, and contrary to the all-dielectric case no magnetic hotspot appears at all. possible approaches to decreasing losses in optical metamaterials 511 fig. 13 electric and magnetic hotspots in purely dielectric dimers. + and – signs describe polarization of molecules within dimers. the vector of electric polarization within separate nanoparticles has the same direction as the electric field, while magnetic polarization follows the magnetic field. a square array of cylindrical high refractive index dielectric resonators is shown in fig. 14. such structure is denoted as dielectric huygens metasurface [43]. a metasurface can be defined as quasi-2d structure with a subwavelength thickness containing metamaterial “atoms” in its plane, which are themselves with subwavelength dimensions. a huygens metasurface can behave as a reflectionless plane, i.e. an array of huygens sources which do not have a backward component of scattering. the metamaterial “atoms” in this case are high-index cylinders arranged in plane. fig. 14 square array of subwavelength high refractive index cylinders embedded in a low-index host. 512 z. jakšić, m. obradov, o. jakšić, g. isić, s. vuković, d. vasiljević radović 5. optimizing frequency dispersion metal-dielectric nanocomposites exhibit very complex photonic behavior even in the case of the simplest structures. an interplay between bragg and plasmon-polariton interface phenomena generates a plethora of various electromagnetic modes. as an illustration, fig. 15 shows a frequency dispersion of a simple one-dimensional metal-dielectric multilayer with only 3 metal-dielectric pairs. the structure is deposited on dielectric and surrounded by air or vacuum. even such a basic plasmonic structure shows a surprising wealth of modes. besides a bandgap one can observe different plasmonic modes (to the right of the light line), including those modes that exist within the bandgap and cross into the bands, as well as the negative group velocity modes. obviously, if we meet such a complex situation for a simple 1d plasmonic crystal, with increasing structural complexity (making nanoplasmonic structures in 2d and 3d), it should be possible to customize frequency dispersion of the obtained artificial materials to arrive at almost any desired group velocity in a given frequency/wave vector range. the idea is to adjust the parameters to minimize losses and maximize fom for a given frequency range. fig. 15 frequency dispersion of a simple three-layer pair metal-dielectric with vacuum (n=1) on top side and dielectric substrate n=2.89. possible approaches to decreasing losses in optical metamaterials 513 now we give here an example of plasmonic metamaterials that can have strongly decreased losses. it is an old/new paradigm –materials with hyperbolic dispersion [44] (hm, hyperbolic metamaterials). the effect was first demonstrated in 1969, but only relatively recently attracted attention within the field of metamaterials. hm metallodielectric structures became a target of intensive research, among other reasons, because their absorption losses can be strongly reduced. hm is a metamaterial designed to exhibit extreme optical anisotropy, with opposite signs of dielectric permittivity in two orthogonal directions εn∙ετ < 0 (1) ετ = εx = εy, εn = εz (2) hyperbolic dispersion for extraordinary waves: kτ 2 /εn + kn 2 /ετ = k0 2 (3) k0 2 = ω 2 /c 2 , kτ 2 = kx 2 + ky 2 , kn = kz. (4) hm are much easier to fabricate than the well-known double-negative media (artificial composites that simultaneously have their effective permeability and permittivity below zero, i.e. negative refractive index metamaterials). a visual presentation of topological transformations in k-space which bring to hyperbolic dispersion is shown in fig. 16. fig. 16 topological transitions in k-space: isofrequency surfaces for extraordinary waves in hyperbolic metamaterials. various implementations of hyperbolic metamaterials are illustrated in fig.17. the simplest one is obviously a metal-dielectric multilayer whose isofrequency surfaces are hyperbolic. other examples include multilayer fishnet metamaterials and their complementary structures, pillars made from alternating metal and dielectric layers. finally, the most complex design presented in fig. 17 is a sculpted metal-dielectric film – the superlens design. 514 z. jakšić, m. obradov, o. jakšić, g. isić, s. vuković, d. vasiljević radović fig. 17 examples of hyperbolic metamaterials another type of structures that can be used to tailor optical absorption losses are the plasmonic hypercrystals. they can be defined as a periodic combination of hyperbolic medium with another medium (metal, dielectric, metamaterial...) [45]. a general design of a hypercrystal is shown in fig. 18. fig. 18 general design of a hypercrystal possible approaches to decreasing losses in optical metamaterials 515 dispersion of hyperbolic materials does not impose diffraction limit for tm waves (system unlimited by frequency!) bragg reflection in a hyperbolic photonic crystal (d~λ0) leads to the appearance of optical tamm surface states [7]. in hypercrystals the formation of pbgs (photonic bandgaps) persists in the subwavelength mode (metamaterial regime, d<<λ0) optical tamm states in hypercrystals lead to high em confinement (larger wave numbers) and simultaneously to lower absorption losses compared to surface plasmons polaritons. such behavior does not occur either in conventional pbg or in metamaterials. figures 19 and 20 show the frequency dispersion of the extinction coefficient im(kn)d (absorption) in a hypercrystal in dependence on the normalized in-plane momentum (k| |=k0) for varying τ=1/γ in lossy drude model (fig. 19). fig. 19 absorption in a hypercrystal for varying τ=1/γ in lossy drude model. 516 z. jakšić, m. obradov, o. jakšić, g. isić, s. vuković, d. vasiljević radović fig. 20 absorption in a hypercrystal for varying metal fraction in hyperbolic part. 6. conclusion reaching low-loss or lossless nanophotonics & plasmonics is a holy grail of electromagnetics, photonics and transformation optics. strategies include the use of alternative plasmonic materials, all-dielectric nanocomposites and optimization of dispersion and structure toward lower losses. each of them holds its own promises and pitfalls. a winning combination is not (yet) known, but may include a combination of two or more of the above. a host of practical applications would benefit: transformation optics (including superlenses and hyperlenses, cloaking devices, superconcentrators, superabsorbers...) elimination of losses is a crucial step toward merging electronics and photonics into super-compact, super-fast new generation of integrated circuitry. acknowledgement: the paper is a part of the research funded by the serbian ministry of education and science within the projects tr32008, iii45016 and on171005, as well as by the qatar national research fund within the projects nprp 8-028-1-001 and nprp 7-665-1-125. possible approaches to decreasing losses in optical metamaterials 517 references [1] u. leonhardt, “optical conformal mapping,” science, vol. 312, no. 5781, pp. 1777-1780, 2006. [2] y. liu, t. zentgraf, g. bartal, and x. zhang, “transformational plasmon optics,” nano lett., vol. 10, no. 6, pp. 1991-1997, 2010. [3] j. b. pendry, d. schurig, and d. r. smith, “controlling electromagnetic fields,” science, vol. 312, no. 5781, pp. 1780-1782, 2006. [4] w. cai, and v. shalaev, optical metamaterials: fundamentals and applications, springer, dordrecht , germany, 2009. [5] s. a. ramakrishna, and t. m. grzegorczyk, physics and applications of negative refractive index materials, spie press bellingham, wa & crc press, taylor & francis group, boca raton fl, 2009. [6] m. i. dyakonov, “new type of electromagnetic wave propagating at an interface,” sov. phys. jetp, vol. 67, pp. 714-716, 1988. [7] g. isić, s. vuković, z. jakšić, and m. belić, “tamm plasmon modes on semi-infinite metallodielectric superlattices,” scientific reports, vol. 7, no. 1, pp. 3746, 2017. [8] j. a. polo jr, and a. lakhtakia, “surface electromagnetic waves: a review,” laser and photonics reviews, vol. 5, no. 2, pp. 234-246, 2011. [9] j. yang, m. huang, c. yang, z. xiao, and j. peng, “metamaterial electromagnetic concentrators with arbitrary geometries,” opt. express, vol. 17, no. 22, pp. 19656-19661, 2009. [10] d. s. wiersma, p. bartolini, a. lagendijk, and r. righini, “localization of light in a disordered medium,” nature, vol. 390, no. 6661, pp. 671-673, 1997. [11] n. i. landy, s. sajuyigbe, j. j. mock, d. r. smith, and w. j. padilla, “perfect metamaterial absorber,” phys. rev. lett., vol. 100, no. 20, 2008. [12] n. liu, m. mesch, t. weiss, m. hentschel, and h. giessen, “infrared perfect absorber and its application as plasmonic sensor,” nano lett., vol. 10, no. 7, pp. 2342-2348, 2010. [13] j. ng, h. chen, and c. t. chan, “metamaterial frequency-selective superabsorber,” opt. lett., vol. 34, no. 5, pp. 644-646, 2009. [14] n. fang, h. lee, c. sun, and x. zhang, “sub-diffraction-limited optical imaging with a silver superlens,” science, vol. 308, no. 5721, pp. 534-537, 2005. [15] z. liu, s. durant, h. lee, y. pikus, n. fang, y. xiong, c. sun, and x. zhang, “far-field optical superlens,” nano lett., vol. 7, no. 2, pp. 403-408, 2007. [16] j. b. pendry, and d. r. smith, “the quest for the superlens,” sci. am., vol. 295, no. 1, pp. 60-67, 2006. [17] z. jacob, l. v. alekseyev, and e. narimanov, “optical hyperlens: far-field imaging beyond the diffraction limit,” opt. express, vol. 14, no. 18, pp. 8247-8256, 2006. [18] z. liu, h. lee, y. xiong, c. sun, and x. zhang, “far-field optical hyperlens magnifying sub-diffractionlimited objects,” science, vol. 315, no. 5819, pp. 1686, 2007. [19] e. e. narimanov, and v. m. shalaev, “optics: beyond diffraction,” nature, vol. 447, no. 7142, pp. 266267, 2007. [20] w. cai, u. k. chettiar, a. v. kildishev, and v. m. shalaev, “optical cloaking with metamaterials,” nature photonics, vol. 1, no. 4, pp. 224-227, 2007. [21] t. ergin, n. stenger, p. brenner, j. b. pendry, and m. wegener, “three-dimensional invisibility cloak at optical wavelengths,” science, vol. 328, no. 5976, pp. 337-339, 2010. [22] e. ozbay, “plasmonics: merging photonics and electronics at nanoscale dimensions,” science, vol. 311, no. 5758, pp. 189-193, 2006. [23] s. a. maier, plasmonics: fundamentals and applications, springer science+business media, new york, ny, 2007. [24] j. brown, “artificial dielectrics having refractive indices less than unity,” proc. ieee, vol. 100, no. 4, pp. 51-62, 1953. [25] j. brown, "artificial dielectrics," progress in dielectrics, j. b. birks, ed., pp. 193–225, hoboken, new jersey: wiley, 1960. [26] s. m. vuković, z. jakšić, and j. matovic, “plasmon modes on laminated nanomembrane-based waveguides,” j. nanophotonics, vol. 4, pp. 041770, 2010. [27] z. jakšić, and j. matovic, “functionalization of artificial freestanding composite nanomembranes,” materials, vol. 3, no. 1, pp. 165-200, 2010. [28] c. jiang, s. markutsya, y. pikus, and v. v. tsukruk, “freely suspended nanocomposite membranes as highly sensitive sensors,” nature mater., vol. 3, no. 10, pp. 721-728, 2004. [29] p. berini, “long-range surface plasmon polaritons,” adv. opt. photon., vol. 1, no. 3, pp. 484-588, 2009. 518 z. jakšić, m. obradov, o. jakšić, g. isić, s. vuković, d. vasiljević radović [30] p. berini, r. charbonneau, and n. lahoud, “long-range surface plasmons along membrane-supported metal stripes,” ieee j. sel. top. quant. electr., vol. 14, no. 6, pp. 1479-1495, 2008. [31] a. boltasseva, and h. a. atwater, “low-loss plasmonic metamaterials,” science, vol. 331, no. 6015, pp. 290-291, 2011. [32] p. r. west, s. ishii, g. v. naik, n. k. emani, v. shalaev, and a. boltasseva, “searching for better plasmonic materials,” laser & photon. rev, pp. 1-13, 2010. [33] s. franzen, c. rhodes, m. cerruti, r. w. gerber, m. losego, j. p. maria, and d. e. aspnes, “plasmonic phenomena in indium tin oxide and ito-au hybrid films,” opt. lett., vol. 34, no. 18, pp. 2867-2869, 2009. [34] z. fei, a. rodin, g. andreev, w. bao, a. mcleod, m. wagner, l. zhang, z. zhao, m. thiemens, and g. dominguez, “gate-tuning of graphene plasmons revealed by infrared nano-imaging,” nature, vol. 487, no. 7405, pp. 82, 2012. [35] a. grigorenko, m. polini, and k. novoselov, “graphene plasmonics,” nature photonics, vol. 6, no. 11, pp. 749, 2012. [36] j. m. luther, p. k. jain, t. ewers, and a. p. alivisatos, “localized surface plasmon resonances arising from free carriers in doped quantum dots,” nature mater., vol. 10, no. 5, pp. 361, 2011. [37] s. jahani, and z. jacob, “all-dielectric metamaterials,” nature nanotech., vol. 11, no. 1, pp. 23-36, 2016. [38] a. i. kuznetsov, a. e. miroshnichenko, m. l. brongersma, y. s. kivshar, and b. luk’yanchuk, “optically resonant dielectric nanostructures,” science, vol. 354, no. 6314, 2016. [39] p. spinelli, m. a. verschuuren, and a. polman, “broadband omnidirectional antireflection coating based on subwavelength surface mie resonators,” nature comm., vol. 3, 2012. [40] m. quinten, optical properties of nanoparticle systems: mie and beyond, wiley-vch, weinheim, germany, 2011. [41] m. schmid, r. klenk, m. c. lux-steiner, m. topič, and j. krč, “modeling plasmonic scattering combined with thin-film optics,” nanotechnology, vol. 22, no. 2, pp. 025204.1-10, 2010. [42] z. jakšić, m. obradov, s. vuković, and m. belić, “plasmonic enhancement of light trapping in photodetectors,” facta universitatis, series: electronics and energetics, vol. 27, no. 2, pp. 183-203, 2014. [43] a. epstein, and g. v. eleftheriades, “huygens’ metasurfaces via the equivalence principle: design and applications,” josa b, vol. 33, no. 2, pp. a31-a50, 2016. [44] a. poddubny, i. iorsh, p. belov, and y. kivshar, “hyperbolic metamaterials,” nature photonics, vol. 7, no. 12, pp. 948-957, 2013. [45] e. e. narimanov, “photonic hypercrystals,” physical review x, vol. 4, no. 4, 2014. 404.indd facta universitatis series: electronics and energetics vol. 27, no. 4, december 2014, pp. 649 – 661 a double-differential-input / differential-output fully complementary and self-biased asynchronous cmos comparator vladimir milovanović and horst zimmermann institute of electrodynamics, microwave and circuit engineering faculty of electrical engineering and information technology vienna university of technology (tu wien) gußhausstraße 27, a-1040 wien, austria abstract: a novel fully complementary and fully differential asynchronous cmos comparator architecture, that consists of a two-stage preamplifier cascaded with a latch, achieves a sub-100 ps propagation delay for a 50 mvpp and higher input signal amplitudes under 1.1 v supply and 2.1 mw power consumption. the proposed voltage comparator topology features two differential pairs of inputs (four in total) thus increasing signal-to-noise ratio (snr) and noise immunity through rejection of the coupled noise components, reduced evenorder harmonic distortion, and doubled output voltage swing. in addition to that, the comparator is truly self-biased via negative feedback loop thereby eliminating the need for a voltage reference and suppressing the influence of process, supply voltage and ambient temperature variations. the described analog comparator prototype occupies 0.001 mm2 in a purely digital 40 nm lp (low power) cmos process technology. all the above mentioned merits make it highly attractive for use as a building block in implementation of the leadingedge system-on-chip (soc) data transceivers and data converters. keywords: comparator, preamplifier, latch, cmos, fully-differential, pvt variations, noise immunity, self-biasing, data converters, adc, transceivers. manuscript received august 9, 2014; received in revised form october 9, 2014 ∗ an earlier version of this manuscript received the best oral paper award at the 29th international conference on microelectronics (miel 2014), belgrade, 12-14 may, 2014. [1] corresponding author: vladimir milovanović institute of electrodynamics, microwave and circuit engineering (emce), vienna university of technology (tu wien), gußhausstraße 25-29/e354, a-1040 vienna, österreich. e-mail: vladimir.milovanovic@tuwien.ac.at 649 facta universitatis series: electronics and energetics vol. 27, no. 4, december 2014, pp. 649 – 661 a double-differential-input / differential-output fully complementary and self-biased asynchronous cmos comparator vladimir milovanović and horst zimmermann institute of electrodynamics, microwave and circuit engineering faculty of electrical engineering and information technology vienna university of technology (tu wien) gußhausstraße 27, a-1040 wien, austria abstract: a novel fully complementary and fully differential asynchronous cmos comparator architecture, that consists of a two-stage preamplifier cascaded with a latch, achieves a sub-100 ps propagation delay for a 50 mvpp and higher input signal amplitudes under 1.1 v supply and 2.1 mw power consumption. the proposed voltage comparator topology features two differential pairs of inputs (four in total) thus increasing signal-to-noise ratio (snr) and noise immunity through rejection of the coupled noise components, reduced evenorder harmonic distortion, and doubled output voltage swing. in addition to that, the comparator is truly self-biased via negative feedback loop thereby eliminating the need for a voltage reference and suppressing the influence of process, supply voltage and ambient temperature variations. the described analog comparator prototype occupies 0.001 mm2 in a purely digital 40 nm lp (low power) cmos process technology. all the above mentioned merits make it highly attractive for use as a building block in implementation of the leadingedge system-on-chip (soc) data transceivers and data converters. keywords: comparator, preamplifier, latch, cmos, fully-differential, pvt variations, noise immunity, self-biasing, data converters, adc, transceivers. manuscript received august 9, 2014; received in revised form october 9, 2014 ∗ an earlier version of this manuscript received the best oral paper award at the 29th international conference on microelectronics (miel 2014), belgrade, 12-14 may, 2014. [1] corresponding author: vladimir milovanović institute of electrodynamics, microwave and circuit engineering (emce), vienna university of technology (tu wien), gußhausstraße 25-29/e354, a-1040 vienna, österreich. e-mail: vladimir.milovanovic@tuwien.ac.at 649 facta universitatis series: electronics and energetics vol. 27, no 4, december 2014, pp. 649 662 doi: 10.2298/fuee1404649m received august 9, 2014; received in revised form october 9, 2014 *an earlier version of this manuscript received the best oral paper award at the 29th international conference on microelectronics (miel 2014), belgrade, 12-14 may, 2014. [1] corresponding author: vladimir milovanović institute of electrodynamics, microwave and circuit engineering (emce), vienna university of technology (tu wien), gußhausstraße 25-29/e354, a-1040 vienna, österreich (e-mail: vladimir.milovanovic@tuwien.ac.at) 650 v. milovanović and h. zimmermann 1 introduction after amplifiers, comparators are perhaps the second most widely used analog electronic component. analog comparators can be used to determine whether one input value is higher or lower than the other one at specific time points (predefined by the clock signal) or to perform the comparisons in an asynchronous manner, that is, to detect the time point at which the difference of the two input signals has changed its sign. these two comparator types are usually classified as dynamic (clocked) comparators and asynchronous (or open-loop), respectively. further, the compared signal may be any analog physical (i.e., electrical) quantity, like current, voltage, but also charge or even time. this paper settles its contribution in the field of the so-called asynchronous (non-clocked) analog voltage comparators. both asynchronous/open-loop [2] and dynamic/synchronous [3] comparator types, are in a widespread use in switched-mode power supplies as well as in the present-day data conversion [4] and/or transmission circuits [5]. after all comparator itself is nothing else but the single-bit analog-to-digital converter (adc). often, they are the critical design components as, for example, data converters’ bandwidth and maximum (over-)sampling rate directly depend on comparator’s propagation delay. moreover, an analogto-digital converter’s resolution, expressed in terms of signal-to-noise and distortion ratio or effective number of bits, is largely influenced by the comparator’s noise figure and its input-referred noise. finally, on the one hand, comparators should be high speed/low noise, while on the other, for use in battery-powered applications, they should consume as less power as possible. the basic idea behind high-speed analog voltage comparators is in combination of the best aspects of a preamplifier with the negative exponential step response with a latch that exhibits the positive exponential rise. the v − in v + in v + intermediate v − intermediate v − out v + out preamplifier latch vss vdd vss vss vdd vdd fig. 1. fully differential asynchronous voltage comparator that exploits a preamplifierlatch cascade to achieve fast decision making and thereby high operating speeds. 650 v. milovanović, h. zimmermann fully differential self-biased asynchronous cmos comparator 651 a fully differential self-biased asynchronous cmos comparator 651 v − in2 v + in2 v − in1 v + in1 v + intermediate v − intermediate v − out v + out preamplifier latch vss vdd vss vss vdd vdd fig. 2. fully differential high-speed preamplifier-latch asynchronous voltage comparator that features two pairs of differential inputs (four in total) on the preamplifier. preamplifier is used to build-up the input voltage difference up to a certain point where the latch takes over and brings the signal to rail. both clocked and non-clocked comparators can exploit these speed-up principles. a blocklevel representation of a high-speed asynchronous comparator consisting of a preamplifier-latch cascade is given in fig. 1. it is advantageous for high-speed asynchronous voltage comparators to utilize fully differential signaling as it brings with itself increased noise immunity by rejection of the coupled noise components, reduced even-order harmonic distortion, and doubled output voltage swing. besides using differential output as the one of fig. 1, the overall noise performance benefits could also be induced from the comparator version of fig. 2 that features the preamplifier stage with two pairs of differential inputs (four in total). this article presents a high-speed asynchronous cmos voltage comparator implementation which exploits two differential pairs of inputs and is suitable for incorporation in the cutting-edge systems on chip (socs). 2 four-input asynchronous comparator topology transistor-level and block-level schematics of the proposed complementary and fully differential self-biased asynchronous cmos voltage comparator that features two pairs of inputs are shown in fig. 3 and fig. 4, respectively. the comparator is comprised out of three fully differential self-biased cmos voltage amplifiers that share identical circuit topology, and a cmos latch. inputs of two amplifiers (four in total) at the same time act as the comparator inputs, while the biasing nodes and respective outputs of these two amplifiers are connected to each other in parallel, thus constituting the first preamplifier stage. the third amplifier is cascaded to the outputs of the first two, hence effectively forming the preamplifier’s second stage. the 650 v. milovanović, h. zimmermann fully differential self-biased asynchronous cmos comparator 651 652 v. milovanović and h. zimmermann v + in1 v − in1 v + in2 v − in2 v ′ up1 v ′ down1 v ′ up2 v ′ down2 n1lbias n 1r bias n1libias n 1r ibiasn 1l iout n 1r io p1lbias p 1r bias p1libias p 1r ibiasp 1l io p 1r io v ′ bias n1lbias n 1r bias n1libias n 1r ibiasn 1l io n 1r iout p1lbias p 1r bias p1libias p 1r ibiasp 1l io p 1r io v ′+ out v ′− out r ′ r ′ r ′ r ′ v ′′+ in v ′′− inv ′ ′ b ia s v ′′+ out v ′′− out v ′′ up v ′′ down n2lbias n 2r bias n2libias n 2r ibias n2liout n 2r iout p2lbias p 2r bias p2libias p 2r ibias p2liout p 2r iout r ′′ r ′′ v + inl v − inl v + out v − out nlinv n r invn l latch n r latch nlrail n r rail plinv p r invp l latch p r latch plrail p r rail vdd vdd vdd vdd vdd vdd vdd vdd vdd vdd fig. 3. transistor-level schematic of the proposed self-biased asynchronous cmos analog voltage comparator which features two pairs of differential inputs and differential output. amplifiers constructing the first preamplifying stage are mutually identical (corresponding transistor sizes of both are matched), but are different from the one serving as the second preamplifying stage (meaning, its transistor sizes are optimized independently). finally, preamplifier is cascaded with a 652 v. milovanović, h. zimmermann fully differential self-biased asynchronous cmos comparator 653 a fully differential self-biased asynchronous cmos comparator 653 v + in1 v − in1 v + in2 v − in2 1st stage 1/2 1st stage 2/2 v ′− out v ′+ out v ′− out v ′+ out v ′ b ia s v ′′+ in v ′′− in 2nd stage v ′′− out v ′′+ out v − inl v + inl latch vss vss vdd vdd vss vdd vss vss vdd vdd v + out v − out fig. 4. block-level schematic of the proposed self-biased asynchronous analog voltage comparator which features two pairs of differential inputs and differential output of fig. 3. simple latch whose outputs are at the same time the comparator outputs. inputs of each of the three fully differential self-biased inverter-based cmos amplifiers [5, 6] are amplified through the push-pull inverters consisting of transistors nxxiout and p xx iout, thus rendering the outputs of that particular amplifier. the cmos inverters at the inputs bring with themselves inherent advantages like very high input impedance and nominally doubled transconductance. the biasing of each stage is accomplished through complementary transistor pairs nxxbias and p xx bias which are controlled by vbias and are operating deep within the linear region. this potential is in turn stabilized through the negative feedback loop utilizing nxxibias and p xx ibias. namely, any variation in processing parameters or operating conditions (change of supply voltage or ambient temperature) that shifts vbias from its nominal value, results in an instant attenuation of these deviations [7] in an extent proportional to the value of the loop gain. as the biasing transistors are operating in the triode region, potentials vdown and vup are very close to the negative and the positive supply rail, respectively. in such configuration, self-biasing is not compromising with the output voltage swing which is nearly equal to the difference between the values of the two supply rails. resistors r′ and r′′ serve to avoid establishment of the low-resistive paths through v′bias and v ′′ bias nodes, respectively, for high (by absolute value) input voltage differences. placed in the biasing part, the resistors have no impact on comparator performance except that it drastically reduces dissipation while mutually distant potentials are applied as comparator inputs. 652 v. milovanović, h. zimmermann fully differential self-biased asynchronous cmos comparator 653 654 v. milovanović and h. zimmermann problem of the same kind will also occur in the path through v′+out and v′−out nodes but it cannot be avoided using the resistor trick instead these metal lines must be made thicker in order to sustain higher current values. as already stated, the output of the last preamplifier stage is connected to the input of the latch stage. the latch itself is implemented as the cross-coupled connection of two cmos inverters (composed out of transistors nxlatch and p x latch). the coupling between the preamplifier’s output and the latch itself is done through inverters consisting of transistors nxinv and pxinv. without transistors n x rail and p x rail, the coupling inverters should be large/strong enough to have the ability to pull the latch out of the positive feedback saturation, but still small/weak enough not to firmly dictate the output voltage (because having a latch in that case is senseless). connecting these four field-effect transistors to the supply rails relaxes the last requirement and consequently increases design’s reliability and robustness. besides being fully complementary, the proposed asynchronous voltage comparator circuit with two pairs of inputs is also perfectly symmetrical with respect to the vertical and the horizontal axis in fig. 3 and fig. 4, respectively. this is the reason why the biasing transistors on each preamplifier stage are drawn separately. symmetry implies beneficial repercussions on the process of laying the circuit out, as one can naturally match paired devices and the propagation delay through separate circuit blocks. 3 circuit analysis of the comparator architecture analysis of the proposed comparator topology can be accomplished by analyzing two of its subcomponents, namely the preamplifier and the latch. 3.1 preamplifier if the voltage drops across the biasing transistors are neglected, that is, if vdown and vup are approximately at the supply rails, then the small-signal differential gain of the comparator’s preamplifier is just equal to the transfer function of the push-pull inverter and hence it can be written as v ′′+out − v ′′− out ( v +in1 − v − in1 ) − ( v +in2 − v − in2 ) (s) = hpreamplifier (s) = (1) r′or ′′ o ( s − g′m/c ′ gd ) ( s − g′′m/c ′′ gd ) r′or ′′ oζs 2 + [ r′o ( c′ gd + c′′ gd (1 + g′′mr ′′ o) + ci2o1 ) + r′′o ( c′′ gd + cl )] s + 1 , 654 v. milovanović, h. zimmermann fully differential self-biased asynchronous cmos comparator 655 a fully differential self-biased asynchronous cmos comparator 655 0 time ttx ttot tpreamplifier tlatch tlatch vlatch vx vpreamplifier = gpreamplifier· · [( v + in1 − v − in1 ) − ( v + in2 − v − in2 )] vpreamplifier > vx vx > vlatch v vdd supply voltage rail preamplifier la tc h p re a m p li fi e r/ l a tc h t im e -d o m a in r e sp o n se fig. 5. combination of the preamplifier negative exponential step response (dashed line) with the positive exponential initial condition time response of the latch (dash-dotted line). at optimum point (tx, vx), which is at the same time the preamplifier-latch takeover point, the first derivatives of the two curves are the same. this minimizes preamplifier-latch cascade propagation delay ttotal = tpreamplifier+tlatch and makes the combined output signal quicker which implies fast decision making of the proposed asynchronous comparator. where g′m = g ′ mn +g ′ mp and g ′′ m = g ′′ mn +g ′′ mp are the total transconductances of the first and the second preamplifier’s stage inverter, respectively, r′o and r′′o are the total resistances seen at the output of the first and at the output of the preamplifier’s second stage, c′ gd = c′ gdn+c ′ gdp and c ′′ gd = c′′ gdn+c ′′ gdp are the sums of the gate-drain capacitances of the nmos and pmos of the first and the second preamplifier’s stage, respectively. for simplicity reasons, ζ = cl ( c′gd + c ′′ gd + ci2o1 ) + c′′gd ( c′gd + ci2o1 ) is introduced, while ci2o1 is the total capacitance at the output of the first and the input of the second preamplifier stage and cl is the total load capacitance at the output of the preamplifier or at the input of the latch. it may be observed that the transfer function (1) in which s = σ + iω is the complex angular frequency, is of the second order with two real left complex half-plane poles. it also possesses two real high frequency right complex half-plane zeroes at frequencies z1 = g ′ m/c ′ gd and z2 = g ′′ m/c ′′ gd . the step response of the preamplifier can be predicted based on its transfer function. if the effect of the two high frequency zeroes, z1 and z2 is neglected, together with the dominant pole approximation, the system’s step 654 v. milovanović, h. zimmermann fully differential self-biased asynchronous cmos comparator 655 656 v. milovanović and h. zimmermann response may be written as v′′+out (t) − v ′′− out (t) = l −1 {hpreamplifier (s) /s} ≈ (2) ≈ gpreamplifier [( v+in1 − v − in1 ) − ( v+in2 − v − in2 )] [1 − κ exp (−t/τa)] u (t) , where gpreamplifier and τa are the preamplifier low frequency gain and time constant which is inversely proportional to the value of the dominant pole, κ is a constant dependent on coefficients of the polynomial found in the transfer function denominator, while u (t) and l−1 represent the heaviside step function and the inverse laplace transform operator, respectively. 3.2 latch if the initial voltage that is applied to the latch output nodes (through the preamplifier-latch coupling inverters) at specified time point t′ is v+out (t ′) − v−out (t ′), then the time response of the linearized latch approximation on this initial condition (for t ≥ t′ and ∆t = t − t′) has the form of an exponentially increasing [8] function of time ∆t and can be written as v+out (t) − v − out (t) = exp(∆t/τl) [ v+out ( t′ ) − v−out ( t′ )] . (3) the time constant of the portrayed cross-coupled cmos inverter latch is approximately equal to τl ≈ c/gml, where c is the total capacitance seen at the output of the latch, i.e., comparator, while gml = gmnl + gmpl is the total transconductance of the latch complementary transistor pair. note that this is a typical temporal response of positive-feedback systems which have a single or a dominant real right complex half-plane pole. 4 operating principles of the described comparator as already stated in the introduction, the basic idea behind the presented comparator is in combination of the best aspects of the preamplifier, which is characterized by the negative exponential step response (2), with the positive exponential response (3) latch. the preamplifier builds up the voltage up to a certain point where the latch takes over and brings the signal to a rail. the previous principle concepts are illustrated in fig. 5. in this figure, the preamplifier gain times the input voltage alone is not sufficient for the output to reach the rail. nevertheless, it achieves a high enough output value to pull the latch out of one saturation state and trigger its positive feedback loop that drives the comparator to the saturation state on another supply rail, thus producing a firm logical level (high or low) at the output. 656 v. milovanović, h. zimmermann fully differential self-biased asynchronous cmos comparator 657 a fully differential self-biased asynchronous cmos comparator 657 comparator + output buffers in2− in2+ in1− in1+ 5 0 ω 5 0 ω 5 0 ω 5 0 ω comparator chain of inverters as output drivers ✏✏✶ ron = 50 ω��� capable of driving pad capacitance and 50 ω measurement equipment out+ out− delay(comparator)=delay(comparator+buffers)−delay(buffers) output buffers only (for delay subtraction) in+ in− 50 ω 50 ω dummy comparator actually a shortcut chain of inverters as output drivers ✏✏✶ ron = 50 ω��� these inverters are identical to the ones that come after the comparator out+ out− fig. 6. on-chip comparator structure with output buffers and the corresponding dummy comparator structure used for exact extraction of the comparator’s propagation delay. with the total propagation delay through the comparator being the sum of propagation delays of the cascaded components it consists of, namely, ttotal = tpreamplifier + tlatch , (4) it is obvious that reducing the time constants of the separate comparator subcircuits (τa and τl) is essential to increase its speed of operation. additionally, it can be proven that there exists the optimum preamplifier-latch takeover point (tx, vx) that is located in the point where the first derivatives of the preamplifier and the latch function are equal. this was somewhat expected and hence for high-speed applications the comparator should be optimized such that the subcomponent function that has larger first derivative of the two is used for the corresponding part of the characteristics. apart from acceleration, another role of the latch block is also to align comparator’s complementary output fall-time and rise-time edges. 656 v. milovanović, h. zimmermann fully differential self-biased asynchronous cmos comparator 657 658 v. milovanović and h. zimmermann b u ff e rs o n ly o u t + & [v ] time elapsed after the fixed moment in time t [ns] b u ff e rs o n ly in + & [v ] c o m p a ra to r o u t + & [v ] c o m p a ra to r in 2 + & [v ] c o m p a ra to r in 1 + & [v ] t + 1 t + 2 t + 3 t + 4 t + 5 t + 6 t + 7 t + 8 t + 9 0 0.2 0.4 0.6 0.0 0.55 1.1 0 0.2 0.4 0.6 0.53 0.55 0.57 0.53 0.55 0.57 p se u d o ra n d o m b in a ry se q u e n c e 2 3 1 − 1 fre q u e n c y f = 3 .3 3 g h z ✻ ❄ 50 mvpp ✻ ❄ 50 mvpp ✻ ❄ 0.55 vpp ✻ ❄ 1.1 vpp ✻ ❄ 0.55 vpp ❄ d iff e re n c e : c o m p a r a t o r d e la y t d e la y fig. 7. measured inputs and outputs of the on-chip structure containing asynchronous voltage comparator featuring two pairs of differential inputs with output drivers and the corresponding on-chip dummy comparator structure containing the output drivers alone. 5 on-chip measurement setup for propagation delay the output of the latch, which is at the same time the comparator output, has rail-to-rail swing and is hence designed to be cascaded by some digital circuitry which regularly features relatively low input capacitance with respect to a pad capacitances. to measure the comparator characteristics in a realistic configuration a chain of several inverters which drive the pad capacitance and the 50 ω measurement equipment follows each of the comparator outputs as shown in fig. 6. both transistors in the last inverter are designed to have the on-resistance of ron = 50 ω to avoid reflection thus halving the output signal amplitude to vdd/2. for the same reason all four inputs have 50 ω on-chip termination to ground. to enable indirect delay measurement of the comparator, output drivers are also placed on chip, on their own, as explained by fig. 6. special attention is paid so that the metal lines routed to and off the comparator (with the output drivers) and the output drivers alone 658 v. milovanović, h. zimmermann fully differential self-biased asynchronous cmos comparator 659 a fully differential self-biased asynchronous cmos comparator 659 fig. 8. oscilloscope display showing an eye pattern for the two comparator outputs that are connected to channels 1 and 2. input pseudorandom sequence’s frequency is 3.33 ghz. are identical in every aspect. this enabled the use of identical printed circuit boards, identical coaxial cables and finally identical measurement equipment to drive and characterize both on-chip structures. thus, delay of the comparator is obtained as the difference between the delay of the structure with comparator plus output buffers and the delay of the dummy structure containing the buffers only. the previous subtraction eliminates the influence of coaxial cables, printed circuit board microstrip lines, on-chip metal lines, etc., which were identical for both measurements and are therefore canceled out in the process of delay subtraction. additionally, the output drivers are optimized for small propagation delay variation, the standard deviation of which is σ(delay) < 5 ps based on one thousand monte-carlo simulations and the sample of ten relative on-chip measurements. also, the comparator and the buffers have separate supply pads (i.e., analog and digital, respectively) to enable power consumption measurement of the comparator alone. measured inputs and outputs of the on-chip characterization structures depicted in fig. 6, driven by pseudorandom binary sequence signal with frequency of 3.33 ghz, are shown in fig. 7 in a form of an oscilloscope screenshot. it can be observed that the structure containing buffers only is always driven with rail-to-rail signal resembling the comparator outputs. difference between the two outputs yields the comparator propagation delay. 658 v. milovanović, h. zimmermann fully differential self-biased asynchronous cmos comparator 659 660 v. milovanović and h. zimmermann 1.05 mm ✲✛ 0 .7 7 m m ✻ ❄ ✄ ✂ � ✁ ✄ ✂ � ✁ ✞ ✝ ☎ ✆ ✄ ✂ � ✁✄ ✂ � ✁ 11.96× 25.4 µm2 ❅❅❘ ✻✻ output drivers 39.2× 25.5 µm2 ❅❅❘ ✻ comparator d g o o g d g i i g a g i g i g a g i i g g g o o g g fig. 9. test chip photomicrograph. abbreviations: (g) ground, (a) analog supply, (d) digital supply, (i) input, (o) output. left – output buffers; right – four-input comparator. 6 measurement results of the proposed comparator having in mind reasonable power consumption, the described comparator is optimized for speed and is fabricated in a standard 1p8m digital 40 nm low power multi-threshold cmos process technology shrank to 90% (minimum transistor gate length 36 nm). to optimize latency and power the exploited technology offers transistors with three different values of threshold voltage. threshold voltages for low-vt transistor types, which are used in the design to minimize propagation delay, are around vtn/vtp ≈ 0.33 v/−0.28 v, while the nominal supply voltage for the given process is vdd = 1.1 v. the propagation delay of the comparator with two pairs of inputs, measured in the upper described manner, is lower than 100 ps for the 50 mvpp step applied at both of its differential inputs. total power dissipation of the comparator under these circumstances equals 2.1 mw and is dominated by the preamplifier’s static consumption. ergo, the dc current consumption accounts for the major part of the total comparator’s power consumption. measured eye diagram of the comparator at 3.33 ghz, what was the limit of stimulus equipment, is shown in fig. 8, however, based on the propagation delay measurements, the eye opening should be present up to 10 ghz. test chip photomicrograph is given in fig. 9. our proposed four-input comparator design implementation occupies an area of 39.2 × 25.5 µm2. 660 v. milovanović, h. zimmermann fully differential self-biased asynchronous cmos comparator 661 a fully differential self-biased asynchronous cmos comparator 661 7 conclusions the article presents a prototype of a novel fully differential asynchronous comparator topology that features two-pairs of inputs and is implemented in 40 nm lp cmos technology. the comparator consists of a preamplifierlatch cascade and is completely self-biased thus overcoming the need for a reference circuit and reducing the influence of pvt variations. comparator propagation delay is extracted using subtractive method which exploits onchip dummy output driver structures. measurements indicate that, depending on the actual input signal amplitude and common-mode, the comparator can operate at frequencies beyond 10 ghz under dissipation of 2.1 mw. although both comparator delay and its power consumption greatly depend on the input signal amplitude and common-mode value, this still places it among the fastest non-clocked comparators published up to date. finally, the proposed comparator circuit is well-suitable for implementation in the cutting-edge system-on-chip (soc) data transceivers and data converters. acknowledgements the authors would like to express their gratitude to lantiq a and austrian bmvit for their financial support of the fit-it project xplc via ffg. references [1] v. milovanović and h. zimmermann, “a two-differential-input / differentialoutput fully complementary self-biased open-loop analog voltage comparator in 40 nm low power cmos,” in proceedings of the 29th international conference on microelectronics — miel 2014, may 2014, pp. 355–358. [2] t. sepke et al., “comparator-based switched-capacitor circuits for scaled cmos technologies,” in isscc dig. tech.papers, feb. 2006, pp. 812–821. [3] d. schinkel et al., “a double-tail latch-type voltage sense amplifier with 18 ps setup+hold time,” in isscc dig. tech.pap., feb. 2007, pp. 314–315. [4] v. srinivasan et al., “a 20 mw 61 db sndr (60 mhz bw) 1 b 3rd-order continuous-time delta-sigma modulator clocked at 6 ghz in 45 nm cmos,” in isscc dig. tech.papers, feb. 2012, pp. 812–821. [5] c.-y. yang and s.-i. liu, “a one-wire approach for skew-compensating clock distribution based on bidirectional techniques,” ieee journal of solid-state circuits, vol. 36, no. 2, pp. 266–272, feb. 2001. [6] m.-c. huang and s.-i. liu, “a fully differential comparator-based switchedcapacitor ∆σ modulator,” ieee transactions on circuits and systems ii: express briefs, vol. 56, no. 5, pp. 369–373, may 2009. [7] m. bazes, “two novel fully complementary self-biased cmos differential amplifiers,” ieee j. of solid-state circuits, vol. 26, no. 2, pp. 165–168, feb. 1991. [8] b. j. mccarroll et al., “a high-speed cmos comparator for use in an adc,” ieee journal of solid-state circuits, vol. 23, no. 1, pp. 159–165, feb. 1988. 660 v. milovanović, h. zimmermann fully differential self-biased asynchronous cmos comparator 661 a fully differential self-biased asynchronous cmos comparator 661 7 conclusions the article presents a prototype of a novel fully differential asynchronous comparator topology that features two-pairs of inputs and is implemented in 40 nm lp cmos technology. the comparator consists of a preamplifierlatch cascade and is completely self-biased thus overcoming the need for a reference circuit and reducing the influence of pvt variations. comparator propagation delay is extracted using subtractive method which exploits onchip dummy output driver structures. measurements indicate that, depending on the actual input signal amplitude and common-mode, the comparator can operate at frequencies beyond 10 ghz under dissipation of 2.1 mw. although both comparator delay and its power consumption greatly depend on the input signal amplitude and common-mode value, this still places it among the fastest non-clocked comparators published up to date. finally, the proposed comparator circuit is well-suitable for implementation in the cutting-edge system-on-chip (soc) data transceivers and data converters. acknowledgements the authors would like to express their gratitude to lantiq a and austrian bmvit for their financial support of the fit-it project xplc via ffg. references [1] v. milovanović and h. zimmermann, “a two-differential-input / differentialoutput fully complementary self-biased open-loop analog voltage comparator in 40 nm low power cmos,” in proceedings of the 29th international conference on microelectronics — miel 2014, may 2014, pp. 355–358. [2] t. sepke et al., “comparator-based switched-capacitor circuits for scaled cmos technologies,” in isscc dig. tech.papers, feb. 2006, pp. 812–821. [3] d. schinkel et al., “a double-tail latch-type voltage sense amplifier with 18 ps setup+hold time,” in isscc dig. tech.pap., feb. 2007, pp. 314–315. [4] v. srinivasan et al., “a 20 mw 61 db sndr (60 mhz bw) 1 b 3rd-order continuous-time delta-sigma modulator clocked at 6 ghz in 45 nm cmos,” in isscc dig. tech.papers, feb. 2012, pp. 812–821. [5] c.-y. yang and s.-i. liu, “a one-wire approach for skew-compensating clock distribution based on bidirectional techniques,” ieee journal of solid-state circuits, vol. 36, no. 2, pp. 266–272, feb. 2001. [6] m.-c. huang and s.-i. liu, “a fully differential comparator-based switchedcapacitor ∆σ modulator,” ieee transactions on circuits and systems ii: express briefs, vol. 56, no. 5, pp. 369–373, may 2009. [7] m. bazes, “two novel fully complementary self-biased cmos differential amplifiers,” ieee j. of solid-state circuits, vol. 26, no. 2, pp. 165–168, feb. 1991. [8] b. j. mccarroll et al., “a high-speed cmos comparator for use in an adc,” ieee journal of solid-state circuits, vol. 23, no. 1, pp. 159–165, feb. 1988. 662 v. milovanović, h. zimmermann fully differential self-biased asynchronous cmos comparator pb facta universitatis series: electronics and energetics vol. 31, n o 1, march 2018, pp. 1 9 https://doi.org/10.2298/fuee1801001e effect of the distribution of states in amorphous in-ga-zn-o layers on the conduction mechanism of thin film transistors on its base magali estrada 1 , yoanlys hernandez-barrios 1 , oana moldovan 2 , antonio cerdeira 1 , francois lime 2 , marcelo pavanello 3 , benjamin iñiguez 2 1 sees, depto. de ingeniería eléctrica, cinvestav-ipn, méxico city, méxico 2 departament d'enginyeria electrònica, elèctrica i automàtica (deeea), universitat rovira i virgili, tarragona, spain 3 department of electrical engineering, centro universitário da fei, são paolo, brasil abstract. amorphous in-ga-zn-o thin film transistors (a-igzo tfts) have proven to be an excellent approach for flat panel display drivers using organic light emitting diodes, due to their high mobility and stability compared to other types of tfts. these characteristics are related to the specifics of the metal-oxygen-metal bonds, which give raise to spatially distributed s orbitals that can overlap between them. the magnitude of the overlap between s orbitals seems to be little sensitive to the presence of the distorted bonds, allowing high values of mobility, even in devices fabricated at room temperature. in this paper, we show the effect of the distribution of states in the a-igzo layer on the main conduction mechanism of the a-igzo tfts, analyzing the behavior with temperature of the drain current. key words: amorphous oxide semiconductor, thin-film transistor, behavior with temperature, distribution of states 1. introduction in 2004, nomura et al. [1] presented the first amorphous oxide semiconductor (aos) tfts using a novel semiconductor material at that time, the amorphous in-ga-zn-o (aigzo), which was deposited by pulse layer deposition, using a krf excimer laser and a polycrystalline in-ga-zno target. the chemical composition of the target was in:ga:zn 1.1:1.1:0.9 in atomic ratio. the authors explained, that conduction in amorphous oxide semiconductors (aoss) containing post-transition-metal cations is completely different received july 7, 2017 corresponding author: yoanlys hernandez-barrios sees, depto. de ingeniería eléctrica, cinvestav-ipn, av. ipn 2208, cp 07360, méxico city, méxico (e-mail: yhb961210@gmail.com) 2 m. estrada, y. hernandez-barrios, o. moldovan, et al. from that of covalent semiconductors as a-si:h. in a-si:h tfts, the presence of randomly distributed sp3 bonds, gives rise to a high density of both deep and tail localized states and the carrier transport is governed by hopping between localized tail states. in aoss, the conduction band has high ionicity due to spatially distributed s orbitals, which can overlap between them. although distorted metal-oxygen-metal bonds exist in the amorphous material, the magnitude of the overlap between s orbitals seems to be insensitive to the presence of the distorted bonds, allowing high values of mobility, even in devices fabricated at room temperature. the transistor presented in [1] used 140-nmthick y2o3 layer as gate dielectric and indium tin oxide (ito) as source (s), drain (d) and gate (g) contacts. all materials used in the tft were transparent. to explain the conduction mechanism, authors previously analyzed the behavior of single-crystalline ingao3(zno)5 [2]. for these devices, the carrier transport was associated to percolation conduction over potential barriers around the conduction band edge. these potential barriers are supposed to be due to randomly distributed ga 3+ and zn 2+ ions in the crystal structure. since the amorphous igzo (a-igzo) tfts showed also high mobility values with a behavior similar to the crystalline ones, authors considered that the percolation conduction mechanism takes place also in a-igzo tfts [3]. due to the relatively high electron mobility, high optical transparency, low temperature and relatively low cost processing techniques, these devices have found an important application in active matrix organic light emitting diodes, (amoleds) displays [4-7]. from this moment on, a-igzo tfts have been object of continuous and intensive research from all points of view, including technological and physical aspects, looking to improve stability, increase mobility and reduce operating voltage range, among others. the characteristics of the a-igzo band structure can be found in [6,7], with a distribution of bulk localized states dos in the gap [7,8]. it is generally accepted that dos observed in a-igzo layers are characterized by a relatively low density of localized states, less than 1x10 20 cm -3 ev -1 , and their characteristics strongly depend on the process used for the igzo layer deposition and in general on the device fabrication [9-13]. regarding the conduction mechanism, in [14], authors proposed a conduction mechanism that contains both possibilities, band percolation [15] and the mobility edge or multiple trapping and release mechanism [16,17], which they call extended mobility edge model. depending on the specific characteristics of the device associated to the fabrication process, or depending on the operation regime for the same device, the predominant mechanism can be either percolation in conduction band or multiple trapping. the characterization with temperature of the electrical characteristics of tfts allows to analyze, not only the behavior of the devices in the temperature operation range required for the specific application, but also the conduction mechanisms. in most amorphous tfts, the drain current has been reported to increase with temperature, which is characteristic of the hopping conduction mechanism. in this paper, we show, that under certain operation conditions, a reduction of the drain current with temperature is observed, which is related to the characteristics of the dos present in the a-igzo layer of the device. effect of the distribution of states in amorphous in-ga-zn-o layers on the conduction mechanism... 3 2. experimental part for the analysis of the electrical characteristics, we will use two bottom gate top contact igzo tfts shown in figures 1a and b. device 1 consists of 90 nm of hfo2 as gate dielectric, deposited at 100 o c by atomic layer deposition, on top of which, a 70 nm thick igzo layer was deposited by pulsed laser deposition (pld) at 20 mtorr oxygen pressure. a 500 nm thick layer of poly-p-xylylene-c (parylene-c), deposited by chemical vapor deposition (cvd) at room temperature at 1 mtorr, was used as etch stopper layer (esl). au/cr deposited by electron beam was used for gate contact and al was used for drain and source contacts. annealing was done after the deposition of the a-igzo layer and at the end of the fabrication process. device 2 consisted of 200 nm of si3o4 as gate insulator, deposited by plasma enhanced chemical vapor deposition (pecvd), at 250 o c as gate insulator. the aigzo layer had 12 nm and was deposited by rf sputtering at room temperature. as etch stopper layer, 100 nm of sio2, deposited by pecvd, were used. mo/cr was used for g, d and s contacts. a final annealing was done at the end of the fabrication process. photolithography was used to pattern each layer. figures 1a and b show the cross section of device 1 and 2, respectively. the analyzed tfts corresponding to device 1 had channel width (w) and length (l) of w=80 µm l=40 µm and those of device 2 had w=900 µm, l=30 µm. (a) (b) fig. 1 a) cross section of igzo tft referred as: a) device 1; b) device 2. electrical measurements were done at different temperatures and in vacuum conditions, using a k20 programmable temperature controller and measurement chamber from mmr technologies inc. and a keithley 4200 semiconductor characterization system. the output characteristics were measured every ten degrees, in the temperature range between 300 k and 350 k. measurements were done after waiting 5 minutes at each fixed temperature, with no applied bias. the time with applied voltage, during measurements at 4 m. estrada, y. hernandez-barrios, o. moldovan, et al. each temperature, was less than 5 min. before the i-v-t measurements, the devices were tested for hysteresis and bias stress instability at room temperature to guarantee that the variation of the drain current was due to the temperature variation and not to instability effects. 3. analysis and discussion figure 2a shows the measured output characteristics at 300, 320 and 330 k for device 1. fig. 2b shows the output characteristics at 300,330 and 350 k for device 2. as can be seen in fig. 2a, the drain current (ids) increases significantly with temperature. this is the typical behavior with temperature of the output current in a-igzo shown in [12,14,17,18,19]. on the contrary, in the output characteristics shown in fig. 2b, ids reduces with the increase of t, as vds increases. as already mentioned, due to the specific characteristics of metal oxide materials chemical bonds, the conduction mechanism in a-igzo tfts can be due not only to hopping, typical of amorphous tfts, but also to percolation in the conduction band [8,9,14]. the temperature dependence of the drain current and mobility will be determined by the predominant conduction mechanism, which can depend, not only on the fabrication process, but on the operation conditions for a given fabrication process. it is expected that the contribution of the variable range hopping (vrh) becomes greater than the band-like mechanism when the fermi level ef lies within the exponential tail states. according to [14], for a characteristic energy of around kta=0.069 ev and a density of acceptor tail states at ec, (nta) in the order of 3.4x10 19 cm -3 ev -1 , this can occur. to estimate the dos in the amorphous semiconductor material of our devices, we used the same procedure as in [16,17], obtaining characteristic energy of kta =0.072 ev and nta = 8.5x10 19 cm -3 ev -1 for device 1 and kta=40 mev and nta<6x10 18 cm -3 ev -1 for device 2. in order to study in more detail the effect of the dos characteristic on the conduction mechanism, we used simulations in atlas. for this purpose, we simulated the output characteristics for a-igzo tfts with a bottom gate structure. different dos characteristics with an acceptor-type dos, having nta values in the range from 1.5x10 20 cm -3 ev -1to 3.5x10 17 cm -3 ev -1 were simulated. the characteristic energy was varied from 0.03 ev to 0.18 ev. the default low field mobility model was used, taking the default value of the temperature dependent parameter for this mobility in altas. the reduction of mobility with temperature is the typical behavior expected for a crystalline device due to phonon scattering. in this case, it can be associated to a crystalline-like behavior of the amorphous oxide semiconductor material [3]. the mobility dependence with temperature is considered with the objective to distinguish between the effect of the characteristics of the dos and the effect of a crystalline-like mobility behavior, on the ids temperature dependence. if mobility is considered constant in the simulator, the variation of the drain current with temperature will be determined only by the dos characteristics, which was confirmed by simulating considering a mobility that does not depend on temperature. the model used for the blaze simulation included fermi statistics, as well as band parameters and bandgap narrowing specified for igzo material. effect of the distribution of states in amorphous in-ga-zn-o layers on the conduction mechanism... 5 main results of simulations are summarized on table i. for simulated output characteristics of devices with nta=1.5x10 20 cm -3 ev -1 and kta=0.34, the typical increase of ids with t was observed for all curves with vgs equal or above 4 v. it is evident that in this case, defects are determining the behavior of the temperature dependence of the drain current. for values of nta equal or below 1.5x10 19 cm -3 ev -1 and kta=0.034 ev, the drain current decreases with temperature. this result confirms that when, the density of localized states is sufficiently small and trapping is less important, the temperature dependence of the drain current is determined by the temperature dependence of mobility, see table 1. (a) (b) fig. 2 output characteristics at different temperatures, of: a) device 1; b) device 2´. 6 m. estrada, y. hernandez-barrios, o. moldovan, et al. the effect of reducing the dos characteristic energy kta, and thus the effect of trapping, was also analyzed. for this purpose nta was maintained constant and equal to nta=3.5x10 19 cm -3 ev -1 , while the characteristic energy was varied. table 1 and fig. 3 show that the reduction of kta allows the change of mechanism to occur for smaller values of vgs. it is observed that for kta=100 mev, the change in mechanism is not observed and ids at 350 k is higher than at 300 k for all the operation voltage range. for kta =70 mev, the change in mechanism is observed only, for vgs=10 v, when ids at 350 k becomes smaller than at 300 k. for kta =0.34, the conduction mechanism is also the same in all the operation voltage range, but it corresponds to percolation in the conduction band, producing the reduction of ids with t. fig. 4 shows the change of mechanism in a simulated transfer curve in saturation, when nta is changed maintaining kta =0.034 ev. for nta = 1.5x10 20 cm -3 ev -1 , ids is higher at 350 k than at 300 k for all values of vgs. for nta =1.5x10 18 cm -3 sv -1 , ids is practically the same at 300 and 350 k for vds<0.5 v and as vgs increases, the drain current at 350 k becomes smaller than at 300 k. table 1 values of the drain current for t=300 k and 350 k, at vds=10 v and different values of vgs, corresponding to different combinations of values of nta and kta. nta (cm -3 ) kta (mev) ids [a] vds=10 v vgs=4 v t=350 k t=300 k ids [a] vds=10 v vgs=6 v t=350 k t=300 k ids [a] vds=10 v vgs=8 v t=350 k t=300 k ids [a] vds=10 v vgs=10 v t=350 k t=300 k 1.5x10 18 34 2.05x10 -5 2.54x10 -5 4.00x10 -5 4.98x10 -5 6.58x10 -5 8.23x10 -5 9.65x10 -5 1.21x10 -4 1.5x10 19 34 1.64x10 -5 1.89x10 -5 3.30x10 -5 3.89x10 -5 5.57x10 -5 6.66x10 -5 8.34x10 -5 1.01 x10 -4 1.5x10 20 34 4.06 x10 -6 3.20 x10 -6 9.13 x10 -6 7.83 x10 -6 1.71 x10 -5 1.50 x10 -5 2.85x10 -5 2.80 x10 -4 1.5x10 19 100 2.09 x10 -6 1.50 x10 -6 7.36 x10 -6 6.68 x10 -6 1.74 x10 -5 1.78 x10 -5 3.29x10 -5 3.59x10 -5 3.5x10 19 34 1.22 x10 -5 1.83 x10 -5 2.56 x10 -5 3.69 x10 -5 4.44 x10 -5 6.19 x10 -5 6.83x10 -5 9.21x10 -5 3.5x10 19 70 2.13 x10 -6 1.44 x10 -6 6.51 x10 -6 5.41 x10 -6 1.48 x10 -5 1.40 x10 -5 2.77 x10 -5 2.85 x10 -5 3.5x10 19 100 2.10 x10 -7 8.80 x10 -7 1.08 x10 -6 6.36 x10 -7 3.69 x10 -6 2.83 x10 -6 9.38 x10 -6 8.55 x10 -6 6x10 19 30 1.07 x10 -5 1.08 x10 -5 2.24 x10 -5 2.37 x10 -5 3.92 x10 -5 4.31 x10 -5 6.08 x10 -5 6.87 x10 -5 8.5x10 19 72 2.84 x10 -7 1.25 x10 -7 1.03 x10 -6 5.78 x10 -7 2.85 x10 -6 1.95 x10 -6 6.53 x10 -6 5.22 x10 -6 3.5x10 17 180 1.83 x10 -5 2.25 x10 -5 3.69 x10 -5 4.58 x10 -5 6.19 x10 -5 7.72 x10 -5 9.21 x10 -5 1.15 x10 -4 effect of the distribution of states in amorphous in-ga-zn-o layers on the conduction mechanism... 7 fig. 3 simulated output characteristics at t=300 k and t=350 k for nta= 3.5x10 19 cm -3 ev -1 and kta=100 mev and 70 mev, showing the different behavior of ids with t corresponding to the change in conduction mechanism. fig. 4 simulated transfer curves in saturation at t=300 k and t=350 k for nta= 1.5x10 20 cm -3 ev -1 and 1.5x10 18 cm -3 ev -1 , for kta =34 mev. simulations confirm that device 1 with the dos characteristic indicated above, is expected to have hopping as the predominant conduction mechanism in all the operation region and temperature range analyzed, which is what was observed. 8 m. estrada, y. hernandez-barrios, o. moldovan, et al. from the other hand, simulations also show that for values of nta below 3.5x10 19 cm -3 and kta below 0.1 ev, the predominant conduction mechanism can change to percolation for vds and vgs above a given value. this value can be estimated analyzing the arrhenius dependence of the drain current in the devices. this was the case observed for device 2. as already mentioned, the presence of bandlike carrier transport is well accepted for igzo tfts, although the presence of vrh cannot be excluded [14]. due to it, the device can reveal an electrical crystalline-like behavior, in which mobility reduces with temperature, due to the interaction of carriers with the atoms in the material. the predominant carrier transport mechanism, will depend on the dos characteristics of the device being analyzed. if the effect of the dos, is sufficiently small, the current due to percolation conduction is expected to become predominant and the device can show a crystalline-like behavior. 4. conclusions due to the chemical bond of in a-igzo layers, carrier conduction in the conduction band is possible in aostfts based on this material. however, the presence of vrh cannot be excluded and the predominant conduction mechanism is determined by the characteristics of the dos in the amorphous igzo layer and the operating voltages, which will define the position of the fermi level. simulations confirm that, when the effect of the dos, is sufficiently small, that is when the combination of nta and kta is sufficiently small, current due to percolation conduction becomes predominant and the device can show a crystalline-like behavior in the operation range. for example, for a density of tail acceptor states nta=3.5x10 19 cm 3 ev -1 and a characteristic energy of 34 mev, ids reduces with t. for kta=100 mev the current increases with t. for kta =70 mev, the drain current decreases only when vgs=10 v. for nta =1x10 20 cm -3 ev -1 the current increases with t even for kta =34 mev, indicating that vhr conduction is predominant. for nta <1.5e 19 cm -3 ev -1 , the current reduces with t for kta =34 mev, indicating that the percolation conduction mechanism is predominant in all the operating voltage range of the tft. acknolwledgement: this work was supported by conacyt projects 237213 and 236887 in mexico, the h2020 programme of the european union under contract 645760 (domino), by contract “thin oxide tft spice model” (t12129s) with silvaco inc., by icrea academia 2013 from icrea institute and the spanish ministry of economy and competitiveness through project tec2015-67883-r (greensense). the authors acknowledge holst centre/tno and dr. i. mejia, from the university of texas at dallas, for providing the tft devices. effect of the distribution of states in amorphous in-ga-zn-o layers on the conduction mechanism... 9 references [1] k. nomura, h. ohta, a. takagi, t. kamiya, m. hirano, h. hosono, “room-temperature fabrication of transparent flexible thin-film transistors using amorphous oxide semiconductors”, nature, vol. 432, pp. 488-492, 2004. [2] k. nomura, t. kamiya, h. ohta, k. ueda, m. hirano, h. hosono. “carrier transport in transparent oxide semiconductor with intrinsic structural randomness probed using single-crystalline ingao3(zno)5 films”, appl. phys. lett., vol. 85, pp. 1993-1995, 2004. [3] t. kamiya, k. nomura, h. hosono, “electronic structures above mobility edges in crystalline and amorphous in-ga-zn-o: percolation conduction examined by analytical model”, j. display technol., vol. 5, pp. 462-467, 2009. [4] e. fortunato, p. barquinha, r. martins, “oxide semiconductor thin-film transistors: a review of recent advances”, adv. mater., vol. 24, pp. 2945-2986, 2012. [5] h. kumomi, t. kamiya, h. hosono, “advances in oxide thin-film transistors in recent decade and their future”, ecs transactions, vol. 67, pp. 3-8, 2015. [6] t. kamiya, h. hosono, “material characteristics and applications of transparent amorphous oxide semiconductors”, npg asia mater., vol. 2, pp. 15-22, 2010. [7] t. kamiya, k. nomura, h. hosono, “present status of amorphous in-ga-zn-o thin-film transistors”, sci. technol. adv. mater., vol. 11, pp. 044-305. [8] t. kamiya, k. nomura, h. hosono, “electronic structure of the amorphous oxide semiconductor aingazno4–x: tauc–lorentz optical model and origins of subgap states”, phys. status solidi a, vol. 206, pp. 860–867, 2009. [9] s. sallis, k.t. butler, n.f. quackenbush, d.s. williams, m. junda, d.a. fischer, j.c. woicik, n.j. podraza, b.e. white, a. walsh, l.f. piper, “origin of deep subgap states in amorphous indium gallium zinc oxide: chemically disordered coordination of oxygen”, applied physics letters, vol. 104, pp. 232108, 2014. [10] s. c. kim, y.s. kim, j. kanicki, “density of states of short channel amorphous in–ga–zn–o thin-film transistor arrays fabricated using manufacturable processes”, jpn. j. of appl. phys., vol. 54, pp. 51-101, 2015. [11] x. ding, j. zhang, w. shi, h. zhang, c. huang, j. li, x. jiang, z. zhang, “extraction of density-ofstates in amorphous ingazno thin-film transistors from temperature stress studies”, current applied physics, vol. 14, pp. 1713-1717, 2014. [12] c. chen, k. abe, h. kumomi, j. kanicki, “density of states in a-ingazno from temperature dependent field studies”, ieee tran. electron devices, vol. 56, pp. 1177-1183, 2009. [13] j. jeong, j.k. jeong, j.s. park, y.g. mo, y. hong, “meyer–neldel rule and extraction of density of states in amorphous indium–gallium–zinc-oxide thin-film transistor by considering surface band bending”, japanese journal of applied physics, vol. 49, pp. 03cb02, 2010. [14] w. chr. germs, w.h. adriaans, a.k. tripathi, w.s.c. roelofs, b. cobb, r.a. j. janssen, g.h. gelinck, m. kemerink, “charge transport in amorphous ingazno thin-film transistors”, phys. rev., vol. b86, pp. 155-319, 2012. [15] t. kamiya, k. nomura, h. hosono, “origin of definite hall voltage and positive slope in mobility-donor density relation in disordered oxide semiconductors”, appl. phys. lett., vol. 96, pp. 122103, 2010. [16] s. lee, s. park, s. kim, y. jeon, k. jeon, j.-h. park, j. park, i. song c., j. kim, y. park, d.m. kim, d.h. kim, “extraction of subgap density of states in amorphous ingazno thin-film transistors by using multifrequency capacitance–voltage characteristics”, ieee electron device lett., vol. 31, pp. 231-233, 2010. [17] j.-h. park, k. jeon, s. lee, s. kim, s. kim, i. song, j. park, y. park, c. j. kim, d. m. kim, d.h. kim, “self-consistent technique for extracting density of states in amorphous ingazno thin film transistors”, j. electrochem. soc., vol. 157, pp. h272, 2010. [18] p.y. liao, t.c. chang, t.y. hsieh, m.y. tsai, b.w. chen, y. h. tu, a. k. chu, c.h. chou, j.f. chang, j f “investigation of carrier transport behavior in amorphous indium–gallium–zinc oxide thin film transistors”, jpn. j. of appl. phys., vol. 54, pp. 094101, 2015. [19] m. estrada, m. rivas, i. garduño, f. avila-herrera, a. cerdeira, m. pavanello, i. mejia, m.a. quevedolopez, “temperature dependence of the electrical characteristics up to 370 k of amorphous in-ga-zno thin film transistors”, microelectronics reliability, vol. 56, pp. 29–33, 2016. instruction facta universitatis series: electronics and energetics vol. 28, n o 2, june 2015, pp. 297 307 doi: 10.2298/fuee1502297u system design considerations of universal uhf rfid reader transceiver ics  nikolay usachev 1 , vadim elesin 1 , alexander nikiforov 1 , george chukov 1 , galina nazarova 1 , denis sotskov 1 , nikolay shelepin 2 , vladislav dmitriev 2 1 national research nuclear university mephi (moscow engineering physics institute), moscow, russian federation, 2 «mikron» jsc, moscow, russian federation abstract. this paper describes the architecture, system analysis and implementation of world-wide regulation compliant uhf rfid reader transceiver for iso 18000-6 multi-class tags in the ism band 860 mhz-960 mhz. the presented considerations are based on a system analysis providing evaluation of the transceiver’s building blocks parameters in accordance with the required characteristics of a complete rfid reader system, read range, data transmission rate, reading speed and power consumption. the phase noise, noise figure, sensitivity, p1db, dynamic range are estimated for the design of a custom ‘system-in-package’ transceiver, implemented in ltcc-module. based on the direct-conversion architecture, the reader transceiver integrates rfblocks, frequency synthesizer, modulation and demodulation functions, low frequency analog baseband. the receiver sensitivity is down to -85 dbm, the transmitter produces output power of +17 dbm. key words: rfid, uhf, ltcc, ‘system-in-package’ , sige bicmos 1. introduction radio-frequency identification (rfid) uhf band supporting the epc global class 1 generation 2 and iso 18000-6a/b/c standards have become indispensable in today‟s distribution industries, purchasing, manufacturing, energy and healthcare services [1]. a uhf rfid system consists of reader(s), tags, and host computer. a uhf reader is a system with an integrated transceiver module as a core. as shown in fig.1, rfid reader transceiver consists of a uhf receiver and transmitter front-ends, frequency synthesizer, low frequency analog baseband, analog-to-digital (adc) and digital-to-analog (dac) converters, and digital baseband for data processing and control [2], [3]. the uhf front-end of a rfid reader transceiver contains a low noise amplifier (lna), power amplifier (pa) (required for improving sensitivity of receiver path and output power level in forward link), quadrature rf modulator and demodulator. the low frequency analog baseband of a rfid reader transceiver contains an active bandpass received september 20, 2014; received in revised form december 11, 2014 corresponding author: nikolay usachev national research nuclear university mephi (nrnu mephi), moscow, russian federation (e-mail: nausachev@mephi.ru) 298 n. usachev, v. elesin, a. nikiforov, et al. filters (bpf) with variable bandwidth, and variable gain amplifiers (vga). the bpf is required for rejection of noisy signals from dac and adc and digital baseband parts. transmitter digital baseband receiver digital baseband dac dac adc adc bpf bpf vga vga vga vga bpf bpf frequency synthesizer quad rf modulator quad rf demodulator rf pa rf lna d ig it a l in p u t/ o u tp u t r f t x o u tp u t r f r x i n p u t fig. 1 rfid reader transceiver block diagram this paper describes issues associated with the system design of uhf rfid reader ics [4]. the proposed considerations are based on the evaluation of the rf transceiver‟s building blocks parameters in conjunction with the required characteristics of a complete rfid reader system, i.e. read range, data transmission rate, reading speed, power consumption, etc. the phase noise, noise figure, sensitivity, p1db, dynamic range are estimated for the design of a custom „system-in-package‟ direct-conversion rfid transceiver that produces output power up to +17dbm and input linearity up to +6 dbm in the ism band between 860 mhz 960 mhz and provides a read range more than 1 m without using an external pa. 2. system analysis 2.1. read range typical parameters [1] of a uhf rfid system described in this work are presented in table 1. passive tags have no independent source of electrical power and are widely used in uhf rfid systems because of their cost. the rf carrier signal transmitted by a reader is required for the passive tag to be activated. table 1 uhf rfid system‟s parameters reader tag air interface ptx=30 dbm gtx = 3 dbi grx= 3 dbi ptagmin=-15dbm gtag = -5 dbi f=865 mhz m = 0,25 (ask) m[db]=20log(0,25) system design considerations of universal uhf rfid reader transceiver ics 299 minimal rf power level required for the tag activation (sensitivity) is -15-20 dbm for typical uhf rfid systems [2]. power level at the tag input ptag and reader receiver input prx can be calculated using the following equations [3], [4]: [dbm] tag tx tx tag p p g g loss    , (1) [dbm] 2 2 2 rx tx tx tag p p g g loss m     , (2) 4 [db] 20 log( ) l loss    , (3) where ptx is power level at transmitter output in dbm; gtx and gtag are gains of reader and tag antennas correspondingly in dbi; loss is loss in the air interface between reader and tag in db; m is modulation depth of tag backscattering signal in db; l is the distance between reader and tag (read range) in meters; λ is wavelength of carrier signal in meters. dependencies of ptag and prx versus distance between reader and tag for typical uhf rfid systems simulated by eq. (1)-(3), are shown in fig.2. fig. 2 power level vs. distance for typical uhf rfid system according to fig.2, for a rfid system with ptag = -15 dbm, l = 4 m, power level at receiver input should be more than -73 dbm. the semi-active tags (with high sensitivity) or high output power pa should be used to improve the read range. in the last case it may lead to a reader receiver blocking. 2.2. receiver noise figure in accordance with epc global c1 g2 standard [1] rfid reader needs to support listen before talk (lbt) and talk modes. this means that before a reader can transmit at a given channel, it should make sure the channel is free. only the reader receiver is active in lbt-mode. in talk-mode the receiver and transmitter operate in duplex. the amplitude-shift-keying (ask) is a basic type of modulation for forward (reader-tag) and reverse (tag-reader) link. 300 n. usachev, v. elesin, a. nikiforov, et al. to receive the messages reliably a bit error rate (ber) in uhf rfid systems should be less than 10 -5 [3], which corresponds to the signal-to-noise ratio (snr) of 12 db (for ask). the uhf reader receiver noise figure (nfrx) can be calculated using the equation [3]: [db] 174 10log( ) ,    rx s n nf p bw snr (4) where ps is the reader receiver sensitivity in dbm, bwn is receiver bandwidth in hz. in talk-mode ps is equal to -73 dbm and bwn is 1.28 mhz (for maximum bit-rate of 640 kbit/s) and value of nfrx should be less than 28 db. in lbt-mode bwn is 200 khz, ps should be -100 dbm and less and nfrx in accordance with eq. (4) should be less than 9 db. 2.3. receiver input linearity an example of a multiple reader environment is illustrated in fig. 3. reader a reader b reader c r fig. 3 interference in multiple reader environment in the case when readers a and b are operating in talk-mode reader c is operating in lbt-mode, 1 db input compression point of receiver р-1db can be calculated by the following equation [3]: _ _ _ [dbm]      -1db_lbt tx a tx a rx c a c p p g g loss , (5) 4 (2 r) [db] 20log( ).   a c loss   (6) in typical rfid system (see table 1) with the distance between readers 2r = 4 m p-1db should be more than -13.3 dbm. in mono-static configuration a single antenna can be used for both transmission and reception. in bi-static configuration two different antennas are used for transmission and reception. the main disadvantage of the mono-static configuration is insufficient isolation between receiver and transmitter. typical isolation value is less than 20 25 db. meanwhile, a mono-static configuration is a good choice for mobile reader with integrated antenna. typical isolation value between receiver and transmitter for bi-static configuration is 30 40 db. system design considerations of universal uhf rfid reader transceiver ics 301 examples of mono-static and bi-static reader configurations are shown in fig.4. antenna 50 ohm receiver transmitter lna pacoupler isolation 25 db antenna 2 antenna 1 is o la ti o n 3 0 d b receiver transmitter lna pa a) b) fig. 4 reader configurations: (a) mono-static (b) and bi-static in talk-mode р-1db is determined by self-jammer signal in the receiver input as a part of transmitter power ptx and can be calculated by the following equation: [dbm]   -1db_talk tx p p iso , (7) where iso is isolation between the reader‟s transmitter and receiver. in accordance with eq. (7) р-1db for a rfid system (see table 1) in mono-static (iso is less than 25 db) and bi-static configurations (iso is more than 30 db) should be more than 5 dbm and 0 dbm, respectively. 2.4. phase noise in talk-mode the main problem is a weak tag signal (flo+fblf) detection (typical power level is less than -60 dbm) which is limited by a carrier signal from the transmitter output (flo) and signals from adjacent readers (fac). the simplified signal spectrum diagrams at receiver rf input and low frequency output are shown in fig. 5a and fig. 5b respectively. for these reasons, special requirements to the phase noise level should be determined. frequency p o w e r flo+fblfflo fac frequencyfblf fac flo lo phase noise analog lpf frequency responce p o w e r a) b) fig. 5 (a) signal spectrum diagrams at receiver rf input and (b) receiver low frequency output 302 n. usachev, v. elesin, a. nikiforov, et al. providing that the conversion and proper processing of weak tag information signal is available, the phase noise level can be calculated using the following equation [5, 6]: @ [dbc hz] 10log( ),    n s n pn bw p acrr snr bw (8) where bwn is bandwidth in hz, acrr – adjacent channel rejection ratio in db. the transceiver‟s key parameters estimated in accordance with equation (8) are summarized below:  sensitivity in talk-mode, determined by eq.(1) – (3), equals -73 dbm;  signal to noise ratio (snr) is 12 db, which corresponds ber of 10 -5 (for ask);  typical value of acrr [1] is 40 db;  the maximum available phase noise level should be less than -95 dbc/hz at 100 khz. 3. transceiver implementation to verify the proposed system approach to transmitter design a test uhf rfid reader transceiver, shown in fig.6a, was implemented as „system-in-package‟ in low temperature co-fired (ltcc)-module. the simplified cross-section of the ltcc-module which consists of seven layers is shown in fig.6b. a) b) fig. 6 uhf rfid reader transceiver module: (a) top-view (20×20 mm2) and (b) cross-section the transceiver ltcc-module (20 mm × 20 mm × 4 mm) with appropriate thermal properties and rf grounding, integrates the rf-receiver, transmitter and frequency synthesizer dies were fabricated on a 0.25 μm sige bicmos process. dupont 951 greentape (εr=7.8@3ghz, tgδ≤0.006@3 ghz) was used as substrate materials with 10…15 μm thick argentum conductors as low loss interconnection and microstrip lines. a minimal vias diameter of 100 μm is available within the ltcc-process that makes this technology suitable for realizing packages with ground plane inductance low enough [7]. the metal layers coming from top to bottom are as follows: top metal layer system design considerations of universal uhf rfid reader transceiver ics 303 (m7) is for smd-component, chip and kovar frame mounting, two microstrip lines interconnect layers (m6-m5), two layers (m4-m3) are for passive elements (rf capacitors, inductors, baluns, etc.) [7-10], layer m2 and m1 are for shield ground planes. the bottom shield ground plane layer is used for mounting ltcc-module on a printed-circuit board (pcb) by conventional soldering technique. 4. simulation and experimental results the transceiver performance (dynamic range, noise figure, sensitivity, p1db, phase noise, etc.) was simulated based on the proposed system design considerations in conjunction with the required characteristics of a complete rfid reader system, i.e. read range, data transmission rate, reading speed, power consumption, etc. measurements were performed using a special pcb test-fixture and specialized microwave test system (mwts), based on cascade summit 12000b microwave probe station, agilent n5230a vector network analyzer, n9020a signal analyzer [10, 11], shown in fig.7. mwts is successfully used with complex radiation test facilities for experimental studies and theoretical analysis of radiation effects in wide range complex multifunctional verylarge-scale mixed and digital ics [12-14]. fig.7 specialized microwave test system simulated and measured receiver conversion loss in talk-mode is in good agreement, as shown in fig. 8. measured frequency synthesizer carrier phase noise response is shown in fig. 9. simulated and measured transmitter output power characteristics in talk-mode are shown in fig.10. measured transmitter output spectrum for single-sideband modulation and carrier frequency of 865mhz, if bandwidth of 1 mhz is shown in fig. 11. the obtained rf output power is more than +17 dbm for a frequency range 860 to 960 mhz. 304 n. usachev, v. elesin, a. nikiforov, et al. fig. 8 simulated and measured receiver conversion loss in talk-mode fig. 9 measured frequency synthesizer carrier phase noise fig. 10 transmitter output power characteristics system design considerations of universal uhf rfid reader transceiver ics 305 fig. 11 measured transmitter output spectrum simulated and measured parameters of the uhf rfid reader transceiver presented in this paper are summarized in table 2 and compared with other published work. the measured parameters of the uhf transceiver are in good agreement with the modeling results and fulfill the rfid system requirements. some illegible difference between measured and simulated receiver conversion loss and transmitter output power are probably caused by insertion loss in test pcb microstrip lines. table 2 uhf rfid system‟s parameters parameter this work [3] rfid system requirements measurements frequency, mhz 860…960 860…960 835…930 technology process – sige bicmos 0.25 μm cmos 0.18 μm package – ltcc, 44 leads, 20×20 mm 2 lqfp64a 10×10 mm 2 (die area is 4×4 mm 2 ) p-1db, dbm ≥ 0 (talk) ≥ -13 (lbt) +6 -23 -3 – nfrx, db ≤ 28 (talk) ≤ 9 (lbt) 27 9 35 – pn, dbc/hz @100khz ≤ -95 -95 -90 ptx, dbm – ≥ 17 10 power supply, v – +5 +3.3 power consumption, w – 1.1 0.4 estimated read range, m ≥0.5 0.9 0.4 306 n. usachev, v. elesin, a. nikiforov, et al. 4. conclusion architecture, system analysis and implementation of world-wide regulation compliant uhf rfid reader transceiver for iso 18000-6 multi-class tags in the ism band 860 mhz 960 mhz have been presented. the described system design considerations have been verified in the design process of the reader transceiver that integrates a uhf receiver, transmitter and frequency synthesizer, and covers the entire 860 mhz to 960 mhz frequency range. the reader transceiver parameters (input linearity, noise figure, phase noise, output power) have been optimized following the proposed approach provided the required characteristics of complete rfid reader system (read range, reading speed, multiply reader environment-mode, etc.). fabricated on a 0.25 μm sige bicmos process, the transceiver was implemented as „system-in-package‟ in ltcc-module and measured. simulated and measured parameters of the uhf rfid reader transceiver are in good agreement and fulfill the rfid system requirements. acknowledgments. the authors would like to thank konstantin m. amburkin and dmitry m. amburkin (specialized electronic systems, moscow, russia) for their contributions to the experimental investigations; vitaly a. telets (nrnu mephi) for helpful discussions and interesting in this work. references [1] epc radio frequency identity protocols c1g2 uhf rfid. protocol for communications at 860-960 mhz //www.epcgloballink.com. [2] k. xu et al., “design, verification and measurement techniques for uhf rfid tag ic”, in proc. 7th int. сonf. on wicom, 2011, pp. 1-5. [3] r. zhang et al., “several key issues in single-chip uhf rfid reader design”, in proc. international conference on microwave and millimeter wave technology (icmmt2010), 2010, pp. 1453-1456. [4] n. a. usachev, v.v. elesin, a.y. nikiforov and v. a. telets, “behavioral approach to design universal uhf rfid reader transceiver ics”, in proc. of the international conference on microelectronics (miel2014), pp. 405-408. [5] j. wang et al., “system design considerations of highly-integrated uhf rfid reader transceiver rf front-end”, in proc. 9th international conference on solid-state and integrated-circuit technology (icsict 2008), 2008, pp. 1560-1563. [6] i. mayordomo et al., “design and analysis of a complete rfid system in the uhf band focused on the backscattering communication and reader architecture”, 3rd europ. workshop on rfid systech., 2007. pp. 1-6. [7] v.v. elesin, g.n. nazarova, n.a. usachev, “design of passive elements for monolithic silicongermanium microwave ics tolerant to ionizing radiation”, russian microelectronics, vol 39, no. 2, pp. 134-141, 2010. [8] i.i. mukhin, v.v. repin, v.v. elesin, g.n. nazarova, a.s. shnitnikov, “balun integral circuits design”, in proc. 22nd international crimean conference microwave and telecommunication technology (crimico 2012), 2012, pp. 95-96. [9] v.v. elesin, g.v. chukov, d.v. gromov, v.v. repin, v.a. vavilov, “the effect of ionizing radiation on the characteristics of silicon-germanium microwave integrated circuits”, russian microelectronics, vol 39, no. 2, pp. 122-133, 2010. [10] d.v. gromov, s.a. polevich, v.v. elesin, “test and measurement system for microwave semiconductor devices and ic investigation on radiation hardness”, in proc. 19th international crimean conference microwave and telecommunication technology (crimico-2009), 2009, pp. 730-731. http://ieeexplore.ieee.org/xpl/mostrecentissue.jsp?punumber=5492939 system design considerations of universal uhf rfid reader transceiver ics 307 [11] v.v. elesin, “transient radiation effects in microwave monolithic integrated circuits based on heterostructure field-effect transistors: experiment and model”, russian microelectronics, vol 43, no. 2, pp. 139-147, 2014. [12] o. kalashnikov, a. nikiforov. “tid behavior of complex multifunctional vlsi devices”, in proc. of the international conference on microelectronics (miel2014), 2014, pp. 455-458. [13] a. akhmetov, d. boychenko, d. bobrovskiy, et. al., “system on module total ionizing dose distribution modeling”, in. proc. of the international conference on microelectronics (miel2014), 2014, pp. 329-331. [14] a. chumakov, a. vasil'ev, a. yanenko, et. al, “single-event-effect prediction for ics in a space environment”, russian microelectronics, vol 39, no. 2, pp. 74-78, 2010.  facta universitatis series: electronics and energetics vol. 33, n o 1, march 2020, pp. 61-71 https://doi.org/10.2298/fuee2001061n © 2020 by university of niš, serbia | creative commons license: cc by-nc-nd parallel-strip line stub resonator for permittivity characterization dušan a. nešić 1 , ivana radnović 2 1 centre of microelectronic technologies, institute of chemistry, technology and metallurgy, university of belgrade, serbia 2 institute imtel komunikacije a.d, belgrade, serbia abstract. a new type of a microwave permittivity sensor with a short open stub as a resonator is introduced. the open stub is realized as a double-sided parallel-strip line without a substrate and can be totally immersed into the measured material. it provides high sensitivity of the resonant frequency nearly proportional to the ratio of square roots of dielectric constants of the measured materials. the sensor is tested in two different frequency ranges and for two different dielectric constant ranges (oils and ethanol-water mixture). its technology is without any additional technological processes such as vias, air-bridges or defected ground structures. presented sensor is designed, fabricated and tested showing good agreement between simulations and measurements. key words: microwave sensor, microstrip, double-sided parallel-strip line, permittivity measurement. 1. introduction microwave sensors are being increasingly used as sensing components in many applications [1]. they are sensitive, able to survive overdrives and their signal can be directly transmitted over a distance [2]. one type of microwave sensor is a resonant sensor. great advantage of this type of sensor is its principle of operation that is based on the resonance frequency and is generally immune to the environmental noise. besides, the use of the planar technology enables an easy, fast and inexpensive fabrication. advantages of the planar microwave fabrication process finds wide application in planar structures such as microstrip, cpw and strip line [1,3]. accordingly, a microwave microstrip resonator is a good choice for a sensor [4-9]. the location of the material under test (mut) is usually above the microstrip line [4,9], under the pattern etched in the microstrip ground plane [5,6] or above the coupling area of the coupled microstrip structures [7,8]. however, there is one main problem the received march 4, 2019; received in revised form august 21, 2019 corresponding author: dušan a. nešić centre of microelectronic technologies, institute of chemistry, technology and metallurgy, university of belgrade, njegoseva 12, 11000 belgrade, serbia (e-mail: nesicad@nanosys.ihtm.bg.ac.rs)  62 d. a. nešić, i. radnović fact that the sensitivity depends on the extent of the field penetration inside the mut [3]. in all three mentioned positions of the mut only a part of the field lines is inside the mut because the field lines in microstrip are predominantly concentrated within the substrate, as presented in fig. 1. fig. 1 electric (e) and magnetic (h) field lines in microstrip are stronger within the substrate. material under test (mut) is usually above the substrate in the lower field region. gray areas represent metallization it is obvious that locating the mut inside the substrate results in a higher sensitivity [3]. still, one can insert the mut (i.e. fluid) through the substrate [10, 11]. this solution is inconvenient especially in cases where thin substrates are used and is suitable only for microfluids. another solution can be double-sided parallel-strip line printed on dielectric pipes for fluids testing, [12], though it is appropriate for pipes but not for immersing a stub into a fluid. also, the resonance occurs at low frequencies and open stubs are in this case too long (around 25 cm). some analogy with a coaxial open stub is given in [4]. its resonance is also at low frequencies thus an open stub is too long (around 33 cm), and is not practical for a number of applications. besides, it is tested only for high dielectric constants. the microstrip sensor for immersing into a fluid is presented in [5]. it has disadvantages in the construction and the protection problems during measurements. one solution to problems from [5] is in use of substrate integrated waveguide (siw) technology [13]. however, the disadvantage of the solution presented in [13] is great number of vias in the siw technology. in this paper a new type of a modified microstrip /4 open stub resonant sensor is introduced. it is suitable for immersing into a fluid and has a short open stub ( 20 mm). the whole structure is in the form of a double-sided parallel-strip line [14,15], i.e. a tjunction with an open stub without a substrate as a sensing part, fig. 2. the pair of two symmetrical metal strips without a substrate represents the sensing part of the stub. double-sided parallel-strip line technology is chosen in order to obtain such sensing structure. the absence of a substrate enables each stub strip to be totally surrounded by the mut. according to this, the total field around the stub strips is inside the mut and naturally produces higher sensitivity. the sensing stub can be simply immersed into the mut without any additional preparation or use of any auxiliary structure like cavity. the sharp stopband always exists and the resonant frequency can be clearly measured. parallel-strip line stub resonator for permittivity characterizati 63 fig. 2 basic layout of a double-sided parallel-strip line t-junction with an open stub without substrate an open stub is a well-known resonator. the first resonant frequency of an open shunt stub is for the wavelength: 2 0 2 1 1 , , 4 4 44 gr r reff reff reff r c c l f l l f                (1) where gr is the guided resonant wavelength, 0 is the free space wavelength, ɛreff is the effective dielectric constant and l is the length of the open stub, fr is the resonant frequency and c is the speed of light. in the microstrip structure ɛreff mainly depends on the dielectric constant ɛr of the microstrip dielectric substrate because the field lines of the microstrip are predominantly concentrated within the substrate, as presented in fig. 1. the goal of the paper is to use an open stub without a substrate in which case the material under test totally fills both the area surrounding the substrate and the area commonly occupied by the substrate. in that case ɛreff  ɛrmut induces high sensitivity. the ideal sensitivity, as the shift of the open stub resonant frequency, is equal to the ratio of square roots of dielectric constants of the measured materials, eq. (1). the proposed sensor is fabricated in microstrip printed planar technology without any additional technological process such as vias, air-bridges, defected ground structures (dgs) or many vias for substrate integrated waveguide (siw). the realization of the sensor was carried out in an easy way using standard photolithographic procedure. besides, the sensor dimensions are within technological tolerances. 2. design and fabrication as mentioned previously, the structure is designed in printed planar technology as a double-sided parallel-strip line t-junction. the objective of the design was to fabricate the tjunction with an open stub without a substrate. according to fabrication possibilities, the realized structure is somewhat different from the basic ideal model shown in fig. 2. the photos of the both sides of the fabricated structure are displayed in fig. 3. the main part of the proposed structure is realized on cuclad 217 substrate (with relative dielectric constant ɛr = 2.17 and thickness h = 1.143 mm) as a double-sided parallel-strip line t-junction. layouts of the bottom and the top parts of the structure are presented in fig.4 and are denoted by gray and black color, respectively. the structure consists of a 4.5 mm wide 50 ω-double-sided parallel-strip line with a double-sided parallel-strip line open stub in the middle which is 4.75 mm long and 4.5 mm wide as shown in fig.4. the part of the stub 64 d. a. nešić, i. radnović printed on the dielectric substrate serves for bonding the rigid metal strips (a in fig.3) on both sides while the distance between the strips is the same as the thickness of the substrate (1.143 mm). since the structure is designed as a symmetrical (balanced) microstrip line, there has to be a transition (bal-un) to unsymmetrical (conventional) 50 ω-microstrip line at its both ports, [15]. in our case, for the used dielectric substrate, the width of this 50 ω-line is 3.5 mm. width of the ground plane area at the sma connector location is 14 mm. rigid metal strips, 20 mm long, 4.5 mm wide and 0.3 mm thick, are bonded (conventional eutectic alloy) to the 4.75 mm long stubs (a in fig.3) on the both sides of the substrate. free parts of the rigid metal strips are forming 15.25 mm long part of the open stub without a substrate (b in fig.3). a) bottom side of the proposed microwave sensor b) top side of the proposed microwave sensor fig. 3 photograph of the proposed microwave sensor with sma connectors. a – part of the metal strip on the substrate; b part of the metal strip without the substrate fig. 4 layout of the bottom (gray) and the top (black) side metallization of the proposed double-sided parallel-strip line t-junction with a bal-un transition to the conventional microstrip line at both ports parallel-strip line stub resonator for permittivity characterizati 65 3. simulation the main problem is a double segmented open stub. the shorter part of this stub (part a in fig. 5) is printed on the substrate and cannot be immersed in the mut. it is treated like a common double-sided parallel-strip line on a substrate. the part b (fig. 5) is immersed into the mut so to be totally surrounded by it. simulations were carried out using 3d wipl-d microwave pro program package [16]. a) segments of the open stub b) wipl-d pro simulation model fig. 5 sketch of the open stub resonator and its wipl-d simulation model. a segment of the stub printed on the substrate (4.75 mm); b segment of the stub without the substrate immersed in the mut (15.25 mm) the wipl-d simulation model is presented in fig. 5b. simulation results are obtained for two specific ranges of the relative dielectric constants. the first is for r which ranges from 1.5 to 3, specific for oils, while the second is for r that ranges from 20 to 80, specific for the water-ethanol mixtures. for the mixture water-ethanol the parameters are taken from [17]. high imaginary parts ofr are incorporated from [17] to calculate real resonant frequency for the measured frequency range (ethanol 70%: 39.5 i7 and ethanol 96%: 22 i11)relative dielectric constantr-mut related to the resonant open stub frequencies are presented in diagrams in fig. 6., fig. 7. and fig. 11. for the reference air (r = 1) simulated resonant frequency is 3.74 ghz. fig. 6 simulated diagrams for the first specific range of the mut relative dielectric constants (1.53.0) vs. the resonant frequencies 66 d. a. nešić, i. radnović fig. 7 simulated diagrams for the second specific range of the mut relative dielectric constants (2080) vs. the resonant frequencies 4. measurement the measurements are performed in the steady state at the temperature around 300 k in order to obtain stable results. measurement setup with the sensing open stub and the container with the mut are presented in fig. 8. the container, shown in fig. 8, inserts itself a negligible frequency shift. transmission coefficient (s21) of the proposed structure is measured using the agilent technologies network analyzer n5227a. several materials were tested: air, gasoline (medical), paraffin oil and sunflower oil, as well as water and ethanol. the measured s21 parameters in both cases are presented in figures 9, 10 and 12, respectively. fig. 8 measurement setup with the sensing open stub and the container. a segment of the stub printed on the substrate; b segment of the stub without the substrate to be immersed into the mut parallel-strip line stub resonator for permittivity characterizati 67 fig. 9 measured s21 coefficient of various mut fig. 10 measured s21 coefficient of water and ethanol fig. 11 ethanol 96% simulated s21 coefficient (parameters from [17]) 68 d. a. nešić, i. radnović 0.0 0.5 1.0 1.5 2.0 -16 -15 -14 -13 -12 -11 -10 -9 -8 -7 -6 -5 -4 -3 -2 -1 0 s 2 1 [d b ] f [ghz] b fig. 12 ethanol 96% measured s21 coefficient according to the diagrams presented in fig. 6. and fig. 7. ɛr-mut values (and measured resonant frequencies) are: gasoline-medical (2.755 ghz) ɛr = 1.90; paraffin oil (2.584 ghz) ɛr = 2.16; sunflower oil (2.4 ghz) ɛr = 2.5; water (0.449 ghz) ɛr = 73; diluted ethanol 35% (0.49 ghz) ɛr = 61; ethanol 70% (ethanol 70% v/v) (0.629 ghz) ɛr = 37 and ethanol 96% (ethanol 96% v/v) (0.787 ghz) ɛr = 22. for the air (3.74 ghz), ɛr =1. all results reasonably match values from the available references [17-21] as shown in table 1. agreement between simulation and measurement can be tested by comparing s21 parameters for ethanol 96% from the simulation in fig. 11 and from the measurement in fig. 12. the loss tangent tan(δ) is extracted (-3db frequency range) according to [22] using the relation for the quality factor q  tan(δ) and contribution of the mut part in the entire electrical length of the open stub. the authors assume that tan(δ) of the cuclad 217 substrate as well as tan(δ) of the rigid metal strips in the air are negligible comparing to the tan(δ) of the mut. proposed estimation gives somewhat higher tan(δ) of the mut (conservative version). the tan(δ) of the mut is estimated from the influence of the mut on the resonator and is slightly higher than the measured tan(δ) (only the longer part of the open stub is in the mut). tan( ) tan( ) 1 shorter reff mut meas mut mut d d              (2) table 1 results mut measured ɛr ( fr ) tan(δ) reference ɛr (error %) tan(δ) (error %) gasoline-medical 1.90 ±0.003 (2.755 ghz) 0.015 [18] 2.0 ( 5. %) 0.015 ( 1. %) paraffin oil 2.16 ±0.018 (2.584 ghz) 0.013 [19] 2.2 ( 2. %) sunflower oil 2.50 ±0.005 (2.4 ghz) 0.08 [20] 2.56 ( 3. %) 0.128 (38. %) water # 73.0 ±3.8 (0.449 ghz) 0.05 [21] 76.0 ( 4.%) 0.026 (90. %) ethanol 35% 61.0 ±2.6 (0.49 ghz) 0.064 [17] 58.9 ( 4.%) 0.07 ( 9. %) ethanol 70% 37.0 ±1.2 (0.629 ghz) 0.186 [17] 39.5 ( 7. %) 0.177 ( 5. %) ethanol 96% 22.0 ±1.0 (0.787 ghz) 0.53 [17] 22.0 ( 1.%) 0.5 ( 6. %) # tap water – water from the regular water supply parallel-strip line stub resonator for permittivity characterizati 69 5. discussion the sensor is tested for two dielectric constants and frequency ranges (oils and ethanol-water mixture). the frequency shift between two measured materials is close to the ratio of square roots of their relative dielectric constants ɛr for both ranges. for example, the ratio between the air and the water resonant frequencies is around 8.3 and the ratio between square roots of the water and the air dielectric constants is around 8.5. for gasoline these ratios are 1.36 and 1.38, respectively. the sensing part of the open stub is relatively short (15.25 mm) and can be immersed into a small container. the measurement errors are calculated according to the frequency step in the measurement process, table 1. the measurement errors against values in references [1721] are given in percentages [%]. the errors are high for tan(δ) of the sunflower oil and water due to not so fixed mixture content of the sunflower oil and water from the regular water supply (especially for tan(δ)). relative sensitivity for both dielectric constant ranges, (1.5-3.0) and (20-80), are given in fig. 13 and fig. 14, respectively. the resolution depends on the frequency step and on the dielectric constant range. fig. 13 relative sensitivity for the first specific range of the mut relative dielectric constants (1.53.0) vs. dielectric constant fig. 14 relative sensitivity for the second specific range of the mut relative dielectric constants (2080) vs. dielectric constant 70 d. a. nešić, i. radnović the second group of resonant frequencies in fig. 10 is from the second resonant bandgap from the open stub (3 times the first resonance). the second resonances are somewhat shifted and have wider bandgaps. the reason is lower dielectric constant and higher tan(δ) for higher frequencies [17, 21]. 6. conclusion the paper introduces the new type of a microwave resonant sensor realized as a tjunction with an open stub as a sensing part. the sensing part of the stub represents a pair of two metal strips in the form of a double-sided parallel-strip line without a substrate. the absence of the substrate enables each stub strip to be totally surrounded by the mut. the frequency shift between two measured materials is close to the ratio of the square roots of their relative dielectric constants ɛr-mut. the proposed sensor is fabricated in the planar technology without dimension tolerance problems: narrowest line width is 3.5 mm that is much wider than typical photolithographic manufacturing tolerances (around 30 microns). the sensing open stub is short (15.25 mm), but still significantly longer than common tolerances. there are no additional technological processes such as vias, air-bridges, defected ground structures (dgs) or great number of vias like in substrate integrated waveguide (siw) technology. the only additional process is bonding of the rigid metal strips to the microstrip line on the substrate. the sensing stub can be simply immersed into the mut without any additional preparing or use of auxiliary structures like cavity. the sensor is suitable for distinguishing the mut, especially mixture concentrations such as water and ethanol mixture. presented sensor is tested for two dielectric constant ranges: oils (1.5-3) and ethanol-water mixtures (20-80), and in two frequency ranges: around 2 ghz and below 1 ghz, respectively. in both cases frequency shift between two measured materials is closely proportional to the ratio of the square roots of their relative dielectric constants. all results reasonably match values from the available references. acknowledgment: the authors would like to thank colleagues m. pesic, n. tasic, lj. radovic, n. popovic and p. manojlovic from the institute imtel for their help in the realization and to professor m. potrebic from the university of belgrade, school of electrical engineering, for her assistance in performing the measurements. this work was funded by the serbian ministry of education and science within the project tr 32008. references [1] s. dey, j.k. saha, and n.c. karmakar, "smart sensing", ieee microwave magazine, pp. 26-39, november 2015. [2] j. polivka, "an overview of microwave sensor technology", high frequency electronic, pp. 32-42, april 2007. [3] k. saeed, m. f. shafique, m. b. byrne and i. c. hunter (2012). planar microwave sensors for complex permittivity characterization of materials and their applications, applied measurement systems, prof. zahurul haq (ed.), intec. [4] a. hoog, m.j.j. mayer, h. miedema, w. olthuis, f.b.j. leferink and a. van den berg, "modeling and simulations of the amplitude–frequency response of transmission line type resonators filled with lossy dielectric fluids", sensors and actuators a, vol. 216, pp. 147-157, 2014. parallel-strip line stub resonator for permittivity characterizati 71 [5] c. liu and y. pu, "a microstrip resonator with slotted ground plane for complex permittivity measurements of liquid", ieee microwave and wireless components letters, vol. 18, no. 4, pp. 257259, 2008. [6] c.-s. lee and c.-l. yang, "complementary split-ring resonators for measuring, dielectric constants and loss tangents“, ieee microwave and wireless components letters, vol. 24, no. 8, pp. 563-565, 2014. [7] a. a. abduljabar, d. j. rowe, a. porch, and d. a. barrow, "novel microwave microfluidic sensor using a microstrip split-ring resonator", ieee transactions on microwave theory and techniques, vol. 62, no. 3, pp. 679-688, 2014. [8] m. t. jilani, w. p. wen, l. y. cheong, m. z. u. rehman, and m. t. khan, "determination of sizeindependent effective permittivity of an overlay material using microstrip ring resonator", microwave and optical technology letters, vol. 58, no. 1, pp. 4-9, 2016. [9] lescopa, f. galléeb, s. riouala, "development of a radio frequency resonator for monitoring water diffusion in organic coatings", sensors and actuators a, vol. 247, pp. 30-36, 2016. [10] l. le cloirec, a. benlarbi-delaï and b. bocque, "new concept of rf functions by microfluidic coupling", microwave and optical technology letters, vol. 48, no. 10, pp. 1912-1916, 2006. [11] d.l. diedhiou, r. sauleau, and a.v. boriskin, "microfluidically tunable microstrip filters", ieee transactions on microwave theory and techniques, vol. 63, no. 7, pp. 2245-2252, 2015. [12] m. a. karimi, m. arsalan and a. shamim, "low cost and pipe conformable microwave-based water-cut sensor", ieee sensors journal, vol. 16, no. 21, pp. 7636-7645, 2016. [13] c. liu and f. tong, an siw resonator sensor for liquid permittivity measurements at c band, ieee microwave wireless components letters, vol. 25, no. 11, pp. 751-753, 2015. [14] s.-g. kim and k. chang, "ultrawide-band transitions and new microwave components using doublesided parallel-strip lines", ieee transactions on microwave theory and techniques, vol. 52, no. 9, p. 2148, september 2004. [15] j.-x. chen, c.-h. k. chin and q. xue, "double-sided parallel-strip line with an inserted conductor plane and its applications", ieee transactions on microwave theory and techniques, vol. 55, no. 9, p. 1899, 2007. [16] 3d wipl-d microwave pro program package. [17] a. megriche1, a. belhadj and a. mgaidi, "microwave dielectric properties of binary solvent wateralcohol, alcohol-alcohol mixtures at temperatures between -35°c and +35°c and dielectric relaxation studies", mediterranean journal of chemistry, vol. 1, no. 4, pp. 200-209, 2012. [18] f. s. jafari, j. ahmadi-shokouh, reconfigurable microwave siw sensor based on pbg structure for high accuracy permittivity characterization of industrial liquids, sensors and actuators a, vol. 283, pp. 386-395, 2018. [19] https://www.engineeringtoolbox.com/relative-permittivity-d_1660.html. [20] j. vrba and d. vrba, "temperature and frequency dependent empirical models of dielectric properties of sunflower and olive oil", radioengineering, vol. 22, no. 4, pp. 1281-1287, 2013. [21] martin chaplin, water and microwaves, http://www1.lsbu.ac.uk/water/microwave_water.htm. [22] a. r. fulford and s. m. wentworth, "conductor and dielectric property extraction using microstrip tee resonators," microwave and optical technology letters, vol. 47, no. 1, pp. 14-16, 2005. http://onlinelibrary.wiley.com/doi/10.1002/mop.v48:10/issuetoc http://ieeexplore.ieee.org/xpl/tocresult.jsp?isnumber=7579242 https://www.engineeringtoolbox.com/relative-permittivity-d_1660.html http://www1.lsbu.ac.uk/water/microwave_water.htm hybrid neural lumped element approach in modeling of rf mems switches facta universitatis series: electronics and energetics vol. 33, n o 1, march 2020, pp. 27-36 https://doi.org/10.2298/fuee2001027c hybrid neural lumped element approach in inverse modeling of rf mems switches  tomislav ćirić 1 , zlatica marinković 1 , rohan dhuri 2 , olivera pronić-rančić 1 , vera marković 1 1 university of niš, faculty of electronic engineering, niš, serbia 2 alten gmbh, munich, germany abstract. rf mems switches have been efficiently exploited in various applications in communication systems. as the dimensions of the switch bridge influence the switch behaviour, during the design of a switch it is necessary to perform inverse modeling, i.e. to determine the bridge dimensions to ensure the desired switch characteristics, such as the resonant frequency. in this paper a novel inverse modeling approach based on combination of artificial neural networks and a lumped element circuit model has been considered. this approach allows determination of the bridge fingered part length for the given resonant frequency and the bridge solid part length, generating at the same time values of the elements of the switch lumped element model. validity of the model is demonstrated by appropriate numerical examples. key words: artificial neural networks, inverse modeling, lumped element model, rf mems switch. 1. introduction radio-frequency micro-electro-mechanical systems (rf mems) components have been proven to be of a great importance for rf circuits and subsystems, as they possess characteristics that may surpass characteristics of conventional, purely electrical components. rf mems devices consist of moving sub-millimeter-sized parts that provide radio frequency functionality. they are of high linearity, low insertion loss and extremely good intermodulation performance. mems devices have the ability to sense, control and actuate on micro scale, and generate effects on macro scale. according to variety and diversity of rf mems technology functionalities, they have wide applicability for the new generation of communication system components, like switches and varactors (variable capacitors), resonators, complex networks, reconfigurable filters, phase shifters, impedance matching tuners and programmable step attenuators [1-9]. in the recent time, the rf mems technology has found applications for internet of things (iot), internet of everything (ioe), tactile internet and 5g telecommunications [10-12]. design of the received february 20, 2019; received in revised form september 10, 2019 corresponding author: tomislav ćirić faculty of electronic engineering, university of niš, aleksandra medvedeva 14, 18000 niš, serbia (e-mail: cirict@live.com)  28 t. ćirić, z. marinković, r. dhuri, o. pronić-ranĉić, v. marković circuits containing rf mems switches requires repeated simulations and/or optimizations of the switch characteristics. therefore, there is a need for reliable rf mems models. switch electrical characteristics can be accurately determined in full-wave electromagnetic simulators [13-15]. however, as the simulation models are quite complex and the simulations consume a significant amount of time, a common option to overcome these problems is usage of lumped models in the circuit simulators [16, 17]. the lumped element models based on the equivalent circuits are faster than the full-wave ones. however, if differently sized bridges are analyzed, the procedures for obtaining the equivalent circuit elements have to be repeated, which is a time-consuming process. to make the lumped element model scalable with the dimensions, artificial neural networks (anns) were proposed to model the dependence of the lumped element model on the switch bridge dimensions [18]. the switch bridge dimensions determine the electromagnetic characteristics of the switch. therefore during the design of a switch, it is necessary to determine the bridge dimensions to ensure the desired switch characteristics, such as resonant frequency, i.e. to perform the inverse modeling of the switch. the authors of this work proposed earlier a black-box inverse modeling of the rf mems capacitive switches where the bridge lateral dimensions were determined for given electrical or mechanical switches [19-25]. in this work, the neural based inverse modeling approach is extended in a way that the novel approach provides not only determination of the bridge dimensions but also the values of the corresponding lumped model elements, resulting in a lumped element model ready to be used for further simulations of the circuits containing the considered switch. the paper is organized as follows: after introduction, a description of the considered rf mems switch is given in section 2. the proposed modeling approach is described in section 3. details of the model development and validation and the most illustrative numerical results are given is section 4. section 5 contains the conclusions. 2. rf mems capacitive switches mems are integrated devices consisting of micromechanical and electronic components. rf mems switches are the specific micromechanical switches that are designed to operate at rf to mm-wave frequencies. rf mems switches use mechanical movements of the bridge to achieve a closed or open circuit in the rf transmission lines. rf mems classification depends on the type of actuation, deflection axis, contact type, circuit configuration, and structure configuration. the considered device is a coplanar waveguide (cpw) based rf mems capacitive shunt switch (fig. 1(a)) designed at fondazione bruno kessler (fbk) in trento in an 8 layer silicon micromachining process [26-28]. the device is fabricated on silicon substrate and silicon dioxide (sio2) as insulator. the bridge is a thin gold (au) membrane connecting both sides of the ground plane with defined lateral dimensions (length of the fingered part lf and length of solid part – ls). the signal line is a thin aluminum layer, placed below the bridge. on the opposite sides of the signal line, the dc actuation pads made of polysilicon are placed. applying the actuation voltage on electrodes, electrostatic force becomes superior over mechanical restoring force, causes membrane to pull down towards the ground plane switching the circuit [26]. hybrid neural lumped element approach in inverse modeling of rf mems switches 29 a) b) fig. 1 (a) top-view of the realized switch and schematic of the cross-section [26] and (b) equivalent circuit of the rf mems switch rlc lumped element model the inductance of the bridge and the fixed capacitance between signal line and bridge create a resonant circuit to the ground. the resonant frequency can be adjusted by varying the length of the bridge lateral dimensions. at the series resonance, the circuit acts as a short circuit to the ground. in a certain frequency band around the resonant frequency the transmission of the signal is suppressed. an rf mems switch can be represented by a simplified equivalent circuit model, as shown in fig. 1(b). it consists of the resistance r, the inductance l and the capacitance c. two coplanar waveguide lines, cpw1 and cpw2, are added with the aim of matching the obtained s-parameters with the s-parameters obtained by a full-wave analysis, having in mind that the reference planes for simulation and measurement are usually not defined directly at the membrane but a distance apart from it [26]. the switch resonant frequency is 1 2 resf lc  . (1) the switch capacitance in the membrane down-state case, considered in this case, is calculated from the layout using the following expression [3]: d r t a c  0  , (2) where 0 is the dielectric permittivity, r is the relative dielectric permittivity, td is the distance between the two plates forming the capacitor and a is the surface of the plates. the capacitance is constant, because it does not depend on the bridge lateral dimensions. the other two elements, r and l, depend on the bridge lateral dimensions ls and lf. they can be obtained simultaneously by optimizations in a circuit simulator aimed to achieve the desired values of the equivalent circuit s-parameters. alternatively, the inductance can be determined from the given resonance frequency as: cf l res 22 4 1   , (3) and then only the resistance is to be obtained by optimization in a circuit simulator. it should be noted that once the capacitance is determined, the extraction of the resistance 30 t. ćirić, z. marinković, r. dhuri, o. pronić-ranĉić, v. marković and the inductance should be repeated for each considered combination of bridge dimensions, which always requires new full-wave simulations to provide inputs for optimization. following the approach proposed in [26], the lumped element scalability with the bridge lateral dimensions can be introduced by means of artificial neural networks, as will be described in the next section. 3. proposed inverse modeling approach to determine the switch lateral dimensions for the desired resonant frequency, and simultaneously to determine the corresponding equivalent circuit elements, a new inverse modeling approach is proposed in this work. the proposed approach is a hybrid approach combining neural modeling with a lumped element equivalent circuit. in other words, it is a combination of the black-box neural inverse modeling approach [19-21] and a modification of the scalable lumped element model proposed in [18]. schematic diagram of proposed model is shown in fig. 2. the aim of the first ann (ann 1) is to determine the length of the fingered part lf for the desired resonant frequency [19, 20, 22]. as described in the previous work, due to the fact that different combinations of the bridge solid and fingered parts’ lengths may lead to the same resonant frequency value, it is not possible to use this approach to determine ls and lf simultaneously. instead, the length of the solid part is considered as the inverse model input beside the resonant frequency. the second ann (ann 2) is used for modeling the relationship between the resistance and the bridge lateral dimensions ls and lf. unlike the model considered in [18] where the inductance dependence on the dimensions is modeled also by the ann, having in mind that in the considered case the resonant frequency is known, it is possible to calculate the inductance by using the eq. 3, assuming that the capacitance, which is constant and does not depend on the bridge lateral dimensions, has been determined previously. therefore the value calculated by eq. 2 is directly assigned to the capacitor in the equivalent circuit. fig. 2 proposed inverse modeling approach the used anns are multilayered anns having one input layer, one output layer and one or more hidden layers [1]. both anns have two input neurons and one output neuron. the inputs of the ann 1 correspond to the bridge solid part length ls and resonant frequency fres, and the output corresponds to the bridge fingered part lf. for the training and hybrid neural lumped element approach in inverse modeling of rf mems switches 31 validation of ann 1 it is necessary to have a set of samples consisting of a combination of dimensions and the corresponding values of the resonant frequency. that implies that simulations of the s-parameters in a full-wave simulator should be performed for each combination of the dimensions and the resonance frequency is determined as the frequency corresponding to the minimum value of the s21 magnitude. the ann 2 inputs correspond to the bridge lateral dimensions ls and lf , whereas the output corresponds to the equivalent circuit resistance r. the training samples consist of the two considered lateral dimension combinations and corresponding resistances. values of the resistance used for training are determined by optimization in a circuit simulator for the previously calculated capacitance and inductance, as described in section 2. the flow chart describing the development of the proposed model is shown in fig. 3. the optimization goal is to match the simulated resonant frequency (i.e. all scattering parameters) and the resonant frequency simulated in the full-wave simulator for the given combination of the dimensions. the implementation of anns in the equivalent circuit is done as follows. each ann is represented by a set of mathematical expressions describing the ann transfer function. the expressions corresponding to the developed anns are implemented by means of a variable and equation blocks (var) on the equivalent circuit schematic. a var block inputs and outputs are the same as the inputs and outputs of the corresponding ann. the output of the var block corresponding to the ann 1 is led to the input of the var block corresponding to the ann 2, whose output is further assigned to the resistance of the equivalent circuit. fig. 3 model development flow chart the developed inverse model does not require additional simulations in the full-wave simulator or additional optimizations. for the desired resonant frequency and a given value of the bridge solid part length, by running the s-parameter simulations, it is possible to simulate in em full-wave simulator the s-parameters for several combinations of the lateral dimensions find the resonant frequency for each combination of the lateral dimensions build the ann 1 training and test sets (ls, fres, lf) for each of the combination of the lateral dimensions determine in a circuit simulator the resistance r calculate the capacitance c (eq. 2) and the inductance l (eq. 3) build the ann 2 training and test sets (ls, lf, r) train the ann 1 (networks with different number of hidden neurons) and find the ann with the best accuracy train the ann 2 (networks with different number of hidden neurons) and find the ann with the best accuracy. create mathematical expressions describing ann 1 and ann 2 implement the expressions on the equivalent circuit schematic. 32 t. ćirić, z. marinković, r. dhuri, o. pronić-ranĉić, v. marković simultaneously calculate the length of the fingered part, determine the corresponding elements of the equivalent circuit and simulate the s-parameters over the desired frequency range. as all the operations are performed in the circuit simulator, the whole process is done within seconds, which is significantly faster than performing optimizations in a fullwave simulator for determining the dimension and optimizations in a circuit simulator to determine the resistance. 4. numerical results the proposed inverse model was developed for the following ranges of the switch geometrical parameters: ls from 50 µm to 500 µm, and lf from 0 µm to 100 µm. to prepare the data for the model development, the equivalent circuit elements r, l and c were determined for several different combinations of the lateral dimensions ls and lf. the relative permittivity of silicon dioxide is 3.9 and the dimensions contributing to the capacitance value in the down-state are a = 13000 μm 2 and td = 0.1 μm. therefore, by using eq. 2, the calculated capacitance in the down-state is 4.48695 pf. for each combination of ls and lf, first the s-parameters were determined by full-wave simulations in advanced design system (ads) momentum software [29] and the resonant frequency was determined as the minimum of the s21 parameter magnitude. further, the combinations of the lateral dimensions and the resonant frequencies obtained by ads simulations were used for training the ann 1. the available dataset was divided into the training set used for the development of the anns and the test set used for the model validation. anns with different number of hidden neurons in one or two hidden layers were trained, because a prior determination of number of hidden neurons is not possible. the networks with the best test results were chosen as the final model. in this paper, the following notation of anns is used: ann denoted with n-h1-h2-m, has n input neurons, h1 and h2 neurons in the first and second hidden layer, respectively, and m output neurons. in the table 1, there are test results obtained by the best ann 1 (2-15-15-1) for the input combinations whose values did not appear in the training set [19, 22]. table 1 rf mems switch inverse modeling results: lf ls (m) fres (ghz) lf (target) (m) lf (from ann 1) (m) lf abs. error (m) lf relative error (%) 5 22.78 25 24.9 0.1 0.4 75 19.17 65 65.4 0.4 0.6 75 17.92 85 85.3 0.3 0.3 100 17.5 75 73.6 1.4 1.9 200 13.13 85 86.8 1.8 2.1 350 11.67 25 23.4 1.6 6.4 350 10.83 65 62.2 2.8 4.3 400 10 85 87.4 2.4 2.9 the relative errors are in most cases less than 3%. however, the absolute difference of the predicted and expected values is less than 3 µm, which is already close to fabrication tolerances. more details about the development and validation of the mentioned inverse model can be found in [19, 22]. hybrid neural lumped element approach in inverse modeling of rf mems switches 33 further, the resonant frequency and the capacitance were used to determine the inductance for each combination of ls and lf. the inductance is calculated by using eq. 3 and achieved results are presented in table 2. in next step, the neural model for determining the resistance for the given dimensions (ann 2) was developed. the target resistance values were obtained by optimization of the resistance value for each considered combination of the dimensions. cpws of 50  were used. among the trained anns with different numbers of hidden neurons, the best results were obtained by ann which has the structure 2-4-8-1. the resistance obtained by ann 2 for the eight test combinations not used for the network training are shown in table 2. table 2 extracted equivalent circuit elements ls (m) lf (from ann 1) (m) c (pf) fres (ghz) l (ph) r (from ann 2) (mω) 75 24.9 4.48695 22.78 10.879 638.05 75 65.4 4.48695 19.17 15.363 739.45 75 85.3 4.48695 17.92 17.581 763.26 100 73.6 4.48695 17.5 18.435 764.92 200 86.8 4.48695 13.13 32.748 857.68 350 23.4 4.48695 11.67 41.455 908.49 350 62.2 4.48695 10.83 48.135 946.07 400 87.4 4.48695 10 56.457 977.75 to validate further the proposed hybrid inverse modeling approach, for the test combinations of the bridge dimensions, the calculated c, l and r were assigned to the corresponding equivalent circuit elements, and used for the s-parameter simulation. the comparison of rf mems switch s-parameters simulated by the equivalent circuit and the s-parameters determined by the ads momentum simulations shows a very good match. as an illustration, in fig. 4 and fig. 5 the insertion loss (|s21| in db) and the return loss (|s11| in db) are shown for two devices with different lateral dimensions: the first one having ls = 100 µm and lf = 75 µm, and the second device with ls = 350 µm 5 10 15 20 25 30 350 40 -20 -15 -10 -5 -25 0 f (ghz) |s 1 1 | (d b ) db(s(1,1)) db(sref(1,1)) 5 10 15 20 25 30 350 40 -30 -20 -10 -40 0 f (ghz) |s 2 1 | (d b ) db(s(2,1)) db(sref(2,1)) fig. 4 s11 and s21 of rf mems switch for ls = 100 µm and lf = 75 µm (rlc model red solid line, full-wave simulations – blue dashed line) 34 t. ćirić, z. marinković, r. dhuri, o. pronić-ranĉić, v. marković 5 10 15 20 25 30 350 40 -20 -15 -10 -5 -25 0 f (ghz) |s 1 1 | (d b ) db(s(1,1)) db(sref(1,1)) 5 10 15 20 25 30 350 40 -20 -10 -30 0 f (ghz) |s 2 1 | (d b ) db(s(2,1)) db(sref(2,1)) fig. 5 s11 and s21 of rf mems switch for ls = 350 µm and lf = 25 µm (rlc model red solid line, full-wave simulations – blue dashed line) and lf = 25 µm. as can be seen, in both cases the response of the equivalent circuit is almost identical to the reference response obtained by the full wave simulations, confirming the accuracy of the proposed approach. the results referring to the bridge with lateral dimensions ls = 350 µm and lf = 25 µm have been shown with the aim to show the results for the case where ann 1 exhibits the biggest deviation between modeled and reference values. even in that case, the circuit responses are almost identical and very close to the target values obtained by the full-wave simulations. 5. conclusion in this paper, a new approach to rf mems capacitive switch inverse modeling has been proposed. it is a hybrid approach combining artificial neural networks and a lumped element equivalent circuit model. the inverse approach proposed earlier by the authors aimed only to determine switch dimensions for the given resonant frequency. the inverse modeling approach proposed in this paper can be used to determine not only the necessary length of the bridge fingered part to achieve the given resonant frequency for the given value of the bridge solid part length, but also to determine the elements of the switch equivalent circuit in a full-wave simulator. after the anns composing the model have been developed, determination of the bridge fingered part length and the elements of the equivalent circuit are done straightforwardly without additional optimizations, making the process of inverse modeling very time-efficient. according to the obtained results, the accuracy of the determination of the bridge fingered part is within the fabrication tolerances. moreover, the s-parameters simulated by using the equivalent circuit elements obtained by this approach match well the s-parameters obtained by full-wave simulations, confirming the accuracy of the equivalent circuit parameter extraction. acknowledgement: the work was supported by the projects tr-32052 and iii-43102 of the serbian ministry of education, science and technological development. hybrid neural lumped element approach in inverse modeling of rf mems switches 35 references [1] q. j. zhang, k. c. gupta, neural networks for rf and microwave design, artech house, 2000. [2] m. gad-el-hak, the mems handbook florida: crc pres, 2002 [3] g. m. rebeiz, rf mems theory, design, and technology. new york: wiley, 2003. [4] g. m. rebeiz, j. b. muldavin, "rf mems switches and switch circuits," ieee microw. mag., vol. 2, no. 4, pp. 59-71, december 2001. [5] y. mafinejad, a. z. kouzani, k. mafinezhad, "determining rf mems switch parameter by neural networks", in proceedings of the ieee region 10 conference tencon 2009, 2009, pp. 1-5. [6] l. michalas, m. koutsoureli, e. papandreou, a. gantis, g. papaioannou “a mim capacitor study of dielectric charging for rf mems capacitive switches”, facta universitatis, series: electronics and energetics, vol. 28, no. 1, pp. 113-122, 2015. [7] m. koutsoureli, l. michalas, g. papaioannou, “assessment of dielectric charging in micro-electromechanical system capacitive switches”, facta universitatis, series: electronics and energetics, vol. 26, no. 3, pp. 239-245, 2013. [8] a. napieralski, c. maj, m. szermer, p. zajac, w. zabierowski, m. napieralska, ł. starzak, m. zubert, r.kiełbik, p. amrozik, z. ciota, r. ritter, m. kamiński, r. kotas, p. marciniak, b. sakowicz, k. grabowski, w. sankowski, g. jabłoński, d. makowski, a. mielczarek, m. orlikowski, m. jankowski, p. perek, “recent research in vlsi, mems and power devices with practical application to the iter and dream projects”, facta universitatis, series: electronics and energetics, vol. 27, no. 4, pp. 561-588, 2014. [9] i. jokić, m. frantlović, z. đurić, m. dukić, "rf mems/nems resonators for wireless communication systems and adsorption-desorption phase noise", facta universitatis, series: electronics and energetics, vol. 28, no. 3, pp. 345-381, 2015. [10] j. iannacci and c. tschoban, "rf-mems for future mobile applications: experimental verification of a reconfigurable 8-bit power attenuator up to 110 ghz, " journal of micromechanics and microengineering (iop-jmm), vol. 27, no. 4, pp. 1-11, apr. 2017. [11] j. iannacci, "rf-mems technology as an enabler of 5g: low-loss ohmic switchtested up to 110 ghz", sensors and actuators a, vol. 279, pp. 624-629, 2018. [12] m. donelli, j. iannacci, "exploitation of rf-mems switches for the design of broadband modulated scattering technique wireless sensors", ieee antennas and wireless propagation letters, vol. 18, no. 1, january 2019. [13] e. hamad and a. omar, "an improved two-dimensional coupled electrostatic-mechanical model for rf mems switches", j. micromech. microeng., vol. 16, pp. 1424, 2006. [14] l. vietzorreck, "em modeling of rf mems," in proceedings of the 7th international conference on thermal, mechanical and multiphysics simulation and experiments in micro-electronics and microsystems, eurosime 2006, como, italy, april 24-26, 2006, pp.1-4. [15] z. j. guo, n. e. mcgruer and g. g. adams, "modeling, simulation and measurement of the dynamic performance of an ohmic contact, electrostatically actuated rf mems switch", j. micromech. microeng, vol. 17, pp. 1899-1909, 2007. [16] j. iannacci, r. gaddi, a. gnudi, "a experimental validation of mixed electromechanical and electromagnetic modeling of rf-mems devices within a standard ic simulation environment", journal of microelectromechanical systems, vol. 19, no. 3, pp. 526-537, 2010. [17] http://www.coventor/mems-solutions/products/mems [18] t. ćirić, r. dhuri, z. marinković, o. pronić-ranĉić, v. marković, l. vietzorreck, "neural based lumped element model of capacitive rf mems switches", frequenz, vol. 72, no. 11-12, november 2018. [19] z. marinković, t. ćirić, t. kim, l. vietzorreck, o. pronić-ranĉić, m. milijić, v. marković, "ann based inverse modeling of rf mems capacitive switches", in proceedings of the 11th conference on telecommunications in modern satellite, cable and broadcasting services (telsiks 2013), serbia, october 16-19, 2013, pp. 366-369. [20] z. marinković, v. marković, t. ćirić, l. vietzorreck, o. pronić-ranĉić, "artifical neural networks in rf mems switch modelling", facta universitatis, series: electronics and energetics, vol. 29, no 2, pp. 177191, 2016. [21] t. ćirić, z. marinković, o. pronić-ranĉić, v. marković, l. vietzorreck, "ann approach for modeling of mechanical characteristics of rf mems capacitive switches an overview", microwave review, vol. 23, no. 1, pp. 25-34, june 2017. [22] z. marinković, t. kim, v. marković, m. milijić, o. pronić-ranĉić, t. ćirić, l. vietzorreck, "artificial neural network based design of rf mems capacitive shunt switches", applied computational electromagnetics society (aces) journal , vol. 31 no. 7, pp. 756-764, july 2016. 36 t. ćirić, z. marinković, r. dhuri, o. pronić-ranĉić, v. marković [23] t. ćirić, z. marinković, t. kim, l. vietzorreck, o. pronić-ranĉić, m. milijić, v. marković, "ann based inverse electro-mechanical modeling of rf mems capacitive switches", in proceedings of the xlix scientific conference on information, communication and energy systems and technologies (icest 2014), niš, serbia, june 25-27, 2014, vol. 2, pp. 127-130. [24] z. marinković, a. aleksić, t. ćirić, o. pronić-ranĉić, v. marković, l. vietzorreck, "inverse electromechanical ann model of rf mems capacitive switches-applicability evaluation", in proceedings of the xlx scientific conference on information, communication and energy systems and technologies (icest 2015), sofia, bulgaria, june 24-26, 2015, pp. 157-160. [25] t. ćirić, z. marinković, m. milijić, o. pronić-ranĉić, v. marković, l. vietzorreck, "modeling of actuation voltage of rf mems capacitive switches based on rbf anns", in proceedings of the 13th symposium on neural networks and applications (neurel), belgrade, serbia, november 22-24, 2016, pp. 119-122. [26] s. dinardo, p. farinelli, f. giacomozzi, g. mannocchi, r. marcelli , b. margesin, p. mezzanotte, v. mulloni, p. russer, r. sorrentino, f. vitulli, l. vietzorreck, "broadband rf-mems based spdt", in proceedings of the european microwave conference 2006, manchester, great britain, september 2006. [27] f. giacomozzi, v. mulloni, s. colpo, j. iannacci, b. margesin, a. faes, "a flexible fabrication process for rf mems devices", romanian journal of information science and technology (romjist), vol. 14, no. 3, 2011. [28] d. dubuc, k. grenier, j. iannacci, "rf-mems for smart communication systems and future 5g applications", in smart sensors and mems -intelligent sensing devices and microsystems for industrial applications, 2nd edition, editors: s. nihtianov, a. luque, chapter 18, elsevier ltd. amsterdam, nl, pp. 499-539, march 2018. [29] advanced design system 2009, santa rosa, ca: electronic design automation software system produced by keysight eesof eda. instruction facta universitatis series: electronics and energetics vol. 27, n o 4, december 2014, pp. 613 619 doi: 10.2298/fuee1404613m resolving the bias point for wide range of temperature applications in high-k/metal gate nanoscale dg-mosfet  sushanta k. mohapatra, kumar p. pradhan, prasanna k. sahu nanoelectronics lab., department of electrical engineering, national institute of technology, rourkela, odisha india abstract. this article investigates the zero-temperature-coefficient (ztc) bias point and its associated performance metrics of a high-k metal gate (hkmg) dg-mosfet in nanoscale. the ztc bias point is defined as the point at which the device parameters are independent of temperature. the discussion includes sub threshold slope (ss), drain induced barrier lowering (dibl), on-off current ratio (ion/ioff), transconductance (gm), output conductance (gd) and intrinsic gain (av). from the results, it is confirmed that there are two different ztc bias points, one for ids (ztcids) and the other for gm (ztcgm). the points are obtained as: ztcids=0.552 v and ztcgm =0.410 v, which will open important opportunities in analog circuit design for wide range of temperature applications. key words: dg-mosfets, hkmg, sces, analog foms, ztc point. 1. introduction the growing interest and demand in designing circuits that operate at high temperatures which will be used in the military, automobile, nuclear, and some industries need to be analysed in nanoscale. silicon on insulator (soi) based cmos devices have the potential for the operation at both low and high temperatures. it is desirable to bias the digital and analog circuits designed for high temperature applications at a point where the v-i characteristics show little or no variation with respect to temperature. this point is typically known as ztc point [1-5]. previously, shoucair [1] and prijic, et al. [3] have identified the ztc point for a bulk cmos in both linear and saturation regions for temperatures between 25 0 c 200 0 c. researchers like groeseneken, et al. [4] and jeon, et al. [5] have shown the existence of the ztc point for soi mosfet’s. osman, et al. [6] presented a systematic analysis of ztc point for partially depleted (pd) soi mosfet over a wide range of temperatures (25 0 c 300 0 c), and identified two distinct ztc points, in the linear as well as in the saturation region. tan, et al. [2] identified that the ztcids received july 3, 2014; received in revised form september 14, 2014 corresponding author: sushanta k. mohapatra nanoelectronics lab., dept. of electrical engineering, national institute of technology, rourkela, 769008, odisha india (e-mail: skmctc74@gmail.com) 614 s. k. mohapatra, k. p. pradhan, p. k. sahu exists in both linear and saturation regions, whereas the ztcgm lies only in the saturation region for fully depleted (fd), lightly doped, enhanced mode soi n-mosfet. the double gate (dg) mosfet fabricated on soi wafers is one of the most promising candidates due to its attractive features of low leakage currents, high current drivability (ion) & transconductance (gm), reduced short channel effects (sces), steeper subthreshold slopes, and suppression of latch-up phenomenon [7-10], and also it is a very good option for analog applications [11-14]. hardly any work has been reported to investigate the ztc point for multi-gate technology. the behaviour of id is exactly opposite after a certain vgs with variation in temperatures. this is due to the degradation in mobility or high electric field effect at higher gate bias [2]. as far as we know, this is a unique attempt to investigate the detailed analysis of ztc point over a wide range of temperatures (100 k-400 k) for analog applications of a dg mosfet with hkmg technology. various performance metrics of the device have been systematically examined, which includes the sces like ss, dibl, ion/ioff ratio, and some important analog figures of merit (foms) such as gm, gd, av. 2. device description and simulation setup in the 2-d numerical simulation, a symmetric device structure as shown in fig. 1(a) has been modelled. the silicon channel is covered above and below by oxide layers as gate stack (gs) of equivalent oxide thickness (eot) having 1.1 nm. metal gate work function is considered as 4.6 ev. the channel length is 40 nm with a fixed width of 1 μm has been considered. source and drain extensions are 60 nm with contacts vertically placed (s and d, respectively). the doping profile for channel (p-type 110 16 cm −3 ) and source, drain (n-type 110 20 cm −3 ) are set. (a) (b) fig. 1 (a) schematic structure of nanoscale hkmg double gate n-mosfet (b) calibration between simulation and experimental data of threshold voltage as a function of temperature. the 2-d numerical device simulator atlas is employed to simulate the planner dgmosfet with high-k/metal gate technology. according to itrs the drain bias has been fixed at vdd = 1.0 v [15]. to study the analog performance the simulation is performed with analysis of ztc point in hkmg-dg-mosfet for analog application 615 vds = 0.5 v and vgs = 0 v to 1.0 v. in the simulation, the inversion-layer lombardi constant voltage and temperature (cvt) mobility model has been used, that takes into account the effect of transverse fields along with doping and temperature dependent parameters of the mobility. the shockley–read–hall (srh) generation and auger recombination model are used for minority carrier recombination. the model fermi-dirac uses a rational chebyshev approximation that gives results close to the exact values. the model temperature is used for various operating temperature in kelvin which is varied from 100 k to 400 k. the interface trapped charges during the pre and post fabrications process are a common phenomenon, and these charges cannot be neglected in nanoscale device fabrication. presence of trapped charges creates an additional non-linear potential and varying electric field across the gate dielectric. according to (1), the high-k gate stack reduces the electric field across the layer of gate stack due to high permittivity. so a lower electric field will require inducing inversion layer charge as [16]. ch di iq   (1) where qch is inversion charge, di permittivity of dielectric and ei is electric field. even if, the fixed oxide and interface trapped charge densities are very large, it requires moderate potential across the high-k gate stack layer. consequently, the reduction of threshold voltage and supply voltages can be maintained at reasonable values. this low electric field promotes gate stack reliability with huge unwanted charges inside. as the device is high-k gate stack, the interface trapped charge effects are included in the simulation. the trapped charge densities are considered at semiconductor to insulator interface. the typical concentration of trapped charges considered in this work is 410 11 cm -2 at interface [17]. the electron and hole surface recombination velocity is considered as 110 4 cm/sec. in the simulation all the junctions of the structure are assumed to be abrupt in nature. furthermore, we have chosen two numerical techniques, gummel and newton, to obtain solutions [17]. fig. 1(b) shows excellent agreement with the nature of threshold voltage between our simulation results and experimental data for a wide range of temperature reported in [4]. 3. results and discussion in this section, the device scalability and analog performance metrics are discussed. threshold voltage (vth), sub-threshold swing (ss), dibl, on-state drive current (ion), offstate leakage current (ioff), ion/ioff ratio are the important figures of merit (foms) under device scalability. as far as analog circuits are concerned, the most important parameters are the transconductance (gm), output conductance (gd), intrinsic gain (av). fig. 2(a) and (b) describe all three important sces, which include the variation of vth, ss, and dibl for different temperatures. the threshold voltage is determined from ids– vgs characteristics. it is considered to be that value of the vgs for which the ids approaches 10 −6 a/μm at vds = 0.5 v. the calculation of dibl is done as per (2). 1 2 2 1( ) ( )th ds th th ds dsdibl v v v v v v      (2) the vth is observed at two different drain bias vds1=0.5 v and vds2=1.0v. from the fig. 2(a) and (b), it should be noted that vth is decreasing with an increase in temperature, but the 616 s. k. mohapatra, k. p. pradhan, p. k. sahu ss, and dibl values are decreasing as temperature increases. the typical value for the ss of multi-gate mosfet is 60 mv/decade. according to fig. 2(b), the ss value is lowest for t< 300 k (room temperature), then it starts increasing as temperature increases and reaches its typical value at t=300 k. the dibl value is quite impressive throughout the entire temperature range. as there is a little variation in vth for two different vds at temperatures from 200 k to 400 k, so the dibl varies from 5 mv/v to 14.38 mv/v. (a) (b) fig. 2 (a) vth as a function of temperature for different vds, (b) ss and dibl as a function of temperature for vds=0.5 v. fig. 3(a) and (b) show the ion, and ioff respectively for a wide range of temperature variations at vgs=0.5v and vds=0.5v. the on state current (ion) is extracted, by calculating the maximum drain current (id) from the ids–vgs characteristics at vgs=0.5 v and vds=0.5 v. the off state current (ioff) is extracted, by calculating the drain current (id) at vgs=0 and vds=vdd. the ioff shows a very low value for t< 300 k and then started increasing as temperature increases; this is due to the low ss and high vth values at low temperatures. the temperature dependence on the id is influence by vth as: ( ) ( )[ ( )]d gs thi t t v v t  (3) the temperature dependant id(t) is directly related to µ(t) or vgs–vth(t) term. so, increasing the vgs–vth(t) term causes the id(t) to increase because the vth decreases with increase in temperature as shown in fig. 3(a) and (b). (a) (b) fig. 3 (a) on state current (ion), (b) off state current (ioff), as a function of temperature for different vds. analysis of ztc point in hkmg-dg-mosfet for analog application 617 the ion/ioff is a very important parameter for switching application; it should be very high for a good switch. according to fig. 4(a), the ion/ioff is 2.3010 14 for t=100 k, then it starts falling down as temperature increases and reaches 1.2010 4 for t=400 k. at higher temperature regions, the high value of ion because of lower vth and the high value of ioff due to the high ss values compensate each other and give rise to nearly constant ion/ioff. fig. 4(b) shows the variation of the id and gm with vgs for different bias temperatures. as per (2), at high gate bias the µ(t) dominates because at higher t, lattice scattering dominates and causes reduction in the channel mobility, which further reduces the id. at low gate bias the vgsvth(t) term causes the id to increase with increasing temperature because a low vth is predicted at higher temperatures. these two opposite effects will cancel each other out at a value of vgs where the id shows minimum variation with t. this point is called ztc bias point. the gm–vgs plot can be obtained by the derivative of the id with respect to the vgs. at vgs< vth (channel is weakly inverted) the id is due to diffusion. the diffusion current increases with increase in t due to hike in intrinsic carrier concentration. at vgs >vth, gm will decrease as t increases due to mobility degradation. (a) (b) fig. 4(a) on-off current ratio (ion/ioff) as a function of temperature, (b) drain current (id) and transconductance (gm) as a function of vgs for different values of operating temperature. the reduction in vth with increase in temperature will increase gm but the reduction of gm occurs due to degradation of mobility. these two phenomena will compensate each other to give rise to a ztc bias point for gm. from the figure we can conclude that the transconductance ztc point (0.014 v) is lower than the drain current ztc bias point (0.552 v). the ztcids and ztcgm bias points are two important measures in analog circuit design. in opamp (operational amplifier) design, to maintain constant dc current levels, the devices need to be biased at ztcids points, while input devices can be biased at ztcgm point to achieve stable circuit parameters. the simulated output current (ids) and output conductance (gd) versus drain bias (vds) at a vgs=0.5 v for different temperatures are plotted in fig. 5(a). because of the above said µ(t) and vth effects with respect to temperature, the ids decreases as t increases below the ztc point and the reverse is happening after the ztc point for both parameters. the ztc point for gd is lower than the output current ztc point. the intrinsic gain (av = gm/gd) is a valuable fom for operational transconductance amplifier (ota) and it is given in fig. 5(b). from fig. 5(b), high gain can be observable for high temperatures in 618 s. k. mohapatra, k. p. pradhan, p. k. sahu subthreshold regime and just a reverse effect in above threshold region. from this it can be concluded that the device shows better results in subthreshold regime for higher t and it is a good candidate in above threshold regime for lower t. (a) (b) fig. 5(a) output current (id) and output conductance as a function of vds for different values operating temperature, (b) intrinsic gain (av) as a function of vgs for the different values operating temperature. the important performance metrics are tabulated in table 1. by observing the table, it is clear that our device shows very impressive values in low temperature ranges. the ion/ioff, ss and av of the device increases as temperature decreases and attains their maximum values for t=100 k. table 1 extracted parameters for various temperatures temperature in k ion/ioff dibl (mv/v) ss (mv/decade) av in db 400 1.2010 4 14.38 83.52 38.780 350 8.0810 4 12.52 72.83 40.514 300 1.0210 6 10.33 62.30 42.402 250 3.4510 7 7.78 51.85 44.355 200 6.5010 9 5.00 41.45 46.311 150 1.1910 13 11.75 18.80 48.248 100 2.3010 14  20.87 50.180 4. conclusion the ztc bias points of the hkmg dg-mosfet are investigated using the 2-d numerical simulation. the results presented in this work give a detailed idea about the ztc bias point for parameters like id, and gm. these results provided here can serve as a good design tool for designing circuits in a wide range of temperature applications and show promising solutions to minimize temperature degradation of analog circuits. the work identified the distinct ztc points for the device in nanoscale. analysis of ztc point in hkmg-dg-mosfet for analog application 619 references [1] f. s. shoucair, “analytical and experimental methods for zero-temperature-coefficient biasing of mos transistors,” electronics letters, vol. 25, pp. 1196-1198, 1989. [2] t. h. tan, and a. k. goel, “zero-temperature-coefficient biasing point of a fully depleted soi mosfet”, microwave and optical technology letters, vol. 37, no. 5, pp-366-370, june, 2003. [3] z. prijic, s. s. dimitrijev, and n. stojadinovic, “the determination of zero temperature coefficient point in cmos transistors,” microelectronics reliability, vol. 32, no. 6, pp. 769-113, 1992. [4] g. groeseneken, j. p. colinge, h. e. maes, j. c. alderman, and s. holt, “temperature dependence of threshold voltage in thin-film soi mosfet's,” ieee electron device letters, vol. 11, no. 8, pp. 329-331, 1990. [5] d. s. jeon and d. e. burk, “a temperature-dependent so1 mosfet model for high-temperature application (27 0 c-300 0 c),” ieee transactions on electron devices, vol. 38, no. 9, pp. 2101-2110, 1991. [6] ashraf a. osman, mohamed a. osman, numan s. dogan, and mohamed a. iman, “zero-temperaturecoefficient biasing point of partially depleted soi mosfet’s”, ieee transactions on electron devices, vol. 42, no. 9, pp. 1709 – 1711, september, 1995. [7] k. suzuki, y. tosaka, t. tanaka, h. horie, y. arimoto, “scaling theory of double-gate soi mosfet’s”, ieee transactions on electron devices, vol. 40, no. 12, pp. 2326–2329, 1993. [8] c. wann, k. noda, t. tanaka, m. yoshida, and c. hu, “a comparative study of advanced mosfet concepts”, ieee transactions on electron devices, vol. 43, pp. 1742, oct. 1996. [9] j.p. colinge “multiple-gate soi mosfets” solid-state electronics, vol. 48, no.6, pp.897–905, 2004. [10] h.-s. p. wong “beyond the conventional transistor” ibm j. res. & dev. vol. 46, no. 2/3, march/may, 2002. [11] a. kranti, t. m. chung, d. flandre, j. p. raskin “laterally asymmetric channel engineering in fully depleted double gate soi mosfets for high performance analog applications”, solid-state electronics, vol. 48, pp. 947–59, 2004. [12] n. mohankumar, b. syamal, c. k. sarkar, “influence of channel and gate engineering on the analog and rf performance of dg mosfets”, ieee transactions on electron devices, vol. 57, no. 4, pp. 820–826, april, 2010. [13] a sarkar, a. k. das, s. de, c. k. sarkar, “effect of gate engineering in double gate mosfets for analog/rf applications”, microelectronics journal, vol. 43, pp.-873-882, july, 2012. [14] r. k. sharma, m. bucher, “device design engineering for optimum analog/rf performance of nanoscale dg mosfets”, ieee transactions on nanotechnology, vol.-11, no.-5, pp.-992-998, sept., 2012. [15] the international technology roadmap for semiconductors. (2011). [online]. available: http://public.itrs.net/. [16] s. m. sze, “physics of semiconductor devices (3rd edition)”, wiley, 2007. [17] atlas manual: silvaco int. santa clara, 2008. http://public.itrs.net/ instruction facta universitatis series: electronics and energetics vol. 27, n o 4, december 2014, pp. 589 600 doi: 10.2298/fuee1404589p covered microstrip line with ground planes of finite width  mirjana t. perić 1 , saša s. ilić 1 , slavoljub r. aleksić 1 , nebojša b. raičević 1 , mirza i. bichurin 2 , alexander s. tatarenko 2 , roman v. petrov 2 1 university of niš, faculty of electronic engineering of niš, serbia 2 novgorod state university, veliky novgorod, russian federation abstract. characteristic parameters of a covered microstrip line with ground planes of finite width are determined using hybrid boundary element method (hbem). this method, developed at the faculty of electronic engineering of niš is based on the combination of equivalent electrodes method (eem) and boundary element method (bem). results for the characteristic impedance of the observed microstrip line are compared with the corresponding ones obtained by the finite element method. key words: characteristic impedance, finite element method (fem), hybrid boundary element method (hbem), microstrip line, perfect electric conductor (pec). 1. introduction over the years, many authors have analyzed microstrip lines with finite width dielectric substrate using numerical and analytical methods [1]-[14]. the variational method [5, 7], the boundary element method/method of moments (bem/mom) [1], [9]-[11], the conformal mapping and the moving perfect electric wall methods [12]-[13], etc. are some of the commonly used procedures for microstrip lines analysis. on the other side, the problem of the finite width microstrip ground plane has not been so often researched, although these forms of microstrips are typical in practice. in [4] and [14]-[15] the microstrip line with finite-width dielectric and ground plane was analyzed. a moving perfect electric wall method (mpew) was applied in [12]. this method is used in combination with the conformal mapping method (cmm). the author obtained simple analytical relations for quasi tem parameters of microstrip lines. the calculation was performed with the assumption that the conductor thickness is zero.  received january 21, 2014; received in revised form september 5, 2014 corresponding author: mirjana t. perić faculty of electronic engineering, a. medvedeva 14, 18000 niš, republic of serbia (e-mail: mirjana.peric@elfak.ni.ac.rs) 590 m. t. perić, s. s. ilić, s. r. aleksić, et al. in [6] the authors present an efficient numerical technique for characteristic parameters determination of multiconductor transmission lines with homogeneous dielectrics. the influence of finite width ground plane was also investigated. the system of integral equations resulting from the method is solved using galerkin’s method with a pulse approximation. the technique applied in this paper is an improvement of the procedure presented in [8], in the sense of better efficiency and accuracy of the obtained results. analysis of structures with ground planes of finite width as well as finite conductor thickness is also possible using the hybrid boundary element method (hbem) [15]. this method is applied for the microstrip characteristic parameters determination in [15] and [16]. in [17] and [18] the symmetrically coupled microstrip lines with finite and infinite width ground plane are analyzed using the hbem. both modes (even and odd) are considered. covered coupled microstrip lines parameters are calculated in [19]. the structure that has not been analyzed using hbem until now is a covered single microstrip line with ground planes of finite width and finite conductor thickness. the analysis of such structure will be presented in this paper. results obtained for the characteristic impedance will be shown in tables and graphically, as equipotential contours. the main assumption in this analysis involves quasi tem propagation in the microstrip line. in order to validate the hbem values obtained for the characteristic impedance, in terms of accuracy, they have been compared with the corresponding ones obtained by the finite element method (fem). that method is very useful for application in software for electromagnetic problems solving, including the microwave analysis. some of this type of software is femm [20] or comsol [21]. the first one will be applied in this paper for results comparison. 2. theoretical background the hbem has been applied, until now, for electromagnetic field determination in the vicinity of cable terminations [22], calculation of magnetic force between permanent magnets as well as for microstrip lines parameters determination [23]. a generalization of the hbem, which is applied in this paper for microstrip lines analysis, was described in detail in [15] and [16]. this method presents a combination of the bem/mom, the equivalent electrodes method (eem) [24] and the point-matching method (pmm). the main idea of the hbem is in discretizing each arbitrarily shaped surface of the perfect electric conductor (pec) electrode as well as an arbitrarily shaped boundary surface between any two dielectric layers. the boundary surfaces are divided into a large number of segments. each of those segments on pec electrode is replaced by equivalent electrodes (ees) placed at their centers. the potential of equivalent electrodes obtained in this manner is the same as the potential of pecs themselves. the segments at any boundary surfaces between the two layers are replaced by discrete equivalent total charges. those charges are placed in the air [15, 16]. the equivalent electrodes are line charges whose radius is determined in [24]. the green’s function for the electric scalar potential of the charges is used. covered microstrip line with ground planes of finite width 591 applying the point-matching method (pmm) for the potential of the perfect electric conductor (pec) electrodes and for the normal component of the electric field at the boundary surface between any two dielectric layers, the system of linear equations is formed. increasing the number of the ees the distances between them becomes smaller. in order to keep stability of the formed system of equations it is necessary that the distances between ees be larger than their radius. the formed quadratic system of linear equations is well-conditioned. the system matrix always has the greatest values at the main diagonal. after solving the system of equations, according to [15], it is possible to calculate the capacitance per unit length of the microstrip line, as well as the characteristic impedance and effective relative permittivity. this method will be described in detail in the following section for characteristic parameters determination of covered microstrip line with the ground planes of finite width and finite conductor thickness. 3. hbem application geometry of the covered microstrip line, with finite width dielectric substrate placed between two ground planes of finite width, is shown in fig. 1. fig. 1 problem geometry the hbem, based on discretization of boundary surfaces between any two dielectric layers and replacement of those segments with total charges per unit length, is applied. it should be mentioned that the free surface charges do not exist on boundary surfaces layer 1 layer 2, so the total surface charges placed between dielectric layers are only surface polarization charges. the equivalent hbem model is shown in fig. 2. 592 m. t. perić, s. s. ilić, s. r. aleksić, et al.  indices “d”, “a” and “t” denote the charges per unit length placed in dielectric (“d”) and air (“a”) as well as total (“t”) charges per unit length, respectively.  mi (i = 1,2) is the number of ees on pecs, with line charges q'd im (m = 1,...,mi), placed in the layer 2;  mj ( j = 3,...,5) is the number of ees on pecs, with line charges q'a jm (m = 1,...,mj), placed in the layer 1;  ni (i = 1,...4) is the number of ees on boundary surfaces layer 1 – layer 2, with line charges q't in, placed in the air (n = 1,...,ni);  ),( dd imim yx , ),( aa imim yx , ),( tt inin yx are the positions of the ees. fig. 2 hbem model the electric scalar potential of the system from fig. 2, is given in eq. (1). 2 2 2dim 0 dim dim 1 1 2 5 2 2a a a 3 1 1 4 2 2t t t 1 1 0 ln ( ) ( ) 2 ln ( ) ( ) 2 ln ( ) ( ) , 2 i i i m i m m im im im i m n in in in i n q x x y y q x x y y q x x y y                                    (1) where 0 is unknown additive constant, which depends on the chosen referent point for the electric scalar potential. the procedure for determining the number of unknowns is the following: in order to avoid placing an arbitrary number of unknowns on each boundary surface, an initial parameter np is introduced. the number of unknowns is determined as covered microstrip line with ground planes of finite width 593 p 1 1 n sh w m   , p2 n sh s m   , p 11 3 2 n sh tw m    , p 22 4 22 n sh ytw m    , where 2 2 sw y   , p 22 5 22 n sh tw m    , p41 n sh h nn   , p32 n sh x nn   , where 2 1wsx   . the total number of unknowns tot n , will be denoted by: 5 4 1 1 1 tot i i i i n m n       . the electric field is obtained using )grad(e . a relation between the normal component of the electric field and the total surface charges is given with eq. (2). (0 ) 2 t 0 1 2 ˆ ( ) i im im           n e , t t im im im q l     , inn ,,1 , 4,3,2,1i (2) where in̂ ( xnynynxn ˆˆ,ˆˆ,ˆˆ,ˆˆ 4321  ) are unit normal vectors oriented from the layer 2 into layer 1. applying the procedure described in the previous section, the system of linear equation is formed using the pmm for the potential of the perfect electric conductor given in (1) and the pmm for the normal component of the electric field (2). the unknown free charges per unit length on conductors, and total charges per unit length on the boundary surfaces between two dielectric layers is determined after solving the system of equations. in order to satisfy the necessary condition of electrical neutrality of the whole covered microstrip line, equation (3) is added: 2 5 d a 1 1 3 1 0 i im m im im i m i m q q         (3) in that way, a quadratic system of linear equations is formed. the unknown values are free charges of pecs, total charges per unit length at boundary surfaces between dielectric layers, and unknown additive constant 0. the capacitance per unit length of the observed microstrip line is: 31 d1 a 3 1 1 1 mm k k k k c q q u              (4) the characteristic impedance is given in (5) c c 0 / eff r z z  , (5) 594 m. t. perić, s. s. ilić, s. r. aleksić, et al. where r eff = c'/c'0 is the effective relative permittivity of the microstrip line, and zc0 is the characteristic impedance of the microstrip line placed in the air. also, with c'0 the capacitance per unit length of the microstrip line without dielectrics (free space) is denoted. in order to validate and compare the obtained results for the characteristic impedance, the software femm [20] is used. 4. numerical results a computer code based on the procedure described in previous section, is written in mathematica [25]. all calculations were performed on computer with dual core intel processor 2.8 ghz and 4 gb of ram. the results convergence and the computation time are shown in table 1. the values of the effective relative permittivity, the characteristic impedance are determined for: r1 = 1, r2 = 3, w1/d = 1.0, t1/w1 = 0.05, w2/d = 3.0, t2/w2 = 0.1, h/d = 0.5 and s/d = 2.0. table 1 convergence of the results and computation time np ntot r eff zc[] t(s) 5 66 1.7008 44.665 0.3 10 98 1.8648 42.234 0.4 15 134 1.7107 44.228 0.7 20 166 1.7825 43.544 1.0 50 376 1.8559 43.328 4.5 75 550 1.8707 43.343 9.6 85 618 1.8744 43.346 12.1 100 722 1.8786 43.349 16.5 125 894 1.8836 43.350 25.2 135 964 1.8846 43.356 29.3 150 1068 1.8866 43.355 36.0 160 1136 1.8877 43.355 41.3 170 1242 1.8887 43.356 49.3 200 1414 1.8908 43.356 64.5 250 1760 1.8935 43.356 100.5 300 2106 1.8953 43.356 143.2 325 2278 1.8963 43.356 171.0 the “computation time” is the time spent for determining the number of unknowns, their positioning, forming a matrix elements, solving the system of equations, the characteristic parameters calculation. most of the calculation time is spent on matrix fill. for example, when the totn =1068, the time for determining the number of unknowns and their positioning is 0.2 s and for the matrix fill 32 s. for solving the system of linear equation is spent 3.3 s and for the capacitance, characteristic impedance and effective dielectric permittivity calculation 0.5 s. from table 1 is evident that a good convergence of the results is achieved in a short computation time. sufficient accuracy is obtained for 1242 unknowns, so there is no need to increase the number of ees. covered microstrip line with ground planes of finite width 595 equipotential contours are shown in fig. 3, for: np = 150, r1 = 1, r2 = 3, w1/d = 1.0, t1/w1 = 0.05, w2/d = 3.0, t2/w2 = 0.1, h/d = 0.5 and s/d = 2.0. fig. 3 equipotential contours in order to verify the obtained hbem values, a comparison of hbem and femm results for the effective dielectric permittivity and the characteristic impedance versus h/d is given in table 2. the discrepancy of these results is less than 0.6 %. it should be mentioned that the classical comparison of results does not make sense here. these methods (hbem and fem) are applied under different conditions. the number of unknowns in the hbem application was about 1100. on the other hand, the corresponding femm model was created with a few thousand finite elements. increasing the number of finite elements, accuracy of femm increases too, so it is possible to “compare” and verify the hbem results. table 2 verification of results for effective dielectric permittivity and characteristic impedance of microstrip line versus h/d for parameters: np = 150, r1 = 1, r2 = 3, w1/d = 1.0, t1/w1 = 0.05, w2/d = 3.0, t2/w2 = 0.1 and s/d =2.0 h/d hbem fem r eff zc[] r eff zc[] 0.2 2.3745 27.897 2.3740 28.056 0.3 2.2007 35.847 2.2089 35.895 0.4 2.0424 40.946 2.0568 40.883 0.5 1.8866 43.355 1.9068 43.208 0.6 1.7286 42.904 1.7530 42.702 0.7 1.5588 39.030 1.5898 38.812 0.8 1.3685 30.462 1.4072 30.320 596 m. t. perić, s. s. ilić, s. r. aleksić, et al. distributions of characteristic impedance versus different parameters are shown in the following figures. fig. 4 shows the influence of ground plane thickness on the characteristic impedance of microstrip line. the input data are: np = 150, r1 = 1, r2 = 3, w1/d = 1.0, t1/w1 = 0.05, w2/d = 3.0 and s/d = 2.0. from this figure it is evident that for corresponding input data, the characteristic impedance does not depend on the ground planes thickness. the characteristic impedance depends on the conductor’s distance from the planes (parameter h/d). increasing this parameter, the characteristic impedance first increases, and then decreases. the maximum value is when the conductor is equidistant from the ground planes. fig. 4 distribution of characteristic impedance versus t2/w2 for different values of parameter h/d distribution of characteristic impedance versus w1/d and s/d is shown in fig. 5. also, there are given values for characteristic impedance of microstrip line with parallel ground planes of infinite width [7]. the influence of dielectric substrate width as well as ground planes width is given in fig. 6. increasing the substrate width, the characteristic impedance decreases. the influence of planes width on characteristic impedance exists, but it can be neglected. increasing the substrate height, the characteristic impedance first increases first, then decreases as the conductor approaches the upper plane, fig. 7. the dielectric permittivity of substrate has also the influence on the characteristic impedance value. increasing the substrate permittivity, the characteristic impedance values decrease. covered microstrip line with ground planes of finite width 597 fig. 5 distribution of characteristic impedance versus s/d for different values of parameter w1/d fig. 6 distribution of characteristic impedance versus s/d for different values of parameter w2/d 598 m. t. perić, s. s. ilić, s. r. aleksić, et al. fig. 7 distribution of characteristic impedance versus h/d for different values of parameter r2 distribution of polarization charges per unit length along boundary surface is shown in fig. 8. fig. 8 distribution of polarization charges per unit length along boundary surface 5. conclusion the aim of this paper is to apply a very efficient hbem, based on a combination of eem and bem, for determining the characteristic impedance of the covered microstrip line with ground planes of finite width. that configuration has not been analyzed so far using hbem. the quasi tem analysis is applied. the main advantage of this method is the possibility to solve arbitrarily shaped, multilayered configuration of microstrip lines, with finite dimension of ground planes and conductor thickness, without any numerical integration. of course, there are other methods that can analyze this structure, but the hbem is simple and accurate procedure. the convergence of the results is good and the computation time is very short. covered microstrip line with ground planes of finite width 599 the analysis of this microstrip was performed for different values of microstrip parameters. the influence of permittivity of layer 2 on the characteristic impedance is evident. also, the results show that for w2/w1 > 2.5 the influence of finite width of ground planes on the characteristic impedance values can be neglected. acknowledgement: this research was partially supported by funding from the serbian ministry of education and science in the frame of the project iii 44004. references [1] k. li, y. fujii, “indirect boundary element method applied to generalized microstrip line analysis with applications to side-proximity effect in mmics”, ieee trans. microwave theory tech., vol. 40, pp. 237–244, 1992, doi: 10.1109/22.120095. [2] chang t. and c. tan, ”analysis of a shielded microstrip line with finite metallization thickness by the boundary element method”, ieee transactions on microwave theory tech., vol. 38, no. 8, pp. 11301132, 1990, doi: 10.1109/22.57340. [3] c.e. smith, r.s. chang, “microstrip transmission line with finite width dielectric”, ieee trans. microwave theory and tech., vol. 28, pp. 90–94, 1980, doi: 10.1109/tmtt.1980.1130015. [4] j. svacina, “new method for analysis of microstrip with finite-width ground plane”, microwave and optical technology letters, vol. 48, no. 2, pp. 396-399, 2006, doi: 10.1002/mop.10672. [5] t. fukuda, t. sugie, k. wakino, y.-d. lin, and t. kitazawa, “variational method of coupled strip lines with an inclined dielectric substrate,” in asia pacific microwave conference – apmc 2009, december 7-10, 2009, pp. 866-869. [6] j. venkataraman, s. n. rao, a. r. đorđević, t. k. sarkar, and y. naiheng, „analysis of arbitrarily oriented microstrip transmission lines in arbitrarily shaped dielectric media over a finite ground plane“, ieee trans. on microwave theory tech., vol. mtt-33, pp. 952–959, oct. 1985, doi: 10.1109/ tmtt.1985.1133155. [7] m. b. baždar, a. r. đorđević, r. f. harrington, t. k. sarkar, „evaluation of quasi-static matrix parameters for multiconductor transmission lines using galerkin’s method“, ieee trans.on microwave theory tech., vol. 42, no. 7, pp. 1223-1228, 1994, doi: 10.1109/22.299760. [8] a. r. đorđević, r. f. harrington, t. k. sarkar, m. b. baždar, matrix parameters for multiconductor transmission lines, software and user’s manual, artech house, boston, 1989. [9] r. f. harrington, field computation by moment methods. new york: macmillan, 1968. [10] t. g. bryant and j. a. weiss, “parameters of microstrip transmission lines and of coupled pairs of microstrip lines”, ieee trans. microwave theory tech., vol. mmt-16, pp. 1021-1027, dec. 1968, doi: 10.1109/tmtt.1968.1126858. [11] a. farrar and a. t. adams, “characteristic impedance of microstrip by the method of moments”, ieee trans. microwave theory tech., vol. mmt-18, pp. 65-66, jan. 1970, doi: 10.1109/tmtt.1970.1127146. [12] j. svacina, “new method for analysis of microstrip with finite-width ground plane”, microwave and optical technology letters, vol. 48, no. 2, pp. 396-399, feb. 2006, doi: 10.1002/mop.21361. [13] c.e. smith, and r.s. chang, “microstrip transmission line with finite width dielectric”, ieee trans. microwave theory tech., vol. 28, pp. 90–94, feb. 1980, doi: 10.1109/tmtt.1980.1130015. [14] c.e. smith, r.s. chang, “microstrip transmission line with finite width dielectric and ground plane” , ieee trans. microwave theory tech., vol. 33, pp. 835–839, 1985, doi: 10.1109/tmtt.1985.1133142. [15] s. ilić, m. perić, s. aleksić, n. raičević, “hybrid boundary element method and quasi tem analysis of 2d transmission lines – generalization”, electromagnetics, vol. 33, no. 4, pp. 292-310, 2013, doi: 10.1080/02726343.2013.777319. [16] m. perić, s. ilić, s. aleksić, n. raičević, “application of hybrid boundary element method to 2d microstrip lines analysis”, int. journal of applied electromagnetics and mechanics, vol. 42, no. 2, pp. 179-190, 2013, doi 10.3233/jae-131655. [17] s. ilić, m. perić, s. aleksić, n. raičević, “quasi tem analysis of 2d symmetrically coupled strip lines with finite grounded plane using hbem”, in 15 th international igte symposium, graz, austria, pp. 7377, 2012. 600 m. t. perić, s. s. ilić, s. r. aleksić, et al. [18] s. ilić, m. perić, s. aleksić, n. raičević, “quasi tem analysis of 2d symmetrically coupled strip lines with infinite grounded plane using hbem”, in xvii-th international symposium on electrical apparatus and technologies siela 2012, bourgas, bulgaria, pp.147-155, 2012. [19] s. ilić, m. perić, s. aleksić, n. raičević, “covered coupled microstrip lines with ground planes of finite width”, in 11 th international conference on telecommunications in modern satellite, cable and broadcasting services – telsiks 2013, niš, serbia, pp. 37-40, 2013. [20] d. meeker, femm 4.2, available: http://www.femm.info/wiki/download. download date: 1 oct 2011. [21] comsol multiphysics, available: http://www.comsol.com [22] n. b. raičević, s. s. ilić, s. r. aleksić, “application of new hybrid boundary element method on the cable terminations”, in 14th international igte'10 symposium, graz, austria, pp. 56-61, 2010. [23] ana n. vučković, nebojša b. raičević, mirjana t. perić and slavoljub aleksić, “magnetic force calculation of permanent magnet systems using hybrid boundary element method”, in sixteenth biennial ieee conference on electromagnetic field computation cefc 2014, annecy, france, 2014. (accepted for presentation) [24] d. m. veličković, “equivalent electrodes method”, scientific review, vol. 21–22, pp. 207–248, 1996. [25] mathematica 5.0, wolfram research inc., 1988-2003. facta universitatis series: electronics and energetics vol. xx, 2018, xx-xx compact xor-bi-decomposition for lattices of boolean functions ∗ bernd steinbach1 and christian posthoff2 1freiberg university of mining and technology, institute of computer science, freiberg, germany 2the university of the west indies, department of computing and information technology, saint augustine, trinidad & tobago abstract: bi-decomposition is a powerful approach for the synthesis of multi-level combinational circuits because it utilizes the properties of the given functions to find small circuits, with low power consumption and low delay. compact bi-decompositions restrict the variables in the support of the decomposition functions as much as possible. methods to find compact and-, or-, or xor-bi-decompositions for a given completely specified function are well known. a lattice of boolean functions represents all possible functions which are defined by an incompletely specified function. lattices of boolean functions significantly increase the possibilities to synthesize a minimal circuit. however, so far only methods to find compact andor or-bi-decompositions for lattices of boolean functions are known. this gap, i.e., a method to find a compact xor-bi-decomposition for a lattice of boolean functions, has been closed by the approach suggested in this paper. keywords: synthesis, combinational circuit, lattice of boolean functions, xor-bidecomposition, boolean differential calculus, derivative operations. 1 introduction the aim of all decomposition methods in circuit design is to find decomposition functions that are simpler than the given function. the bi-decomposition is an apmanuscript received september 18, 2017 corresponding author: bernd steinbach institute of computer science, freiberg university of mining and technology, bernhard-von-cottastr. 2, d-09596 freiberg, germany (e-mail: steinb@informatik.tu-freiberg.de). ∗a preliminary version of this paper was presented at the reed-muller 2017 workshop, novi sad, serbia, may 24-25, 2017. 1 2 b. steinbach and c. posthoff: proach that decomposes a given boolean function into two simpler decomposition functions which are combined by an and-, an or-, or an xor-gate. for the aim of simplification, the bi-decomposition utilizes the properties of the given boolean function to design circuit structures of a small area, low power consumption, and low delay [1]. there are several types of bi-decompositions. both decomposition functions of each strong bi-decomposition are simpler than the given function because they depend on fewer variables. unfortunately, there are functions for which no strong bi-decomposition exists. le [2] bridged this gap by means of the weak bi-decomposition. he found that each function, for which neither a weak or bi-decomposition nor a weak and bi-decomposition exists, can be simplified by a strong xor bi-decomposition. a simplified proof of the completeness of the strong and weak bi-decomposition is given in [3, 4]. an implementation that reuses already decomposed blocks outperforms other synthesis approaches [5]. a drawback of the weak bi-decomposition is that the synthesized circuits can have a large difference between the shortest and the longest path (unbalanced circuits). recently, vectorial bi-decompositions were suggested as supplement to strong and weak bi-decompositions [6, 7]. the decomposition functions of these bidecompositions are simpler than the given function because they are independent of the simultaneous change of several variables. vectorial bi-decompositions can exist for functions without any strong bi-decomposition. benefits of the vectorial bi-decomposition are their contribution to balanced circuits and the increased number of decomposition possibilities in comparison to the strong bi-decomposition. all bi-decomposition approaches mentioned above utilize the boolean differential calculus (bdc) [3, 4, 8–12] to find optimal bi-decompositions. there are several other approaches of bi-decompositions which demonstrate the interest on this useful synthesis method; however, these other approaches are not directly helpful to solve the problem explored in this paper. in [13] a method for disjoint bi-decompositions with an extension to non-disjoint bi-decompositions for a single common variable are suggested. a graph-based approach for bi-decompositions was suggested in [14]. unfortunately, the used benchmarks in [5] and [14] overlap only partially. one common benchmark is t481 where our approach from [5] outperforms the graph-based approach from [14] in the number of gates (17/25). recently, semi-tensor products of matrices were suggested for bi-decompositions of boolean and multi-valued functions [15]. however, this paper does not contain experimental results of benchmark circuits. it is a property of the given function whether it can be decomposed into two simpler decomposition functions using a certain type of bi-decomposition. the possibility to find a bi-decomposition increases when the function to decompose can be chosen from a lattice of boolean functions. incompletely specified functions facta universitatis series: electronics and energetics vol. 31, no 2, june 2018, pp. 223 240 https://doi.org/10.2298/fuee1802223s bernd steinbach1, christian posthoff2 received october 21, 2017; received in revised form january 24, 2018 corresponding author: bernd steinbach institute of computer science, freiberg university of mining and technology, bernhard-von-cotta-str. 2, d-09596 freiberg, germany (e-mail: steinb@informatik.tu-freiberg.de) *an earlier version of this paper was presented as an invited address at the reed-muller 2017 workshop, novi sad, serbia, may 24-25, 2017 facta universitatis series: electronics and energetics vol. 28, no 4, december 2015, pp. 507 525 doi: 10.2298/fuee1504507s horizontal current bipolar transistor (hcbt) – a low-cost, high-performance flexible bicmos technology for rf communication applications tomislav suligoj1, marko koričić1, josip žilak1, hidenori mochizuki2, so-ichi morita2, katsumi shinomura2, hisaya imai2 1university of zagreb, faculty of electrical engineering and computing, department of electronics, microand nano-electronics laboratory, croatia 2asahi kasei microdevices co. 5-4960, nobeoka, miyazaki, 882-0031, japan abstract. in an overview of horizontal current bipolar transistor (hcbt) technology, the state-of-the-art integrated silicon bipolar transistors are described which exhibit ft and fmax of 51 ghz and 61 ghz and ftbvceo product of 173 ghzv that are among the highest-performance implanted-base, silicon bipolar transistors. hbct is integrated with cmos in a considerably lower-cost fabrication sequence as compared to standard vertical-current bipolar transistors with only 2 or 3 additional masks and fewer process steps. due to its specific structure, the charge sharing effect can be employed to increase bvceo without sacrificing ft and fmax. moreover, the electric field can be engineered just by manipulating the lithography masks achieving the high-voltage hcbts with breakdowns up to 36 v integrated in the same process flow with high-speed devices, i.e. at zero additional costs. double-balanced active mixer circuit is designed and fabricated in hcbt technology. the maximum iip3 of 17.7 dbm at mixer current of 9.2 ma and conversion gain of -5 db are achieved. key words: bicmos technology, bipolar transistors, horizontal current bipolar transistor, radio frequency integrated circuits, mixer, high-voltage bipolar transistors. 1. introduction in the highly competitive wireless communication markets, the rf circuits and systems are fabricated in the technologies that are very cost-sensitive. in order to minimize the fabrication costs, the sub-10 ghz applications can be processed by using the high-volume silicon technologies. it has been identified that the optimum solution might received march 9, 2015 corresponding author: tomislav suligoj university of zagreb, faculty of electrical engineering and computing, department of electronics, microand nano-electronics laboratory, croatia (e-mail: tom@zemris.fer.hr) compact xor-bi-decomposition for lattices of boolean functions* 1freiberg university of mining and technology, institute of computer science, freiberg, germany 2the university of the west indies, department of computing and information technology, saint augustine, trinidad and tobago abstract. bi-decomposition is a powerful approach for the synthesis of multi-level combinational circuits because it utilizes the properties of the given functions to find small circuits, with low power consumption and low delay. compact bi-decompositions restrict the variables in the support of the decomposition functions as much as possible. methods to find compact and-, or-, or xor-bi-decompositions for a given completely specified function are well known. a lattice of boolean functions represents all possible functions which are defined by an incompletely specified function. lattices of boolean functions significantly increase the possibilities to synthesize a minimal circuit. however, so far only methods to find compact andor or-bi-decompositions for lattices of boolean functions are known. this gap, i.e., a method to find a compact xor-bi-decomposition for a lattice of boolean functions, has been closed by the approach suggested in this paper. key words: synthesis, combinational circuit, lattice of boolean functions, xor-bidecomposition, boolean differential calculus, derivative operations. 2 b. steinbach and c. posthoff: proach that decomposes a given boolean function into two simpler decomposition functions which are combined by an and-, an or-, or an xor-gate. for the aim of simplification, the bi-decomposition utilizes the properties of the given boolean function to design circuit structures of a small area, low power consumption, and low delay [1]. there are several types of bi-decompositions. both decomposition functions of each strong bi-decomposition are simpler than the given function because they depend on fewer variables. unfortunately, there are functions for which no strong bi-decomposition exists. le [2] bridged this gap by means of the weak bi-decomposition. he found that each function, for which neither a weak or bi-decomposition nor a weak and bi-decomposition exists, can be simplified by a strong xor bi-decomposition. a simplified proof of the completeness of the strong and weak bi-decomposition is given in [3, 4]. an implementation that reuses already decomposed blocks outperforms other synthesis approaches [5]. a drawback of the weak bi-decomposition is that the synthesized circuits can have a large difference between the shortest and the longest path (unbalanced circuits). recently, vectorial bi-decompositions were suggested as supplement to strong and weak bi-decompositions [6, 7]. the decomposition functions of these bidecompositions are simpler than the given function because they are independent of the simultaneous change of several variables. vectorial bi-decompositions can exist for functions without any strong bi-decomposition. benefits of the vectorial bi-decomposition are their contribution to balanced circuits and the increased number of decomposition possibilities in comparison to the strong bi-decomposition. all bi-decomposition approaches mentioned above utilize the boolean differential calculus (bdc) [3, 4, 8–12] to find optimal bi-decompositions. there are several other approaches of bi-decompositions which demonstrate the interest on this useful synthesis method; however, these other approaches are not directly helpful to solve the problem explored in this paper. in [13] a method for disjoint bi-decompositions with an extension to non-disjoint bi-decompositions for a single common variable are suggested. a graph-based approach for bi-decompositions was suggested in [14]. unfortunately, the used benchmarks in [5] and [14] overlap only partially. one common benchmark is t481 where our approach from [5] outperforms the graph-based approach from [14] in the number of gates (17/25). recently, semi-tensor products of matrices were suggested for bi-decompositions of boolean and multi-valued functions [15]. however, this paper does not contain experimental results of benchmark circuits. it is a property of the given function whether it can be decomposed into two simpler decomposition functions using a certain type of bi-decomposition. the possibility to find a bi-decomposition increases when the function to decompose can be chosen from a lattice of boolean functions. incompletely specified functions compact xor-bi-decomposition for lattices of boolean functions 3 were traditionally used as a source of a lattice of boolean functions. the on-setfunction fq(x) and the off-set-function fr(x) are the preferred mark functions of these lattices. the introduction of derivative operations for lattices of boolean functions [4, 16, 17] facilitates the application of lattices in circuit design. a strong bi-decomposition divides the variables of the function to decompose into three disjoint subsets. the variables xa control only the decomposition function g, the variables xb control only the decomposition function h, and the variables xc are commonly used for both the decomposition functions g and h. the more variables in the dedicated sets of variables xa and xb the simpler are the decomposition functions g and h. a compact strong bi-decomposition uses the largest possible sets of xa and xb. there are formulas [3, 4, 10–12] containing operations of the boolean differential calculus that can be used to decide whether a lattice of boolean functions contains a compact strong andor a compact strong or-bi-decomposition. unfortunately, so far it is only possible to find a compact strong xor-bi-decomposition for a single decomposition function or to assign only a single variable to the set xa for the check whether a lattice of boolean functions contains a strong xorbi-decomposition [18]. we suggest in this paper an approach to find also compact strong xor-bi-decompositions for a lattice of boolean functions. this new method combines the ideas of simplifications used in both the strong and the vectorial bidecomposition. hence, we are going to solve a problem that is more than 20 years known as unsolved. the rest of this paper is organized as follows. section 2 briefly describes lattices of boolean functions and single derivatives of functions belonging to such a lattice. section 3 summarizes the known approach to find and determine noncompact xor-bi-decompositions. section 4 introduces into the theory of compact xor-bi-bi-decompositions, concludes the main theorem and the consequence for compact xor-bi-bi-decompositions, and provides two consecutive algorithms that solve this task using xboole [19, 20]. section 5 demonstrates the benefits of the suggested new decomposition method by means of a simple example. section 6 concludes the paper. 2 lattices of boolean functions lattices of boolean functions occur, e.g., in circuit design where each function of the lattice can be chosen as a function to realize the circuit structure. hence, lattices of boolean functions provide a possibility for optimization in circuit design. widely used are lattices which can be modeled as incompletely specified function (isf). such an incompletely specified boolean function divides the 2n patterns 224 b. steinbach, c. posthoff compact xor-bi-decomposition for lattices of boolean functions 225 2 b. steinbach and c. posthoff: proach that decomposes a given boolean function into two simpler decomposition functions which are combined by an and-, an or-, or an xor-gate. for the aim of simplification, the bi-decomposition utilizes the properties of the given boolean function to design circuit structures of a small area, low power consumption, and low delay [1]. there are several types of bi-decompositions. both decomposition functions of each strong bi-decomposition are simpler than the given function because they depend on fewer variables. unfortunately, there are functions for which no strong bi-decomposition exists. le [2] bridged this gap by means of the weak bi-decomposition. he found that each function, for which neither a weak or bi-decomposition nor a weak and bi-decomposition exists, can be simplified by a strong xor bi-decomposition. a simplified proof of the completeness of the strong and weak bi-decomposition is given in [3, 4]. an implementation that reuses already decomposed blocks outperforms other synthesis approaches [5]. a drawback of the weak bi-decomposition is that the synthesized circuits can have a large difference between the shortest and the longest path (unbalanced circuits). recently, vectorial bi-decompositions were suggested as supplement to strong and weak bi-decompositions [6, 7]. the decomposition functions of these bidecompositions are simpler than the given function because they are independent of the simultaneous change of several variables. vectorial bi-decompositions can exist for functions without any strong bi-decomposition. benefits of the vectorial bi-decomposition are their contribution to balanced circuits and the increased number of decomposition possibilities in comparison to the strong bi-decomposition. all bi-decomposition approaches mentioned above utilize the boolean differential calculus (bdc) [3, 4, 8–12] to find optimal bi-decompositions. there are several other approaches of bi-decompositions which demonstrate the interest on this useful synthesis method; however, these other approaches are not directly helpful to solve the problem explored in this paper. in [13] a method for disjoint bi-decompositions with an extension to non-disjoint bi-decompositions for a single common variable are suggested. a graph-based approach for bi-decompositions was suggested in [14]. unfortunately, the used benchmarks in [5] and [14] overlap only partially. one common benchmark is t481 where our approach from [5] outperforms the graph-based approach from [14] in the number of gates (17/25). recently, semi-tensor products of matrices were suggested for bi-decompositions of boolean and multi-valued functions [15]. however, this paper does not contain experimental results of benchmark circuits. it is a property of the given function whether it can be decomposed into two simpler decomposition functions using a certain type of bi-decomposition. the possibility to find a bi-decomposition increases when the function to decompose can be chosen from a lattice of boolean functions. incompletely specified functions compact xor-bi-decomposition for lattices of boolean functions 3 were traditionally used as a source of a lattice of boolean functions. the on-setfunction fq(x) and the off-set-function fr(x) are the preferred mark functions of these lattices. the introduction of derivative operations for lattices of boolean functions [4, 16, 17] facilitates the application of lattices in circuit design. a strong bi-decomposition divides the variables of the function to decompose into three disjoint subsets. the variables xa control only the decomposition function g, the variables xb control only the decomposition function h, and the variables xc are commonly used for both the decomposition functions g and h. the more variables in the dedicated sets of variables xa and xb the simpler are the decomposition functions g and h. a compact strong bi-decomposition uses the largest possible sets of xa and xb. there are formulas [3, 4, 10–12] containing operations of the boolean differential calculus that can be used to decide whether a lattice of boolean functions contains a compact strong andor a compact strong or-bi-decomposition. unfortunately, so far it is only possible to find a compact strong xor-bi-decomposition for a single decomposition function or to assign only a single variable to the set xa for the check whether a lattice of boolean functions contains a strong xorbi-decomposition [18]. we suggest in this paper an approach to find also compact strong xor-bi-decompositions for a lattice of boolean functions. this new method combines the ideas of simplifications used in both the strong and the vectorial bidecomposition. hence, we are going to solve a problem that is more than 20 years known as unsolved. the rest of this paper is organized as follows. section 2 briefly describes lattices of boolean functions and single derivatives of functions belonging to such a lattice. section 3 summarizes the known approach to find and determine noncompact xor-bi-decompositions. section 4 introduces into the theory of compact xor-bi-bi-decompositions, concludes the main theorem and the consequence for compact xor-bi-bi-decompositions, and provides two consecutive algorithms that solve this task using xboole [19, 20]. section 5 demonstrates the benefits of the suggested new decomposition method by means of a simple example. section 6 concludes the paper. 2 lattices of boolean functions lattices of boolean functions occur, e.g., in circuit design where each function of the lattice can be chosen as a function to realize the circuit structure. hence, lattices of boolean functions provide a possibility for optimization in circuit design. widely used are lattices which can be modeled as incompletely specified function (isf). such an incompletely specified boolean function divides the 2n patterns compact xor-bi-decomposition for lattices of boolean functions 3 were traditionally used as a source of a lattice of boolean functions. the on-setfunction fq(x) and the off-set-function fr(x) are the preferred mark functions of these lattices. the introduction of derivative operations for lattices of boolean functions [4, 16, 17] facilitates the application of lattices in circuit design. a strong bi-decomposition divides the variables of the function to decompose into three disjoint subsets. the variables xa control only the decomposition function g, the variables xb control only the decomposition function h, and the variables xc are commonly used for both the decomposition functions g and h. the more variables in the dedicated sets of variables xa and xb the simpler are the decomposition functions g and h. a compact strong bi-decomposition uses the largest possible sets of xa and xb. there are formulas [3, 4, 10–12] containing operations of the boolean differential calculus that can be used to decide whether a lattice of boolean functions contains a compact strong andor a compact strong or-bi-decomposition. unfortunately, so far it is only possible to find a compact strong xor-bi-decomposition for a single decomposition function or to assign only a single variable to the set xa for the check whether a lattice of boolean functions contains a strong xorbi-decomposition [18]. we suggest in this paper an approach to find also compact strong xor-bi-decompositions for a lattice of boolean functions. this new method combines the ideas of simplifications used in both the strong and the vectorial bidecomposition. hence, we are going to solve a problem that is more than 20 years known as unsolved. the rest of this paper is organized as follows. section 2 briefly describes lattices of boolean functions and single derivatives of functions belonging to such a lattice. section 3 summarizes the known approach to find and determine noncompact xor-bi-decompositions. section 4 introduces into the theory of compact xor-bi-bi-decompositions, concludes the main theorem and the consequence for compact xor-bi-bi-decompositions, and provides two consecutive algorithms that solve this task using xboole [19, 20]. section 5 demonstrates the benefits of the suggested new decomposition method by means of a simple example. section 6 concludes the paper. 2 lattices of boolean functions lattices of boolean functions occur, e.g., in circuit design where each function of the lattice can be chosen as a function to realize the circuit structure. hence, lattices of boolean functions provide a possibility for optimization in circuit design. widely used are lattices which can be modeled as incompletely specified function (isf). such an incompletely specified boolean function divides the 2n patterns 224 b. steinbach, c. posthoff compact xor-bi-decomposition for lattices of boolean functions 225 4 b. steinbach and c. posthoff: x of the boolean space bn into three disjoint sets: x ∈ don’t-care-set ⇔ fϕ (x1,...,xn) = 1 ⇔ it is allowed to choose the function value of f (x) without any restrictions; x ∈ on-set ⇔ fq(x1,...,xn) = 1 ⇔ ( fϕ (x1,...,xn) = 0)∧( f (x1,...,xn) = 1) ; x ∈ off-set ⇔ fr(x1,...,xn) = 1 ⇔ ( fϕ (x1,...,xn) = 0)∧( f (x1,...,xn) = 0) . each pair of these mark functions can be used to specify all functions of the lattice. a function f (x) belongs to the lattice l 〈 fq(x), fr(x) 〉 if fq(x) ≤ f (x) ≤ fr(x) . the single derivatives with regard to xi of all functions of a lattice l 〈 fq(x), fr(x) 〉 results again in a lattice of boolean function that is specified by the mark functions: f ∂ xiq (x1) = maxxi fq(xi,x1)∧ max xi fr(xi,x1) , (1) f ∂ xir (x1) = minxi fq(xi,x1)∨ min xi fr(xi,x1) . (2) 3 known non-compact xor-bi-decompositions for lattices of boolean functions a lattice of boolean functions l 〈 fq(xa,xb,xc), fr(xa,xb,xc) 〉 contains at least one function f (xa,xb,xc) that is strongly xor-bi-decomposable with regard to the single variable xa and the set of variables xb if and only if max xb m f ∂ xaq (xb,xc)∧ f ∂ xa r (xb,xc) = 0 . (3) the decomposition function g(xa,xc) of this xor-bi-decomposition is uniquely specified by g(xa,xc) = xa ∧ max xb m f ∂ xaq (xb,xc) , (4) and the associated decomposition function h(xb,xc) can be chosen from the lattice with the mark functions hq(xb,xc) = max xa ((g(xa,xc)∧ fq(xa,xb,xc))∨(g(xa,xc)∧ fr(xa,xb,xc))) , (5) hr(xb,xc) = max xa ((g(xa,xc)∧ fr(xa,xb,xc))∨(g(xa,xc)∧ fq(xa,xb,xc))) . (6) more details about strong and weak bi-decompositions are given in [3, 4, 10, 11]. 226 b. steinbach, c. posthoff compact xor-bi-decomposition for lattices of boolean functions 227 4 b. steinbach and c. posthoff: x of the boolean space bn into three disjoint sets: x ∈ don’t-care-set ⇔ fϕ (x1,...,xn) = 1 ⇔ it is allowed to choose the function value of f (x) without any restrictions; x ∈ on-set ⇔ fq(x1,...,xn) = 1 ⇔ ( fϕ (x1,...,xn) = 0)∧( f (x1,...,xn) = 1) ; x ∈ off-set ⇔ fr(x1,...,xn) = 1 ⇔ ( fϕ (x1,...,xn) = 0)∧( f (x1,...,xn) = 0) . each pair of these mark functions can be used to specify all functions of the lattice. a function f (x) belongs to the lattice l 〈 fq(x), fr(x) 〉 if fq(x) ≤ f (x) ≤ fr(x) . the single derivatives with regard to xi of all functions of a lattice l 〈 fq(x), fr(x) 〉 results again in a lattice of boolean function that is specified by the mark functions: f ∂ xiq (x1) = maxxi fq(xi,x1)∧ max xi fr(xi,x1) , (1) f ∂ xir (x1) = minxi fq(xi,x1)∨ min xi fr(xi,x1) . (2) 3 known non-compact xor-bi-decompositions for lattices of boolean functions a lattice of boolean functions l 〈 fq(xa,xb,xc), fr(xa,xb,xc) 〉 contains at least one function f (xa,xb,xc) that is strongly xor-bi-decomposable with regard to the single variable xa and the set of variables xb if and only if max xb m f ∂ xaq (xb,xc)∧ f ∂ xa r (xb,xc) = 0 . (3) the decomposition function g(xa,xc) of this xor-bi-decomposition is uniquely specified by g(xa,xc) = xa ∧ max xb m f ∂ xaq (xb,xc) , (4) and the associated decomposition function h(xb,xc) can be chosen from the lattice with the mark functions hq(xb,xc) = max xa ((g(xa,xc)∧ fq(xa,xb,xc))∨(g(xa,xc)∧ fr(xa,xb,xc))) , (5) hr(xb,xc) = max xa ((g(xa,xc)∧ fr(xa,xb,xc))∨(g(xa,xc)∧ fq(xa,xb,xc))) . (6) more details about strong and weak bi-decompositions are given in [3, 4, 10, 11]. compact xor-bi-decomposition for lattices of boolean functions 5 4 compact xor-bi-decomposition for lattices of boolean functions a compact bi-decomposition is determined by maximal numbers of variables in the dedicated sets xa and xb. the number of commonly used variables xc is as small as possible, and consequently the decomposition functions g(xa,xc) and h(xb,xc) will be the simplest in case of a desirable compact bi-decomposition. condition (3) allows us to find a maximal number of possible variables for the dedicated set xb of an xor-bi-decomposition, but it unfortunately restricts to a single variable xa of the dedicated set xa. as initial solution we can calculate the decomposition function g(xa,xc) using (4) and the lattice l 〈 hq(xb,xc),hr(xb,xc) 〉 , using (5) and (6), from which the decomposition function h(xb,xc) can be chosen. the set of all variables x is distributed to the three disjoint sets xa = xa, xb, and xc. we assume that xb contains as much as possible variables, because it can be verified by (3) that no other variable can be added to the dedicated set xb without loosing the property of an xor-bi-decomposition of the given lattice. a given xor-bi-decomposition is not compact if at least one variable can be moved from xc to xa. moving a variable xi from xc to xa does not change the set of variables the function g(xa,xc) is depending on; however, it reduces the support of the function h(xb,xc). the set of variables xc is split into xi and xc0. due to the evaluation of condition (3) for all variables, we know that the function h(xb,xc) depends on all variables (xb,xc). hence, only another function h′(xb,xc0) is able to solve the problem. in the context of the xor-bi-decomposition, the following transformation steps show the key idea to solve the problem: g(xa0,xc)⊕ h(xb,xc) (7) = g(xa0,xc0,xi)⊕ h(xb,xc0,xi) (8) = g(xa0,xc0,xi)⊕(xi ⊕ h′(xb,xc0)) (9) = (g(xa0,xc0,xi)⊕ xi)⊕ h′(xb,xc0) (10) = g′(xa0,xi,xc0)⊕ h′(xb,xc0) (11) = g′(xa,xc0)⊕ h′(xb,xc0) ; (12) • the step from (7) to (8) emphasizes the variable xi as element of the given set of variables xc; • the step from (8) to (9) requires that the function h(xb,xc0,xi) is linear in xi. this property enables or prohibits the whole transformation; • the step from (9) to (10) moves the variable xi to the other decomposition function of the xor-bi-decomposition; 226 b. steinbach, c. posthoff compact xor-bi-decomposition for lattices of boolean functions 227 6 b. steinbach and c. posthoff: • the step from (10) to (11) includes the variable xi into the new decomposition function g′(xa0,xi,xc0). this transformation is possible without any restriction; • the step from (11) to (12) emphasizes that xi does not belong to the commonly used variables xc0, because h′(xb,xc0) does not depend on xi; hence, xi extends the dedicated set of variables xa0 to xa = (xa,xi). the only condition for the transformation from (7) to (12) is that the lattice l 〈 hq(xb,xc),hr(xb,xc) 〉 contains at least one function that satisfies: ∂ h(x1,xi) ∂ xi = 1 . theorem 1 (linear separation of a variable for a function of a lattice) a lattice l 〈 hq(xb,xc),hr(xb,xc) 〉 contains at least one function h(x1,xi) that can be represented by h(x1,xi) = xi ⊕ h′(x1) (13) if and only if the condition h∂ xir (x1) = 0 (14) is satisfied. proof 1 necessary: due to condition (14) the lattice l 〈 hq(xb,xc),hr(xb,xc) 〉 contains at least one function that satisfies ∂ h(x1,xi) ∂ xi = 1 . (15) it is well-known that (15) is satisfied if the function h(x1,xi) is linear with regard to the variable xi as shown in (13). hence, we have theorem 1 in the direction (14) ⇒ (13). sufficient: function h(x1,xi) (13) belongs to the lattice l 〈 hq(xb,xc),hr(xb,xc) 〉 so that it holds: hq(xb,xc) ≤ h(x1,xi) ≤ hr(xb,xc) . (16) using (13), the inequality (16) can be split into the two inequalities hq(xb,xc) ≤ h(x1,xi) = xi ⊕ h′(x1) hq(xb,xc) ≤ (xi ∨ h′(x1))(xi ∨ h′(x1)) , (17) 228 b. steinbach, c. posthoff compact xor-bi-decomposition for lattices of boolean functions 229 6 b. steinbach and c. posthoff: • the step from (10) to (11) includes the variable xi into the new decomposition function g′(xa0,xi,xc0). this transformation is possible without any restriction; • the step from (11) to (12) emphasizes that xi does not belong to the commonly used variables xc0, because h′(xb,xc0) does not depend on xi; hence, xi extends the dedicated set of variables xa0 to xa = (xa,xi). the only condition for the transformation from (7) to (12) is that the lattice l 〈 hq(xb,xc),hr(xb,xc) 〉 contains at least one function that satisfies: ∂ h(x1,xi) ∂ xi = 1 . theorem 1 (linear separation of a variable for a function of a lattice) a lattice l 〈 hq(xb,xc),hr(xb,xc) 〉 contains at least one function h(x1,xi) that can be represented by h(x1,xi) = xi ⊕ h′(x1) (13) if and only if the condition h∂ xir (x1) = 0 (14) is satisfied. proof 1 necessary: due to condition (14) the lattice l 〈 hq(xb,xc),hr(xb,xc) 〉 contains at least one function that satisfies ∂ h(x1,xi) ∂ xi = 1 . (15) it is well-known that (15) is satisfied if the function h(x1,xi) is linear with regard to the variable xi as shown in (13). hence, we have theorem 1 in the direction (14) ⇒ (13). sufficient: function h(x1,xi) (13) belongs to the lattice l 〈 hq(xb,xc),hr(xb,xc) 〉 so that it holds: hq(xb,xc) ≤ h(x1,xi) ≤ hr(xb,xc) . (16) using (13), the inequality (16) can be split into the two inequalities hq(xb,xc) ≤ h(x1,xi) = xi ⊕ h′(x1) hq(xb,xc) ≤ (xi ∨ h′(x1))(xi ∨ h′(x1)) , (17) compact xor-bi-decomposition for lattices of boolean functions 7 and hr(xb,xc) ≥ h(x1,xi) = xi ⊕ h′(x1) hr(xb,xc) ≤ h(x1,xi) = xi ⊕ h′(x1) hr(xb,xc) ≤ (xi ∨ h′(x1))(xi ∨ h′(x1)) . (18) the right-hand-side functions of (17) and (18) are equal to or larger than the mark functions hq(xb,xc) and hr(xb,xc). the mark function h∂ xir (x1) of condition (14) is defined by: h∂ xir (x1) = minxi hq(xi,x1)∨ min xi hr(xi,x1) . (19) now, we substitute the function h(x1,xi) of (17) for hq(xb,xc) and the function h(x1,xi) of (18) for hr(xb,xc) into (19). these functions are equal to or larger than the replaced functions. so we get: h∂ xir (x1) = minxi ( (xi ∨ h′(x1))∧(xi ∨ h′(x1)) ) ∨ min xi ( (xi ∨ h′(x1))∧(xi ∨ h′(x1)) ) = min xi (xi ∨ h′(x1))∧ min xi (xi ∨ h′(x1))∨ min xi (xi ∨ h′(x1))∧ min xi (xi ∨ h′(x1)) = ( h′(x1)∨ min xi (xi) ) ∧ ( h′(x1)∨ min xi (xi) ) ∨ ( h′(x1)∨ min xi (xi) ) ∧ ( h′(x1)∨ min xi (xi) ) = ( h′(x1)∨ 0 ) ∧ ( h′(x1)∨ 0 ) ∨ ( h′(x1)∨ 0 ) ∧ ( h′(x1)∨ 0 ) = h′(x1)∧ h′(x1)∨ h′(x1)∧ h′(x1) = 0 ∨ 0 = 0 . hence, condition (14) is not only satisfied for the two mark functions hq(xb,xc) and hr(xb,xc), but also for the function (13) which is linear with regard to xi and belongs to the evaluated lattice. that shows the implication (13) ⇒ (14) and completes theorem 1. consequence 1 an xor-bi-decomposition is compact, if the set of variables xb is as large as possible (verified by condition (3)), and within an iterative procedure 228 b. steinbach, c. posthoff compact xor-bi-decomposition for lattices of boolean functions 229 8 b. steinbach and c. posthoff: all variables xi of the initial set xc that satisfy condition (14) are used to transform g(xa0,xc0,xi) to g′(xa,xc0) by g′(xa,xc0) = xi ⊕ g(xa0,xc0,xi) (20) and the associated new lattice l 〈 h′q(xb,xc0),h ′ r(xb,xc0) 〉 is adjusted by h′q(xb,xc0) = maxxi (xi hq(xb,xc0,xi)∨ xi hr(xb,xc0,xi)) , (21) h′r(xb,xc0) = maxxi (xi hr(xb,xc0,xi)∨ xi hq(xb,xc0,xi)) . (22) a precondition for a compact xor-bi-decomposition is the existence of two variables xa and xb for which the given lattice l 〈 fq(x), fr(x) 〉 contains at least one function which has a strong xor-bi-decomposition with regard to these variables. algorithm 1 analyzes whether there is an xor-bi-decomposition for the given lattice with regard to one pair of variables xa and xb. algorithm 1 determines these variables if they exist. algorithm 1 initial xor-bi-decomposition of the lattice l 〈 fq(x), fr(x) 〉 with regard to the variables xa and xb require: tvls of fq(x)and fr(x) in oda-form ensure: boolean variable hasx orbd: it is true if the given lattice contains at least one xor-bi-decomposable function and f alse otherwise ensure: set of variables (sv) of xa and xb: variables for which the lattice contains at least one function with a strong xor-bi-decomposition 1: all var ← sv uni( fq, fr) 2: hasx orbd ← f alse 3: xa ← /0 4: while hasx orbd ∧ sv next(all var,xa,xa) do 5: xb ← xa 6: f ∂ xaq ← isc(maxk( fq,xa),maxk( fr,xa)) 7: f ∂ xar ← uni(mink( fq,xa),mink( fr,xa)) 8: while hasx orbd ∧ sv next(all var,xb,xb) do 9: if te isc(maxk( f ∂ xaq ,xb), f ∂ xa r ) then 10: hasx orbd ← true 11: end if 12: end while 13: end while 14: return (hasx orbd,xa,xb) 230 b. steinbach, c. posthoff compact xor-bi-decomposition for lattices of boolean functions 231 8 b. steinbach and c. posthoff: all variables xi of the initial set xc that satisfy condition (14) are used to transform g(xa0,xc0,xi) to g′(xa,xc0) by g′(xa,xc0) = xi ⊕ g(xa0,xc0,xi) (20) and the associated new lattice l 〈 h′q(xb,xc0),h ′ r(xb,xc0) 〉 is adjusted by h′q(xb,xc0) = maxxi (xi hq(xb,xc0,xi)∨ xi hr(xb,xc0,xi)) , (21) h′r(xb,xc0) = maxxi (xi hr(xb,xc0,xi)∨ xi hq(xb,xc0,xi)) . (22) a precondition for a compact xor-bi-decomposition is the existence of two variables xa and xb for which the given lattice l 〈 fq(x), fr(x) 〉 contains at least one function which has a strong xor-bi-decomposition with regard to these variables. algorithm 1 analyzes whether there is an xor-bi-decomposition for the given lattice with regard to one pair of variables xa and xb. algorithm 1 determines these variables if they exist. algorithm 1 initial xor-bi-decomposition of the lattice l 〈 fq(x), fr(x) 〉 with regard to the variables xa and xb require: tvls of fq(x)and fr(x) in oda-form ensure: boolean variable hasx orbd: it is true if the given lattice contains at least one xor-bi-decomposable function and f alse otherwise ensure: set of variables (sv) of xa and xb: variables for which the lattice contains at least one function with a strong xor-bi-decomposition 1: all var ← sv uni( fq, fr) 2: hasx orbd ← f alse 3: xa ← /0 4: while hasx orbd ∧ sv next(all var,xa,xa) do 5: xb ← xa 6: f ∂ xaq ← isc(maxk( fq,xa),maxk( fr,xa)) 7: f ∂ xar ← uni(mink( fq,xa),mink( fr,xa)) 8: while hasx orbd ∧ sv next(all var,xb,xb) do 9: if te isc(maxk( f ∂ xaq ,xb), f ∂ xa r ) then 10: hasx orbd ← true 11: end if 12: end while 13: end while 14: return (hasx orbd,xa,xb) compact xor-bi-decomposition for lattices of boolean functions 9 algorithm 1 uses two nested while-loops for the selection of the variables xa and xb. the basic set of all variables is prepared in line 1 using the xboolefunction sv uni (set of variables union). the sequential selection of the variables xa and xb is realized by two xboole-functions sv next (set of variables next variable) that control these while-loops. the variable hasx orbd is used to indicate the boolean result whether the lattice contains at least one function with a strong xor-bi-decomposition. this variable is also used to terminate both while-loops if a strong xor-bi-decomposition is detected. xboole-functions isc (intersection), uni (union), maxk (k-fold maximum), and mink (k-fold minimum) calculate in lines 6 and 7 the mark functions of the derivative of the given lattice with regard to the single variables xa. xboolefunction te isc (test empty intersection) checks in line 9 condition (3) for the strong xor-bi-decomposition with regard to the actually selected variables xa and xb. in the case that this condition is satisfied, the control variable hasx orbd is changed to the value true in line 10. algorithm 2 extends a found initial xor-bi-decomposition to a compact one. initial steps determine all variables (line 1), the basic sets of commonly used variables xc (line 2) and dedicated variables xb (line 3), and the precondition (line 4) for the selection of variables xb by means of the xboole-function sv next in line 8. the mark functions of the derivative of the given lattice with regard to the single variables xa are needed in condition (3) to decide about the possibility to extend the set xb; they are calculated in lines 5 and 6 based on (1) and (2). the help-function h0 stores the intermediate result of the k-fold maximum with regard to the already known variables of the set xb (line 7). the while-loop in lines 8 to 13 extends the set xb to the maximal possible number of variables of a strong xor-bi-decomposition for the given lattice. condition (3) is verified in line 9 for the temporally extended set xb. if this condition is satisfied for the set of variables xb ∪ xb, the set of variables xb is permanently extended in line 10 and the help-function h0 is adjusted in line 11. knowing the maximal set of variables xb, basic versions of the wanted functions can be calculated: • g(xa,xc) based on (4) in line 14; • hq(xb,xc) based on (5) in line 15; and • hr(xb,xc) based on (6) in line 16. in a second while-loop (lines 20 to 28) the set of variables xa is extended. initial steps determine the new set of commonly used variables xc (line 17), the basic set 230 b. steinbach, c. posthoff compact xor-bi-decomposition for lattices of boolean functions 231 10 b. steinbach and c. posthoff: algorithm 2 compact strong xor-bi-decomposition of the lattice l 〈 fq(x), fr(x) 〉 require: tvls of fq(x) and fr(x); initial svs of xa and xb ensure: tvl of g(xa,xc): decomposition function ensure: tvls of hq(xb,xc) and hr(xb,xc): decomposition lattice ensure: svs of xa, xb, and xc: disjoint sets of variables 1: all var ← sv uni( fq, fr) 2: xc ← sv dif(sv dif(all var,xa),xb)) 3: xb ← xb 4: xb ← /0 5: f ∂ xaq ← isc(maxk( fq,xa),maxk( fr,xa)) 6: f ∂ xar ← uni(mink( fq,xa),mink( fr,xa)) 7: h0 ← maxk( f ∂ xaq ,xb) 8: while sv next(xc,xb,xb) do 9: if te isc(maxk(h0,xb), f ∂ xar ) then 10: xb ← sv uni(xb,xb) 11: h0 ← maxk( f ∂ xaq ,xb) 12: end if 13: end while 14: g ← isc(sv get(xa),h0) 15: hq ← maxk(uni(isc(g, fq),isc(g, fr)),xa) 16: hr ← maxk(uni(isc(g, fr),isc(g, fq)),xa) 17: xc ← sv dif(sv dif(all var,xa),xb)) 18: xa ← xa 19: xi ← /0 20: while sv next(xc,xi,xi) do 21: if te uni(mink(hq,xi),mink(hr,xi)) then 22: xa ← sv uni(xa,xi) 23: xi ← sv get(xi) 24: g ← syd(xi,g) 25: hq ← maxk(uni(isc(xi,hq),isc(xi,hr)),xi) 26: hr ← maxk(uni(isc(xi,hr),isc(xi,hq)),xi) 27: end if 28: end while 29: xc ← sv dif(xc,xa) 30: return (g,hq,hr,xa,xb,xc) of variables xa (line 18), and the selection variable xi (line 19) needed to evaluate the possibility of the extension of xa. 232 b. steinbach, c. posthoff compact xor-bi-decomposition for lattices of boolean functions 233 10 b. steinbach and c. posthoff: algorithm 2 compact strong xor-bi-decomposition of the lattice l 〈 fq(x), fr(x) 〉 require: tvls of fq(x) and fr(x); initial svs of xa and xb ensure: tvl of g(xa,xc): decomposition function ensure: tvls of hq(xb,xc) and hr(xb,xc): decomposition lattice ensure: svs of xa, xb, and xc: disjoint sets of variables 1: all var ← sv uni( fq, fr) 2: xc ← sv dif(sv dif(all var,xa),xb)) 3: xb ← xb 4: xb ← /0 5: f ∂ xaq ← isc(maxk( fq,xa),maxk( fr,xa)) 6: f ∂ xar ← uni(mink( fq,xa),mink( fr,xa)) 7: h0 ← maxk( f ∂ xaq ,xb) 8: while sv next(xc,xb,xb) do 9: if te isc(maxk(h0,xb), f ∂ xar ) then 10: xb ← sv uni(xb,xb) 11: h0 ← maxk( f ∂ xaq ,xb) 12: end if 13: end while 14: g ← isc(sv get(xa),h0) 15: hq ← maxk(uni(isc(g, fq),isc(g, fr)),xa) 16: hr ← maxk(uni(isc(g, fr),isc(g, fq)),xa) 17: xc ← sv dif(sv dif(all var,xa),xb)) 18: xa ← xa 19: xi ← /0 20: while sv next(xc,xi,xi) do 21: if te uni(mink(hq,xi),mink(hr,xi)) then 22: xa ← sv uni(xa,xi) 23: xi ← sv get(xi) 24: g ← syd(xi,g) 25: hq ← maxk(uni(isc(xi,hq),isc(xi,hr)),xi) 26: hr ← maxk(uni(isc(xi,hr),isc(xi,hq)),xi) 27: end if 28: end while 29: xc ← sv dif(xc,xa) 30: return (g,hq,hr,xa,xb,xc) of variables xa (line 18), and the selection variable xi (line 19) needed to evaluate the possibility of the extension of xa. compact xor-bi-decomposition for lattices of boolean functions 11 condition (14) of theorem 1 is verified in line 21 using the formula (2) to determine the off-set of the derivative of a lattice with regard to the single variable xi. if this condition is satisfied: • the set of variables xa is extended by xi in line 22 using the xboolefunction sv uni (set of variables union); • xi is transformed in line 23 from a set of variables into the tvl representing x1 = 1 using the xboole-function sv get (set of variables get); • the new function g is calculated in line 24 based on (20) using the xboolefunction syd (symmetric difference); • the new on-set function hq(x) is calculated in line 25 based on (21); and • the new off-set function hr(x) is calculated in line 26 based on (22). the restriction of the set of commonly used variables xc is realized in line 29 outside of the loop, because the unchanged basic set xc is needed in the xboolefunction sv next in line 20 to select the next variable xi. the complement operations in lines 15, 16, 24, and 25 are realized using the xboole-function cpl. 5 example 5.1 the chosen lattice and conditions for the synthesis 0 0 0 1 0 1 0 1 0 1 0 1 1 1 0 0 1 1 0 1 1 0 1 0 1 1 0 0 1 1 0 1 0 0 0 0 0 0 1 1 0 0 1 1 0 x3 0 0 1 1 1 1 0 0 x2 0 0 0 0 1 1 1 1 x1 φ φ φ x4 x5 l 〈 fq(x), fr(x) 〉 (a) 0 0 0 1 0 1 0 1 0 1 0 1 1 1 0 0 1 1 0 0 1 1 0 1 0 1 1 0 0 1 1 0 1 1 0 0 0 0 0 0 0 1 1 0 0 1 1 0 x3 0 0 1 1 1 1 0 0 x2 0 0 0 0 1 1 1 1 x1 x4 x5 y = f (x) (b) fig. 1. karnaugh-maps of (a) the given lattice, and (b) chosen function of both bi-decompositions. figure 1 (a) shows the karnaugh-map of a lattice of eight boolean functions. the simplest multi-level circuit structure for one of these functions must be found using and-, or-, and xor-gates of two inputs where these inputs arbitrary can 232 b. steinbach, c. posthoff compact xor-bi-decomposition for lattices of boolean functions 233 12 b. steinbach and c. posthoff: be negated. the gates can be reused to simplify the circuit. as basis for comparison serves a minimal disjunctive form, calculated by means of the well known quine mccluskey algorithm. the synthesis of the given lattice of functions by bi-decompositions has been realized using both the know non-compact xor-bidecomposition and the new compact xor-bi-decomposition. using conditions given in [3, 4, 11] it can be verified that this lattice does not contain any function which has a strong bi-decomposition with regard to any dedicated sets of variables xa and xb for an andor an or-gate. figure 1 (b) shows the function chosen by both the known and the new bi-decomposition approach. two don’t-cares are assigned to 0 and the other to 1. the simplest minimal disjunctive form realizes the function fq(x) of the lattice where all don’t-cares are assigned to 0. 5.2 synthesis by covering using a minimal disjunctive form the execution of the quine mccluskey algorithm results in two minimal disjunctive forms of the same complexity. both of them realize the on-set function fq(x) and require the same number of gates and levels in a circuit. the chosen minimal disjunctive form is: fq(x) = (x1x2)x3 ∨(x1x2)(x4x5)∨(x1x2)(x4x5)∨(x2x3)(x4x5)∨ (x1x2)(x3x4)∨(x1x3)(x4x5)∨(x1(x2x3))(x4x5)∨(x2(x1x3))(x4x5) . (23) the parentheses in the conjunctions in (23) emphasize the chosen two-input andgates. figure 2 shows the associated circuit structure in which as much as possible and-gates are reused. the disjunction of eight conjunctions is realized by a tree of seven or-gates. seven and-gates could be reused to build another conjunction. in total there are 18 and-gates. the complete circuit consists of 25 two-input gates on six levels. 5.3 synthesis using the known non-compact xor-bi-decomposition using condition (3) it was found that the lattice of figure 1 (a) contains at least one function that is xor-bi-decomposable with regard to the single variable xa = x1 and the dedicated set xb = (x3,x5). hence, the set of commonly used variables xc = (x2,x4). the decomposition function g(xa,xc) of an xor-bi-decomposition is uniquely specified by (4), and we get g1(x1,x2,x4) = x1 ∧(x2 ∧ x4) . (24) it can directly be seen that there is a strong and-bi-decomposition of g1(x1,x2,x4) into g2 = x1 and h2 = (x2 ∧ x4). no further decomposition is needed for these functions. 234 b. steinbach, c. posthoff compact xor-bi-decomposition for lattices of boolean functions 235 12 b. steinbach and c. posthoff: be negated. the gates can be reused to simplify the circuit. as basis for comparison serves a minimal disjunctive form, calculated by means of the well known quine mccluskey algorithm. the synthesis of the given lattice of functions by bi-decompositions has been realized using both the know non-compact xor-bidecomposition and the new compact xor-bi-decomposition. using conditions given in [3, 4, 11] it can be verified that this lattice does not contain any function which has a strong bi-decomposition with regard to any dedicated sets of variables xa and xb for an andor an or-gate. figure 1 (b) shows the function chosen by both the known and the new bi-decomposition approach. two don’t-cares are assigned to 0 and the other to 1. the simplest minimal disjunctive form realizes the function fq(x) of the lattice where all don’t-cares are assigned to 0. 5.2 synthesis by covering using a minimal disjunctive form the execution of the quine mccluskey algorithm results in two minimal disjunctive forms of the same complexity. both of them realize the on-set function fq(x) and require the same number of gates and levels in a circuit. the chosen minimal disjunctive form is: fq(x) = (x1x2)x3 ∨(x1x2)(x4x5)∨(x1x2)(x4x5)∨(x2x3)(x4x5)∨ (x1x2)(x3x4)∨(x1x3)(x4x5)∨(x1(x2x3))(x4x5)∨(x2(x1x3))(x4x5) . (23) the parentheses in the conjunctions in (23) emphasize the chosen two-input andgates. figure 2 shows the associated circuit structure in which as much as possible and-gates are reused. the disjunction of eight conjunctions is realized by a tree of seven or-gates. seven and-gates could be reused to build another conjunction. in total there are 18 and-gates. the complete circuit consists of 25 two-input gates on six levels. 5.3 synthesis using the known non-compact xor-bi-decomposition using condition (3) it was found that the lattice of figure 1 (a) contains at least one function that is xor-bi-decomposable with regard to the single variable xa = x1 and the dedicated set xb = (x3,x5). hence, the set of commonly used variables xc = (x2,x4). the decomposition function g(xa,xc) of an xor-bi-decomposition is uniquely specified by (4), and we get g1(x1,x2,x4) = x1 ∧(x2 ∧ x4) . (24) it can directly be seen that there is a strong and-bi-decomposition of g1(x1,x2,x4) into g2 = x1 and h2 = (x2 ∧ x4). no further decomposition is needed for these functions. compact xor-bi-decomposition for lattices of boolean functions 13 y = fq(x)x1 x2 x3 x4 x5 fig. 2. circuit structure synthesized by quine-mccluskey and reused two-input gates. y = f (x) g1g2 h2 h1 g3 h3 g4 h4 x1 x2 x3 x4 x5 fig. 3. circuit structure synthesized using the old non-compact xor-bi-decomposition. the lattice of the decomposition function h1 can be calculated by (5) and (6) and contains in this example the single function h1(x2,x3,x4,x5) = (x3 ∧(x4 ⊕ x5))⊕(x2 ⊕ x3) . (25) by means of condition (3) it can be verified that an xor-bi-decomposition of l 〈 h1q,h1r 〉 with regard to xa = x2 and xb = (x4,x5) exists. hence, only the variable x3 belongs to the set of commonly used variables xc. the decomposition function g(xa,xc) of this second xor-bi-decomposition 234 b. steinbach, c. posthoff compact xor-bi-decomposition for lattices of boolean functions 235 14 b. steinbach and c. posthoff: was again calculated by (4): g3(x2,x3) = x2 ⊕ x3 . the lattice l 〈 h3q,h3r 〉 contains only the single function h3(x3,x4,x5) = x3 ∧(x4 ⊕ x5) for which a strong and-bi-decomposition into g4 = x3 and h4 = (x4 ⊕ x5) exists. no further decomposition is needed for these functions. figure 3 shows the synthesized circuit consisting of seven gates on four levels. 5.4 optimized synthesis using the new compact xor-bi-decomposition for direct comparison we demonstrated the approach of the utilization of a linearly separable variable to get a compact xor-bi-decomposition using the same lattice (shown in figure 1 (a)) as before. y = f (x) g1 g2 h2 h1 g3 h3 x1 x2 x3 x4 x5 fig. 4. circuit structure synthesized using the new compact xor-bi-decomposition. using condition (3) in algorithm 1 the initial xor-bi-decomposition with regard to the single variables xa = x1 and xb = x3 is found. in the first part of algorithm 2 (lines 1 to 13) the dedicated set xb could be extended to (x3,x5) due to the check of variables x2, x4, and x5 within line 9 embedded in the loop of lines 8 to 13. hence, the basic set of commonly used variables is xc = (x2,x4) which is determined in line 17. algorithm 2 finds by condition (14) in line 21 in the while-loop in lines 20 to 28 that the so fare detected lattice of h1 contains a function that is linear with regard to x2. hence, x2 is included into the set xa in line 22, the new function g1 is calculated in lines 23 and 24 using the basic function g′1 (24): g1(x1,x2,x4) = x2 ⊕ g′1(x1,x2,x4) = x2 ⊕ ( x1 ∧(x2 ∧ x4) ) = (x1 ⊕ x2)∨(x2 ∧ x4) . (26) 236 b. steinbach, c. posthoff compact xor-bi-decomposition for lattices of boolean functions 237 14 b. steinbach and c. posthoff: was again calculated by (4): g3(x2,x3) = x2 ⊕ x3 . the lattice l 〈 h3q,h3r 〉 contains only the single function h3(x3,x4,x5) = x3 ∧(x4 ⊕ x5) for which a strong and-bi-decomposition into g4 = x3 and h4 = (x4 ⊕ x5) exists. no further decomposition is needed for these functions. figure 3 shows the synthesized circuit consisting of seven gates on four levels. 5.4 optimized synthesis using the new compact xor-bi-decomposition for direct comparison we demonstrated the approach of the utilization of a linearly separable variable to get a compact xor-bi-decomposition using the same lattice (shown in figure 1 (a)) as before. y = f (x) g1 g2 h2 h1 g3 h3 x1 x2 x3 x4 x5 fig. 4. circuit structure synthesized using the new compact xor-bi-decomposition. using condition (3) in algorithm 1 the initial xor-bi-decomposition with regard to the single variables xa = x1 and xb = x3 is found. in the first part of algorithm 2 (lines 1 to 13) the dedicated set xb could be extended to (x3,x5) due to the check of variables x2, x4, and x5 within line 9 embedded in the loop of lines 8 to 13. hence, the basic set of commonly used variables is xc = (x2,x4) which is determined in line 17. algorithm 2 finds by condition (14) in line 21 in the while-loop in lines 20 to 28 that the so fare detected lattice of h1 contains a function that is linear with regard to x2. hence, x2 is included into the set xa in line 22, the new function g1 is calculated in lines 23 and 24 using the basic function g′1 (24): g1(x1,x2,x4) = x2 ⊕ g′1(x1,x2,x4) = x2 ⊕ ( x1 ∧(x2 ∧ x4) ) = (x1 ⊕ x2)∨(x2 ∧ x4) . (26) compact xor-bi-decomposition for lattices of boolean functions 15 using (21) and (22) the new lattice l 〈 h1q,h1r 〉 is calculated in lines 25 and 26 of algorithm 2. due to the special case of the completely specified function h′1 (25) the result is also a completely specified function that is calculated by (21): h1(x3,x4,x5) = max x2 ( x2 ⊕ h′1(x2,x3,x4,x5)) ) = max x2 (x2 ⊕(x3 ∧(x4 ⊕ x5))⊕(x2 ⊕ x3)) = (x3 ∧(x4 ⊕ x5))⊕ x3 = (x4 ⊕ x5)∨ x3 . (27) hence, h1(x3,x4,x5) depends only on three variables, and the comparison with g1(x1,x2,x4) confirms that only the variable x4 is shared. in this way the single variable of the dedicated set xa is implicitly extended to xa = (x1,x2). algorithm 2 explicitly realizes this extension in line 22 using the xboole operation sv uni (set of variables union). the expressions (26) and (27) show that there are or-bi-decompositions for both decomposition functions g1 and h1. figure 4 shows that the circuit structure, realized be means of the new method, only needs six gates on three levels. 5.5 comparison of the synthesis results table 1 summarizes the results of the synthesis of the given lattice of boolean functions realized by: • the covering method using the quine mccluskey approach to get a minimal disjunctive form which has been split into two-input gates that are reused as much as possible; • the bi-decomposition method where the known xor-bi-decomposition of a lattice is restricted to the assignment of a single variable to the dedicated set xa; • the bi-decomposition method using the new xor-bi-decomposition for a lattice that is able to realize a compact xor-bi-decomposition. both the needed area and the power consumption are estimated by the number of gates. the benefit of the bi-decomposition in comparison to the covering method is evident; the number of gates could be reduced, despite the seven reused gates in the covering approach, from 25 to seven in case of the known bi-decomposition and even to six when the new compact xor-bi-decomposition is used. this is a reduction to 24% of the needed area as well as the power consumption of the new 236 b. steinbach, c. posthoff compact xor-bi-decomposition for lattices of boolean functions 237 16 b. steinbach and c. posthoff: table 1. comparison of needed area, power consumption, and maximal delay effect to used count covering method used xor-bi-decomposition ratios known new compact new compactcovering new compact known area number of gates 25 7 6 24.0 % 85.7 % power number of gates 25 7 6 24.0 % 85.7 % delay number of gates in the longest path 6 4 3 50.0 % 75.0 % compact xor-bi-decomposition in comparison to the covering method or to 85.7% regarding the so far used non-compact xor-bi-decomposition. the maximal delay of the synthesized circuit can be estimated by the number of gates in the longest path that is equal to the number of gate levels. the bidecomposition outperforms the covering method also regarding the maximal delay. the new compact xor-bi-decomposition was able to reduce the maximal delay to one half in comparison to the covering method or 75% according to the known non-compact xor-bi-decomposition. 6 conclusions lattices of boolean functions provide the possibility to choose the function for which the circuit needs a small area, a low power consumption, and has a short delay time. the bi-decomposition is a very powerful method to synthesize circuits that improve these parameters in comparison to covering methods. the theory to find compact strong bi-decompositions was so far only known for andand orgates. however, strong xor-bi-decompositions were restricted to a single variable in the dedicated set xa. the results of this paper close this gap of a missing compact xor-bi-decomposition for lattices of boolean functions. it provides both the needed new theory and their application in algorithms using xboole [19, 20] for the calculation of compact xor-bi-decompositions for lattices of boolean functions. in a very simple example the gate count (needed area, power consumption) could be reduced to 24 percent in comparison to an exact covering method and to 85 percent regarding the known bi-decomposition. for the same example the length of the longest path (maximal delay) could be reduced to one half in comparison to an exact covering method and to 75 percent according to the known bi-decomposition. 238 b. steinbach, c. posthoff compact xor-bi-decomposition for lattices of boolean functions 239 16 b. steinbach and c. posthoff: table 1. comparison of needed area, power consumption, and maximal delay effect to used count covering method used xor-bi-decomposition ratios known new compact new compactcovering new compact known area number of gates 25 7 6 24.0 % 85.7 % power number of gates 25 7 6 24.0 % 85.7 % delay number of gates in the longest path 6 4 3 50.0 % 75.0 % compact xor-bi-decomposition in comparison to the covering method or to 85.7% regarding the so far used non-compact xor-bi-decomposition. the maximal delay of the synthesized circuit can be estimated by the number of gates in the longest path that is equal to the number of gate levels. the bidecomposition outperforms the covering method also regarding the maximal delay. the new compact xor-bi-decomposition was able to reduce the maximal delay to one half in comparison to the covering method or 75% according to the known non-compact xor-bi-decomposition. 6 conclusions lattices of boolean functions provide the possibility to choose the function for which the circuit needs a small area, a low power consumption, and has a short delay time. the bi-decomposition is a very powerful method to synthesize circuits that improve these parameters in comparison to covering methods. the theory to find compact strong bi-decompositions was so far only known for andand orgates. however, strong xor-bi-decompositions were restricted to a single variable in the dedicated set xa. the results of this paper close this gap of a missing compact xor-bi-decomposition for lattices of boolean functions. it provides both the needed new theory and their application in algorithms using xboole [19, 20] for the calculation of compact xor-bi-decompositions for lattices of boolean functions. in a very simple example the gate count (needed area, power consumption) could be reduced to 24 percent in comparison to an exact covering method and to 85 percent regarding the known bi-decomposition. for the same example the length of the longest path (maximal delay) could be reduced to one half in comparison to an exact covering method and to 75 percent according to the known bi-decomposition. references 17 references [1] d. bochmann, f. dresig, and b. steinbach. “a new decomposition method for multilevel circuit design”. in: proceedings of the conference on european design automation. edac ’91. amsterdam, the netherlands: ieee computer society, 1991, pp. 374–377. [2] t. le. “testbarkeit kombinatorischer schaltungen theorie und entwurf”. written in german, english title: testability of combinational circuits theory and design. phd thesis. tu karl-marx-stadt, germany, 1989. [3] c. posthoff and b. steinbach. logic functions and equations – binary models for computer science. dordrecht, the netherlands: springer, 2004. [4] b. steinbach and c. posthoff. boolean differential calculus. synthesis lecturers on digital circuits and systems 52. san rafael, ca, usa: morgan & claypool, 2017. [5] a. mishchenko, b. steinbach, and m. perkowski. “an algorithm for bidecomposition of logic functions”. in: proceedings of the 38th annual design automation conference. dac ’01. las vegas, nevada, usa: acm, 2001, pp. 103–108. [6] b. steinbach. “vectorial bi-decompositions of logic functions”. in: proceedings of the reed-muller workshop 2015. rm 4. waterloo, canada, 2015. [7] b. steinbach and c. posthoff. “vectorial bi-decompositions for lattices of boolean functions”. in: proceedings of the 12th international workshops on boolean problems. iwsbp. freiberg, germany: freiberg university of mining and technology, 2016, pp. 93–104. [8] a. thayse. “boolean differential calculus”. in: philips research reports 26 (1971). r 764, pp. 229–246. [9] m. davio and a. thayse. “boolean differential calculus and its application to switching theory”. in: ieee transactions on computes 22.4 (1973), pp. 409–420. [10] b. steinbach and c. posthoff. logic functions and equations examples and exercises. springer science + business media b.v., 2009. [11] b. steinbach and c. posthoff. “boolean differential calculus theory and applications”. in: journal of computational and theoretical nanoscience 7.6 (2010), pp. 933–981. 238 b. steinbach, c. posthoff compact xor-bi-decomposition for lattices of boolean functions 239 18 references [12] b. steinbach and c. posthoff. “boolean differential calculus”. in: progress in applications of boolean functions. synthesis lecturers on digital circuits and systems 26. san rafael, ca, usa: morgan & claypool, 2010, pp. 55–78. [13] t. sasao and j. butler. “on bi-decompositions of logic functions”. in: 6th international workshop on logic & synthesis. iwls. granlibakken resort tahoe city, ca, usa, 1997, pp. 1–6. [14] m. choudhury and k. mohanram. “bi-decomposition of large boolean functions using blocking edge graphs”. in: 2010 ieee/acm international conference on computer-aided design. iccad. 2010, pp. 586–591. [15] d. cheng and x. xu. “bi-decomposition of logical mappings via semitensor product of matrices”. in: automatica 49.7 (2013), pp. 51–76. [16] b. steinbach. “generalized lattices of boolean functions utilized for derivative operations”. in: materiały konferencyjne knws’13. knws ’13. łagów, poland, 2013, pp. 1–17. [17] b. steinbach. “derivative operations for lattices of boolean functions”. in: proceedings of the reed-muller workshop 2013. rm ’13. toyama, japan, 2013, pp. 110–119. [18] b. steinbach and a. wereszczynski. “synthesis of multi-level circuits using exor-gates”. in: ifip wg 10.5 workshop on applications of the reed-muller expansion in circuit design. chiba makuhari, japan, 1995, pp. 161–168. [19] b. steinbach. “xboole a toolbox for modelling, simulation, and analysis of large digital systems”. in: systems analysis and modelling simulation 9.4 (1992), pp. 297–312. [20] b. steinbach and m. werner. “xboole-cuda fast calculations of large boolean problems on the gpu”. in: problems and new solutions in the boolean domain. ed. by b. steinbach. newcastle upon tyne, uk: cambridge scholars publishing, 2016, pp. 117–149. 240 b. steinbach, c. posthoff compact xor-bi-decomposition for lattices of boolean functions pb 4583-32239-1-pb rev1 facta universitatis series: electronics and energetics vol. 32, no 3, september 2019, pp. 449-461 https://doi.org/10.2298/fuee1903449b frequency scanning antenna arrays with metamaterial based phased shifters nikola bošković1, branka jokanović1, vera marković2 1institute of physics, university of belgrade, serbia 2faculty of electronic engineering, university of niš, serbia abstract. this paper presents a simple design of linear series-fed frequency scanning antenna arrays with: (a) identical rectangular dipoles and (b) pentagonal dipoles having different impedances to provide enhanced side lobe suppression. phase shifters are designed as a metamaterial unit cell consisting of split-ring resonators coupled with the parallel microstrip line. shifter models variations are described and control of phase is demonstrated. two antenna arrays are manufactured and measured. key words: scanning antenna array, linear array, series feeding, pentagonal dipole, phase shifter, split-ring resonator. 1. introduction antenna elements come in various forms in terms of technology, size, cost and radiation properties. nevertheless, a single antenna has a typical omnidirectional radiation pattern and low gain. in many applications, there is a need for directional and high gain radiation, which can be generated by combining multiple antenna elements in different arrangements. one of the most notable problems in antenna arrays is side lobe emergence. they can be observed as radiation in the unwanted direction as a direct consequence of the configuration of arrays elements. high levels of side lobes can make it hard to isolate desired signals and overcome uncertainties in the determination of a position of the specific object, which is especially important in radar applications. two main factors, which determine the sidelobe levels (slls), are the power distribution between the elements and the distance between them. typical demands are slls -20 db [1], relative to the main lobe. slls from -30 to -20 db can be typically achieved solely by the power distribution, but for higher lobes suppression, the distance between the elements must be considered as well. printed antennas are by far the most popular antennas for applications due to size and shape diversity, ease of fabrication and integration, low cost and high flexibility in resonant frequency, polarization, radiation pattern and impedance. they come in forms of patch, received november 2, 2018; received in revised form february 6, 2019 corresponding author: nikola bošković institute of physics, university of belgrade, pregrevica 118, 11080 belgrade, serbia (e-mail: nikolab@ipb.ac.rs) 450 n. bošković, b. jokanović, v. marković dipoles, slots etc. their main drawback is a typical low power handling capability due to the low thermal conductivity of regularly used dielectrics. however, developing various dielectrics with high thermal conductivity similar to aluminum nitride (ain) ceramic can overcome even this obstacle. other problems like surface waves, spurious radiation and losses can be controlled in different ways [2-10]. in order to achieve low slls in frequency scanning antenna with linear element arrangements, the appropriate power distribution can be extremely hard to implement because it can require a very high ratio of the impedances of radiation elements. in addition, there is a need to maintain the desired distribution in a wide frequency band while avoiding main beam deformation, which can be caused by beam squinting due to the frequency change [11]. in this paper, experimental results from the array with regular dipoles comparing with the experimental results with the enhanced pentagonal dipoles are shown. both models use the same shifter based on the four srr, same dielectric, the distance between elements, and both are design to work at 10 ghz making it very fairly to use in comparison. (a) (b) fig. 1 antenna array feeds: (a) series feed and (b) corporate feed. φ φ φ φ φ φ φ φ 2 3 4 5 6 7 8φ φ φ φ φ φ φ φ frequency scanning antenna arrays with metamaterial based phased shifters 451 2. printed frequency scanning antenna with technological development, there is a great need for the scanning antennas, which enable tracking of a specific position in space with great accuracy and resolution [12-15]. in the past, such solutions were predominantly based on a combination of the antenna array and a customized mechanical system for pointing the array at the specific direction. such systems have limited agility, require periodic maintenance of the mechanical parts, and due to high price have limited usage. with the development of modern electronics, frequency scanning is introduced at a much lower price, better performances and reliability. basic frequency scanning antenna array consists of the radiating elements in a specific spatial distribution and frequency dependent elements between them, which introduce different phase shift, depending on the applied frequency, hence this elements can be called phase shifters. a typical configuration is a linear series-fed array, which enables scanning in one plane. combining multiple linear arrays in a planar array, we can obtain scanning in the second plane. scanning in the second plane is typically achieved in a different manner than in a linear array. frequency scanning antenna enables continuous coverage of the spatial range of scanning. angular resolution depends on the 3 db-beamwidth of the main beam, and speed of scanning is determined by the frequency dependence of the phase shifter. for linear arrangements, an array of n radiating elements require n-1 phase shifters, since for the basic operation phase shift ∆φ between two successive elements is constant, but there is also need for constant phase-increment from the first to the last element in the plane of scanning. depending on their structure phase shifters can have significant losses and non-linearity, which can seriously degrade array characteristics. in addition, in order to provide low slls the suitable power distribution needs to be implemented, which can be challenging in a series fed array (fig 1a) because phase shifters and radiating elements change their performances in the frequency range. because of that, full corporate-fed (fig. 1b) are occasionally used, but they require a much greater number of phase shifters, have significantly larger size and smaller efficiency. other type of commonly used electronic scanning is via switchers. the principle is that every antenna element is connected to the power source via one of the several available phase shifters. with on/off switching shifter selection is made and the main beam is positioned at the certain direction. the whole system can work in a single frequency, but the number of available directions depends on the number of the phase shifts available. similar basic principles of operations are used with rotman lens [16-17], butler matrix [18-19] and similar structures. they typically have n inputs and n outputs, which are connected to the antenna elements. when connecting the power source at the different inputs the different main beam positions are generated. combining one of these types of electronic scanning in one plane and frequency scanning in another, scanning in two planes is enabled at the same time. frequency scanning antenna should be cheaper, easier to manufacture, with more stable characteristics in the working range, with higher efficiency and easier for integration with other components in comparison with the electronic scanning antenna. a natural choice would be a printed antenna structure with antenna elements having stable radiation and impedance characteristics in the working range. the printed pentagonal dipole is an excellent choice as a radiating element that satisfies demands for higher impedance bandwidth due to working at the second resonance and has stable radiation characteristics in the wide frequency band [20-21]. 452 n. bošković, b. jokanović, v. marković 2.1. antenna array technology the printed dipole can be naturally implemented in two different technologies. one is coplanar stripline (cps) and the other is symmetrical (balanced) microstrip line. cps is a balanced uniplanar transmission line, consisting of two metallic conductor strips separated by a certain gap width, on a substrate. the cps line is without bottom metallization of the substrate for the ground; instead, the virtual ground is placed at the symmetry plane between two conductors. the balanced microstrip line is equivalent to the classical microstrip line and is represented by two identical parallel transmission lines, one from each side of the dielectric surface. for the given substrate height and line width, the impedance of the balanced microstrip line would be equal to double the impedance of the microstrip line having identical width and half of the substrate height. the cps line offers flexibility in the design of planar microwave and millimeter-wave circuits, especially in mounting the solid-state device in series or shunt without via holes. it exhibits low loss, small dispersion, small discontinuity parasitics, considerable insensitivity to substrate thickness and simple implementation of openand shortcircuits. the cps line has a typical impedance value around 200 ohms, which is much higher than the typical microstrip line of 50 ohms. in the series-fed array, it is very important to have available a high impedance ratio of the feeding transmission line and the radiating elements for achieving proper power distribution. since the balanced microstrip line can achieve much lower impedance value, it is a better choice for this type of array. 2.2. frequency scanning performance the frequency bandwidth is a valuable and limited resource, and certain bands are restricted for specific use [22]. two important parameters for frequency scanning systems are range and angular resolution. the angular resolution of beam scanning systems is defined by the antenna main lobe 3 db beamwidth. it means that two identical targets at the same distance are resolved in angle if they are separated by more than the antenna 3 db beamwidth. antenna system with fixed beam provides only range resolution. range resolution is the ability of an antenna system to distinguish between two or more targets on the same bearing but at different ranges. the pulse width is the primary factor in range resolution and it is generally the inverse of the pulse bandwidth. for the higher bandwidth available, the greater range precision can be obtained. frequency scanning provides the angular resolution. narrower 3 db beamwidth provides greater precision in determining the angular position of the target. frequency scanning systems performance is a typical trade-off between angular and range resolution. the simplest phase shifter is a basic transmission line. its length is directly proportional to phase shift contribution. any phase shifter can be approximated with the transmission line of the certain length. two parameters, which determine overall position of the main beam during scanning for the simple antenna array, are the distance between radiating elements (d) and length of the transmission line (l), as shown in fig. 2. dependence between scanning angle θ and relative frequency change ∆f / f0 is given as: 0 0 0 , 3602 sin fff df f d l −=∆ ∆ =      ∆ = λφ θ o (1) frequency scanning antenna arrays with metamaterial based phased shifters 453 where the beam is steered over the limits ±θ, f0 represent the central frequency at which the main beam is positioned broadside, λ0 is the free space length at the central frequency and ∆φ is the phase shift between two succeeding radiating elements (phase-increment). if the distance between the elements is fixed at the typical value of the 0.5 λ0 (wavelength in free space at center frequency), the length l and the available frequency bandwidth will determine scanning properties. fig. 2 frequency scanning antenna array with different positions of the main beam. as can be seen, the same results for the scanning angle (sector) can be obtained independently for different values of l and ∆f, while one of them is fixed. in practical application, frequency bandwidth is specified and l is used for obtaining a specific scanning angle. for the practical example, let us say that available relative bandwidth is 20% and required scanning is ± 25°, then from (1) l would have to be around 4.2 d , that is 2.1λ0 for the previously stated typical value of d. for 10% relative bandwidth, that value would be 4.2λ0. in both cases, the resulting phase shift would be around ±76 degrees. relative bandwidth in (1) would be equal to ∆f / f0, since total scanning sector is 2θ. from this, we can see that frequency sensitivity of the phased array is directly proportional to the equivalent length of the phase shifter. 2.3. phase shifter performances transmission line although simple, typically has a very slow phase contribution with frequency change. for narrow bandwidth and large phase shift, it has to have substantial length. long transmission lines can have significant losses and give rise to spurious radiation. if placed in the same plane as radiating elements, interaction might occur through the coupling and severe degradation of the radiation pattern could happen. for these reasons often other structures are employed as the phase shifter, which are better suited for the specific purpose. in fig. 3a it is shown phase shifter based on the metamaterial left-handed cell consisting of the pair of srrs (split ring resonators) in balanced microstrip technology, where one metal layer is on top and the second identical at the bottom side of the dielectric. in microstrip, it would be single srr cell coupled with transmission line with via in center in order to provide pass-band characteristics. such shifter is used in [23], where it enabled scanning sector of 32° in 5% of the relative bandwidth. if we applied that as an angle θ = ±16° in (1), we can see that phase shift would be around ±50 degrees and required l source l d radiation pattern 454 n. bošković, b. jokanović, v. marković would be around 5.5λ0. such a long line would take significant space and would require special care in order to minimize its impact on the radiation elements. the substrate used in [23] is rogers 4003c with the dielectric constant of 3.55, height of 1 mm, loss tangent is 0.0027. surface roughness in rogers 4003c is 2.8 microns. fig. 3 phase shifters based on the left-handed unit cell: (a) srrs coupled with meander line, (b) two pairs of srrs, (c) s-parameters for (a), (d); s-parameters for (b), (e) equivalent circuit of the microstrip line loaded with srrs and grounded with via. in fig. 3b shifter based on the four srr left-handed cell is shown [24]. two pairs of srrs in balanced microstrip technology are coupled with a transmission line in a similar manner like the previous one. the obtained characteristics are scanning sector of 30° for 2.5% of the relative bandwidth, which requires a phase shift of ±47° and required l would be 10.35λ0. in [24] rogers 5880 is used, with the dielectric constant of 2.17, the height of 0.508 mm, loss tangent is 0.001. surface roughness in rogers 5880 is 0.3 microns and is (a) (b) (c) (d) (e) via hole input output via hole input output k m k m c s l s /2 l/2 l/2 l s /2 c/4 c/2 c/4lvia p2p1 � frequency scanning antenna arrays with metamaterial based phased shifters 455 significantly smaller than the one in rogers 4003c, which would mean that at the same frequency, losses in metal would be considerably larger for rogers 4003c. in both cases impedance of the transmission line is 100 ω, and losses in the transmission line are 0.058 db/cm for rogers 4003c at 6 ghz [23], and 0.035 db/cm for rogers 5880 at 10 ghz [24]. from these two examples, we can see a significant advantage in the application of different phase shifter structures for enhancing frequency-scanning characteristics of the antenna array. in figs. 3c and 3d the s-parameters of the corresponding shifters are given. in fig. 3e the equivalent circuit of the shifters is shown and it can be derived from [25]. based on the characteristics it can be seen that these shifters exhibit the behavior of the pass-band filter, hence controlling its zeros and poles desired characteristics could be obtained. 2.4. linear arrays with identical rectangular dipoles a. scanning antenna array at 6 ghz previously discussed shifters are used in the antenna array design. radiating elements are simple identical rectangular dipoles. one-half of the dipole is at the top layer (brown) and the other is at the bottom (yellow) fig. 4c. the structure is designed in a balanced microstrip technology so in order to connect it to a standard sma connector, a transition from the balance-to-unbalance line (balun) is necessary. this is achieved via the triangular balun. the shifter from fig. 3a is used in the antenna design in [23]. from fig. 4a we can see that the antenna array achieved the scanning sector from 45° to 77°, frequency sensitivity of 10.67°/100 mhz and gain is from 12.4 to 13.73 dbi. dimensions of the rectangular dipoles are calculated in order to be resonant at a specific frequency with specific resistance value (z = 400 ω + 0j). the position of the resonance is determined with the length of the dipole and value of the resistance is regulated with the dipole width. since there are two variables and two goals it is more tuning than an optimization. for a true optimization it necessary to have a certain degree of freedom, that is to have more variables than goals, which is the case in the pentagonal dipoles. b. scanning antenna array at 10 ghz the shifter shown in fig. 3b is implemented at a higher frequency of 10 ghz. the produced prototype is shown in fig. 4e. the array is placed above the reflector plane at the distance d = 7.5 mm. dipoles are designed to have an impedance around 400 ω at 10 ghz, with the distance between radiating dipoles of 0.5λ0, that is 15 mm at 10 ghz. in fig. 4g we can see the offset between measured and simulated s11 parameter due to the fact that the sma connector is not precisely modeled and interconnection between the structure of the balun and the connector can produce discrepancy. nevertheless, the measured s11-parameter exhibits a good matching in the working bandwidth from 10 to 10.3 ghz. from fig. 4b we can again see a slight offset between the measured and simulated radiation characteristics due to manufacturing imperfections. measured characteristics show scanning from 105° to 130°, gain variation from 12.1 to 12.9 dbi and frequency sensitivity of 8.33°/100 mhz. as can be seen, these two shifters are designed to produce the frequency scanning at the different angles and scan rates, but both antenna arrays in this configurations display very high slls since the identical radiating elements are used in the array. in the first case, slls are from -10 db to -7.5 db below the main beam while in the second case their measured values are from -11.5 to -9 db bellow the main beam. the high slls are usually the biggest issues with scanning antennas. 456 n. bošković, b. jokanović, v. marković (a) (b) (c) (d) (e) (f) (g) fig. 4 comparison of the antenna arrays with different phase shifters operating at 6 ghz and 10 ghz: (a) simulated radiation pattern for the antenna array with phase shifter shown in fig. 3a, (b) simulated and measured radiation pattern for the antenna array with phase shifter shown in fig. 3b at the central and edge frequencies, (c) model of the antenna array with shifter shown in fig. 3a, (d) model of the antenna array with shifter shown in fig. 3b, (e) antenna prototype with dimensions 146.2 mm x 35.75 mm, (f) measured radiation pattern, (g) measured and simulated s-parameters of the array from fig. 4e. angle (deg) r a d ia ti o n p a tt e rn ( d b i) sma sma simulation measurement frequency scanning antenna arrays with metamaterial based phased shifters 457 2.5. linear array with pentagonal dipoles in order to obtain a higher side lobe suppression in the antenna array, the appropriate power distribution is necessary to be implemented. this problem is particularly challenging in the case of the linear scanning array with series feeding. the typical configuration of the traveling wave antenna array employs radiating elements of different impedances, so when wave travel through the array each radiating element takes the portion of the power available, which depends on its impedance value. at the end of the array, there is a termination for preventing the remaining power to return to the array and cause additional scanning beam in the opposite direction in relation to the broadside. shifters can have significant losses, which can considerably degrade the radiation characteristics. its influence on the power distribution must be seriously considered. another important issue is the fact that frequency scanning means that the antenna operates in a certain frequency band. the elements of the antenna array are frequency dependent and have different behavior depending on the observing frequency. power distribution is mostly implemented based on the ratio of the impedances of the transmission line and the radiation elements. an approach that is more proper would be observing sparameters on the multi-port network thus directly observe power distribution in the frequency range. in order to preserve power distribution in the frequency range, all components should have slow impedance change, which would result in stable s-parameters. this can be accomplished using pentagonal printed dipoles as radiating elements and shifter from fig. 3. approximation of the impedance values for the specific distribution can be calculated by: 2 10 )),(( 10 knw z z j norm a j j = (2) where zj represents the impedance in ohms of the jth element of the array, where j = 1..n and n is the number of the elements of the array; aj represents accumulated losses in the array at the jth element, which mostly originated from the phase shifters and radiating losses; znorm is the constant impedance which value depends on the scope of value of the minimum and maximum available as the impedance of the radiating elements; wj(n,k) is the weighting coefficient for the specific distribution for the case of the n elements and for k as a level of sidelobe suppression in db. implementation of this approach in the array with right-handed shifters is shown in [26]. for dolph-chebyshev distribution with n = 8 and k = 21, aj = 1.5(j-1) impedance values are given in table 1. table 1 impedance values for the array with the pentagonal dipoles. z1 z2 z3 z4 z5 z6 z7 z8 1570.8 750.2 292.3 156.1 110.5 103.7 133.4 140 458 n. bošković, b. jokanović, v. marković (a) (b) (c) (d) (e) fig. 5 (a) measured and simulated s-parameters of the linear array with pentagonal dipoles, (b) simulated radiation pattern, (c) measured radiation pattern, (d) model of the array, (e) manufactured prototype with dimensions: 140 mm x 27 mm. res sma frequency scanning antenna arrays with metamaterial based phased shifters 459 measured and simulated s-parameters are shown in fig. 5a. the measured s11 characteristic is better than simulated one due to the additional loses. simulated and measured radiation characteristics are shown in figs. 5b and 5c, respectively. the power distribution used in the array is dolph-chebyshev, with the goal to achieve slls suppression of 20 db in respect to the level of the main beam. in fig. 5b we can see that the goal is achieved and in the whole range slls are below desired level. the measured results show some degradation due to manufacturing errors and slightly lower gain due to losses. the model and manufactured prototype are shown in figs. 5d and 5e, respectively. the detailed comparison of the manufactured antennas characteristics are shown in table 2. from it, we can see that using pentagonal dipoles with different impedances, the main problem with printed scanning arrays can be resolved. the great improvement in slls is achieved. the trade-off of slls improvement is a wider 3 db beamwidth and somewhat lower antenna gain. table 2 comparison of the measured characteristics of the antenna arrays with identical rectangular and different pentagonal dipoles. dipoles rectangular pentagonal bandwidth 10.00 ghz-10.30 ghz 9.98 ghz-10.22 ghz scanning angle 100°-125° (25°) 100°-122° (22°) frequency sensitivity 83.333°/ghz 91.666°/ghz 3 db beamwidth 14.26° – 22.6° 21.2°-29.2° gain 12.1 db – 12.9 db 10.4 db 11.7 db sll better than 7.5 db better than 17 db measurements were performed using anritsu ме7838а vector network analyzer [27] in a setup which consists of the calibration kit, two identical standard horn antennas, device under test (dut), cables, positioner with stepper motor and pc control via arduino mega 2560 motherboard [28]. software communication with arduino is done with matlab through matlab support package for arduino hardware [29]. at the same time software communication with anritsu ме7838а, is done with instrument control toolbox through lan using tcp/ip [30]. one horn antenna is used as a transmitting antenna during the whole measurement procedure and the second one is used only at the beginning to determine relative gain levels at the position of the dut. after placing dut at the positioner with stepper motor the whole process is done automatically. accuracy should be better than 0.5 db. 3. conclusion in this paper, we have shown the use of the phased shifters based on the metamaterials. shifters are analyzed and their performance is discussed. their use in the frequency scanning arrays is shown. two prototypes are produced and shown. it is demonstrated that a combination of the pentagonal dipoles with different impedances and metamaterial based shifters can provide frequency scanning and slls control thus making it a good choice for cheap and highly accurate frequency scanning solution. 460 n. bošković, b. jokanović, v. marković acknowledgement: this work was financed by the serbian ministry for education, science and technological development through the projects tr-32024 and iii-45016. the authors would like to thank the institute imtel, belgrade, for the prototype manufacturing and to wipl-d, belgrade for the use of software licenses. references [1] m. miljić, a. nešić, b. milovanović, “an investigation of side lobe suppression in integrated printed antenna structures with 3d reflectors”, facta universitatis, series: electronics and energetics, vol. 30, no. 3, pp. 391–402, september 2017. [2] c. balanis, antenna theory: analysis and design, 3rd ed. hoboken, new jersey, united states, john wiley, 2005. [3] t. a. miligan, modern antenna design, 2nd ed. hoboken, new jersey, united states, john wiley, 2005. [4] w.s.t. rowe and r. b. waterhouse, “edge-fed patch antennas with reduced spurious radiation”, ieee trans. antennas propag., vol. 53, no. 5, pp.1785–1790, may 2005. [5] j. l. volakis, antenna engineering handbook, 4th ed. new york, united states, mcgraw-hill education, 2007. [6] d. f. sievenpiper, l. zhang, r. f. jimenez broas, n. g. alexópolous, and e. yablonovitch, “high impedance electromagnetic surfaces with a forbidden frequency band,” ieee trans. microwave theory tech., vol. 47, no. 11, pp. 2059–2074, nov. 1999. [7] c. s. lee, v. nalbandian, and f. schwering, “surface-mode suppression in a thick microstrip antenna by parasitic elements,” microwave and opt. technol. lett., vol. 8, pp. 145–147, feb. 1995. [8] n. g. alexopoulos and d. r. jackson, “fundamental superstrate (cover) effects on printed circuit antennas,” ieee trans. antennas propag., vol. 32, no. 8, pp. 807–816, aug. 1984. [9] d. r. jackson, j. t. williams, a. k. bhattacharyya, r. l. smith, s. j. buchheit, and s. a. long, “microstrip patch designs that do not excite surface waves,” ieee trans. antennas propag., vol. 41, no. 8, pp. 1026– 1037, aug. 1993. [10] komanduri, v.r., jackson, d.r., williams, j.t., and mehrotra, a.r.:”a general method for designing reduced surface wave microstrip antennas”, ieee trans. antennas propag., vol. 61, no. 6, pp. 2887–2894, march 2013. [11] r. j. mailloux, phased array antenna handbook, 2nd ed. london: artech house antennas and propagation library, 2005. [12] m. winfried, m. wetzel and m. menzel, "a novel direct imaging radar sensor with frequency scanned antenna," in proceedings of the ieee mtt-s int. microwave symp. dig., 2003, vol. 3, pp. 1941–1944. [13] y. alvarez lopez, c. garcia, c. vazquez, s. ver-hoeye, and f. las-heras, “frequency scanning based radar system,” prog. electromagn. res. vol. 132, pp. 275–296, 2012. [14] alvarez, y., camblor, r., garcia, c.,et al.: “submillimeter-wave frequency scanning system for imaging applications”, ieee trans. antennas propag., vol. 61, no 11, pp. 5689–5696, nov. 2013. [15] m. a. tehrani, j. j. laurin, and y. savaria, “multiple targets direction-of-arrival estimation in frequency scanning array antennas,” iet radar, sonar and navigation, vol. 10, no. 3, pp. 624–631, march 2016. [16] s. vashist, m. k. soni and p.k. singhal, "a review on the development of rotman lens antenna", chinese journal of engineering, vol. 2014, article id 385385, 9 pages, 2014. [17] w. zongxin, x. bo and y. fei, "a multibeam antenna array based on printed rotman lens", international journal of antennas and propagation, vol. 2013, article id 179327, 6 pages, 2013. [18] j. remez ; r. carmon,” compact designs of waveguide butler matrices”, ieee antennas wireless propag. lett., vol. 5, pp. 27–31, march 2006. [19] m. koubeissi, l. freytag, c. decroze, and t. monediere, “design of a cosecant-squared pattern antenna fed by a new butler matrix topology for base station at 42 ghz,”ieee antennas wireless propag. lett., vol. 7, pp. 354–357, 2008. [20] m. ilić i n. bošković, “poređenje karakteristika štampanih bow-tie dipola sa dipolima petougaonog oblika”, in proceedings of the etran conference 2012, zlatibor, 11-14. jun 2012. [21] a. nešić, z. mičić, s. jovanović, i. radnović, d. nešić, “millimeter-wave printed antenna arrays for covering various sector widths”, ieee antennas and propagation magazine, vol. 49, no. 1, pp. 113–118, february 2007. [22] https://www.itu.int/ frequency scanning antenna arrays with metamaterial based phased shifters 461 [23] n. bošković, b. jokanović i a. nesić, “frekvencijski skeniran antenski niz sa srr faznim šifterima”, in proceedings of the etran conference 2013, zlatibor, 3-6. jun 2013. [24] n. boskovic, b. jokanovic and a. nesic, “frequency scanning antenna array with enhanced side lobe suppression”, metamaterials 2014, copenhagen, denmark, 25-30. august 2014. [25] r. bojanic, v. milosevic, b. jokanovic, f. medina-mena and f. mesa, “enhanced modelling of split-ring resonators couplings in printed circuits,” ieee trans. microw. theory tech., vol. 62, no. 8, pp. 1605– 1615, 2014. [26] n. boskovic, b. jokanovic, and m. radovanovic, “printed frequency scanning antenna arrays with enhanced frequency sensitivity and sidelobe suppression,” ieee trans. antennas propag., vol. 65, no. 4, pp. 1757– 1764, apr. 2017. [27] “installation guide vectorstar me7838 series.” https://dl.cdn-anritsu.com/en-us/test-measurement/files/ manuals/installation-guide/10410-00293f.pdf [28] “arduino mega 2560 rev3.” https://store.arduino.cc/arduino-mega-2560-rev3 [29] “matlab support package for arduino hardware documentation.” https://www.mathworks.com/ help/supportpkg/arduinoio/index.html [30] “instrument control toolbox.” https://www.mathworks.com/products/instrument.html facta universitatis series: electronics and energetics vol. 30, no 1, march 2017, pp. 49 66 doi: 10.2298/fuee1701049a anas n. al-rabadi1,2 received november 28, 2015; received in revised form april 13, 2016 corresponding author: anas n. al-rabadi electrical engineering department, philadelphia university, jordan & computer engineering department, the university of jordan, amman-jordan (email: alrabadi@yahoo.com) facta universitatis series: electronics and energetics vol. 28, no 4, december 2015, pp. 507 525 doi: 10.2298/fuee1504507s horizontal current bipolar transistor (hcbt) – a low-cost, high-performance flexible bicmos technology for rf communication applications tomislav suligoj1, marko koričić1, josip žilak1, hidenori mochizuki2, so-ichi morita2, katsumi shinomura2, hisaya imai2 1university of zagreb, faculty of electrical engineering and computing, department of electronics, microand nano-electronics laboratory, croatia 2asahi kasei microdevices co. 5-4960, nobeoka, miyazaki, 882-0031, japan abstract. in an overview of horizontal current bipolar transistor (hcbt) technology, the state-of-the-art integrated silicon bipolar transistors are described which exhibit ft and fmax of 51 ghz and 61 ghz and ftbvceo product of 173 ghzv that are among the highest-performance implanted-base, silicon bipolar transistors. hbct is integrated with cmos in a considerably lower-cost fabrication sequence as compared to standard vertical-current bipolar transistors with only 2 or 3 additional masks and fewer process steps. due to its specific structure, the charge sharing effect can be employed to increase bvceo without sacrificing ft and fmax. moreover, the electric field can be engineered just by manipulating the lithography masks achieving the high-voltage hcbts with breakdowns up to 36 v integrated in the same process flow with high-speed devices, i.e. at zero additional costs. double-balanced active mixer circuit is designed and fabricated in hcbt technology. the maximum iip3 of 17.7 dbm at mixer current of 9.2 ma and conversion gain of -5 db are achieved. key words: bicmos technology, bipolar transistors, horizontal current bipolar transistor, radio frequency integrated circuits, mixer, high-voltage bipolar transistors. 1. introduction in the highly competitive wireless communication markets, the rf circuits and systems are fabricated in the technologies that are very cost-sensitive. in order to minimize the fabrication costs, the sub-10 ghz applications can be processed by using the high-volume silicon technologies. it has been identified that the optimum solution might received march 9, 2015 corresponding author: tomislav suligoj university of zagreb, faculty of electrical engineering and computing, department of electronics, microand nano-electronics laboratory, croatia (e-mail: tom@zemris.fer.hr) an extended green sasao hierarchy of canonical ternary galois forms and universal logic modules 1electrical engineering department, philadelphia university, 2jordan & computer engineering department, the university of jordan, amman-jordan abstract. a new extended green-sasao hierarchy of families and forms with a new subfamily for multiple-valued reed-muller logic is introduced. recently, two families of binary canonical reed-muller forms, called inclusive forms (ifs) and generalized inclusive forms (gifs) have been proposed, where the second family was the first to include all minimum exclusive sum-of-products (esops). in this paper, we propose, analogously to the binary case, two general families of canonical ternary reed-muller forms, called ternary inclusive forms (tifs) and their generalization of ternary generalized inclusive forms (tgifs), where the second family includes minimum galois field sum-of-products (gfsops) over ternary galois field gf(3). one of the basic motivations in this work is the application of these tifs and tgifs to find the minimum gfsop for multiple-valued input-output functions within logic synthesis, where a gfsop minimizer based on if polarity can be used to minimize the multiple-valued gfsop expression for any given function. the realization of the presented shannon-davio (s/d) trees using universal logic modules (ulms) is also introduced, where ulms are complete systems that can implement all possible logic functions utilizing the corresponding s/d expansions of multiple-valued shannon and davio spectral transforms. key words: canonical forms, galois field forms, green-sasao hierarchy, inclusive forms, multiple-valued logic, shannon-davio trees, ternary logic, universal logic modules. an extended green sasao hierarchy of canonical ternary galois forms and universal logic modules anas n. al-rabadi ∗ electrical engineering department, philadelphia university, jordan & computer engineering department, the university of jordan, amman-jordan e-mail: alrabadi@yahoo.com abstract a new extended green-sasao hierarchy of families and forms with a new subfamily for multiple-valued reed-muller logic is introduced. recently, two families of binary canonical reed-muller forms, called inclusive forms (ifs) and generalized inclusive forms (gifs) have been proposed, where the second family was the first to include all minimum exclusive sum-of-products (esops). in this paper, we propose, analogously to the binary case, two general families of canonical ternary reed-muller forms, called ternary inclusive forms (tifs) and their generalization of ternary generalized inclusive forms (tgifs), where the second family includes minimum galois field sum-of-products (gfsops) over ternary galois field gf(3). one of the basic motivations in this work is the application of these tifs and tgifs to find the minimum gfsop for multiple-valued inputoutput functions within logic synthesis, where a gfsop minimizer based on if polarity can be used to minimize the multiple-valued gfsop expression for any given function. the realization of the presented shannon-davio (s/d) trees using universal logic modules (ulms) is also introduced, where ulms are complete systems that can implement all possible logic functions utilizing the corresponding s/d expansions of multiple-valued shannon and davio spectral transforms. keywords 1. canonical forms, galois field forms, green-sasao hierarchy, inclusive forms, multiple-valued logic, shannon-davio trees, ternary logic, universal logic modules. 1 normal galois forms reed-muller-like spectral transforms [1-18] have found a variety of useful applications in minimizing exclusive sum-of-products (esop) and galois field sop (gfsop) expressions, creation of new forms, binary and spectral decision diagrams, regular structures, besides the well-known uses in digital communications, digital signal ∗received: july 15, 2015 101 an extended green sasao hierarchy of canonical ternary galois forms ... table 1: number of product terms needed to realize some arithmetic functions using various expressions. function pprm fprm grm esop sop adr4 34 34 34 31 75 log8 253 193 105 96 123 nrm4 216 185 96 69 120 rdm8 56 56 31 31 76 rot8 225 118 51 35 57 sym9 210 173 126 51 84 wgt8 107 107 107 58 255 and image processing, and fault detection and testing [1-9, 12-14, 16, 17, 19, 20, 21]. the method of generating the new families of multiple-valued shannon and davio spectral transforms is based on the fundamental multiple-valued shannon and davio expansions. dyadic families of discrete transforms; reed-muller and green-sasao hierarchy, walsh, arithmetic, adding and haar transforms and their generalizations to multiple-valued transforms, have also found important applications in digital system design and optimization [1, 6, 19, 7-18, 20-31]. normal canonical forms play an important role in the synthesis of logic circuits which includes synthesis, testing and optimization [1, 9, 12-15, 17, 21, 22, 26, 27, 31, 32]. one can observe that by going, for example, from positive polarity reed-muller (pprm) form to the generalized reed-muller (grm) form, less constraints are imposed on the canonical forms due to the enlarged set of polarities that one can choose from. the gain of more freedom on the polarity of the canonical expansions will provide an advantage of obtaining exclusive-sum-of-product (esop) expressions with less number of terms and literals, and consequently expressing boolean functions using esop forms will produce on average expressions with less size as if compared to sum-of-product (sop) expressions for example. table 1 illustrates these observations [1]. the main algebraic structure which is used in this work for developing the canonical normal forms is the galois field (gf) algebraic structure, which is a fundamental algebraic structure in the theory of algebra [1, 12, 13, 17, 22, 29, 33, 34]. the importance of gf for logic synthesis results from the fact that every finite field is isomorphic to a galois field. in general, the attractive properties of gf-based circuits, such as the high testability of such circuits, are mainly due to the fact that the gf operators exhibit the cyclic group, also known as latin square, property. in binary, for example, gf(2) addition gate, the exor, has the cyclic group property. the cyclic group property can be explained, for example, using the three-valued (ternary) gf operators as shown in figures 1(a) and 1(b), respectively. note that in any row and column of the addition table in figure 1(a), the elements are all different, which is cyclic, and that the elements have a different order in each row and column. another cyclic group can be observed in the multiplication table; if the zero elements are removed from the multiplication table in figure 1(b), then the remaining elements form a cyclic group. reed-muller normal forms have been classified using the green-sasao hierarchy [1, 10, 12, 13, 17], where the green-sasao hierarchy of families of canonical forms and corresponding decision diagrams is based on three generic expansions; shannon, pos102 an extended green sasao hierarchy of canonical ternary galois forms and universal logic modules anas n. al-rabadi ∗ electrical engineering department, philadelphia university, jordan & computer engineering department, the university of jordan, amman-jordan e-mail: alrabadi@yahoo.com abstract a new extended green-sasao hierarchy of families and forms with a new subfamily for multiple-valued reed-muller logic is introduced. recently, two families of binary canonical reed-muller forms, called inclusive forms (ifs) and generalized inclusive forms (gifs) have been proposed, where the second family was the first to include all minimum exclusive sum-of-products (esops). in this paper, we propose, analogously to the binary case, two general families of canonical ternary reed-muller forms, called ternary inclusive forms (tifs) and their generalization of ternary generalized inclusive forms (tgifs), where the second family includes minimum galois field sum-of-products (gfsops) over ternary galois field gf(3). one of the basic motivations in this work is the application of these tifs and tgifs to find the minimum gfsop for multiple-valued inputoutput functions within logic synthesis, where a gfsop minimizer based on if polarity can be used to minimize the multiple-valued gfsop expression for any given function. the realization of the presented shannon-davio (s/d) trees using universal logic modules (ulms) is also introduced, where ulms are complete systems that can implement all possible logic functions utilizing the corresponding s/d expansions of multiple-valued shannon and davio spectral transforms. keywords 1. canonical forms, galois field forms, green-sasao hierarchy, inclusive forms, multiple-valued logic, shannon-davio trees, ternary logic, universal logic modules. 1 normal galois forms reed-muller-like spectral transforms [1-18] have found a variety of useful applications in minimizing exclusive sum-of-products (esop) and galois field sop (gfsop) expressions, creation of new forms, binary and spectral decision diagrams, regular structures, besides the well-known uses in digital communications, digital signal ∗received: july 15, 2015 101 50 a. n. al-rabadi an extended green sasao hierarchy of canonical ternary galois forms and universal logic modules 51 an extended green sasao hierarchy of canonical ternary galois forms ... table 1: number of product terms needed to realize some arithmetic functions using various expressions. function pprm fprm grm esop sop adr4 34 34 34 31 75 log8 253 193 105 96 123 nrm4 216 185 96 69 120 rdm8 56 56 31 31 76 rot8 225 118 51 35 57 sym9 210 173 126 51 84 wgt8 107 107 107 58 255 and image processing, and fault detection and testing [1-9, 12-14, 16, 17, 19, 20, 21]. the method of generating the new families of multiple-valued shannon and davio spectral transforms is based on the fundamental multiple-valued shannon and davio expansions. dyadic families of discrete transforms; reed-muller and green-sasao hierarchy, walsh, arithmetic, adding and haar transforms and their generalizations to multiple-valued transforms, have also found important applications in digital system design and optimization [1, 6, 19, 7-18, 20-31]. normal canonical forms play an important role in the synthesis of logic circuits which includes synthesis, testing and optimization [1, 9, 12-15, 17, 21, 22, 26, 27, 31, 32]. one can observe that by going, for example, from positive polarity reed-muller (pprm) form to the generalized reed-muller (grm) form, less constraints are imposed on the canonical forms due to the enlarged set of polarities that one can choose from. the gain of more freedom on the polarity of the canonical expansions will provide an advantage of obtaining exclusive-sum-of-product (esop) expressions with less number of terms and literals, and consequently expressing boolean functions using esop forms will produce on average expressions with less size as if compared to sum-of-product (sop) expressions for example. table 1 illustrates these observations [1]. the main algebraic structure which is used in this work for developing the canonical normal forms is the galois field (gf) algebraic structure, which is a fundamental algebraic structure in the theory of algebra [1, 12, 13, 17, 22, 29, 33, 34]. the importance of gf for logic synthesis results from the fact that every finite field is isomorphic to a galois field. in general, the attractive properties of gf-based circuits, such as the high testability of such circuits, are mainly due to the fact that the gf operators exhibit the cyclic group, also known as latin square, property. in binary, for example, gf(2) addition gate, the exor, has the cyclic group property. the cyclic group property can be explained, for example, using the three-valued (ternary) gf operators as shown in figures 1(a) and 1(b), respectively. note that in any row and column of the addition table in figure 1(a), the elements are all different, which is cyclic, and that the elements have a different order in each row and column. another cyclic group can be observed in the multiplication table; if the zero elements are removed from the multiplication table in figure 1(b), then the remaining elements form a cyclic group. reed-muller normal forms have been classified using the green-sasao hierarchy [1, 10, 12, 13, 17], where the green-sasao hierarchy of families of canonical forms and corresponding decision diagrams is based on three generic expansions; shannon, pos102 an extended green sasao hierarchy of canonical ternary galois forms ... table 1: number of product terms needed to realize some arithmetic functions using various expressions. function pprm fprm grm esop sop adr4 34 34 34 31 75 log8 253 193 105 96 123 nrm4 216 185 96 69 120 rdm8 56 56 31 31 76 rot8 225 118 51 35 57 sym9 210 173 126 51 84 wgt8 107 107 107 58 255 and image processing, and fault detection and testing [1-9, 12-14, 16, 17, 19, 20, 21]. the method of generating the new families of multiple-valued shannon and davio spectral transforms is based on the fundamental multiple-valued shannon and davio expansions. dyadic families of discrete transforms; reed-muller and green-sasao hierarchy, walsh, arithmetic, adding and haar transforms and their generalizations to multiple-valued transforms, have also found important applications in digital system design and optimization [1, 6, 19, 7-18, 20-31]. normal canonical forms play an important role in the synthesis of logic circuits which includes synthesis, testing and optimization [1, 9, 12-15, 17, 21, 22, 26, 27, 31, 32]. one can observe that by going, for example, from positive polarity reed-muller (pprm) form to the generalized reed-muller (grm) form, less constraints are imposed on the canonical forms due to the enlarged set of polarities that one can choose from. the gain of more freedom on the polarity of the canonical expansions will provide an advantage of obtaining exclusive-sum-of-product (esop) expressions with less number of terms and literals, and consequently expressing boolean functions using esop forms will produce on average expressions with less size as if compared to sum-of-product (sop) expressions for example. table 1 illustrates these observations [1]. the main algebraic structure which is used in this work for developing the canonical normal forms is the galois field (gf) algebraic structure, which is a fundamental algebraic structure in the theory of algebra [1, 12, 13, 17, 22, 29, 33, 34]. the importance of gf for logic synthesis results from the fact that every finite field is isomorphic to a galois field. in general, the attractive properties of gf-based circuits, such as the high testability of such circuits, are mainly due to the fact that the gf operators exhibit the cyclic group, also known as latin square, property. in binary, for example, gf(2) addition gate, the exor, has the cyclic group property. the cyclic group property can be explained, for example, using the three-valued (ternary) gf operators as shown in figures 1(a) and 1(b), respectively. note that in any row and column of the addition table in figure 1(a), the elements are all different, which is cyclic, and that the elements have a different order in each row and column. another cyclic group can be observed in the multiplication table; if the zero elements are removed from the multiplication table in figure 1(b), then the remaining elements form a cyclic group. reed-muller normal forms have been classified using the green-sasao hierarchy [1, 10, 12, 13, 17], where the green-sasao hierarchy of families of canonical forms and corresponding decision diagrams is based on three generic expansions; shannon, pos102 an extended green sasao hierarchy of canonical ternary galois forms ... + 0 1 2 0 0 1 2 1 1 2 0 2 2 0 1 ∗ 0 1 2 0 0 0 0 1 0 1 2 2 0 2 1 (a) (b) figure 1: third radix galois field addition and multiplication tables: (a) addition and (b) multiplication. itive davio and negative davio expansions. the two-valued shannon, positive davio and negative davio expansions are given as follows, respectively: f (x1,x2,...,xn) = x̄1 · f0(x1,x2,...,xn)⊕ x1 · f1(x1,x2,...,xn) = [ x̄1 x1 ] [ 1 0 0 1 ][ f0 f1 ] , (1) f (x1,x2,...,xn) = 1 · f0(x1,x2,...,xn)⊕ x1 · f2(x1,x2,...,xn) = [ 1 x1 ] [ 1 0 1 1 ][ f0 f1 ] , (2) f (x1,x2,...,xn) = 1 · f1(x1,x2,...,xn)⊕ x̄1 · f2(x1,x2,...,xn) = [ 1 x̄1 ] [ 0 1 1 1 ][ f0 f1 ] , (3) where f0(x1,x2,...,xn) = f (0,x2,...,xn) = f0 is the negative cofactor of variable x1, f1(x1,x2,...,xn) = f (1,x2, ...,xn) = f1 is the positive cofactor of variable x1, and f2(x1,x2,...,xn) = f (0,x2,...,xn) ⊕ f (1,x2,...,xn) = f0 ⊕ f1. in addition, an arbitrary n-variable function f (x1,...,xn) can be represented using pprm expansion as [2, 31]: f (x1,x2,...,xn) =a0 ⊕ a1x1 ⊕ ...⊕ anxn ⊕ a12x1x2 ⊕ a13x1x3 ⊕ an−1,nxn−1xn⊕ ...⊕ a12...nx1x2 ... xn. (4) for each function f , the coefficients ai in equation (4) are determined uniquely, so pprm is a canonical form. for example, if we use either only the positive literal or only the negative literal for each variable in equation (4) we obtain the fixed polarity reed-muller (fprm) form. the good selection of different permutations using shannon and davio expansions like other expansions such as walsh and arithmetic expansions as internal nodes in decision trees (dts) and diagrams (dds) will result in dts and dds that represent the corresponding logic functions with smaller sizes in terms of the total number of hierarchical levels used, and the total number of internal nodes needed. in general, a literal can be defined as any function of a single variable. basis functions in the general case of multiple-valued expansions are constructed using these 103 50 a. n. al-rabadi an extended green sasao hierarchy of canonical ternary galois forms and universal logic modules 51 an extended green sasao hierarchy of canonical ternary galois forms ... + 0 1 2 0 0 1 2 1 1 2 0 2 2 0 1 ∗ 0 1 2 0 0 0 0 1 0 1 2 2 0 2 1 (a) (b) figure 1: third radix galois field addition and multiplication tables: (a) addition and (b) multiplication. itive davio and negative davio expansions. the two-valued shannon, positive davio and negative davio expansions are given as follows, respectively: f (x1,x2,...,xn) = x̄1 · f0(x1,x2,...,xn)⊕ x1 · f1(x1,x2,...,xn) = [ x̄1 x1 ] [ 1 0 0 1 ][ f0 f1 ] , (1) f (x1,x2,...,xn) = 1 · f0(x1,x2,...,xn)⊕ x1 · f2(x1,x2,...,xn) = [ 1 x1 ] [ 1 0 1 1 ][ f0 f1 ] , (2) f (x1,x2,...,xn) = 1 · f1(x1,x2,...,xn)⊕ x̄1 · f2(x1,x2,...,xn) = [ 1 x̄1 ] [ 0 1 1 1 ][ f0 f1 ] , (3) where f0(x1,x2,...,xn) = f (0,x2,...,xn) = f0 is the negative cofactor of variable x1, f1(x1,x2,...,xn) = f (1,x2, ...,xn) = f1 is the positive cofactor of variable x1, and f2(x1,x2,...,xn) = f (0,x2,...,xn) ⊕ f (1,x2,...,xn) = f0 ⊕ f1. in addition, an arbitrary n-variable function f (x1,...,xn) can be represented using pprm expansion as [2, 31]: f (x1,x2,...,xn) =a0 ⊕ a1x1 ⊕ ...⊕ anxn ⊕ a12x1x2 ⊕ a13x1x3 ⊕ an−1,nxn−1xn⊕ ...⊕ a12...nx1x2 ... xn. (4) for each function f , the coefficients ai in equation (4) are determined uniquely, so pprm is a canonical form. for example, if we use either only the positive literal or only the negative literal for each variable in equation (4) we obtain the fixed polarity reed-muller (fprm) form. the good selection of different permutations using shannon and davio expansions like other expansions such as walsh and arithmetic expansions as internal nodes in decision trees (dts) and diagrams (dds) will result in dts and dds that represent the corresponding logic functions with smaller sizes in terms of the total number of hierarchical levels used, and the total number of internal nodes needed. in general, a literal can be defined as any function of a single variable. basis functions in the general case of multiple-valued expansions are constructed using these 103 an extended green sasao hierarchy of canonical ternary galois forms ... + 0 1 2 0 0 1 2 1 1 2 0 2 2 0 1 ∗ 0 1 2 0 0 0 0 1 0 1 2 2 0 2 1 (a) (b) figure 1: third radix galois field addition and multiplication tables: (a) addition and (b) multiplication. itive davio and negative davio expansions. the two-valued shannon, positive davio and negative davio expansions are given as follows, respectively: f (x1,x2,...,xn) = x̄1 · f0(x1,x2,...,xn)⊕ x1 · f1(x1,x2,...,xn) = [ x̄1 x1 ] [ 1 0 0 1 ][ f0 f1 ] , (1) f (x1,x2,...,xn) = 1 · f0(x1,x2,...,xn)⊕ x1 · f2(x1,x2,...,xn) = [ 1 x1 ] [ 1 0 1 1 ][ f0 f1 ] , (2) f (x1,x2,...,xn) = 1 · f1(x1,x2,...,xn)⊕ x̄1 · f2(x1,x2,...,xn) = [ 1 x̄1 ] [ 0 1 1 1 ][ f0 f1 ] , (3) where f0(x1,x2,...,xn) = f (0,x2,...,xn) = f0 is the negative cofactor of variable x1, f1(x1,x2,...,xn) = f (1,x2, ...,xn) = f1 is the positive cofactor of variable x1, and f2(x1,x2,...,xn) = f (0,x2,...,xn) ⊕ f (1,x2,...,xn) = f0 ⊕ f1. in addition, an arbitrary n-variable function f (x1,...,xn) can be represented using pprm expansion as [2, 31]: f (x1,x2,...,xn) =a0 ⊕ a1x1 ⊕ ...⊕ anxn ⊕ a12x1x2 ⊕ a13x1x3 ⊕ an−1,nxn−1xn⊕ ...⊕ a12...nx1x2 ... xn. (4) for each function f , the coefficients ai in equation (4) are determined uniquely, so pprm is a canonical form. for example, if we use either only the positive literal or only the negative literal for each variable in equation (4) we obtain the fixed polarity reed-muller (fprm) form. the good selection of different permutations using shannon and davio expansions like other expansions such as walsh and arithmetic expansions as internal nodes in decision trees (dts) and diagrams (dds) will result in dts and dds that represent the corresponding logic functions with smaller sizes in terms of the total number of hierarchical levels used, and the total number of internal nodes needed. in general, a literal can be defined as any function of a single variable. basis functions in the general case of multiple-valued expansions are constructed using these 103 an extended green sasao hierarchy of canonical ternary galois forms ... + 0 1 2 0 0 1 2 1 1 2 0 2 2 0 1 ∗ 0 1 2 0 0 0 0 1 0 1 2 2 0 2 1 (a) (b) figure 1: third radix galois field addition and multiplication tables: (a) addition and (b) multiplication. itive davio and negative davio expansions. the two-valued shannon, positive davio and negative davio expansions are given as follows, respectively: f (x1,x2,...,xn) = x̄1 · f0(x1,x2,...,xn)⊕ x1 · f1(x1,x2,...,xn) = [ x̄1 x1 ] [ 1 0 0 1 ][ f0 f1 ] , (1) f (x1,x2,...,xn) = 1 · f0(x1,x2,...,xn)⊕ x1 · f2(x1,x2,...,xn) = [ 1 x1 ] [ 1 0 1 1 ][ f0 f1 ] , (2) f (x1,x2,...,xn) = 1 · f1(x1,x2,...,xn)⊕ x̄1 · f2(x1,x2,...,xn) = [ 1 x̄1 ] [ 0 1 1 1 ][ f0 f1 ] , (3) where f0(x1,x2,...,xn) = f (0,x2,...,xn) = f0 is the negative cofactor of variable x1, f1(x1,x2,...,xn) = f (1,x2, ...,xn) = f1 is the positive cofactor of variable x1, and f2(x1,x2,...,xn) = f (0,x2,...,xn) ⊕ f (1,x2,...,xn) = f0 ⊕ f1. in addition, an arbitrary n-variable function f (x1,...,xn) can be represented using pprm expansion as [2, 31]: f (x1,x2,...,xn) =a0 ⊕ a1x1 ⊕ ...⊕ anxn ⊕ a12x1x2 ⊕ a13x1x3 ⊕ an−1,nxn−1xn⊕ ...⊕ a12...nx1x2 ... xn. (4) for each function f , the coefficients ai in equation (4) are determined uniquely, so pprm is a canonical form. for example, if we use either only the positive literal or only the negative literal for each variable in equation (4) we obtain the fixed polarity reed-muller (fprm) form. the good selection of different permutations using shannon and davio expansions like other expansions such as walsh and arithmetic expansions as internal nodes in decision trees (dts) and diagrams (dds) will result in dts and dds that represent the corresponding logic functions with smaller sizes in terms of the total number of hierarchical levels used, and the total number of internal nodes needed. in general, a literal can be defined as any function of a single variable. basis functions in the general case of multiple-valued expansions are constructed using these 103 an extended green sasao hierarchy of canonical ternary galois forms ... + 0 1 2 0 0 1 2 1 1 2 0 2 2 0 1 ∗ 0 1 2 0 0 0 0 1 0 1 2 2 0 2 1 (a) (b) figure 1: third radix galois field addition and multiplication tables: (a) addition and (b) multiplication. itive davio and negative davio expansions. the two-valued shannon, positive davio and negative davio expansions are given as follows, respectively: f (x1,x2,...,xn) = x̄1 · f0(x1,x2,...,xn)⊕ x1 · f1(x1,x2,...,xn) = [ x̄1 x1 ] [ 1 0 0 1 ][ f0 f1 ] , (1) f (x1,x2,...,xn) = 1 · f0(x1,x2,...,xn)⊕ x1 · f2(x1,x2,...,xn) = [ 1 x1 ] [ 1 0 1 1 ][ f0 f1 ] , (2) f (x1,x2,...,xn) = 1 · f1(x1,x2,...,xn)⊕ x̄1 · f2(x1,x2,...,xn) = [ 1 x̄1 ] [ 0 1 1 1 ][ f0 f1 ] , (3) where f0(x1,x2,...,xn) = f (0,x2,...,xn) = f0 is the negative cofactor of variable x1, f1(x1,x2,...,xn) = f (1,x2, ...,xn) = f1 is the positive cofactor of variable x1, and f2(x1,x2,...,xn) = f (0,x2,...,xn) ⊕ f (1,x2,...,xn) = f0 ⊕ f1. in addition, an arbitrary n-variable function f (x1,...,xn) can be represented using pprm expansion as [2, 31]: f (x1,x2,...,xn) =a0 ⊕ a1x1 ⊕ ...⊕ anxn ⊕ a12x1x2 ⊕ a13x1x3 ⊕ an−1,nxn−1xn⊕ ...⊕ a12...nx1x2 ... xn. (4) for each function f , the coefficients ai in equation (4) are determined uniquely, so pprm is a canonical form. for example, if we use either only the positive literal or only the negative literal for each variable in equation (4) we obtain the fixed polarity reed-muller (fprm) form. the good selection of different permutations using shannon and davio expansions like other expansions such as walsh and arithmetic expansions as internal nodes in decision trees (dts) and diagrams (dds) will result in dts and dds that represent the corresponding logic functions with smaller sizes in terms of the total number of hierarchical levels used, and the total number of internal nodes needed. in general, a literal can be defined as any function of a single variable. basis functions in the general case of multiple-valued expansions are constructed using these 103 [1, 17]: 52 a. n. al-rabadi an extended green sasao hierarchy of canonical ternary galois forms and universal logic modules 53 an extended green sasao hierarchy of canonical ternary galois forms ... literals. galois field sum-of-products expansions can be performed utilizing variety uses of literals. for example, one can use k-reduced post literal (k-rpl) to produce k-rpl gfsop, generalized (post) literal (gl) to produce gl gfsop, and universal literal (ul) to produce ul gfsop. example 1. figure 2 demonstrates several literal types, where one proceeds from the simplest literal in figure 2(a) (i.e., rpl) to the most complex universal literal in figure 2(c). for rpl in figure 2(a), a value k is produced by the literal when the value of the variable is equal to a specific state, and in this particular example a value of k = 1 is generated by the 1-rpl when the value of variable x is equal to certain state where this state here equals to one. the gl in figure 2(b) produces a value of radix for a set of distinct states. one notes that, in contrast to the other literals, ul in figure 2(c) can have any value of the logic system at distinct states, and thus ul has the highest complexity among the different types of literals. 0 0 01 1 1 1 1 12 2 2 2 2 23 3 3 3 3 34 4 4 4 4 4x x x 1 x l ( ) {1,3} x l ( ) <2,0,4,3,1> x ( )a ( )b ( )c figure 2: an illustrating example of the different types of literals over an arbitrary five-radix logic: (a) 1-reduced post literal (1-rpl), (b) generalized (post) literal (gl), and (c) universal literal (ul). since k-rpl gfsop is simpler from implementation point of view than gl or ul, we will perform all the gfsop expansions utilizing the 1-rpl gfsop. let us define 1-rpl [1, 17] as: i x = 1 iff x = i else ix = 0. (5) for example {0x, 1x, 2x} are the zero, first, and second polarities of the 1-rpl, respectively. also, let us define the ternary shifts over variable x as {x,x′,x′′} as the zero, first and second shifts of the variable x respectively, i.e., x = x + 0, x′ = x + 1 and x′′ = x + 2, and x can take any value in the set {0,1,2}. we chose to represent the 1-reduced post literals in terms of shifts and powers, among others, because of the ease of the implementation of powers of shifted variables in hardware as will be seen in section 3 within universal logic modules (ulms) for the production of rpl. the fundamental shannon expansion over gf(3) for a ternary function with a single variable is shown in theorem 1. theorem 1. shannon expansion over gf(3) for a function with a single variable is: f = 0x f0 + 1 x f1 + 2 x f2, (6) where f0 is cofactor of f with respect to variable x of value 0, f1 is cofactor of f with respect to variable x of value 1, and f2 is cofactor of f with respect to variable x of value 2. 104 an extended green sasao hierarchy of canonical ternary galois forms ... literals. galois field sum-of-products expansions can be performed utilizing variety uses of literals. for example, one can use k-reduced post literal (k-rpl) to produce k-rpl gfsop, generalized (post) literal (gl) to produce gl gfsop, and universal literal (ul) to produce ul gfsop. example 1. figure 2 demonstrates several literal types, where one proceeds from the simplest literal in figure 2(a) (i.e., rpl) to the most complex universal literal in figure 2(c). for rpl in figure 2(a), a value k is produced by the literal when the value of the variable is equal to a specific state, and in this particular example a value of k = 1 is generated by the 1-rpl when the value of variable x is equal to certain state where this state here equals to one. the gl in figure 2(b) produces a value of radix for a set of distinct states. one notes that, in contrast to the other literals, ul in figure 2(c) can have any value of the logic system at distinct states, and thus ul has the highest complexity among the different types of literals. 0 0 01 1 1 1 1 12 2 2 2 2 23 3 3 3 3 34 4 4 4 4 4x x x 1 x l ( ) {1,3} x l ( ) <2,0,4,3,1> x ( )a ( )b ( )c figure 2: an illustrating example of the different types of literals over an arbitrary five-radix logic: (a) 1-reduced post literal (1-rpl), (b) generalized (post) literal (gl), and (c) universal literal (ul). since k-rpl gfsop is simpler from implementation point of view than gl or ul, we will perform all the gfsop expansions utilizing the 1-rpl gfsop. let us define 1-rpl [1, 17] as: i x = 1 iff x = i else ix = 0. (5) for example {0x, 1x, 2x} are the zero, first, and second polarities of the 1-rpl, respectively. also, let us define the ternary shifts over variable x as {x,x′,x′′} as the zero, first and second shifts of the variable x respectively, i.e., x = x + 0, x′ = x + 1 and x′′ = x + 2, and x can take any value in the set {0,1,2}. we chose to represent the 1-reduced post literals in terms of shifts and powers, among others, because of the ease of the implementation of powers of shifted variables in hardware as will be seen in section 3 within universal logic modules (ulms) for the production of rpl. the fundamental shannon expansion over gf(3) for a ternary function with a single variable is shown in theorem 1. theorem 1. shannon expansion over gf(3) for a function with a single variable is: f = 0x f0 + 1 x f1 + 2 x f2, (6) where f0 is cofactor of f with respect to variable x of value 0, f1 is cofactor of f with respect to variable x of value 1, and f2 is cofactor of f with respect to variable x of value 2. 104 an extended green sasao hierarchy of canonical ternary galois forms ... literals. galois field sum-of-products expansions can be performed utilizing variety uses of literals. for example, one can use k-reduced post literal (k-rpl) to produce k-rpl gfsop, generalized (post) literal (gl) to produce gl gfsop, and universal literal (ul) to produce ul gfsop. example 1. figure 2 demonstrates several literal types, where one proceeds from the simplest literal in figure 2(a) (i.e., rpl) to the most complex universal literal in figure 2(c). for rpl in figure 2(a), a value k is produced by the literal when the value of the variable is equal to a specific state, and in this particular example a value of k = 1 is generated by the 1-rpl when the value of variable x is equal to certain state where this state here equals to one. the gl in figure 2(b) produces a value of radix for a set of distinct states. one notes that, in contrast to the other literals, ul in figure 2(c) can have any value of the logic system at distinct states, and thus ul has the highest complexity among the different types of literals. 0 0 01 1 1 1 1 12 2 2 2 2 23 3 3 3 3 34 4 4 4 4 4x x x 1 x l ( ) {1,3} x l ( ) <2,0,4,3,1> x ( )a ( )b ( )c figure 2: an illustrating example of the different types of literals over an arbitrary five-radix logic: (a) 1-reduced post literal (1-rpl), (b) generalized (post) literal (gl), and (c) universal literal (ul). since k-rpl gfsop is simpler from implementation point of view than gl or ul, we will perform all the gfsop expansions utilizing the 1-rpl gfsop. let us define 1-rpl [1, 17] as: i x = 1 iff x = i else ix = 0. (5) for example {0x, 1x, 2x} are the zero, first, and second polarities of the 1-rpl, respectively. also, let us define the ternary shifts over variable x as {x,x′,x′′} as the zero, first and second shifts of the variable x respectively, i.e., x = x + 0, x′ = x + 1 and x′′ = x + 2, and x can take any value in the set {0,1,2}. we chose to represent the 1-reduced post literals in terms of shifts and powers, among others, because of the ease of the implementation of powers of shifted variables in hardware as will be seen in section 3 within universal logic modules (ulms) for the production of rpl. the fundamental shannon expansion over gf(3) for a ternary function with a single variable is shown in theorem 1. theorem 1. shannon expansion over gf(3) for a function with a single variable is: f = 0x f0 + 1 x f1 + 2 x f2, (6) where f0 is cofactor of f with respect to variable x of value 0, f1 is cofactor of f with respect to variable x of value 1, and f2 is cofactor of f with respect to variable x of value 2. 104 an extended green sasao hierarchy of canonical ternary galois forms ... literals. galois field sum-of-products expansions can be performed utilizing variety uses of literals. for example, one can use k-reduced post literal (k-rpl) to produce k-rpl gfsop, generalized (post) literal (gl) to produce gl gfsop, and universal literal (ul) to produce ul gfsop. example 1. figure 2 demonstrates several literal types, where one proceeds from the simplest literal in figure 2(a) (i.e., rpl) to the most complex universal literal in figure 2(c). for rpl in figure 2(a), a value k is produced by the literal when the value of the variable is equal to a specific state, and in this particular example a value of k = 1 is generated by the 1-rpl when the value of variable x is equal to certain state where this state here equals to one. the gl in figure 2(b) produces a value of radix for a set of distinct states. one notes that, in contrast to the other literals, ul in figure 2(c) can have any value of the logic system at distinct states, and thus ul has the highest complexity among the different types of literals. 0 0 01 1 1 1 1 12 2 2 2 2 23 3 3 3 3 34 4 4 4 4 4x x x 1 x l ( ) {1,3} x l ( ) <2,0,4,3,1> x ( )a ( )b ( )c figure 2: an illustrating example of the different types of literals over an arbitrary five-radix logic: (a) 1-reduced post literal (1-rpl), (b) generalized (post) literal (gl), and (c) universal literal (ul). since k-rpl gfsop is simpler from implementation point of view than gl or ul, we will perform all the gfsop expansions utilizing the 1-rpl gfsop. let us define 1-rpl [1, 17] as: i x = 1 iff x = i else ix = 0. (5) for example {0x, 1x, 2x} are the zero, first, and second polarities of the 1-rpl, respectively. also, let us define the ternary shifts over variable x as {x,x′,x′′} as the zero, first and second shifts of the variable x respectively, i.e., x = x + 0, x′ = x + 1 and x′′ = x + 2, and x can take any value in the set {0,1,2}. we chose to represent the 1-reduced post literals in terms of shifts and powers, among others, because of the ease of the implementation of powers of shifted variables in hardware as will be seen in section 3 within universal logic modules (ulms) for the production of rpl. the fundamental shannon expansion over gf(3) for a ternary function with a single variable is shown in theorem 1. theorem 1. shannon expansion over gf(3) for a function with a single variable is: f = 0x f0 + 1 x f1 + 2 x f2, (6) where f0 is cofactor of f with respect to variable x of value 0, f1 is cofactor of f with respect to variable x of value 1, and f2 is cofactor of f with respect to variable x of value 2. 104 an extended green sasao hierarchy of canonical ternary galois forms ... literals. galois field sum-of-products expansions can be performed utilizing variety uses of literals. for example, one can use k-reduced post literal (k-rpl) to produce k-rpl gfsop, generalized (post) literal (gl) to produce gl gfsop, and universal literal (ul) to produce ul gfsop. example 1. figure 2 demonstrates several literal types, where one proceeds from the simplest literal in figure 2(a) (i.e., rpl) to the most complex universal literal in figure 2(c). for rpl in figure 2(a), a value k is produced by the literal when the value of the variable is equal to a specific state, and in this particular example a value of k = 1 is generated by the 1-rpl when the value of variable x is equal to certain state where this state here equals to one. the gl in figure 2(b) produces a value of radix for a set of distinct states. one notes that, in contrast to the other literals, ul in figure 2(c) can have any value of the logic system at distinct states, and thus ul has the highest complexity among the different types of literals. 0 0 01 1 1 1 1 12 2 2 2 2 23 3 3 3 3 34 4 4 4 4 4x x x 1 x l ( ) {1,3} x l ( ) <2,0,4,3,1> x ( )a ( )b ( )c figure 2: an illustrating example of the different types of literals over an arbitrary five-radix logic: (a) 1-reduced post literal (1-rpl), (b) generalized (post) literal (gl), and (c) universal literal (ul). since k-rpl gfsop is simpler from implementation point of view than gl or ul, we will perform all the gfsop expansions utilizing the 1-rpl gfsop. let us define 1-rpl [1, 17] as: i x = 1 iff x = i else ix = 0. (5) for example {0x, 1x, 2x} are the zero, first, and second polarities of the 1-rpl, respectively. also, let us define the ternary shifts over variable x as {x,x′,x′′} as the zero, first and second shifts of the variable x respectively, i.e., x = x + 0, x′ = x + 1 and x′′ = x + 2, and x can take any value in the set {0,1,2}. we chose to represent the 1-reduced post literals in terms of shifts and powers, among others, because of the ease of the implementation of powers of shifted variables in hardware as will be seen in section 3 within universal logic modules (ulms) for the production of rpl. the fundamental shannon expansion over gf(3) for a ternary function with a single variable is shown in theorem 1. theorem 1. shannon expansion over gf(3) for a function with a single variable is: f = 0x f0 + 1 x f1 + 2 x f2, (6) where f0 is cofactor of f with respect to variable x of value 0, f1 is cofactor of f with respect to variable x of value 1, and f2 is cofactor of f with respect to variable x of value 2. 104 an extended green sasao hierarchy of canonical ternary galois forms ... proof. from equation (5), if we substitute the values of the 1-rpl in equation (6), we obtain the following {x = 0 ⇒ fx=0 = f0,x = 1 ⇒ fx=1 = f1,x = 2 ⇒ fx=2 = f2} which are the cofactors of variable x of values {0,1,2}, respectively. example 2. let f (x1,x2) = x ′ 1x2 + x ′′ 2 x1. then, using figure 1, the ternary truth vector with the variable order {x1,x2} is f = [0,2,1,1,2,0,2,2,2] t . using equation (6), we obtain the following gf(3) shannon expansion f (x1,x2) = 0x1 1x2 + 2 · 0x1 2x2 + 2 · 1x1 0x2 + 2 · 1x1 1x2 + 2 · 1x1 2x2 + 2x1 0x2 + 2 · 2x1 2x2. by using the addition and multiplication over gf(3) utilizing figure 1, the 1-rpl which is defined in equation (5) is related to the shifts of variables over gf(3) in terms of powers as follows: 0 x =2(x)2 + 1, (7) 0 x =2(x′)2 + 2(x′), (8) 0 x =2(x′′)2 + x′′, (9) 1 x =2(x)2 + 2(x), (10) 1 x =2(x′)2 + x′, (11) 1 x =2(x′′)2 + 1, (12) 2 x =2(x)2 + x, (13) 2 x =2(x′)2 + 1, (14) 2 x =2(x′′)2 + 2(x′′). (15) after the substitution of equations (7) (15) in equation (6), and after the minimization of the terms according to the gf operations in figure 1, one obtains the following equations: f =1 · f0 + x ·(2 f1 + f2)+ 2(x) 2( f0 + f1 + f2), (16) f =1 · f2 + x ′ ·(2 f0 + f1)+ 2(x ′)2( f0 + f1 + f2), (17) f =1 · f1 + x” ·(2 f2 + f0)+ 2(x ′′)2( f0 + f1 + f2). (18) equations (6) and (16) (18) are the ternary fundamental shannon and davio expansions for a single variable, respectively. these equations can be re-written in the following matrix-based forms as shown in equations (19) (22). we observe that equations (19) (22) are expansions for a single variable, but these expansions can be recursively generated for arbitrary number of variables n using the kronecker product also called the tensor product analogously to the binary case [1, 17]. 105 an extended green sasao hierarchy of canonical ternary galois forms ... s a’ a ( )a pd 1 a ( )b nd 1 a’ d 1 a ( )d( )c figure 3: two-valued nodes: (a) shannon, (b) positive davio, (c) negative davio, and (d) generalized davio where a ∈ {a,a′}. variable, that is one tree level, figure 4 represents the expansion nodes for ternary shannon, davio and generalized davio (d), respectively. s 0 x 1 x 2 x (a) d0 1 x 2 (b) d2 1 (d) d 1 (e) d1 1 ( ’)x 2 ( ’’)x 2 ( )x 2 (c) x xx’ x’’ figure 4: ternary nodes for ternary decision trees: (a) shannon, (b) davio0, (c) davio1, (d) davio2, and (e) generalized davio (d) where x ∈ {x,x′,x′′}. in correspondence to the binary s/d trees, one can produce ternary s/d trees. to define the ternary s/d trees, we define the generalized davio node over ternary galois radix as shown in figure 4(e). our notation here is that (x) corresponds to the three possible shifts of the variable x as follows: x ∈ {x,x′,x′′}, over gf(3). (28) definition 1. the ternary tree with ternary shannon and ternary generalized davio nodes, that generates other ternary trees, is called ternary shannon-davio (s/d) tree. utilizing the definition of ternary shannon in figure 4(a) and ternary generalized davio in figure 4(e), we obtain the ternary shannon-davio trees (ternary s/d trees) for two variables as shown in figure 5. from the ternary s/d dts shown in figure 5, if we take any s/d tree and multiply the second-level cofactors (which are in the tdt leaves) each by the corresponding path in that tdt, and sum all the resulting cubes (terms) over gf(3), we obtain the flattened form of the function f , as a certain gfsop expression. for each tdt in figure 5, there are as many forms obtained for the function f as the number of possible permutations of the polarities of the variables in the second-level branches of each tdt. definition 2. the family of all possible forms obtained per ternary s/d tree is called ternary inclusive forms (tifs). the numbers of these tifs per tdt for two variables are shown on top of each s/d tdt in figure 5. by observing figure 5, we can generate the corresponding flattened forms by two methods. a classical method, per analogy with well-known binary forms, would be to create every transform matrix for every tif s/d tree and then expand using that transform matrix. a better method is to create one flattened 108 an extended green sasao hierarchy of canonical ternary galois forms ... s a’ a ( )a pd 1 a ( )b nd 1 a’ d 1 a ( )d( )c figure 3: two-valued nodes: (a) shannon, (b) positive davio, (c) negative davio, and (d) generalized davio where a ∈ {a,a′}. variable, that is one tree level, figure 4 represents the expansion nodes for ternary shannon, davio and generalized davio (d), respectively. s 0 x 1 x 2 x (a) d0 1 x 2 (b) d2 1 (d) d 1 (e) d1 1 ( ’)x 2 ( ’’)x 2 ( )x 2 (c) x xx’ x’’ figure 4: ternary nodes for ternary decision trees: (a) shannon, (b) davio0, (c) davio1, (d) davio2, and (e) generalized davio (d) where x ∈ {x,x′,x′′}. in correspondence to the binary s/d trees, one can produce ternary s/d trees. to define the ternary s/d trees, we define the generalized davio node over ternary galois radix as shown in figure 4(e). our notation here is that (x) corresponds to the three possible shifts of the variable x as follows: x ∈ {x,x′,x′′}, over gf(3). (28) definition 1. the ternary tree with ternary shannon and ternary generalized davio nodes, that generates other ternary trees, is called ternary shannon-davio (s/d) tree. utilizing the definition of ternary shannon in figure 4(a) and ternary generalized davio in figure 4(e), we obtain the ternary shannon-davio trees (ternary s/d trees) for two variables as shown in figure 5. from the ternary s/d dts shown in figure 5, if we take any s/d tree and multiply the second-level cofactors (which are in the tdt leaves) each by the corresponding path in that tdt, and sum all the resulting cubes (terms) over gf(3), we obtain the flattened form of the function f , as a certain gfsop expression. for each tdt in figure 5, there are as many forms obtained for the function f as the number of possible permutations of the polarities of the variables in the second-level branches of each tdt. definition 2. the family of all possible forms obtained per ternary s/d tree is called ternary inclusive forms (tifs). the numbers of these tifs per tdt for two variables are shown on top of each s/d tdt in figure 5. by observing figure 5, we can generate the corresponding flattened forms by two methods. a classical method, per analogy with well-known binary forms, would be to create every transform matrix for every tif s/d tree and then expand using that transform matrix. a better method is to create one flattened 108 an extended green sasao hierarchy of canonical ternary galois forms ... s a’ a ( )a pd 1 a ( )b nd 1 a’ d 1 a ( )d( )c figure 3: two-valued nodes: (a) shannon, (b) positive davio, (c) negative davio, and (d) generalized davio where a ∈ {a,a′}. variable, that is one tree level, figure 4 represents the expansion nodes for ternary shannon, davio and generalized davio (d), respectively. s 0 x 1 x 2 x (a) d0 1 x 2 (b) d2 1 (d) d 1 (e) d1 1 ( ’)x 2 ( ’’)x 2 ( )x 2 (c) x xx’ x’’ figure 4: ternary nodes for ternary decision trees: (a) shannon, (b) davio0, (c) davio1, (d) davio2, and (e) generalized davio (d) where x ∈ {x,x′,x′′}. in correspondence to the binary s/d trees, one can produce ternary s/d trees. to define the ternary s/d trees, we define the generalized davio node over ternary galois radix as shown in figure 4(e). our notation here is that (x) corresponds to the three possible shifts of the variable x as follows: x ∈ {x,x′,x′′}, over gf(3). (28) definition 1. the ternary tree with ternary shannon and ternary generalized davio nodes, that generates other ternary trees, is called ternary shannon-davio (s/d) tree. utilizing the definition of ternary shannon in figure 4(a) and ternary generalized davio in figure 4(e), we obtain the ternary shannon-davio trees (ternary s/d trees) for two variables as shown in figure 5. from the ternary s/d dts shown in figure 5, if we take any s/d tree and multiply the second-level cofactors (which are in the tdt leaves) each by the corresponding path in that tdt, and sum all the resulting cubes (terms) over gf(3), we obtain the flattened form of the function f , as a certain gfsop expression. for each tdt in figure 5, there are as many forms obtained for the function f as the number of possible permutations of the polarities of the variables in the second-level branches of each tdt. definition 2. the family of all possible forms obtained per ternary s/d tree is called ternary inclusive forms (tifs). the numbers of these tifs per tdt for two variables are shown on top of each s/d tdt in figure 5. by observing figure 5, we can generate the corresponding flattened forms by two methods. a classical method, per analogy with well-known binary forms, would be to create every transform matrix for every tif s/d tree and then expand using that transform matrix. a better method is to create one flattened 108 52 a. n. al-rabadi an extended green sasao hierarchy of canonical ternary galois forms and universal logic modules 53 an extended green sasao hierarchy of canonical ternary galois forms ... proof. from equation (5), if we substitute the values of the 1-rpl in equation (6), we obtain the following {x = 0 ⇒ fx=0 = f0,x = 1 ⇒ fx=1 = f1,x = 2 ⇒ fx=2 = f2} which are the cofactors of variable x of values {0,1,2}, respectively. example 2. let f (x1,x2) = x ′ 1x2 + x ′′ 2 x1. then, using figure 1, the ternary truth vector with the variable order {x1,x2} is f = [0,2,1,1,2,0,2,2,2] t . using equation (6), we obtain the following gf(3) shannon expansion f (x1,x2) = 0x1 1x2 + 2 · 0x1 2x2 + 2 · 1x1 0x2 + 2 · 1x1 1x2 + 2 · 1x1 2x2 + 2x1 0x2 + 2 · 2x1 2x2. by using the addition and multiplication over gf(3) utilizing figure 1, the 1-rpl which is defined in equation (5) is related to the shifts of variables over gf(3) in terms of powers as follows: 0 x =2(x)2 + 1, (7) 0 x =2(x′)2 + 2(x′), (8) 0 x =2(x′′)2 + x′′, (9) 1 x =2(x)2 + 2(x), (10) 1 x =2(x′)2 + x′, (11) 1 x =2(x′′)2 + 1, (12) 2 x =2(x)2 + x, (13) 2 x =2(x′)2 + 1, (14) 2 x =2(x′′)2 + 2(x′′). (15) after the substitution of equations (7) (15) in equation (6), and after the minimization of the terms according to the gf operations in figure 1, one obtains the following equations: f =1 · f0 + x ·(2 f1 + f2)+ 2(x) 2( f0 + f1 + f2), (16) f =1 · f2 + x ′ ·(2 f0 + f1)+ 2(x ′)2( f0 + f1 + f2), (17) f =1 · f1 + x” ·(2 f2 + f0)+ 2(x ′′)2( f0 + f1 + f2). (18) equations (6) and (16) (18) are the ternary fundamental shannon and davio expansions for a single variable, respectively. these equations can be re-written in the following matrix-based forms as shown in equations (19) (22). we observe that equations (19) (22) are expansions for a single variable, but these expansions can be recursively generated for arbitrary number of variables n using the kronecker product also called the tensor product analogously to the binary case [1, 17]. 105 54 a. n. al-rabadi an extended green sasao hierarchy of canonical ternary galois forms and universal logic modules 55 an extended green sasao hierarchy of canonical ternary galois forms ... f = � 0x 1x 2x �   1 0 0 0 1 0 0 0 1     f0 f1 f2  , (19) f = � 1 x x2 �   1 0 0 0 2 1 2 2 2     f0 f1 f2  , (20) f = � 1 x′ (x′)2 �   0 0 1 2 1 0 2 2 2     f0 f1 f2  , (21) f = � 1 x′′ (x′′)2 �   0 1 0 1 0 2 2 2 2     f0 f1 f2  . (22) the recursive generation using the kronecker product for arbitrary number of variables can be expressed formally as in the following forms for ternary shannon (s) and davio (d0,d1,d2) expansions, respectively: f = n � i=1 � 0xi 1xi 2xi � n � i=1 [s][�f], (23) f = n � i=1 � 1 xi x 2 i � n � i=1 [d0][�f], (24) f = n � i=1 � 1 x′i (x ′ i) 2 � n � i=1 [d1][�f], (25) f = n � i=1 � 1 x′′i (x ′′ i ) 2 � n � i=1 [d2][�f]. (26) analogously to the binary case, we can have expansions that are mixed of shannon (s) for certain variables and davio (d0,d1,d2) for the other variables. this will lead, analogously to the binary case, to the kronecker ternary decision trees (tdts). moreover, mixed expansions can be extended to include the case of pseudo kronecker tdts [17]. 2 new multiple-valued s/d trees and their canonical galois sop forms economical and highly testable implementations of boolean functions, based on reedmuller (and-exor) logic, play an important role in logic synthesis and circuit design. the and-exor circuits include canonical forms which are expansions that are unique representations of a boolean function. several large families of canonical forms: fixed polarity reed-muller (fprm) forms, generalized reed-muller (grm) forms, kronecker (kro) forms and pseudo-kronecker (psdkro) forms, referred to 106 an extended green sasao hierarchy of canonical ternary galois forms ... f = � 0x 1x 2x �   1 0 0 0 1 0 0 0 1     f0 f1 f2  , (19) f = � 1 x x2 �   1 0 0 0 2 1 2 2 2     f0 f1 f2  , (20) f = � 1 x′ (x′)2 �   0 0 1 2 1 0 2 2 2     f0 f1 f2  , (21) f = � 1 x′′ (x′′)2 �   0 1 0 1 0 2 2 2 2     f0 f1 f2  . (22) the recursive generation using the kronecker product for arbitrary number of variables can be expressed formally as in the following forms for ternary shannon (s) and davio (d0,d1,d2) expansions, respectively: f = n � i=1 � 0xi 1xi 2xi � n � i=1 [s][�f], (23) f = n � i=1 � 1 xi x 2 i � n � i=1 [d0][�f], (24) f = n � i=1 � 1 x′i (x ′ i) 2 � n � i=1 [d1][�f], (25) f = n � i=1 � 1 x′′i (x ′′ i ) 2 � n � i=1 [d2][�f]. (26) analogously to the binary case, we can have expansions that are mixed of shannon (s) for certain variables and davio (d0,d1,d2) for the other variables. this will lead, analogously to the binary case, to the kronecker ternary decision trees (tdts). moreover, mixed expansions can be extended to include the case of pseudo kronecker tdts [17]. 2 new multiple-valued s/d trees and their canonical galois sop forms economical and highly testable implementations of boolean functions, based on reedmuller (and-exor) logic, play an important role in logic synthesis and circuit design. the and-exor circuits include canonical forms which are expansions that are unique representations of a boolean function. several large families of canonical forms: fixed polarity reed-muller (fprm) forms, generalized reed-muller (grm) forms, kronecker (kro) forms and pseudo-kronecker (psdkro) forms, referred to 106 an extended green sasao hierarchy of canonical ternary galois forms ... f = � 0x 1x 2x �   1 0 0 0 1 0 0 0 1     f0 f1 f2  , (19) f = � 1 x x2 �   1 0 0 0 2 1 2 2 2     f0 f1 f2  , (20) f = � 1 x′ (x′)2 �   0 0 1 2 1 0 2 2 2     f0 f1 f2  , (21) f = � 1 x′′ (x′′)2 �   0 1 0 1 0 2 2 2 2     f0 f1 f2  . (22) the recursive generation using the kronecker product for arbitrary number of variables can be expressed formally as in the following forms for ternary shannon (s) and davio (d0,d1,d2) expansions, respectively: f = n � i=1 � 0xi 1xi 2xi � n � i=1 [s][�f], (23) f = n � i=1 � 1 xi x 2 i � n � i=1 [d0][�f], (24) f = n � i=1 � 1 x′i (x ′ i) 2 � n � i=1 [d1][�f], (25) f = n � i=1 � 1 x′′i (x ′′ i ) 2 � n � i=1 [d2][�f]. (26) analogously to the binary case, we can have expansions that are mixed of shannon (s) for certain variables and davio (d0,d1,d2) for the other variables. this will lead, analogously to the binary case, to the kronecker ternary decision trees (tdts). moreover, mixed expansions can be extended to include the case of pseudo kronecker tdts [17]. 2 new multiple-valued s/d trees and their canonical galois sop forms economical and highly testable implementations of boolean functions, based on reedmuller (and-exor) logic, play an important role in logic synthesis and circuit design. the and-exor circuits include canonical forms which are expansions that are unique representations of a boolean function. several large families of canonical forms: fixed polarity reed-muller (fprm) forms, generalized reed-muller (grm) forms, kronecker (kro) forms and pseudo-kronecker (psdkro) forms, referred to 106 54 a. n. al-rabadi an extended green sasao hierarchy of canonical ternary galois forms and universal logic modules 55 an extended green sasao hierarchy of canonical ternary galois forms ... as the green-sasao hierarchy, have been described [1, 10, 17]. because canonical families have higher testability and some other properties desirable for efficient synthesis, especially of some classes of functions, they are widely investigated. a similar ternary version of the binary green-sasao hierarchy can be developed, where this hierarchy can find applications in minimizing galois field sum-of-product (gfsop) expressions which are expressions in the sum-of-product form that uses the additions and multiplications of arbitrary radix galois field, and can be used for the creation of new forms, decision diagrams and regular structures. the current state-of-the-art minimizers of exclusive sum-of-product (esop) expressions are based on heuristics and give the exact solution only for functions with a small number of variables. for example, a formulation for finding exact esop was given [11], and an algorithm to derive minimum esop for 6-variable function was provided [25]. because gfsop minimization is even more difficult, it is important to investigate the structural properties and the counts of their canonical sub-families. two families of binary canonical reed-muller forms, called inclusive forms (ifs) and generalized inclusive forms (gifs) were presented [1], where the second family was the first to include all minimum esops (binary gfsops). in these forms, if is the form generated by the corresponding s/d tree and gif is the form which is the union of the various variable-based ordering ifs (cf. definitions and theorems of these forms in the next subsection). in this paper, we propose, as analogous to the binary case, two general families of canonical ternary reed-muller forms, called ternary inclusive forms (tifs) and their generalization of ternary generalized inclusive forms (tgifs), where gfsop minimizer based on these new forms can be used to minimize functional gfsop expressions and the second family of tgifs includes minimum gfsops over ternary galois field. one of the motivations for this work is the application of these tifs and tgifs to find minimum gfsop for multiple-valued inputs multiple-valued outputs within logic synthesis, where the corresponding s/d trees provide more general polarity that contains grm forms as a special case. 2.1 s/d trees and their inclusive forms and generalized inclusive forms two general families based on the shannon expansion and the generalized davio expansion which are produced using the corresponding s/d trees are presented in this subsection. these families are called the inclusive forms (ifs) and the generalized inclusive forms (gifs). the corresponding expansions over gf(2) are shown in figure 3, where figure 3(d) shows the new node which is based on binary davio expansions called the generalized davio (d) expansion (cf. equation (28) for the more general ternary case) that generates the negative and positive davio expansions as special cases. the s/d trees for ifs of two variables can be generated for variable order {a,b} and for variable order {b,a} as well. the set of gifs for two variables is the union of these two order-based ifs, where the total number of the resulting gifs is equal to: #gif = 2 ·(#ifa,b)− #(ifa,b ⋂ ifb,a). (27) the galois-based shannon and davio ternary expansions (i.e., flattened forms) can be represented in ternary dts (tdts) and the corresponding varieties of ternary dds (tdds) according to the corresponding reduction rules that are used [1, 17]. for one 107 an extended green sasao hierarchy of canonical ternary galois forms ... as the green-sasao hierarchy, have been described [1, 10, 17]. because canonical families have higher testability and some other properties desirable for efficient synthesis, especially of some classes of functions, they are widely investigated. a similar ternary version of the binary green-sasao hierarchy can be developed, where this hierarchy can find applications in minimizing galois field sum-of-product (gfsop) expressions which are expressions in the sum-of-product form that uses the additions and multiplications of arbitrary radix galois field, and can be used for the creation of new forms, decision diagrams and regular structures. the current state-of-the-art minimizers of exclusive sum-of-product (esop) expressions are based on heuristics and give the exact solution only for functions with a small number of variables. for example, a formulation for finding exact esop was given [11], and an algorithm to derive minimum esop for 6-variable function was provided [25]. because gfsop minimization is even more difficult, it is important to investigate the structural properties and the counts of their canonical sub-families. two families of binary canonical reed-muller forms, called inclusive forms (ifs) and generalized inclusive forms (gifs) were presented [1], where the second family was the first to include all minimum esops (binary gfsops). in these forms, if is the form generated by the corresponding s/d tree and gif is the form which is the union of the various variable-based ordering ifs (cf. definitions and theorems of these forms in the next subsection). in this paper, we propose, as analogous to the binary case, two general families of canonical ternary reed-muller forms, called ternary inclusive forms (tifs) and their generalization of ternary generalized inclusive forms (tgifs), where gfsop minimizer based on these new forms can be used to minimize functional gfsop expressions and the second family of tgifs includes minimum gfsops over ternary galois field. one of the motivations for this work is the application of these tifs and tgifs to find minimum gfsop for multiple-valued inputs multiple-valued outputs within logic synthesis, where the corresponding s/d trees provide more general polarity that contains grm forms as a special case. 2.1 s/d trees and their inclusive forms and generalized inclusive forms two general families based on the shannon expansion and the generalized davio expansion which are produced using the corresponding s/d trees are presented in this subsection. these families are called the inclusive forms (ifs) and the generalized inclusive forms (gifs). the corresponding expansions over gf(2) are shown in figure 3, where figure 3(d) shows the new node which is based on binary davio expansions called the generalized davio (d) expansion (cf. equation (28) for the more general ternary case) that generates the negative and positive davio expansions as special cases. the s/d trees for ifs of two variables can be generated for variable order {a,b} and for variable order {b,a} as well. the set of gifs for two variables is the union of these two order-based ifs, where the total number of the resulting gifs is equal to: #gif = 2 ·(#ifa,b)− #(ifa,b ⋂ ifb,a). (27) the galois-based shannon and davio ternary expansions (i.e., flattened forms) can be represented in ternary dts (tdts) and the corresponding varieties of ternary dds (tdds) according to the corresponding reduction rules that are used [1, 17]. for one 107 an extended green sasao hierarchy of canonical ternary galois forms ... as the green-sasao hierarchy, have been described [1, 10, 17]. because canonical families have higher testability and some other properties desirable for efficient synthesis, especially of some classes of functions, they are widely investigated. a similar ternary version of the binary green-sasao hierarchy can be developed, where this hierarchy can find applications in minimizing galois field sum-of-product (gfsop) expressions which are expressions in the sum-of-product form that uses the additions and multiplications of arbitrary radix galois field, and can be used for the creation of new forms, decision diagrams and regular structures. the current state-of-the-art minimizers of exclusive sum-of-product (esop) expressions are based on heuristics and give the exact solution only for functions with a small number of variables. for example, a formulation for finding exact esop was given [11], and an algorithm to derive minimum esop for 6-variable function was provided [25]. because gfsop minimization is even more difficult, it is important to investigate the structural properties and the counts of their canonical sub-families. two families of binary canonical reed-muller forms, called inclusive forms (ifs) and generalized inclusive forms (gifs) were presented [1], where the second family was the first to include all minimum esops (binary gfsops). in these forms, if is the form generated by the corresponding s/d tree and gif is the form which is the union of the various variable-based ordering ifs (cf. definitions and theorems of these forms in the next subsection). in this paper, we propose, as analogous to the binary case, two general families of canonical ternary reed-muller forms, called ternary inclusive forms (tifs) and their generalization of ternary generalized inclusive forms (tgifs), where gfsop minimizer based on these new forms can be used to minimize functional gfsop expressions and the second family of tgifs includes minimum gfsops over ternary galois field. one of the motivations for this work is the application of these tifs and tgifs to find minimum gfsop for multiple-valued inputs multiple-valued outputs within logic synthesis, where the corresponding s/d trees provide more general polarity that contains grm forms as a special case. 2.1 s/d trees and their inclusive forms and generalized inclusive forms two general families based on the shannon expansion and the generalized davio expansion which are produced using the corresponding s/d trees are presented in this subsection. these families are called the inclusive forms (ifs) and the generalized inclusive forms (gifs). the corresponding expansions over gf(2) are shown in figure 3, where figure 3(d) shows the new node which is based on binary davio expansions called the generalized davio (d) expansion (cf. equation (28) for the more general ternary case) that generates the negative and positive davio expansions as special cases. the s/d trees for ifs of two variables can be generated for variable order {a,b} and for variable order {b,a} as well. the set of gifs for two variables is the union of these two order-based ifs, where the total number of the resulting gifs is equal to: #gif = 2 ·(#ifa,b)− #(ifa,b ⋂ ifb,a). (27) the galois-based shannon and davio ternary expansions (i.e., flattened forms) can be represented in ternary dts (tdts) and the corresponding varieties of ternary dds (tdds) according to the corresponding reduction rules that are used [1, 17]. for one 107 an extended green sasao hierarchy of canonical ternary galois forms ... as the green-sasao hierarchy, have been described [1, 10, 17]. because canonical families have higher testability and some other properties desirable for efficient synthesis, especially of some classes of functions, they are widely investigated. a similar ternary version of the binary green-sasao hierarchy can be developed, where this hierarchy can find applications in minimizing galois field sum-of-product (gfsop) expressions which are expressions in the sum-of-product form that uses the additions and multiplications of arbitrary radix galois field, and can be used for the creation of new forms, decision diagrams and regular structures. the current state-of-the-art minimizers of exclusive sum-of-product (esop) expressions are based on heuristics and give the exact solution only for functions with a small number of variables. for example, a formulation for finding exact esop was given [11], and an algorithm to derive minimum esop for 6-variable function was provided [25]. because gfsop minimization is even more difficult, it is important to investigate the structural properties and the counts of their canonical sub-families. two families of binary canonical reed-muller forms, called inclusive forms (ifs) and generalized inclusive forms (gifs) were presented [1], where the second family was the first to include all minimum esops (binary gfsops). in these forms, if is the form generated by the corresponding s/d tree and gif is the form which is the union of the various variable-based ordering ifs (cf. definitions and theorems of these forms in the next subsection). in this paper, we propose, as analogous to the binary case, two general families of canonical ternary reed-muller forms, called ternary inclusive forms (tifs) and their generalization of ternary generalized inclusive forms (tgifs), where gfsop minimizer based on these new forms can be used to minimize functional gfsop expressions and the second family of tgifs includes minimum gfsops over ternary galois field. one of the motivations for this work is the application of these tifs and tgifs to find minimum gfsop for multiple-valued inputs multiple-valued outputs within logic synthesis, where the corresponding s/d trees provide more general polarity that contains grm forms as a special case. 2.1 s/d trees and their inclusive forms and generalized inclusive forms two general families based on the shannon expansion and the generalized davio expansion which are produced using the corresponding s/d trees are presented in this subsection. these families are called the inclusive forms (ifs) and the generalized inclusive forms (gifs). the corresponding expansions over gf(2) are shown in figure 3, where figure 3(d) shows the new node which is based on binary davio expansions called the generalized davio (d) expansion (cf. equation (28) for the more general ternary case) that generates the negative and positive davio expansions as special cases. the s/d trees for ifs of two variables can be generated for variable order {a,b} and for variable order {b,a} as well. the set of gifs for two variables is the union of these two order-based ifs, where the total number of the resulting gifs is equal to: #gif = 2 ·(#ifa,b)− #(ifa,b ⋂ ifb,a). (27) the galois-based shannon and davio ternary expansions (i.e., flattened forms) can be represented in ternary dts (tdts) and the corresponding varieties of ternary dds (tdds) according to the corresponding reduction rules that are used [1, 17]. for one 107 an extended green sasao hierarchy of canonical ternary galois forms ... f = � 0x 1x 2x �   1 0 0 0 1 0 0 0 1     f0 f1 f2  , (19) f = � 1 x x2 �   1 0 0 0 2 1 2 2 2     f0 f1 f2  , (20) f = � 1 x′ (x′)2 �   0 0 1 2 1 0 2 2 2     f0 f1 f2  , (21) f = � 1 x′′ (x′′)2 �   0 1 0 1 0 2 2 2 2     f0 f1 f2  . (22) the recursive generation using the kronecker product for arbitrary number of variables can be expressed formally as in the following forms for ternary shannon (s) and davio (d0,d1,d2) expansions, respectively: f = n � i=1 � 0xi 1xi 2xi � n � i=1 [s][�f], (23) f = n � i=1 � 1 xi x 2 i � n � i=1 [d0][�f], (24) f = n � i=1 � 1 x′i (x ′ i) 2 � n � i=1 [d1][�f], (25) f = n � i=1 � 1 x′′i (x ′′ i ) 2 � n � i=1 [d2][�f]. (26) analogously to the binary case, we can have expansions that are mixed of shannon (s) for certain variables and davio (d0,d1,d2) for the other variables. this will lead, analogously to the binary case, to the kronecker ternary decision trees (tdts). moreover, mixed expansions can be extended to include the case of pseudo kronecker tdts [17]. 2 new multiple-valued s/d trees and their canonical galois sop forms economical and highly testable implementations of boolean functions, based on reedmuller (and-exor) logic, play an important role in logic synthesis and circuit design. the and-exor circuits include canonical forms which are expansions that are unique representations of a boolean function. several large families of canonical forms: fixed polarity reed-muller (fprm) forms, generalized reed-muller (grm) forms, kronecker (kro) forms and pseudo-kronecker (psdkro) forms, referred to 106 56 a. n. al-rabadi an extended green sasao hierarchy of canonical ternary galois forms and universal logic modules 57 an extended green sasao hierarchy of canonical ternary galois forms ... s a’ a ( )a pd 1 a ( )b nd 1 a’ d 1 a ( )d( )c figure 3: two-valued nodes: (a) shannon, (b) positive davio, (c) negative davio, and (d) generalized davio where a ∈ {a,a′}. variable, that is one tree level, figure 4 represents the expansion nodes for ternary shannon, davio and generalized davio (d), respectively. s 0 x 1 x 2 x (a) d0 1 x 2 (b) d2 1 (d) d 1 (e) d1 1 ( ’)x 2 ( ’’)x 2 ( )x 2 (c) x xx’ x’’ figure 4: ternary nodes for ternary decision trees: (a) shannon, (b) davio0, (c) davio1, (d) davio2, and (e) generalized davio (d) where x ∈ {x,x′,x′′}. in correspondence to the binary s/d trees, one can produce ternary s/d trees. to define the ternary s/d trees, we define the generalized davio node over ternary galois radix as shown in figure 4(e). our notation here is that (x) corresponds to the three possible shifts of the variable x as follows: x ∈ {x,x′,x′′}, over gf(3). (28) definition 1. the ternary tree with ternary shannon and ternary generalized davio nodes, that generates other ternary trees, is called ternary shannon-davio (s/d) tree. utilizing the definition of ternary shannon in figure 4(a) and ternary generalized davio in figure 4(e), we obtain the ternary shannon-davio trees (ternary s/d trees) for two variables as shown in figure 5. from the ternary s/d dts shown in figure 5, if we take any s/d tree and multiply the second-level cofactors (which are in the tdt leaves) each by the corresponding path in that tdt, and sum all the resulting cubes (terms) over gf(3), we obtain the flattened form of the function f , as a certain gfsop expression. for each tdt in figure 5, there are as many forms obtained for the function f as the number of possible permutations of the polarities of the variables in the second-level branches of each tdt. definition 2. the family of all possible forms obtained per ternary s/d tree is called ternary inclusive forms (tifs). the numbers of these tifs per tdt for two variables are shown on top of each s/d tdt in figure 5. by observing figure 5, we can generate the corresponding flattened forms by two methods. a classical method, per analogy with well-known binary forms, would be to create every transform matrix for every tif s/d tree and then expand using that transform matrix. a better method is to create one flattened 108 an extended green sasao hierarchy of canonical ternary galois forms ... s a’ a ( )a pd 1 a ( )b nd 1 a’ d 1 a ( )d( )c figure 3: two-valued nodes: (a) shannon, (b) positive davio, (c) negative davio, and (d) generalized davio where a ∈ {a,a′}. variable, that is one tree level, figure 4 represents the expansion nodes for ternary shannon, davio and generalized davio (d), respectively. s 0 x 1 x 2 x (a) d0 1 x 2 (b) d2 1 (d) d 1 (e) d1 1 ( ’)x 2 ( ’’)x 2 ( )x 2 (c) x xx’ x’’ figure 4: ternary nodes for ternary decision trees: (a) shannon, (b) davio0, (c) davio1, (d) davio2, and (e) generalized davio (d) where x ∈ {x,x′,x′′}. in correspondence to the binary s/d trees, one can produce ternary s/d trees. to define the ternary s/d trees, we define the generalized davio node over ternary galois radix as shown in figure 4(e). our notation here is that (x) corresponds to the three possible shifts of the variable x as follows: x ∈ {x,x′,x′′}, over gf(3). (28) definition 1. the ternary tree with ternary shannon and ternary generalized davio nodes, that generates other ternary trees, is called ternary shannon-davio (s/d) tree. utilizing the definition of ternary shannon in figure 4(a) and ternary generalized davio in figure 4(e), we obtain the ternary shannon-davio trees (ternary s/d trees) for two variables as shown in figure 5. from the ternary s/d dts shown in figure 5, if we take any s/d tree and multiply the second-level cofactors (which are in the tdt leaves) each by the corresponding path in that tdt, and sum all the resulting cubes (terms) over gf(3), we obtain the flattened form of the function f , as a certain gfsop expression. for each tdt in figure 5, there are as many forms obtained for the function f as the number of possible permutations of the polarities of the variables in the second-level branches of each tdt. definition 2. the family of all possible forms obtained per ternary s/d tree is called ternary inclusive forms (tifs). the numbers of these tifs per tdt for two variables are shown on top of each s/d tdt in figure 5. by observing figure 5, we can generate the corresponding flattened forms by two methods. a classical method, per analogy with well-known binary forms, would be to create every transform matrix for every tif s/d tree and then expand using that transform matrix. a better method is to create one flattened 108 an extended green sasao hierarchy of canonical ternary galois forms ... s a’ a ( )a pd 1 a ( )b nd 1 a’ d 1 a ( )d( )c figure 3: two-valued nodes: (a) shannon, (b) positive davio, (c) negative davio, and (d) generalized davio where a ∈ {a,a′}. variable, that is one tree level, figure 4 represents the expansion nodes for ternary shannon, davio and generalized davio (d), respectively. s 0 x 1 x 2 x (a) d0 1 x 2 (b) d2 1 (d) d 1 (e) d1 1 ( ’)x 2 ( ’’)x 2 ( )x 2 (c) x xx’ x’’ figure 4: ternary nodes for ternary decision trees: (a) shannon, (b) davio0, (c) davio1, (d) davio2, and (e) generalized davio (d) where x ∈ {x,x′,x′′}. in correspondence to the binary s/d trees, one can produce ternary s/d trees. to define the ternary s/d trees, we define the generalized davio node over ternary galois radix as shown in figure 4(e). our notation here is that (x) corresponds to the three possible shifts of the variable x as follows: x ∈ {x,x′,x′′}, over gf(3). (28) definition 1. the ternary tree with ternary shannon and ternary generalized davio nodes, that generates other ternary trees, is called ternary shannon-davio (s/d) tree. utilizing the definition of ternary shannon in figure 4(a) and ternary generalized davio in figure 4(e), we obtain the ternary shannon-davio trees (ternary s/d trees) for two variables as shown in figure 5. from the ternary s/d dts shown in figure 5, if we take any s/d tree and multiply the second-level cofactors (which are in the tdt leaves) each by the corresponding path in that tdt, and sum all the resulting cubes (terms) over gf(3), we obtain the flattened form of the function f , as a certain gfsop expression. for each tdt in figure 5, there are as many forms obtained for the function f as the number of possible permutations of the polarities of the variables in the second-level branches of each tdt. definition 2. the family of all possible forms obtained per ternary s/d tree is called ternary inclusive forms (tifs). the numbers of these tifs per tdt for two variables are shown on top of each s/d tdt in figure 5. by observing figure 5, we can generate the corresponding flattened forms by two methods. a classical method, per analogy with well-known binary forms, would be to create every transform matrix for every tif s/d tree and then expand using that transform matrix. a better method is to create one flattened 108 an extended green sasao hierarchy of canonical ternary galois forms ... s a’ a ( )a pd 1 a ( )b nd 1 a’ d 1 a ( )d( )c figure 3: two-valued nodes: (a) shannon, (b) positive davio, (c) negative davio, and (d) generalized davio where a ∈ {a,a′}. variable, that is one tree level, figure 4 represents the expansion nodes for ternary shannon, davio and generalized davio (d), respectively. s 0 x 1 x 2 x (a) d0 1 x 2 (b) d2 1 (d) d 1 (e) d1 1 ( ’)x 2 ( ’’)x 2 ( )x 2 (c) x xx’ x’’ figure 4: ternary nodes for ternary decision trees: (a) shannon, (b) davio0, (c) davio1, (d) davio2, and (e) generalized davio (d) where x ∈ {x,x′,x′′}. in correspondence to the binary s/d trees, one can produce ternary s/d trees. to define the ternary s/d trees, we define the generalized davio node over ternary galois radix as shown in figure 4(e). our notation here is that (x) corresponds to the three possible shifts of the variable x as follows: x ∈ {x,x′,x′′}, over gf(3). (28) definition 1. the ternary tree with ternary shannon and ternary generalized davio nodes, that generates other ternary trees, is called ternary shannon-davio (s/d) tree. utilizing the definition of ternary shannon in figure 4(a) and ternary generalized davio in figure 4(e), we obtain the ternary shannon-davio trees (ternary s/d trees) for two variables as shown in figure 5. from the ternary s/d dts shown in figure 5, if we take any s/d tree and multiply the second-level cofactors (which are in the tdt leaves) each by the corresponding path in that tdt, and sum all the resulting cubes (terms) over gf(3), we obtain the flattened form of the function f , as a certain gfsop expression. for each tdt in figure 5, there are as many forms obtained for the function f as the number of possible permutations of the polarities of the variables in the second-level branches of each tdt. definition 2. the family of all possible forms obtained per ternary s/d tree is called ternary inclusive forms (tifs). the numbers of these tifs per tdt for two variables are shown on top of each s/d tdt in figure 5. by observing figure 5, we can generate the corresponding flattened forms by two methods. a classical method, per analogy with well-known binary forms, would be to create every transform matrix for every tif s/d tree and then expand using that transform matrix. a better method is to create one flattened 108 an extended green sasao hierarchy of canonical ternary galois forms ... s a’ a ( )a pd 1 a ( )b nd 1 a’ d 1 a ( )d( )c figure 3: two-valued nodes: (a) shannon, (b) positive davio, (c) negative davio, and (d) generalized davio where a ∈ {a,a′}. variable, that is one tree level, figure 4 represents the expansion nodes for ternary shannon, davio and generalized davio (d), respectively. s 0 x 1 x 2 x (a) d0 1 x 2 (b) d2 1 (d) d 1 (e) d1 1 ( ’)x 2 ( ’’)x 2 ( )x 2 (c) x xx’ x’’ figure 4: ternary nodes for ternary decision trees: (a) shannon, (b) davio0, (c) davio1, (d) davio2, and (e) generalized davio (d) where x ∈ {x,x′,x′′}. in correspondence to the binary s/d trees, one can produce ternary s/d trees. to define the ternary s/d trees, we define the generalized davio node over ternary galois radix as shown in figure 4(e). our notation here is that (x) corresponds to the three possible shifts of the variable x as follows: x ∈ {x,x′,x′′}, over gf(3). (28) definition 1. the ternary tree with ternary shannon and ternary generalized davio nodes, that generates other ternary trees, is called ternary shannon-davio (s/d) tree. utilizing the definition of ternary shannon in figure 4(a) and ternary generalized davio in figure 4(e), we obtain the ternary shannon-davio trees (ternary s/d trees) for two variables as shown in figure 5. from the ternary s/d dts shown in figure 5, if we take any s/d tree and multiply the second-level cofactors (which are in the tdt leaves) each by the corresponding path in that tdt, and sum all the resulting cubes (terms) over gf(3), we obtain the flattened form of the function f , as a certain gfsop expression. for each tdt in figure 5, there are as many forms obtained for the function f as the number of possible permutations of the polarities of the variables in the second-level branches of each tdt. definition 2. the family of all possible forms obtained per ternary s/d tree is called ternary inclusive forms (tifs). the numbers of these tifs per tdt for two variables are shown on top of each s/d tdt in figure 5. by observing figure 5, we can generate the corresponding flattened forms by two methods. a classical method, per analogy with well-known binary forms, would be to create every transform matrix for every tif s/d tree and then expand using that transform matrix. a better method is to create one flattened 108 an extended green sasao hierarchy of canonical ternary galois forms ... s a’ a ( )a pd 1 a ( )b nd 1 a’ d 1 a ( )d( )c figure 3: two-valued nodes: (a) shannon, (b) positive davio, (c) negative davio, and (d) generalized davio where a ∈ {a,a′}. variable, that is one tree level, figure 4 represents the expansion nodes for ternary shannon, davio and generalized davio (d), respectively. s 0 x 1 x 2 x (a) d0 1 x 2 (b) d2 1 (d) d 1 (e) d1 1 ( ’)x 2 ( ’’)x 2 ( )x 2 (c) x xx’ x’’ figure 4: ternary nodes for ternary decision trees: (a) shannon, (b) davio0, (c) davio1, (d) davio2, and (e) generalized davio (d) where x ∈ {x,x′,x′′}. in correspondence to the binary s/d trees, one can produce ternary s/d trees. to define the ternary s/d trees, we define the generalized davio node over ternary galois radix as shown in figure 4(e). our notation here is that (x) corresponds to the three possible shifts of the variable x as follows: x ∈ {x,x′,x′′}, over gf(3). (28) definition 1. the ternary tree with ternary shannon and ternary generalized davio nodes, that generates other ternary trees, is called ternary shannon-davio (s/d) tree. utilizing the definition of ternary shannon in figure 4(a) and ternary generalized davio in figure 4(e), we obtain the ternary shannon-davio trees (ternary s/d trees) for two variables as shown in figure 5. from the ternary s/d dts shown in figure 5, if we take any s/d tree and multiply the second-level cofactors (which are in the tdt leaves) each by the corresponding path in that tdt, and sum all the resulting cubes (terms) over gf(3), we obtain the flattened form of the function f , as a certain gfsop expression. for each tdt in figure 5, there are as many forms obtained for the function f as the number of possible permutations of the polarities of the variables in the second-level branches of each tdt. definition 2. the family of all possible forms obtained per ternary s/d tree is called ternary inclusive forms (tifs). the numbers of these tifs per tdt for two variables are shown on top of each s/d tdt in figure 5. by observing figure 5, we can generate the corresponding flattened forms by two methods. a classical method, per analogy with well-known binary forms, would be to create every transform matrix for every tif s/d tree and then expand using that transform matrix. a better method is to create one flattened 108 an extended green sasao hierarchy of canonical ternary galois forms ... s a’ a ( )a pd 1 a ( )b nd 1 a’ d 1 a ( )d( )c figure 3: two-valued nodes: (a) shannon, (b) positive davio, (c) negative davio, and (d) generalized davio where a ∈ {a,a′}. variable, that is one tree level, figure 4 represents the expansion nodes for ternary shannon, davio and generalized davio (d), respectively. s 0 x 1 x 2 x (a) d0 1 x 2 (b) d2 1 (d) d 1 (e) d1 1 ( ’)x 2 ( ’’)x 2 ( )x 2 (c) x xx’ x’’ figure 4: ternary nodes for ternary decision trees: (a) shannon, (b) davio0, (c) davio1, (d) davio2, and (e) generalized davio (d) where x ∈ {x,x′,x′′}. in correspondence to the binary s/d trees, one can produce ternary s/d trees. to define the ternary s/d trees, we define the generalized davio node over ternary galois radix as shown in figure 4(e). our notation here is that (x) corresponds to the three possible shifts of the variable x as follows: x ∈ {x,x′,x′′}, over gf(3). (28) definition 1. the ternary tree with ternary shannon and ternary generalized davio nodes, that generates other ternary trees, is called ternary shannon-davio (s/d) tree. utilizing the definition of ternary shannon in figure 4(a) and ternary generalized davio in figure 4(e), we obtain the ternary shannon-davio trees (ternary s/d trees) for two variables as shown in figure 5. from the ternary s/d dts shown in figure 5, if we take any s/d tree and multiply the second-level cofactors (which are in the tdt leaves) each by the corresponding path in that tdt, and sum all the resulting cubes (terms) over gf(3), we obtain the flattened form of the function f , as a certain gfsop expression. for each tdt in figure 5, there are as many forms obtained for the function f as the number of possible permutations of the polarities of the variables in the second-level branches of each tdt. definition 2. the family of all possible forms obtained per ternary s/d tree is called ternary inclusive forms (tifs). the numbers of these tifs per tdt for two variables are shown on top of each s/d tdt in figure 5. by observing figure 5, we can generate the corresponding flattened forms by two methods. a classical method, per analogy with well-known binary forms, would be to create every transform matrix for every tif s/d tree and then expand using that transform matrix. a better method is to create one flattened 108 an extended green sasao hierarchy of canonical ternary galois forms ... form which is an expansion over certain transform matrix (i.e., certain tif) and then transform systematically from one form to another form without the need to create all transform matrices from the corresponding s/d trees. this general approach can lead to several algorithms of various complexities that generalize the existing binary algorithms to obtain the corresponding forms such as fprm, grm and if forms. example 3. using the result of example 2 for the expansion of f (x1,x2) in terms of the ternary shannon expansion (that resembles the s/d tree for shannon nodes in both levels): f = 0x1 1 x2 + 2 · 0 x1 2 x2 + 2 · 1 x1 0 x2 + 2 · 1 x1 1 x2 + 2 · 1 x1 2 x2 + 2 x1 0 x2 + 2 · 2 x1 2 x2, (29) we can substitute any of equations (7) (15), or a mix of these equations, to transform one flattened form to another. for example, if we substitute equations (7) and (11), we obtain: f =(2(x1) 2 + 1)(2(x′2) 2 + x′2)+ 2(2(x1) 2 + 1)2x2 + 2(2(x ′ 1) 2 + x′1)(2(x2) 2 + 1) + 2(2(x′1) 2 + x′1)(2(x ′ 2) 2 + x′2)+ 2(2(x ′ 1) 2 + x′1) · 2 x2 + 2 x1(2(x2) 2 + 1) + 2 · 2x1 2 x2, (30) by utilizing gf addition and multiplication operators from figure 1, equation (30) can be transformed to: f =(x1) 2(x′2) 2 + 2(x1) 2(x′2)+ 2(x ′ 2) 2 + x′2 + (x1) 2( 2x2)+ 2( 2 x2)+ 2(x ′ 1) 2(x2) 2 +(x′1) 2 +(x′1)(x2) 2 + 2x′1 + 2(x ′ 1) 2(x′2) 2 +(x′1) 2(x′2)+ (x ′ 1)(x ′ 2) 2 + 2x′1x ′ 2 + (x′1) 2 · 2x2 + 2(x ′ 1) 2 x2 + 2( 2 x1)(x2) 2 + 2x1 + 2( 2 x1)( 2 x2). (31) let us define, as one of possible definitions, the cost of the flattened form (expression) to be: cost = # cubes (32) then, we observe that equation (29) has the cost of seven, while equation (31) has the cost of 19. thus, the inverse transformations applied to equation (31) would lead to equation (29) and a reduction of cost from 19 to seven. using the same approach, we can generate a subset of possible gfsop expressions (flattened forms). note that all of these gfsop expressions are equivalent since they produce the same function in different forms. yet, as can be observed from equation (31), by further transformations of equation (29) from one form to another, some transformations produce flattened forms with a smaller number of cubes than the others. from this observation rises the idea of a possible application of evolutionary computing using the s/d trees and related transformations to produce the corresponding minimum gfsops. analogous to the binary case in equation (27), the ternary gifs can be defined as the union of ternary ifs. definition 3. the family of forms, which is created as a union of sets of tifs for all variable orders, is called ternary generalized inclusive forms (tgifs). 109 an extended green sasao hierarchy of canonical ternary galois forms ... form which is an expansion over certain transform matrix (i.e., certain tif) and then transform systematically from one form to another form without the need to create all transform matrices from the corresponding s/d trees. this general approach can lead to several algorithms of various complexities that generalize the existing binary algorithms to obtain the corresponding forms such as fprm, grm and if forms. example 3. using the result of example 2 for the expansion of f (x1,x2) in terms of the ternary shannon expansion (that resembles the s/d tree for shannon nodes in both levels): f = 0x1 1 x2 + 2 · 0 x1 2 x2 + 2 · 1 x1 0 x2 + 2 · 1 x1 1 x2 + 2 · 1 x1 2 x2 + 2 x1 0 x2 + 2 · 2 x1 2 x2, (29) we can substitute any of equations (7) (15), or a mix of these equations, to transform one flattened form to another. for example, if we substitute equations (7) and (11), we obtain: f =(2(x1) 2 + 1)(2(x′2) 2 + x′2)+ 2(2(x1) 2 + 1)2x2 + 2(2(x ′ 1) 2 + x′1)(2(x2) 2 + 1) + 2(2(x′1) 2 + x′1)(2(x ′ 2) 2 + x′2)+ 2(2(x ′ 1) 2 + x′1) · 2 x2 + 2 x1(2(x2) 2 + 1) + 2 · 2x1 2 x2, (30) by utilizing gf addition and multiplication operators from figure 1, equation (30) can be transformed to: f =(x1) 2(x′2) 2 + 2(x1) 2(x′2)+ 2(x ′ 2) 2 + x′2 + (x1) 2( 2x2)+ 2( 2 x2)+ 2(x ′ 1) 2(x2) 2 +(x′1) 2 +(x′1)(x2) 2 + 2x′1 + 2(x ′ 1) 2(x′2) 2 +(x′1) 2(x′2)+ (x ′ 1)(x ′ 2) 2 + 2x′1x ′ 2 + (x′1) 2 · 2x2 + 2(x ′ 1) 2 x2 + 2( 2 x1)(x2) 2 + 2x1 + 2( 2 x1)( 2 x2). (31) let us define, as one of possible definitions, the cost of the flattened form (expression) to be: cost = # cubes (32) then, we observe that equation (29) has the cost of seven, while equation (31) has the cost of 19. thus, the inverse transformations applied to equation (31) would lead to equation (29) and a reduction of cost from 19 to seven. using the same approach, we can generate a subset of possible gfsop expressions (flattened forms). note that all of these gfsop expressions are equivalent since they produce the same function in different forms. yet, as can be observed from equation (31), by further transformations of equation (29) from one form to another, some transformations produce flattened forms with a smaller number of cubes than the others. from this observation rises the idea of a possible application of evolutionary computing using the s/d trees and related transformations to produce the corresponding minimum gfsops. analogous to the binary case in equation (27), the ternary gifs can be defined as the union of ternary ifs. definition 3. the family of forms, which is created as a union of sets of tifs for all variable orders, is called ternary generalized inclusive forms (tgifs). 109 an extended green sasao hierarchy of canonical ternary galois forms ... form which is an expansion over certain transform matrix (i.e., certain tif) and then transform systematically from one form to another form without the need to create all transform matrices from the corresponding s/d trees. this general approach can lead to several algorithms of various complexities that generalize the existing binary algorithms to obtain the corresponding forms such as fprm, grm and if forms. example 3. using the result of example 2 for the expansion of f (x1,x2) in terms of the ternary shannon expansion (that resembles the s/d tree for shannon nodes in both levels): f = 0x1 1 x2 + 2 · 0 x1 2 x2 + 2 · 1 x1 0 x2 + 2 · 1 x1 1 x2 + 2 · 1 x1 2 x2 + 2 x1 0 x2 + 2 · 2 x1 2 x2, (29) we can substitute any of equations (7) (15), or a mix of these equations, to transform one flattened form to another. for example, if we substitute equations (7) and (11), we obtain: f =(2(x1) 2 + 1)(2(x′2) 2 + x′2)+ 2(2(x1) 2 + 1)2x2 + 2(2(x ′ 1) 2 + x′1)(2(x2) 2 + 1) + 2(2(x′1) 2 + x′1)(2(x ′ 2) 2 + x′2)+ 2(2(x ′ 1) 2 + x′1) · 2 x2 + 2 x1(2(x2) 2 + 1) + 2 · 2x1 2 x2, (30) by utilizing gf addition and multiplication operators from figure 1, equation (30) can be transformed to: f =(x1) 2(x′2) 2 + 2(x1) 2(x′2)+ 2(x ′ 2) 2 + x′2 + (x1) 2( 2x2)+ 2( 2 x2)+ 2(x ′ 1) 2(x2) 2 +(x′1) 2 +(x′1)(x2) 2 + 2x′1 + 2(x ′ 1) 2(x′2) 2 +(x′1) 2(x′2)+ (x ′ 1)(x ′ 2) 2 + 2x′1x ′ 2 + (x′1) 2 · 2x2 + 2(x ′ 1) 2 x2 + 2( 2 x1)(x2) 2 + 2x1 + 2( 2 x1)( 2 x2). (31) let us define, as one of possible definitions, the cost of the flattened form (expression) to be: cost = # cubes (32) then, we observe that equation (29) has the cost of seven, while equation (31) has the cost of 19. thus, the inverse transformations applied to equation (31) would lead to equation (29) and a reduction of cost from 19 to seven. using the same approach, we can generate a subset of possible gfsop expressions (flattened forms). note that all of these gfsop expressions are equivalent since they produce the same function in different forms. yet, as can be observed from equation (31), by further transformations of equation (29) from one form to another, some transformations produce flattened forms with a smaller number of cubes than the others. from this observation rises the idea of a possible application of evolutionary computing using the s/d trees and related transformations to produce the corresponding minimum gfsops. analogous to the binary case in equation (27), the ternary gifs can be defined as the union of ternary ifs. definition 3. the family of forms, which is created as a union of sets of tifs for all variable orders, is called ternary generalized inclusive forms (tgifs). 109 an extended green sasao hierarchy of canonical ternary galois forms ... form which is an expansion over certain transform matrix (i.e., certain tif) and then transform systematically from one form to another form without the need to create all transform matrices from the corresponding s/d trees. this general approach can lead to several algorithms of various complexities that generalize the existing binary algorithms to obtain the corresponding forms such as fprm, grm and if forms. example 3. using the result of example 2 for the expansion of f (x1,x2) in terms of the ternary shannon expansion (that resembles the s/d tree for shannon nodes in both levels): f = 0x1 1 x2 + 2 · 0 x1 2 x2 + 2 · 1 x1 0 x2 + 2 · 1 x1 1 x2 + 2 · 1 x1 2 x2 + 2 x1 0 x2 + 2 · 2 x1 2 x2, (29) we can substitute any of equations (7) (15), or a mix of these equations, to transform one flattened form to another. for example, if we substitute equations (7) and (11), we obtain: f =(2(x1) 2 + 1)(2(x′2) 2 + x′2)+ 2(2(x1) 2 + 1)2x2 + 2(2(x ′ 1) 2 + x′1)(2(x2) 2 + 1) + 2(2(x′1) 2 + x′1)(2(x ′ 2) 2 + x′2)+ 2(2(x ′ 1) 2 + x′1) · 2 x2 + 2 x1(2(x2) 2 + 1) + 2 · 2x1 2 x2, (30) by utilizing gf addition and multiplication operators from figure 1, equation (30) can be transformed to: f =(x1) 2(x′2) 2 + 2(x1) 2(x′2)+ 2(x ′ 2) 2 + x′2 + (x1) 2( 2x2)+ 2( 2 x2)+ 2(x ′ 1) 2(x2) 2 +(x′1) 2 +(x′1)(x2) 2 + 2x′1 + 2(x ′ 1) 2(x′2) 2 +(x′1) 2(x′2)+ (x ′ 1)(x ′ 2) 2 + 2x′1x ′ 2 + (x′1) 2 · 2x2 + 2(x ′ 1) 2 x2 + 2( 2 x1)(x2) 2 + 2x1 + 2( 2 x1)( 2 x2). (31) let us define, as one of possible definitions, the cost of the flattened form (expression) to be: cost = # cubes (32) then, we observe that equation (29) has the cost of seven, while equation (31) has the cost of 19. thus, the inverse transformations applied to equation (31) would lead to equation (29) and a reduction of cost from 19 to seven. using the same approach, we can generate a subset of possible gfsop expressions (flattened forms). note that all of these gfsop expressions are equivalent since they produce the same function in different forms. yet, as can be observed from equation (31), by further transformations of equation (29) from one form to another, some transformations produce flattened forms with a smaller number of cubes than the others. from this observation rises the idea of a possible application of evolutionary computing using the s/d trees and related transformations to produce the corresponding minimum gfsops. analogous to the binary case in equation (27), the ternary gifs can be defined as the union of ternary ifs. definition 3. the family of forms, which is created as a union of sets of tifs for all variable orders, is called ternary generalized inclusive forms (tgifs). 109 an extended green sasao hierarchy of canonical ternary galois forms ... s a’ a ( )a pd 1 a ( )b nd 1 a’ d 1 a ( )d( )c figure 3: two-valued nodes: (a) shannon, (b) positive davio, (c) negative davio, and (d) generalized davio where a ∈ {a,a′}. variable, that is one tree level, figure 4 represents the expansion nodes for ternary shannon, davio and generalized davio (d), respectively. s 0 x 1 x 2 x (a) d0 1 x 2 (b) d2 1 (d) d 1 (e) d1 1 ( ’)x 2 ( ’’)x 2 ( )x 2 (c) x xx’ x’’ figure 4: ternary nodes for ternary decision trees: (a) shannon, (b) davio0, (c) davio1, (d) davio2, and (e) generalized davio (d) where x ∈ {x,x′,x′′}. in correspondence to the binary s/d trees, one can produce ternary s/d trees. to define the ternary s/d trees, we define the generalized davio node over ternary galois radix as shown in figure 4(e). our notation here is that (x) corresponds to the three possible shifts of the variable x as follows: x ∈ {x,x′,x′′}, over gf(3). (28) definition 1. the ternary tree with ternary shannon and ternary generalized davio nodes, that generates other ternary trees, is called ternary shannon-davio (s/d) tree. utilizing the definition of ternary shannon in figure 4(a) and ternary generalized davio in figure 4(e), we obtain the ternary shannon-davio trees (ternary s/d trees) for two variables as shown in figure 5. from the ternary s/d dts shown in figure 5, if we take any s/d tree and multiply the second-level cofactors (which are in the tdt leaves) each by the corresponding path in that tdt, and sum all the resulting cubes (terms) over gf(3), we obtain the flattened form of the function f , as a certain gfsop expression. for each tdt in figure 5, there are as many forms obtained for the function f as the number of possible permutations of the polarities of the variables in the second-level branches of each tdt. definition 2. the family of all possible forms obtained per ternary s/d tree is called ternary inclusive forms (tifs). the numbers of these tifs per tdt for two variables are shown on top of each s/d tdt in figure 5. by observing figure 5, we can generate the corresponding flattened forms by two methods. a classical method, per analogy with well-known binary forms, would be to create every transform matrix for every tif s/d tree and then expand using that transform matrix. a better method is to create one flattened 108 56 a. n. al-rabadi an extended green sasao hierarchy of canonical ternary galois forms and universal logic modules 57 an extended green sasao hierarchy of canonical ternary galois forms ... form which is an expansion over certain transform matrix (i.e., certain tif) and then transform systematically from one form to another form without the need to create all transform matrices from the corresponding s/d trees. this general approach can lead to several algorithms of various complexities that generalize the existing binary algorithms to obtain the corresponding forms such as fprm, grm and if forms. example 3. using the result of example 2 for the expansion of f (x1,x2) in terms of the ternary shannon expansion (that resembles the s/d tree for shannon nodes in both levels): f = 0x1 1 x2 + 2 · 0 x1 2 x2 + 2 · 1 x1 0 x2 + 2 · 1 x1 1 x2 + 2 · 1 x1 2 x2 + 2 x1 0 x2 + 2 · 2 x1 2 x2, (29) we can substitute any of equations (7) (15), or a mix of these equations, to transform one flattened form to another. for example, if we substitute equations (7) and (11), we obtain: f =(2(x1) 2 + 1)(2(x′2) 2 + x′2)+ 2(2(x1) 2 + 1)2x2 + 2(2(x ′ 1) 2 + x′1)(2(x2) 2 + 1) + 2(2(x′1) 2 + x′1)(2(x ′ 2) 2 + x′2)+ 2(2(x ′ 1) 2 + x′1) · 2 x2 + 2 x1(2(x2) 2 + 1) + 2 · 2x1 2 x2, (30) by utilizing gf addition and multiplication operators from figure 1, equation (30) can be transformed to: f =(x1) 2(x′2) 2 + 2(x1) 2(x′2)+ 2(x ′ 2) 2 + x′2 +(x1) 2( 2x2)+ 2( 2 x2)+ 2(x ′ 1) 2(x2) 2 + (x′1) 2 + (x′1)(x2) 2 + 2x′1 + 2(x ′ 1) 2(x′2) 2 +(x′1) 2(x′2)+ (x ′ 1)(x ′ 2) 2 + 2x′1x ′ 2 + (x′1) 2 · 2x2 + 2(x ′ 1) 2 x2 + 2( 2 x1)(x2) 2 + 2x1 + 2( 2 x1)( 2 x2). (31) let us define, as one of possible definitions, the cost of the flattened form (expression) to be: cost = # cubes (32) then, we observe that equation (29) has the cost of seven, while equation (31) has the cost of 19. thus, the inverse transformations applied to equation (31) would lead to equation (29) and a reduction of cost from 19 to seven. using the same approach, we can generate a subset of possible gfsop expressions (flattened forms). note that all of these gfsop expressions are equivalent since they produce the same function in different forms. yet, as can be observed from equation (31), by further transformations of equation (29) from one form to another, some transformations produce flattened forms with a smaller number of cubes than the others. from this observation rises the idea of a possible application of evolutionary computing using the s/d trees and related transformations to produce the corresponding minimum gfsops. analogous to the binary case in equation (27), the ternary gifs can be defined as the union of ternary ifs. definition 3. the family of forms, which is created as a union of sets of tifs for all variable orders, is called ternary generalized inclusive forms (tgifs). 109 an extended green sasao hierarchy of canonical ternary galois forms ... form which is an expansion over certain transform matrix (i.e., certain tif) and then transform systematically from one form to another form without the need to create all transform matrices from the corresponding s/d trees. this general approach can lead to several algorithms of various complexities that generalize the existing binary algorithms to obtain the corresponding forms such as fprm, grm and if forms. example 3. using the result of example 2 for the expansion of f (x1,x2) in terms of the ternary shannon expansion (that resembles the s/d tree for shannon nodes in both levels): f = 0x1 1 x2 + 2 · 0 x1 2 x2 + 2 · 1 x1 0 x2 + 2 · 1 x1 1 x2 + 2 · 1 x1 2 x2 + 2 x1 0 x2 + 2 · 2 x1 2 x2, (29) we can substitute any of equations (7) (15), or a mix of these equations, to transform one flattened form to another. for example, if we substitute equations (7) and (11), we obtain: f =(2(x1) 2 + 1)(2(x′2) 2 + x′2)+ 2(2(x1) 2 + 1)2x2 + 2(2(x ′ 1) 2 + x′1)(2(x2) 2 + 1) + 2(2(x′1) 2 + x′1)(2(x ′ 2) 2 + x′2)+ 2(2(x ′ 1) 2 + x′1) · 2 x2 + 2 x1(2(x2) 2 + 1) + 2 · 2x1 2 x2, (30) by utilizing gf addition and multiplication operators from figure 1, equation (30) can be transformed to: f =(x1) 2(x′2) 2 + 2(x1) 2(x′2)+ 2(x ′ 2) 2 + x′2 +(x1) 2( 2x2)+ 2( 2 x2)+ 2(x ′ 1) 2(x2) 2 + (x′1) 2 + (x′1)(x2) 2 + 2x′1 + 2(x ′ 1) 2(x′2) 2 +(x′1) 2(x′2)+ (x ′ 1)(x ′ 2) 2 + 2x′1x ′ 2 + (x′1) 2 · 2x2 + 2(x ′ 1) 2 x2 + 2( 2 x1)(x2) 2 + 2x1 + 2( 2 x1)( 2 x2). (31) let us define, as one of possible definitions, the cost of the flattened form (expression) to be: cost = # cubes (32) then, we observe that equation (29) has the cost of seven, while equation (31) has the cost of 19. thus, the inverse transformations applied to equation (31) would lead to equation (29) and a reduction of cost from 19 to seven. using the same approach, we can generate a subset of possible gfsop expressions (flattened forms). note that all of these gfsop expressions are equivalent since they produce the same function in different forms. yet, as can be observed from equation (31), by further transformations of equation (29) from one form to another, some transformations produce flattened forms with a smaller number of cubes than the others. from this observation rises the idea of a possible application of evolutionary computing using the s/d trees and related transformations to produce the corresponding minimum gfsops. analogous to the binary case in equation (27), the ternary gifs can be defined as the union of ternary ifs. definition 3. the family of forms, which is created as a union of sets of tifs for all variable orders, is called ternary generalized inclusive forms (tgifs). 109 an extended green sasao hierarchy of canonical ternary galois forms ... form which is an expansion over certain transform matrix (i.e., certain tif) and then transform systematically from one form to another form without the need to create all transform matrices from the corresponding s/d trees. this general approach can lead to several algorithms of various complexities that generalize the existing binary algorithms to obtain the corresponding forms such as fprm, grm and if forms. example 3. using the result of example 2 for the expansion of f (x1,x2) in terms of the ternary shannon expansion (that resembles the s/d tree for shannon nodes in both levels): f = 0x1 1 x2 + 2 · 0 x1 2 x2 + 2 · 1 x1 0 x2 + 2 · 1 x1 1 x2 + 2 · 1 x1 2 x2 + 2 x1 0 x2 + 2 · 2 x1 2 x2, (29) we can substitute any of equations (7) (15), or a mix of these equations, to transform one flattened form to another. for example, if we substitute equations (7) and (11), we obtain: f =(2(x1) 2 + 1)(2(x′2) 2 + x′2)+ 2(2(x1) 2 + 1)2x2 + 2(2(x ′ 1) 2 + x′1)(2(x2) 2 + 1) + 2(2(x′1) 2 + x′1)(2(x ′ 2) 2 + x′2)+ 2(2(x ′ 1) 2 + x′1) · 2 x2 + 2 x1(2(x2) 2 + 1) + 2 · 2x1 2 x2, (30) by utilizing gf addition and multiplication operators from figure 1, equation (30) can be transformed to: f =(x1) 2(x′2) 2 + 2(x1) 2(x′2)+ 2(x ′ 2) 2 + x′2 +(x1) 2( 2x2)+ 2( 2 x2)+ 2(x ′ 1) 2(x2) 2 + (x′1) 2 + (x′1)(x2) 2 + 2x′1 + 2(x ′ 1) 2(x′2) 2 +(x′1) 2(x′2)+ (x ′ 1)(x ′ 2) 2 + 2x′1x ′ 2 + (x′1) 2 · 2x2 + 2(x ′ 1) 2 x2 + 2( 2 x1)(x2) 2 + 2x1 + 2( 2 x1)( 2 x2). (31) let us define, as one of possible definitions, the cost of the flattened form (expression) to be: cost = # cubes (32) then, we observe that equation (29) has the cost of seven, while equation (31) has the cost of 19. thus, the inverse transformations applied to equation (31) would lead to equation (29) and a reduction of cost from 19 to seven. using the same approach, we can generate a subset of possible gfsop expressions (flattened forms). note that all of these gfsop expressions are equivalent since they produce the same function in different forms. yet, as can be observed from equation (31), by further transformations of equation (29) from one form to another, some transformations produce flattened forms with a smaller number of cubes than the others. from this observation rises the idea of a possible application of evolutionary computing using the s/d trees and related transformations to produce the corresponding minimum gfsops. analogous to the binary case in equation (27), the ternary gifs can be defined as the union of ternary ifs. definition 3. the family of forms, which is created as a union of sets of tifs for all variable orders, is called ternary generalized inclusive forms (tgifs). 109 an extended green sasao hierarchy of canonical ternary galois forms ... form which is an expansion over certain transform matrix (i.e., certain tif) and then transform systematically from one form to another form without the need to create all transform matrices from the corresponding s/d trees. this general approach can lead to several algorithms of various complexities that generalize the existing binary algorithms to obtain the corresponding forms such as fprm, grm and if forms. example 3. using the result of example 2 for the expansion of f (x1,x2) in terms of the ternary shannon expansion (that resembles the s/d tree for shannon nodes in both levels): f = 0x1 1 x2 + 2 · 0 x1 2 x2 + 2 · 1 x1 0 x2 + 2 · 1 x1 1 x2 + 2 · 1 x1 2 x2 + 2 x1 0 x2 + 2 · 2 x1 2 x2, (29) we can substitute any of equations (7) (15), or a mix of these equations, to transform one flattened form to another. for example, if we substitute equations (7) and (11), we obtain: f =(2(x1) 2 + 1)(2(x′2) 2 + x′2)+ 2(2(x1) 2 + 1)2x2 + 2(2(x ′ 1) 2 + x′1)(2(x2) 2 + 1) + 2(2(x′1) 2 + x′1)(2(x ′ 2) 2 + x′2)+ 2(2(x ′ 1) 2 + x′1) · 2 x2 + 2 x1(2(x2) 2 + 1) + 2 · 2x1 2 x2, (30) by utilizing gf addition and multiplication operators from figure 1, equation (30) can be transformed to: f =(x1) 2(x′2) 2 + 2(x1) 2(x′2)+ 2(x ′ 2) 2 + x′2 +(x1) 2( 2x2)+ 2( 2 x2)+ 2(x ′ 1) 2(x2) 2 + (x′1) 2 + (x′1)(x2) 2 + 2x′1 + 2(x ′ 1) 2(x′2) 2 +(x′1) 2(x′2)+ (x ′ 1)(x ′ 2) 2 + 2x′1x ′ 2 + (x′1) 2 · 2x2 + 2(x ′ 1) 2 x2 + 2( 2 x1)(x2) 2 + 2x1 + 2( 2 x1)( 2 x2). (31) let us define, as one of possible definitions, the cost of the flattened form (expression) to be: cost = # cubes (32) then, we observe that equation (29) has the cost of seven, while equation (31) has the cost of 19. thus, the inverse transformations applied to equation (31) would lead to equation (29) and a reduction of cost from 19 to seven. using the same approach, we can generate a subset of possible gfsop expressions (flattened forms). note that all of these gfsop expressions are equivalent since they produce the same function in different forms. yet, as can be observed from equation (31), by further transformations of equation (29) from one form to another, some transformations produce flattened forms with a smaller number of cubes than the others. from this observation rises the idea of a possible application of evolutionary computing using the s/d trees and related transformations to produce the corresponding minimum gfsops. analogous to the binary case in equation (27), the ternary gifs can be defined as the union of ternary ifs. definition 3. the family of forms, which is created as a union of sets of tifs for all variable orders, is called ternary generalized inclusive forms (tgifs). 109 58 a. n. al-rabadi an extended green sasao hierarchy of canonical ternary galois forms and universal logic modules 59 an extended green sasao hierarchy of canonical ternary galois forms ... s s s s 0 b 1 b 2 b s s d d s s s d 1 (a,) 2 a, s d d d 1 (a,) 2 a, s s s 0 b 0 b 1 b 1 b 2 b 2 b d 1 1 1 1 1 1 1 1 1 1 11 1 1 1 1 1 1 1 1 1 1 1(b,) 2 (b,) 2 (b,) 2 (b,) 2 (b,) 2 (b,) 2 (b,) 2 (b,) 2 (b,) 2 (b,) 2 (b,) 2 (b,) 2 (b,) 2 (b,) 2 (b,) 2 (b,) 2 (b,) 2 (b,) 2 (b,) 2 (b,) 2 (b,) 2 (b,) 2 (b,) 2 (b,) 2 b, b, b, b, b, b, b, b, b, b, b,b, b, b, b, b, b, b, b, b, b, b, b, s s d d s s d d 1 (a,) 2 a, sd d d 1 (a,) 2 a, s s s 0 b 0 b 0 b 0 b 0 b 0 b 0 b 0 b 0 b 0 b 0b 0 b 0 b 0 b 0 b 0b0b 0 b 0 b 1 b 1 b 1 b 1 b 1 b 1 b 1 b 1 b 1 b 1 b 1b 1 b 1 b 1 b 1 b 1b1b 1 b 1 b 2 b 2 b 2 b 2 b 2 b 2 b 2 b 2 b 2 b 2 b 2 b 2 b 2 b 2 b 2 b 2b2b 2 b 2 b d 1 b, s sd d s s s sd d 1 (a,) 2 a, d d d d 1 (a,) 2 a, s s s d s d dd d d 1 (a,) 2 a, s d0 d1 d 1 (a,) 2 a, n=1 n=81 n=729 n=59049 n=9 n=81 n=6561 n=59049 n=9 n=81 n=6561 n=59049 n=9 n=729 n=6561 n=531441 0 a 0 a 0 b 0 b 0 a 0 a 0 a 0 a 0a 0 a 1 a 1 a 1 b 1 b 1 a 1 a 1 a 1 a 1a 1 a 2 a 2 a 2 b 2 b 2 a 2 a 2 a 2 a 2a 2 a (a) figure 5: ternary if (tif) s/d trees and their numbers: (a) 2-variable order {a,b} and (b) 2variable order {b,a}. variables {a,b} are defined as in eq. (28) for generalized davio (d) where in this figure (a,≡ a) and (b,≡ b). theorem 2. the total number of the ternary ifs, for two variables and for orders {a,b} and {b,a}, and the total number of ternary generalized ifs, for two variables, are respectively: #tifa,b = 1 ·(3) 0 + 3 ·(3)2 + 3 ·(3)4 + 2 ·(3)6 + 3 ·(3)8 + 3 ·(3)10 + 1 ·(3)12 = 730,000, (33) #tifb,a = 1 ·(3) 0 + 3 ·(3)2 + 3 ·(3)4 + 2 ·(3)6 + 3 ·(3)8 + 3 ·(3)10 + 1 ·(3)12 = 730,000, (34) #tgif = #tifa,b + #tifb,a−#(tifa,b ∩ tifb,a) = 2 · #tif − #(tifa,b ∩ tifb,a) = 2 ·(730,000)− (1 ·(3)0 + 2 ·(3)6 + 1 ·(3)12) = 927,100. (35) proof. by observing figure 5, we note that the total number of tifs for orders {a,b} and {b,a} is the sum of the numbers on top of s/d trees that leads to equations (33) (35), respectively. 2.2 properties of tifs and tgifs the following present basic properties of the presented tifs and tgifs. theorem 3. each ternary inclusive form (tif) is canonical. 110 an extended green sasao hierarchy of canonical ternary galois forms ... s s s s 0 b 1 b 2 b s s d d s s s d 1 (a,) 2 a, s d d d 1 (a,) 2 a, s s s 0 b 0 b 1 b 1 b 2 b 2 b d 1 1 1 1 1 1 1 1 1 1 11 1 1 1 1 1 1 1 1 1 1 1(b,) 2 (b,) 2 (b,) 2 (b,) 2 (b,) 2 (b,) 2 (b,) 2 (b,) 2 (b,) 2 (b,) 2 (b,) 2 (b,) 2 (b,) 2 (b,) 2 (b,) 2 (b,) 2 (b,) 2 (b,) 2 (b,) 2 (b,) 2 (b,) 2 (b,) 2 (b,) 2 (b,) 2 b, b, b, b, b, b, b, b, b, b, b,b, b, b, b, b, b, b, b, b, b, b, b, s s d d s s d d 1 (a,) 2 a, sd d d 1 (a,) 2 a, s s s 0 b 0 b 0 b 0 b 0 b 0 b 0 b 0 b 0 b 0 b 0b 0 b 0 b 0 b 0 b 0b0b 0 b 0 b 1 b 1 b 1 b 1 b 1 b 1 b 1 b 1 b 1 b 1 b 1b 1 b 1 b 1 b 1 b 1b1b 1 b 1 b 2 b 2 b 2 b 2 b 2 b 2 b 2 b 2 b 2 b 2 b 2 b 2 b 2 b 2 b 2 b 2b2b 2 b 2 b d 1 b, s sd d s s s sd d 1 (a,) 2 a, d d d d 1 (a,) 2 a, s s s d s d dd d d 1 (a,) 2 a, s d0 d1 d 1 (a,) 2 a, n=1 n=81 n=729 n=59049 n=9 n=81 n=6561 n=59049 n=9 n=81 n=6561 n=59049 n=9 n=729 n=6561 n=531441 0 a 0 a 0 b 0 b 0 a 0 a 0 a 0 a 0a 0 a 1 a 1 a 1 b 1 b 1 a 1 a 1 a 1 a 1a 1 a 2 a 2 a 2 b 2 b 2 a 2 a 2 a 2 a 2a 2 a (a) figure 5: ternary if (tif) s/d trees and their numbers: (a) 2-variable order {a,b} and (b) 2variable order {b,a}. variables {a,b} are defined as in eq. (28) for generalized davio (d) where in this figure (a,≡ a) and (b,≡ b). theorem 2. the total number of the ternary ifs, for two variables and for orders {a,b} and {b,a}, and the total number of ternary generalized ifs, for two variables, are respectively: #tifa,b = 1 ·(3) 0 + 3 ·(3)2 + 3 ·(3)4 + 2 ·(3)6 + 3 ·(3)8 + 3 ·(3)10 + 1 ·(3)12 = 730,000, (33) #tifb,a = 1 ·(3) 0 + 3 ·(3)2 + 3 ·(3)4 + 2 ·(3)6 + 3 ·(3)8 + 3 ·(3)10 + 1 ·(3)12 = 730,000, (34) #tgif = #tifa,b + #tifb,a−#(tifa,b ∩ tifb,a) = 2 · #tif − #(tifa,b ∩ tifb,a) = 2 ·(730,000)− (1 ·(3)0 + 2 ·(3)6 + 1 ·(3)12) = 927,100. (35) proof. by observing figure 5, we note that the total number of tifs for orders {a,b} and {b,a} is the sum of the numbers on top of s/d trees that leads to equations (33) (35), respectively. 2.2 properties of tifs and tgifs the following present basic properties of the presented tifs and tgifs. theorem 3. each ternary inclusive form (tif) is canonical. 110 an extended green sasao hierarchy of canonical ternary galois forms ... s s s s 0 b 1 b 2 b s s d d s s s d 1 (a,) 2 a, s d d d 1 (a,) 2 a, s s s 0 b 0 b 1 b 1 b 2 b 2 b d 1 1 1 1 1 1 1 1 1 1 11 1 1 1 1 1 1 1 1 1 1 1(b,) 2 (b,) 2 (b,) 2 (b,) 2 (b,) 2 (b,) 2 (b,) 2 (b,) 2 (b,) 2 (b,) 2 (b,) 2 (b,) 2 (b,) 2 (b,) 2 (b,) 2 (b,) 2 (b,) 2 (b,) 2 (b,) 2 (b,) 2 (b,) 2 (b,) 2 (b,) 2 (b,) 2 b, b, b, b, b, b, b, b, b, b, b,b, b, b, b, b, b, b, b, b, b, b, b, s s d d s s d d 1 (a,) 2 a, sd d d 1 (a,) 2 a, s s s 0 b 0 b 0 b 0 b 0 b 0 b 0 b 0 b 0 b 0 b 0b 0 b 0 b 0 b 0 b 0b0b 0 b 0 b 1 b 1 b 1 b 1 b 1 b 1 b 1 b 1 b 1 b 1 b 1b 1 b 1 b 1 b 1 b 1b1b 1 b 1 b 2 b 2 b 2 b 2 b 2 b 2 b 2 b 2 b 2 b 2 b 2 b 2 b 2 b 2 b 2 b 2b2b 2 b 2 b d 1 b, s sd d s s s sd d 1 (a,) 2 a, d d d d 1 (a,) 2 a, s s s d s d dd d d 1 (a,) 2 a, s d0 d1 d 1 (a,) 2 a, n=1 n=81 n=729 n=59049 n=9 n=81 n=6561 n=59049 n=9 n=81 n=6561 n=59049 n=9 n=729 n=6561 n=531441 0 a 0 a 0 b 0 b 0 a 0 a 0 a 0 a 0a 0 a 1 a 1 a 1 b 1 b 1 a 1 a 1 a 1 a 1a 1 a 2 a 2 a 2 b 2 b 2 a 2 a 2 a 2 a 2a 2 a (a) figure 5: ternary if (tif) s/d trees and their numbers: (a) 2-variable order {a,b} and (b) 2variable order {b,a}. variables {a,b} are defined as in eq. (28) for generalized davio (d) where in this figure (a,≡ a) and (b,≡ b). theorem 2. the total number of the ternary ifs, for two variables and for orders {a,b} and {b,a}, and the total number of ternary generalized ifs, for two variables, are respectively: #tifa,b = 1 ·(3) 0 + 3 ·(3)2 + 3 ·(3)4 + 2 ·(3)6 + 3 ·(3)8 + 3 ·(3)10 + 1 ·(3)12 = 730,000, (33) #tifb,a = 1 ·(3) 0 + 3 ·(3)2 + 3 ·(3)4 + 2 ·(3)6 + 3 ·(3)8 + 3 ·(3)10 + 1 ·(3)12 = 730,000, (34) #tgif = #tifa,b + #tifb,a−#(tifa,b ∩ tifb,a) = 2 · #tif − #(tifa,b ∩ tifb,a) = 2 ·(730,000)− (1 ·(3)0 + 2 ·(3)6 + 1 ·(3)12) = 927,100. (35) proof. by observing figure 5, we note that the total number of tifs for orders {a,b} and {b,a} is the sum of the numbers on top of s/d trees that leads to equations (33) (35), respectively. 2.2 properties of tifs and tgifs the following present basic properties of the presented tifs and tgifs. theorem 3. each ternary inclusive form (tif) is canonical. 110 an extended green sasao hierarchy of canonical ternary galois forms ... s s s s 0 b 1 b 2 b s s d d s s s d 1 (a,) 2 a, s d d d 1 (a,) 2 a, s s s 0 b 0 b 1 b 1 b 2 b 2 b d 1 1 1 1 1 1 1 1 1 1 11 1 1 1 1 1 1 1 1 1 1 1(b,) 2 (b,) 2 (b,) 2 (b,) 2 (b,) 2 (b,) 2 (b,) 2 (b,) 2 (b,) 2 (b,) 2 (b,) 2 (b,) 2 (b,) 2 (b,) 2 (b,) 2 (b,) 2 (b,) 2 (b,) 2 (b,) 2 (b,) 2 (b,) 2 (b,) 2 (b,) 2 (b,) 2 b, b, b, b, b, b, b, b, b, b, b,b, b, b, b, b, b, b, b, b, b, b, b, s s d d s s d d 1 (a,) 2 a, sd d d 1 (a,) 2 a, s s s 0 b 0 b 0 b 0 b 0 b 0 b 0 b 0 b 0 b 0 b 0b 0 b 0 b 0 b 0 b 0b0b 0 b 0 b 1 b 1 b 1 b 1 b 1 b 1 b 1 b 1 b 1 b 1 b 1b 1 b 1 b 1 b 1 b 1b1b 1 b 1 b 2 b 2 b 2 b 2 b 2 b 2 b 2 b 2 b 2 b 2 b 2 b 2 b 2 b 2 b 2 b 2b2b 2 b 2 b d 1 b, s sd d s s s sd d 1 (a,) 2 a, d d d d 1 (a,) 2 a, s s s d s d dd d d 1 (a,) 2 a, s d0 d1 d 1 (a,) 2 a, n=1 n=81 n=729 n=59049 n=9 n=81 n=6561 n=59049 n=9 n=81 n=6561 n=59049 n=9 n=729 n=6561 n=531441 0 a 0 a 0 b 0 b 0 a 0 a 0 a 0 a 0a 0 a 1 a 1 a 1 b 1 b 1 a 1 a 1 a 1 a 1a 1 a 2 a 2 a 2 b 2 b 2 a 2 a 2 a 2 a 2a 2 a (a) figure 5: ternary if (tif) s/d trees and their numbers: (a) 2-variable order {a,b} and (b) 2variable order {b,a}. variables {a,b} are defined as in eq. (28) for generalized davio (d) where in this figure (a,≡ a) and (b,≡ b). theorem 2. the total number of the ternary ifs, for two variables and for orders {a,b} and {b,a}, and the total number of ternary generalized ifs, for two variables, are respectively: #tifa,b = 1 ·(3) 0 + 3 ·(3)2 + 3 ·(3)4 + 2 ·(3)6 + 3 ·(3)8 + 3 ·(3)10 + 1 ·(3)12 = 730,000, (33) #tifb,a = 1 ·(3) 0 + 3 ·(3)2 + 3 ·(3)4 + 2 ·(3)6 + 3 ·(3)8 + 3 ·(3)10 + 1 ·(3)12 = 730,000, (34) #tgif = #tifa,b + #tifb,a−#(tifa,b ∩ tifb,a) = 2 · #tif − #(tifa,b ∩ tifb,a) = 2 ·(730,000)− (1 ·(3)0 + 2 ·(3)6 + 1 ·(3)12) = 927,100. (35) proof. by observing figure 5, we note that the total number of tifs for orders {a,b} and {b,a} is the sum of the numbers on top of s/d trees that leads to equations (33) (35), respectively. 2.2 properties of tifs and tgifs the following present basic properties of the presented tifs and tgifs. theorem 3. each ternary inclusive form (tif) is canonical. 110 58 a. n. al-rabadi an extended green sasao hierarchy of canonical ternary galois forms and universal logic modules 59 an extended green sasao hierarchy of canonical ternary galois forms ... s s s s 0 a 1 a 2 a s s d d s s s d 1 (b,) 2 b, s d d d 1 (b,) 2 b, s s s 0 a 0 a 1 a 1 a 2 a 2 a d 1 1 1 1 1 1 1 1 1 1 11 1 1 1 1 1 1 1 1 1 1 1(a,) 2 (a,) 2 (a,) 2 (a,) 2 (a,) 2 (a,) 2 (a,) 2 (a,) 2 (a,) 2 (a,) 2 (a,) 2 (a,) 2 (a,) 2 (a,) 2 (a,) 2 (a,) 2 (a,) 2 (a,) 2 (a,) 2 (a,) 2 (a,) 2 (a,) 2 (a,) 2 (a,) 2 a, a, a, a, a, a, a, a, a, a, a,a, a, a, a, a, a, a, a, a, a, a, a, s s d d s s d d 1 (b,) 2 b, sd d d 1 (b,) 2 b, s s s 0 a 0 a 0 a 0 a 0 a 0 a 0 a 0 a 0 a 0 a 0a 0 a 0 a 0 a 0 a 0a0a 0 a 0 a 1 a 1 a 1 a 1 a 1 a 1 a 1 a 1 a 1 a 1 a 1a 1 a 1 a 1 a 1 a 1a1a 1 a 1 a 2 a 2 a 2 a 2 a 2 a 2 a 2 a 2 a 2 a 2 a 2 a 2 a 2 a 2 a 2 a 2a2a 2 a 2 a d 1 a, s sd d s s s sd d 1 (b,) 2 b, d d d d 1 (b,) 2 b, s s s d s d dd d d 1 (b,) 2 b, s d d d 1 (b,) 2 b, n=1 n=81 n=729 n=59049 n=9 n=81 n=6561 n=59049 n=9 n=81 n=6561 n=59049 n=9 n=729 n=6561 n=531441 0 b 0 b 0 a 0 a 0 b 0 b 0 b 0 b 0b 0 b 1 b 1 b 1 a 1 a 1 b 1 b 1 b 1 b 1b 1 b 2 b 2 b 2 a 2 a 2 b 2 b 2 b 2 b 2b 2 b (b) figure 5: (continued). proof. an expansion is canonical iff its terms are linearly independent; none of the terms is equal to a linear combination of other terms. using this fact, it was proven that all gf(2) ifs are canonical. analogously over gf(3), for any function f of the same number of variables, there exists one and only one set of coefficients ai such that f is uniquely tif-expandable using ai and thus the resulting expression is canonical. by induction on number of variables, terms in tifs will be linearly independent and thus canonical. theorem 4. ternary generalized inclusive forms (tgifs) are canonical with respect to given variable order. proof. since tgif is the union of the corresponding tifs, and since each tif is canonical, then the resulting union of the corresponding canonical tifs will be also canonical. for different variable orderings, some forms are not repeated while other forms are. therefore the union of sets of tifs for all variable orders contains more forms than any of the tif sets taken separately and less forms than total sum of all tifs. generalized inclusive forms include other forms such as grms over gf(3) as can be shown by considering all possible combinations of literals for all possible orders of variables. if we relax the requirement of fixed variable ordering, and allow any ordering of variables in branches of the tree but do not allow repetitions of variables in the branches, we generate more general gf(3) family of forms. 111 an extended green sasao hierarchy of canonical ternary galois forms ... s s s s 0 a 1 a 2 a s s d d s s s d 1 (b,) 2 b, s d d d 1 (b,) 2 b, s s s 0 a 0 a 1 a 1 a 2 a 2 a d 1 1 1 1 1 1 1 1 1 1 11 1 1 1 1 1 1 1 1 1 1 1(a,) 2 (a,) 2 (a,) 2 (a,) 2 (a,) 2 (a,) 2 (a,) 2 (a,) 2 (a,) 2 (a,) 2 (a,) 2 (a,) 2 (a,) 2 (a,) 2 (a,) 2 (a,) 2 (a,) 2 (a,) 2 (a,) 2 (a,) 2 (a,) 2 (a,) 2 (a,) 2 (a,) 2 a, a, a, a, a, a, a, a, a, a, a,a, a, a, a, a, a, a, a, a, a, a, a, s s d d s s d d 1 (b,) 2 b, sd d d 1 (b,) 2 b, s s s 0 a 0 a 0 a 0 a 0 a 0 a 0 a 0 a 0 a 0 a 0a 0 a 0 a 0 a 0 a 0a0a 0 a 0 a 1 a 1 a 1 a 1 a 1 a 1 a 1 a 1 a 1 a 1 a 1a 1 a 1 a 1 a 1 a 1a1a 1 a 1 a 2 a 2 a 2 a 2 a 2 a 2 a 2 a 2 a 2 a 2 a 2 a 2 a 2 a 2 a 2 a 2a2a 2 a 2 a d 1 a, s sd d s s s sd d 1 (b,) 2 b, d d d d 1 (b,) 2 b, s s s d s d dd d d 1 (b,) 2 b, s d d d 1 (b,) 2 b, n=1 n=81 n=729 n=59049 n=9 n=81 n=6561 n=59049 n=9 n=81 n=6561 n=59049 n=9 n=729 n=6561 n=531441 0 b 0 b 0 a 0 a 0 b 0 b 0 b 0 b 0b 0 b 1 b 1 b 1 a 1 a 1 b 1 b 1 b 1 b 1b 1 b 2 b 2 b 2 a 2 a 2 b 2 b 2 b 2 b 2b 2 b (b) figure 5: (continued). proof. an expansion is canonical iff its terms are linearly independent; none of the terms is equal to a linear combination of other terms. using this fact, it was proven that all gf(2) ifs are canonical. analogously over gf(3), for any function f of the same number of variables, there exists one and only one set of coefficients ai such that f is uniquely tif-expandable using ai and thus the resulting expression is canonical. by induction on number of variables, terms in tifs will be linearly independent and thus canonical. theorem 4. ternary generalized inclusive forms (tgifs) are canonical with respect to given variable order. proof. since tgif is the union of the corresponding tifs, and since each tif is canonical, then the resulting union of the corresponding canonical tifs will be also canonical. for different variable orderings, some forms are not repeated while other forms are. therefore the union of sets of tifs for all variable orders contains more forms than any of the tif sets taken separately and less forms than total sum of all tifs. generalized inclusive forms include other forms such as grms over gf(3) as can be shown by considering all possible combinations of literals for all possible orders of variables. if we relax the requirement of fixed variable ordering, and allow any ordering of variables in branches of the tree but do not allow repetitions of variables in the branches, we generate more general gf(3) family of forms. 111 an extended green sasao hierarchy of canonical ternary galois forms ... s s s s 0 a 1 a 2 a s s d d s s s d 1 (b,) 2 b, s d d d 1 (b,) 2 b, s s s 0 a 0 a 1 a 1 a 2 a 2 a d 1 1 1 1 1 1 1 1 1 1 11 1 1 1 1 1 1 1 1 1 1 1(a,) 2 (a,) 2 (a,) 2 (a,) 2 (a,) 2 (a,) 2 (a,) 2 (a,) 2 (a,) 2 (a,) 2 (a,) 2 (a,) 2 (a,) 2 (a,) 2 (a,) 2 (a,) 2 (a,) 2 (a,) 2 (a,) 2 (a,) 2 (a,) 2 (a,) 2 (a,) 2 (a,) 2 a, a, a, a, a, a, a, a, a, a, a,a, a, a, a, a, a, a, a, a, a, a, a, s s d d s s d d 1 (b,) 2 b, sd d d 1 (b,) 2 b, s s s 0 a 0 a 0 a 0 a 0 a 0 a 0 a 0 a 0 a 0 a 0a 0 a 0 a 0 a 0 a 0a0a 0 a 0 a 1 a 1 a 1 a 1 a 1 a 1 a 1 a 1 a 1 a 1 a 1a 1 a 1 a 1 a 1 a 1a1a 1 a 1 a 2 a 2 a 2 a 2 a 2 a 2 a 2 a 2 a 2 a 2 a 2 a 2 a 2 a 2 a 2 a 2a2a 2 a 2 a d 1 a, s sd d s s s sd d 1 (b,) 2 b, d d d d 1 (b,) 2 b, s s s d s d dd d d 1 (b,) 2 b, s d d d 1 (b,) 2 b, n=1 n=81 n=729 n=59049 n=9 n=81 n=6561 n=59049 n=9 n=81 n=6561 n=59049 n=9 n=729 n=6561 n=531441 0 b 0 b 0 a 0 a 0 b 0 b 0 b 0 b 0b 0 b 1 b 1 b 1 a 1 a 1 b 1 b 1 b 1 b 1b 1 b 2 b 2 b 2 a 2 a 2 b 2 b 2 b 2 b 2b 2 b (b) figure 5: (continued). proof. an expansion is canonical iff its terms are linearly independent; none of the terms is equal to a linear combination of other terms. using this fact, it was proven that all gf(2) ifs are canonical. analogously over gf(3), for any function f of the same number of variables, there exists one and only one set of coefficients ai such that f is uniquely tif-expandable using ai and thus the resulting expression is canonical. by induction on number of variables, terms in tifs will be linearly independent and thus canonical. theorem 4. ternary generalized inclusive forms (tgifs) are canonical with respect to given variable order. proof. since tgif is the union of the corresponding tifs, and since each tif is canonical, then the resulting union of the corresponding canonical tifs will be also canonical. for different variable orderings, some forms are not repeated while other forms are. therefore the union of sets of tifs for all variable orders contains more forms than any of the tif sets taken separately and less forms than total sum of all tifs. generalized inclusive forms include other forms such as grms over gf(3) as can be shown by considering all possible combinations of literals for all possible orders of variables. if we relax the requirement of fixed variable ordering, and allow any ordering of variables in branches of the tree but do not allow repetitions of variables in the branches, we generate more general gf(3) family of forms. 111 an extended green sasao hierarchy of canonical ternary galois forms ... s s s s 0 a 1 a 2 a s s d d s s s d 1 (b,) 2 b, s d d d 1 (b,) 2 b, s s s 0 a 0 a 1 a 1 a 2 a 2 a d 1 1 1 1 1 1 1 1 1 1 11 1 1 1 1 1 1 1 1 1 1 1(a,) 2 (a,) 2 (a,) 2 (a,) 2 (a,) 2 (a,) 2 (a,) 2 (a,) 2 (a,) 2 (a,) 2 (a,) 2 (a,) 2 (a,) 2 (a,) 2 (a,) 2 (a,) 2 (a,) 2 (a,) 2 (a,) 2 (a,) 2 (a,) 2 (a,) 2 (a,) 2 (a,) 2 a, a, a, a, a, a, a, a, a, a, a,a, a, a, a, a, a, a, a, a, a, a, a, s s d d s s d d 1 (b,) 2 b, sd d d 1 (b,) 2 b, s s s 0 a 0 a 0 a 0 a 0 a 0 a 0 a 0 a 0 a 0 a 0a 0 a 0 a 0 a 0 a 0a0a 0 a 0 a 1 a 1 a 1 a 1 a 1 a 1 a 1 a 1 a 1 a 1 a 1a 1 a 1 a 1 a 1 a 1a1a 1 a 1 a 2 a 2 a 2 a 2 a 2 a 2 a 2 a 2 a 2 a 2 a 2 a 2 a 2 a 2 a 2 a 2a2a 2 a 2 a d 1 a, s sd d s s s sd d 1 (b,) 2 b, d d d d 1 (b,) 2 b, s s s d s d dd d d 1 (b,) 2 b, s d d d 1 (b,) 2 b, n=1 n=81 n=729 n=59049 n=9 n=81 n=6561 n=59049 n=9 n=81 n=6561 n=59049 n=9 n=729 n=6561 n=531441 0 b 0 b 0 a 0 a 0 b 0 b 0 b 0 b 0b 0 b 1 b 1 b 1 a 1 a 1 b 1 b 1 b 1 b 1b 1 b 2 b 2 b 2 a 2 a 2 b 2 b 2 b 2 b 2b 2 b (b) figure 5: (continued). proof. an expansion is canonical iff its terms are linearly independent; none of the terms is equal to a linear combination of other terms. using this fact, it was proven that all gf(2) ifs are canonical. analogously over gf(3), for any function f of the same number of variables, there exists one and only one set of coefficients ai such that f is uniquely tif-expandable using ai and thus the resulting expression is canonical. by induction on number of variables, terms in tifs will be linearly independent and thus canonical. theorem 4. ternary generalized inclusive forms (tgifs) are canonical with respect to given variable order. proof. since tgif is the union of the corresponding tifs, and since each tif is canonical, then the resulting union of the corresponding canonical tifs will be also canonical. for different variable orderings, some forms are not repeated while other forms are. therefore the union of sets of tifs for all variable orders contains more forms than any of the tif sets taken separately and less forms than total sum of all tifs. generalized inclusive forms include other forms such as grms over gf(3) as can be shown by considering all possible combinations of literals for all possible orders of variables. if we relax the requirement of fixed variable ordering, and allow any ordering of variables in branches of the tree but do not allow repetitions of variables in the branches, we generate more general gf(3) family of forms. 111 60 a. n. al-rabadi an extended green sasao hierarchy of canonical ternary galois forms and universal logic modules 61 an extended green sasao hierarchy of canonical ternary galois forms ... definition 4. the family of forms, generated by s/d tree with no fixed ordering of variables, with variables not repeated along same branches, is called ternary free generalized inclusive forms (tfgifs). 2.3 an extended green-sasao hierarchy with a new sub-family for ternary reed-muller logic the green-sasao hierarchy of families of canonical forms [1, 10-13, 17, 24, 26] and the corresponding decision trees and diagrams is based on three generic expansions, shannon, positive and negative davio expansions. for example, this includes shannon decision trees and diagrams, positive and negative davio decision trees and diagrams, fixed polarity reed-muller decision trees and diagrams, kronecker decision trees and diagrams, pseudo reed-muller decision trees and diagrams, pseudo kronecker decision trees and diagrams, and linearly-independent decision trees and diagrams [1, 17, 24]. a set-theoretic relationship between families of canonical forms over gf(2) was proposed and extended by introducing binary if, gif and free gif (fgif) forms where figure 6(a) illustrates set-theoretic relationship between the various families of canonical forms over gf(2). analogously to the green-sasao hierarchy of binary reed-muller families of spectral transforms over gf(2) that is shown in figure 6(a), we will introduce the extended green-sasao hierarchy of spectral transforms, with a new sub-family for ternary reedmuller logic over gf(3). while definitions 2 4 defined the ternary inclusive forms, ternary generalized inclusive forms and ternary free generalized inclusive forms, respectively, and analogously to the binary reed-muller case, the following definitions are introduced for the corresponding canonical expressions over gf(3). definition 5. the decision tree that results from applying the ternary shannon expansion in equation (23) recursively to a ternary input-ternary output logic function (i.e., all levels in a dt) is called ternary shannon decision tree (tsdt). the result expression (flattened form) from the tsdt is called ternary shannon expression. definition 6. the decision trees that result from applying the ternary davio expansions in equations (24) (26) recursively to a ternary-input ternary-output logic function (i.e., all levels in a dt) are called: ternary zero-polarity davio decision tree (td0dt), ternary first-polarity davio decision tree (td1dt) and ternary secondpolarity davio decision tree (td2dt), respectively. the resulting expressions (flattened forms) from td0dt, td1dt and td2dt are called td0, td1 and td2 expressions, respectively. definition 7. the decision tree that results from applying any of the ternary davio expansions (nodes) for all nodes in each level (variable) in the dt is called ternary reed-muller decision tree (trmdt). the corresponding expression is called ternary fixed polarity reed-muller (tfprm) expression. definition 8. the decision tree that results from using any of the ternary shannon (s) or davio (d0, d1 or d2) expansions (nodes) for all nodes in each level (variable) in the dt (that has fixed order of variables) is called ternary kronecker decision tree (tkrodt). the resulting expression is called ternary kronecker expression. definition 9. the decision tree that results from using any of the ternary davio expansions (nodes) for each node (per level) of the dt is called ternary pseudo-reed112 an extended green sasao hierarchy of canonical ternary galois forms ... definition 4. the family of forms, generated by s/d tree with no fixed ordering of variables, with variables not repeated along same branches, is called ternary free generalized inclusive forms (tfgifs). 2.3 an extended green-sasao hierarchy with a new sub-family for ternary reed-muller logic the green-sasao hierarchy of families of canonical forms [1, 10-13, 17, 24, 26] and the corresponding decision trees and diagrams is based on three generic expansions, shannon, positive and negative davio expansions. for example, this includes shannon decision trees and diagrams, positive and negative davio decision trees and diagrams, fixed polarity reed-muller decision trees and diagrams, kronecker decision trees and diagrams, pseudo reed-muller decision trees and diagrams, pseudo kronecker decision trees and diagrams, and linearly-independent decision trees and diagrams [1, 17, 24]. a set-theoretic relationship between families of canonical forms over gf(2) was proposed and extended by introducing binary if, gif and free gif (fgif) forms where figure 6(a) illustrates set-theoretic relationship between the various families of canonical forms over gf(2). analogously to the green-sasao hierarchy of binary reed-muller families of spectral transforms over gf(2) that is shown in figure 6(a), we will introduce the extended green-sasao hierarchy of spectral transforms, with a new sub-family for ternary reedmuller logic over gf(3). while definitions 2 4 defined the ternary inclusive forms, ternary generalized inclusive forms and ternary free generalized inclusive forms, respectively, and analogously to the binary reed-muller case, the following definitions are introduced for the corresponding canonical expressions over gf(3). definition 5. the decision tree that results from applying the ternary shannon expansion in equation (23) recursively to a ternary input-ternary output logic function (i.e., all levels in a dt) is called ternary shannon decision tree (tsdt). the result expression (flattened form) from the tsdt is called ternary shannon expression. definition 6. the decision trees that result from applying the ternary davio expansions in equations (24) (26) recursively to a ternary-input ternary-output logic function (i.e., all levels in a dt) are called: ternary zero-polarity davio decision tree (td0dt), ternary first-polarity davio decision tree (td1dt) and ternary secondpolarity davio decision tree (td2dt), respectively. the resulting expressions (flattened forms) from td0dt, td1dt and td2dt are called td0, td1 and td2 expressions, respectively. definition 7. the decision tree that results from applying any of the ternary davio expansions (nodes) for all nodes in each level (variable) in the dt is called ternary reed-muller decision tree (trmdt). the corresponding expression is called ternary fixed polarity reed-muller (tfprm) expression. definition 8. the decision tree that results from using any of the ternary shannon (s) or davio (d0, d1 or d2) expansions (nodes) for all nodes in each level (variable) in the dt (that has fixed order of variables) is called ternary kronecker decision tree (tkrodt). the resulting expression is called ternary kronecker expression. definition 9. the decision tree that results from using any of the ternary davio expansions (nodes) for each node (per level) of the dt is called ternary pseudo-reed112 an extended green sasao hierarchy of canonical ternary galois forms ... definition 4. the family of forms, generated by s/d tree with no fixed ordering of variables, with variables not repeated along same branches, is called ternary free generalized inclusive forms (tfgifs). 2.3 an extended green-sasao hierarchy with a new sub-family for ternary reed-muller logic the green-sasao hierarchy of families of canonical forms [1, 10-13, 17, 24, 26] and the corresponding decision trees and diagrams is based on three generic expansions, shannon, positive and negative davio expansions. for example, this includes shannon decision trees and diagrams, positive and negative davio decision trees and diagrams, fixed polarity reed-muller decision trees and diagrams, kronecker decision trees and diagrams, pseudo reed-muller decision trees and diagrams, pseudo kronecker decision trees and diagrams, and linearly-independent decision trees and diagrams [1, 17, 24]. a set-theoretic relationship between families of canonical forms over gf(2) was proposed and extended by introducing binary if, gif and free gif (fgif) forms where figure 6(a) illustrates set-theoretic relationship between the various families of canonical forms over gf(2). analogously to the green-sasao hierarchy of binary reed-muller families of spectral transforms over gf(2) that is shown in figure 6(a), we will introduce the extended green-sasao hierarchy of spectral transforms, with a new sub-family for ternary reedmuller logic over gf(3). while definitions 2 4 defined the ternary inclusive forms, ternary generalized inclusive forms and ternary free generalized inclusive forms, respectively, and analogously to the binary reed-muller case, the following definitions are introduced for the corresponding canonical expressions over gf(3). definition 5. the decision tree that results from applying the ternary shannon expansion in equation (23) recursively to a ternary input-ternary output logic function (i.e., all levels in a dt) is called ternary shannon decision tree (tsdt). the result expression (flattened form) from the tsdt is called ternary shannon expression. definition 6. the decision trees that result from applying the ternary davio expansions in equations (24) (26) recursively to a ternary-input ternary-output logic function (i.e., all levels in a dt) are called: ternary zero-polarity davio decision tree (td0dt), ternary first-polarity davio decision tree (td1dt) and ternary secondpolarity davio decision tree (td2dt), respectively. the resulting expressions (flattened forms) from td0dt, td1dt and td2dt are called td0, td1 and td2 expressions, respectively. definition 7. the decision tree that results from applying any of the ternary davio expansions (nodes) for all nodes in each level (variable) in the dt is called ternary reed-muller decision tree (trmdt). the corresponding expression is called ternary fixed polarity reed-muller (tfprm) expression. definition 8. the decision tree that results from using any of the ternary shannon (s) or davio (d0, d1 or d2) expansions (nodes) for all nodes in each level (variable) in the dt (that has fixed order of variables) is called ternary kronecker decision tree (tkrodt). the resulting expression is called ternary kronecker expression. definition 9. the decision tree that results from using any of the ternary davio expansions (nodes) for each node (per level) of the dt is called ternary pseudo-reed112 an extended green sasao hierarchy of canonical ternary galois forms ... definition 4. the family of forms, generated by s/d tree with no fixed ordering of variables, with variables not repeated along same branches, is called ternary free generalized inclusive forms (tfgifs). 2.3 an extended green-sasao hierarchy with a new sub-family for ternary reed-muller logic the green-sasao hierarchy of families of canonical forms [1, 10-13, 17, 24, 26] and the corresponding decision trees and diagrams is based on three generic expansions, shannon, positive and negative davio expansions. for example, this includes shannon decision trees and diagrams, positive and negative davio decision trees and diagrams, fixed polarity reed-muller decision trees and diagrams, kronecker decision trees and diagrams, pseudo reed-muller decision trees and diagrams, pseudo kronecker decision trees and diagrams, and linearly-independent decision trees and diagrams [1, 17, 24]. a set-theoretic relationship between families of canonical forms over gf(2) was proposed and extended by introducing binary if, gif and free gif (fgif) forms where figure 6(a) illustrates set-theoretic relationship between the various families of canonical forms over gf(2). analogously to the green-sasao hierarchy of binary reed-muller families of spectral transforms over gf(2) that is shown in figure 6(a), we will introduce the extended green-sasao hierarchy of spectral transforms, with a new sub-family for ternary reedmuller logic over gf(3). while definitions 2 4 defined the ternary inclusive forms, ternary generalized inclusive forms and ternary free generalized inclusive forms, respectively, and analogously to the binary reed-muller case, the following definitions are introduced for the corresponding canonical expressions over gf(3). definition 5. the decision tree that results from applying the ternary shannon expansion in equation (23) recursively to a ternary input-ternary output logic function (i.e., all levels in a dt) is called ternary shannon decision tree (tsdt). the result expression (flattened form) from the tsdt is called ternary shannon expression. definition 6. the decision trees that result from applying the ternary davio expansions in equations (24) (26) recursively to a ternary-input ternary-output logic function (i.e., all levels in a dt) are called: ternary zero-polarity davio decision tree (td0dt), ternary first-polarity davio decision tree (td1dt) and ternary secondpolarity davio decision tree (td2dt), respectively. the resulting expressions (flattened forms) from td0dt, td1dt and td2dt are called td0, td1 and td2 expressions, respectively. definition 7. the decision tree that results from applying any of the ternary davio expansions (nodes) for all nodes in each level (variable) in the dt is called ternary reed-muller decision tree (trmdt). the corresponding expression is called ternary fixed polarity reed-muller (tfprm) expression. definition 8. the decision tree that results from using any of the ternary shannon (s) or davio (d0, d1 or d2) expansions (nodes) for all nodes in each level (variable) in the dt (that has fixed order of variables) is called ternary kronecker decision tree (tkrodt). the resulting expression is called ternary kronecker expression. definition 9. the decision tree that results from using any of the ternary davio expansions (nodes) for each node (per level) of the dt is called ternary pseudo-reed112 60 a. n. al-rabadi an extended green sasao hierarchy of canonical ternary galois forms and universal logic modules 61 an extended green sasao hierarchy of canonical ternary galois forms ... muller decision tree (tprmdt). the resulting expression is called ternary pseudoreed-muller expression. definition 10. the decision tree that results from using any of the ternary shannon expansion or ternary davio expansions (nodes) for each node (per level) of the dt is called ternary pseudo-kronecker decision tree (tpkrodt). the resulting expression is called ternary pseudo-kronecker expression. definition 11. the decision tree that results from using any of the ternary shannon expansion or ternary davio expansions (nodes) for each node (per level) of the dt, disregarding order of variables, provided that variables are not repeated along the same branches, is called ternary free kronecker decision tree (tfkrodt). the result is called ternary free-kronecker expression. definition 12. the ternary kronecker dt that has at least one ternary generalized reed-muller expansion node is called ternary generalized kronecker decision tree (tgkdt). the result is called ternary generalized kronecker expression. definition 13. the ternary kronecker dt that has at least one tgif node is called ternary generalized inclusive forms kronecker (tgifk) decision tree. the result is called ternary generalized inclusive form kronecker expression. figure 6(b) illustrates the extended gf(3) green-sasao hierarchy with the new subfamily (tgifk). the presented tgif nodes can be realized with universal logic modules (ulms) for pairs of variables, as will be shown in section 3, which is analogous to what was done for the binary case. although the s/d trees that have been developed so far are for the ternary radix, similar s/d trees can be developed as well for higher galois radices of gf( pk) where p is a prime number and k is a natural number ≥ 1. tgrm ternary gfsop ternary canonical forms tfgif tgif tif tpgk tgk tpkro tkrotfprm tgifk esop canonical forms fgif gif if pgk gk grm pkro kro fprm (a) (b) figure 6: green-sasao hierarchy: (a) set-theoretic relationship between gf(2) families of canonical forms, and (b) the extended green-sasao hierarchy with a new sub-family (tgifk) for gf(3) reed-muller logic. 113 an extended green sasao hierarchy of canonical ternary galois forms ... muller decision tree (tprmdt). the resulting expression is called ternary pseudoreed-muller expression. definition 10. the decision tree that results from using any of the ternary shannon expansion or ternary davio expansions (nodes) for each node (per level) of the dt is called ternary pseudo-kronecker decision tree (tpkrodt). the resulting expression is called ternary pseudo-kronecker expression. definition 11. the decision tree that results from using any of the ternary shannon expansion or ternary davio expansions (nodes) for each node (per level) of the dt, disregarding order of variables, provided that variables are not repeated along the same branches, is called ternary free kronecker decision tree (tfkrodt). the result is called ternary free-kronecker expression. definition 12. the ternary kronecker dt that has at least one ternary generalized reed-muller expansion node is called ternary generalized kronecker decision tree (tgkdt). the result is called ternary generalized kronecker expression. definition 13. the ternary kronecker dt that has at least one tgif node is called ternary generalized inclusive forms kronecker (tgifk) decision tree. the result is called ternary generalized inclusive form kronecker expression. figure 6(b) illustrates the extended gf(3) green-sasao hierarchy with the new subfamily (tgifk). the presented tgif nodes can be realized with universal logic modules (ulms) for pairs of variables, as will be shown in section 3, which is analogous to what was done for the binary case. although the s/d trees that have been developed so far are for the ternary radix, similar s/d trees can be developed as well for higher galois radices of gf( pk) where p is a prime number and k is a natural number ≥ 1. tgrm ternary gfsop ternary canonical forms tfgif tgif tif tpgk tgk tpkro tkrotfprm tgifk esop canonical forms fgif gif if pgk gk grm pkro kro fprm (a) (b) figure 6: green-sasao hierarchy: (a) set-theoretic relationship between gf(2) families of canonical forms, and (b) the extended green-sasao hierarchy with a new sub-family (tgifk) for gf(3) reed-muller logic. 113 an extended green sasao hierarchy of canonical ternary galois forms ... muller decision tree (tprmdt). the resulting expression is called ternary pseudoreed-muller expression. definition 10. the decision tree that results from using any of the ternary shannon expansion or ternary davio expansions (nodes) for each node (per level) of the dt is called ternary pseudo-kronecker decision tree (tpkrodt). the resulting expression is called ternary pseudo-kronecker expression. definition 11. the decision tree that results from using any of the ternary shannon expansion or ternary davio expansions (nodes) for each node (per level) of the dt, disregarding order of variables, provided that variables are not repeated along the same branches, is called ternary free kronecker decision tree (tfkrodt). the result is called ternary free-kronecker expression. definition 12. the ternary kronecker dt that has at least one ternary generalized reed-muller expansion node is called ternary generalized kronecker decision tree (tgkdt). the result is called ternary generalized kronecker expression. definition 13. the ternary kronecker dt that has at least one tgif node is called ternary generalized inclusive forms kronecker (tgifk) decision tree. the result is called ternary generalized inclusive form kronecker expression. figure 6(b) illustrates the extended gf(3) green-sasao hierarchy with the new subfamily (tgifk). the presented tgif nodes can be realized with universal logic modules (ulms) for pairs of variables, as will be shown in section 3, which is analogous to what was done for the binary case. although the s/d trees that have been developed so far are for the ternary radix, similar s/d trees can be developed as well for higher galois radices of gf( pk) where p is a prime number and k is a natural number ≥ 1. tgrm ternary gfsop ternary canonical forms tfgif tgif tif tpgk tgk tpkro tkrotfprm tgifk esop canonical forms fgif gif if pgk gk grm pkro kro fprm (a) (b) figure 6: green-sasao hierarchy: (a) set-theoretic relationship between gf(2) families of canonical forms, and (b) the extended green-sasao hierarchy with a new sub-family (tgifk) for gf(3) reed-muller logic. 113 an extended green sasao hierarchy of canonical ternary galois forms ... definition 4. the family of forms, generated by s/d tree with no fixed ordering of variables, with variables not repeated along same branches, is called ternary free generalized inclusive forms (tfgifs). 2.3 an extended green-sasao hierarchy with a new sub-family for ternary reed-muller logic the green-sasao hierarchy of families of canonical forms [1, 10-13, 17, 24, 26] and the corresponding decision trees and diagrams is based on three generic expansions, shannon, positive and negative davio expansions. for example, this includes shannon decision trees and diagrams, positive and negative davio decision trees and diagrams, fixed polarity reed-muller decision trees and diagrams, kronecker decision trees and diagrams, pseudo reed-muller decision trees and diagrams, pseudo kronecker decision trees and diagrams, and linearly-independent decision trees and diagrams [1, 17, 24]. a set-theoretic relationship between families of canonical forms over gf(2) was proposed and extended by introducing binary if, gif and free gif (fgif) forms where figure 6(a) illustrates set-theoretic relationship between the various families of canonical forms over gf(2). analogously to the green-sasao hierarchy of binary reed-muller families of spectral transforms over gf(2) that is shown in figure 6(a), we will introduce the extended green-sasao hierarchy of spectral transforms, with a new sub-family for ternary reedmuller logic over gf(3). while definitions 2 4 defined the ternary inclusive forms, ternary generalized inclusive forms and ternary free generalized inclusive forms, respectively, and analogously to the binary reed-muller case, the following definitions are introduced for the corresponding canonical expressions over gf(3). definition 5. the decision tree that results from applying the ternary shannon expansion in equation (23) recursively to a ternary input-ternary output logic function (i.e., all levels in a dt) is called ternary shannon decision tree (tsdt). the result expression (flattened form) from the tsdt is called ternary shannon expression. definition 6. the decision trees that result from applying the ternary davio expansions in equations (24) (26) recursively to a ternary-input ternary-output logic function (i.e., all levels in a dt) are called: ternary zero-polarity davio decision tree (td0dt), ternary first-polarity davio decision tree (td1dt) and ternary secondpolarity davio decision tree (td2dt), respectively. the resulting expressions (flattened forms) from td0dt, td1dt and td2dt are called td0, td1 and td2 expressions, respectively. definition 7. the decision tree that results from applying any of the ternary davio expansions (nodes) for all nodes in each level (variable) in the dt is called ternary reed-muller decision tree (trmdt). the corresponding expression is called ternary fixed polarity reed-muller (tfprm) expression. definition 8. the decision tree that results from using any of the ternary shannon (s) or davio (d0, d1 or d2) expansions (nodes) for all nodes in each level (variable) in the dt (that has fixed order of variables) is called ternary kronecker decision tree (tkrodt). the resulting expression is called ternary kronecker expression. definition 9. the decision tree that results from using any of the ternary davio expansions (nodes) for each node (per level) of the dt is called ternary pseudo-reed112 an extended green sasao hierarchy of canonical ternary galois forms ... muller decision tree (tprmdt). the resulting expression is called ternary pseudoreed-muller expression. definition 10. the decision tree that results from using any of the ternary shannon expansion or ternary davio expansions (nodes) for each node (per level) of the dt is called ternary pseudo-kronecker decision tree (tpkrodt). the resulting expression is called ternary pseudo-kronecker expression. definition 11. the decision tree that results from using any of the ternary shannon expansion or ternary davio expansions (nodes) for each node (per level) of the dt, disregarding order of variables, provided that variables are not repeated along the same branches, is called ternary free kronecker decision tree (tfkrodt). the result is called ternary free-kronecker expression. definition 12. the ternary kronecker dt that has at least one ternary generalized reed-muller expansion node is called ternary generalized kronecker decision tree (tgkdt). the result is called ternary generalized kronecker expression. definition 13. the ternary kronecker dt that has at least one tgif node is called ternary generalized inclusive forms kronecker (tgifk) decision tree. the result is called ternary generalized inclusive form kronecker expression. figure 6(b) illustrates the extended gf(3) green-sasao hierarchy with the new subfamily (tgifk). the presented tgif nodes can be realized with universal logic modules (ulms) for pairs of variables, as will be shown in section 3, which is analogous to what was done for the binary case. although the s/d trees that have been developed so far are for the ternary radix, similar s/d trees can be developed as well for higher galois radices of gf( pk) where p is a prime number and k is a natural number ≥ 1. tgrm ternary gfsop ternary canonical forms tfgif tgif tif tpgk tgk tpkro tkrotfprm tgifk esop canonical forms fgif gif if pgk gk grm pkro kro fprm (a) (b) figure 6: green-sasao hierarchy: (a) set-theoretic relationship between gf(2) families of canonical forms, and (b) the extended green-sasao hierarchy with a new sub-family (tgifk) for gf(3) reed-muller logic. 113 an extended green sasao hierarchy of canonical ternary galois forms ... muller decision tree (tprmdt). the resulting expression is called ternary pseudoreed-muller expression. definition 10. the decision tree that results from using any of the ternary shannon expansion or ternary davio expansions (nodes) for each node (per level) of the dt is called ternary pseudo-kronecker decision tree (tpkrodt). the resulting expression is called ternary pseudo-kronecker expression. definition 11. the decision tree that results from using any of the ternary shannon expansion or ternary davio expansions (nodes) for each node (per level) of the dt, disregarding order of variables, provided that variables are not repeated along the same branches, is called ternary free kronecker decision tree (tfkrodt). the result is called ternary free-kronecker expression. definition 12. the ternary kronecker dt that has at least one ternary generalized reed-muller expansion node is called ternary generalized kronecker decision tree (tgkdt). the result is called ternary generalized kronecker expression. definition 13. the ternary kronecker dt that has at least one tgif node is called ternary generalized inclusive forms kronecker (tgifk) decision tree. the result is called ternary generalized inclusive form kronecker expression. figure 6(b) illustrates the extended gf(3) green-sasao hierarchy with the new subfamily (tgifk). the presented tgif nodes can be realized with universal logic modules (ulms) for pairs of variables, as will be shown in section 3, which is analogous to what was done for the binary case. although the s/d trees that have been developed so far are for the ternary radix, similar s/d trees can be developed as well for higher galois radices of gf( pk) where p is a prime number and k is a natural number ≥ 1. tgrm ternary gfsop ternary canonical forms tfgif tgif tif tpgk tgk tpkro tkrotfprm tgifk esop canonical forms fgif gif if pgk gk grm pkro kro fprm (a) (b) figure 6: green-sasao hierarchy: (a) set-theoretic relationship between gf(2) families of canonical forms, and (b) the extended green-sasao hierarchy with a new sub-family (tgifk) for gf(3) reed-muller logic. 113 62 a. n. al-rabadi an extended green sasao hierarchy of canonical ternary galois forms and universal logic modules 63 an extended green sasao hierarchy of canonical ternary galois forms ... 3 universal logic modules for the circuit realization of s/d trees the nonsingular expansions of ternary shannon (s) and ternary davio (d0, d1 and d2), can be realized using a universal logic module (ulm) with control variables corresponding to the variables of the basis functions which are the variables we are expanding upon. we call it a universal logic module, because similarly to a multiplexer, all functions of two variables can be realized with two-level trees of such modules using constants on the second-level data inputs. the presented ulms are complete systems because they can implement all possible functions with certain number of variables. the concept of the universal logic module was used for binary rm logic over gf(2), as well as the general case of linearly independent (li) logic that includes rm logic as a special case. binary li logic extended the universal logic module from just being a multiplexer (shannon expansion), and/exor gate (positive davio expansion) and and/exor/not gate with inverted control variable (negative davio expansion), to the universal logic modules for any expansion over any linearly independent basis functions. analogously to the binary case, figure 7 presents universal logic modules for ternary shannon and ternary davio, respectively. one can note, that any function f can be produced by the application of the independent variable x and the cofactors { fi, f j and fk} as inputs to a ulm. the form of the resulting function depends on our choice of the shift and power operations that we choose inside the ulm for the input independent variable, and on our choice of the weighted combinations of the input cofactors. utilizing this note, we can combine all davio ulms to create the single all-davio ulm, where figure 7(c) illustrates this ulm. also, the more general ulm as shown in figure 7(d), can be generated to implement all ternary gf(3) shannon and davio expansions. in general, the gates in the ulms can be implemented, among other circuit technologies, by using binary logic over gf(2) or using multiple-valued circuit gates. each ternary ulm corresponds to a single node in the nodes of ternary dts that were illustrated previously. the main advantage of such powerful ulms is in high layout regularity that is required by future nanotechnologies, where the trees can be realized in efficient layout because they do not grow exponentially for practical functions. for instance, assuming ulm from figure 7, although every two-variable function can be realized with four such modules, it is highly probable that most of two-variable functions will require less than four modules. because of these properties, this approach is further expected to give good results when applied to the corresponding incompletely specified functions and multiple-valued relations. 4 conclusions and future work in this paper, an extended green-sasao hierarchy of families and forms is introduced. analogously to the binary case, two general families of canonical ternary reed-muller forms, called ternary inclusive forms (tifs) and their generalization of ternary generalized inclusive forms (tgifs), are presented. the second family includes minimum gf(3) galois field sum-of-products (gfsops). the multiple-valued shannon-davio (s/d) trees developed in this paper provide more general polarity with regards to the 114 an extended green sasao hierarchy of canonical ternary galois forms ... 3 universal logic modules for the circuit realization of s/d trees the nonsingular expansions of ternary shannon (s) and ternary davio (d0, d1 and d2), can be realized using a universal logic module (ulm) with control variables corresponding to the variables of the basis functions which are the variables we are expanding upon. we call it a universal logic module, because similarly to a multiplexer, all functions of two variables can be realized with two-level trees of such modules using constants on the second-level data inputs. the presented ulms are complete systems because they can implement all possible functions with certain number of variables. the concept of the universal logic module was used for binary rm logic over gf(2), as well as the general case of linearly independent (li) logic that includes rm logic as a special case. binary li logic extended the universal logic module from just being a multiplexer (shannon expansion), and/exor gate (positive davio expansion) and and/exor/not gate with inverted control variable (negative davio expansion), to the universal logic modules for any expansion over any linearly independent basis functions. analogously to the binary case, figure 7 presents universal logic modules for ternary shannon and ternary davio, respectively. one can note, that any function f can be produced by the application of the independent variable x and the cofactors { fi, f j and fk} as inputs to a ulm. the form of the resulting function depends on our choice of the shift and power operations that we choose inside the ulm for the input independent variable, and on our choice of the weighted combinations of the input cofactors. utilizing this note, we can combine all davio ulms to create the single all-davio ulm, where figure 7(c) illustrates this ulm. also, the more general ulm as shown in figure 7(d), can be generated to implement all ternary gf(3) shannon and davio expansions. in general, the gates in the ulms can be implemented, among other circuit technologies, by using binary logic over gf(2) or using multiple-valued circuit gates. each ternary ulm corresponds to a single node in the nodes of ternary dts that were illustrated previously. the main advantage of such powerful ulms is in high layout regularity that is required by future nanotechnologies, where the trees can be realized in efficient layout because they do not grow exponentially for practical functions. for instance, assuming ulm from figure 7, although every two-variable function can be realized with four such modules, it is highly probable that most of two-variable functions will require less than four modules. because of these properties, this approach is further expected to give good results when applied to the corresponding incompletely specified functions and multiple-valued relations. 4 conclusions and future work in this paper, an extended green-sasao hierarchy of families and forms is introduced. analogously to the binary case, two general families of canonical ternary reed-muller forms, called ternary inclusive forms (tifs) and their generalization of ternary generalized inclusive forms (tgifs), are presented. the second family includes minimum gf(3) galois field sum-of-products (gfsops). the multiple-valued shannon-davio (s/d) trees developed in this paper provide more general polarity with regards to the 114 an extended green sasao hierarchy of canonical ternary galois forms ... 3 universal logic modules for the circuit realization of s/d trees the nonsingular expansions of ternary shannon (s) and ternary davio (d0, d1 and d2), can be realized using a universal logic module (ulm) with control variables corresponding to the variables of the basis functions which are the variables we are expanding upon. we call it a universal logic module, because similarly to a multiplexer, all functions of two variables can be realized with two-level trees of such modules using constants on the second-level data inputs. the presented ulms are complete systems because they can implement all possible functions with certain number of variables. the concept of the universal logic module was used for binary rm logic over gf(2), as well as the general case of linearly independent (li) logic that includes rm logic as a special case. binary li logic extended the universal logic module from just being a multiplexer (shannon expansion), and/exor gate (positive davio expansion) and and/exor/not gate with inverted control variable (negative davio expansion), to the universal logic modules for any expansion over any linearly independent basis functions. analogously to the binary case, figure 7 presents universal logic modules for ternary shannon and ternary davio, respectively. one can note, that any function f can be produced by the application of the independent variable x and the cofactors { fi, f j and fk} as inputs to a ulm. the form of the resulting function depends on our choice of the shift and power operations that we choose inside the ulm for the input independent variable, and on our choice of the weighted combinations of the input cofactors. utilizing this note, we can combine all davio ulms to create the single all-davio ulm, where figure 7(c) illustrates this ulm. also, the more general ulm as shown in figure 7(d), can be generated to implement all ternary gf(3) shannon and davio expansions. in general, the gates in the ulms can be implemented, among other circuit technologies, by using binary logic over gf(2) or using multiple-valued circuit gates. each ternary ulm corresponds to a single node in the nodes of ternary dts that were illustrated previously. the main advantage of such powerful ulms is in high layout regularity that is required by future nanotechnologies, where the trees can be realized in efficient layout because they do not grow exponentially for practical functions. for instance, assuming ulm from figure 7, although every two-variable function can be realized with four such modules, it is highly probable that most of two-variable functions will require less than four modules. because of these properties, this approach is further expected to give good results when applied to the corresponding incompletely specified functions and multiple-valued relations. 4 conclusions and future work in this paper, an extended green-sasao hierarchy of families and forms is introduced. analogously to the binary case, two general families of canonical ternary reed-muller forms, called ternary inclusive forms (tifs) and their generalization of ternary generalized inclusive forms (tgifs), are presented. the second family includes minimum gf(3) galois field sum-of-products (gfsops). the multiple-valued shannon-davio (s/d) trees developed in this paper provide more general polarity with regards to the 114 an extended green sasao hierarchy of canonical ternary galois forms ... 3 universal logic modules for the circuit realization of s/d trees the nonsingular expansions of ternary shannon (s) and ternary davio (d0, d1 and d2), can be realized using a universal logic module (ulm) with control variables corresponding to the variables of the basis functions which are the variables we are expanding upon. we call it a universal logic module, because similarly to a multiplexer, all functions of two variables can be realized with two-level trees of such modules using constants on the second-level data inputs. the presented ulms are complete systems because they can implement all possible functions with certain number of variables. the concept of the universal logic module was used for binary rm logic over gf(2), as well as the general case of linearly independent (li) logic that includes rm logic as a special case. binary li logic extended the universal logic module from just being a multiplexer (shannon expansion), and/exor gate (positive davio expansion) and and/exor/not gate with inverted control variable (negative davio expansion), to the universal logic modules for any expansion over any linearly independent basis functions. analogously to the binary case, figure 7 presents universal logic modules for ternary shannon and ternary davio, respectively. one can note, that any function f can be produced by the application of the independent variable x and the cofactors { fi, f j and fk} as inputs to a ulm. the form of the resulting function depends on our choice of the shift and power operations that we choose inside the ulm for the input independent variable, and on our choice of the weighted combinations of the input cofactors. utilizing this note, we can combine all davio ulms to create the single all-davio ulm, where figure 7(c) illustrates this ulm. also, the more general ulm as shown in figure 7(d), can be generated to implement all ternary gf(3) shannon and davio expansions. in general, the gates in the ulms can be implemented, among other circuit technologies, by using binary logic over gf(2) or using multiple-valued circuit gates. each ternary ulm corresponds to a single node in the nodes of ternary dts that were illustrated previously. the main advantage of such powerful ulms is in high layout regularity that is required by future nanotechnologies, where the trees can be realized in efficient layout because they do not grow exponentially for practical functions. for instance, assuming ulm from figure 7, although every two-variable function can be realized with four such modules, it is highly probable that most of two-variable functions will require less than four modules. because of these properties, this approach is further expected to give good results when applied to the corresponding incompletely specified functions and multiple-valued relations. 4 conclusions and future work in this paper, an extended green-sasao hierarchy of families and forms is introduced. analogously to the binary case, two general families of canonical ternary reed-muller forms, called ternary inclusive forms (tifs) and their generalization of ternary generalized inclusive forms (tgifs), are presented. the second family includes minimum gf(3) galois field sum-of-products (gfsops). the multiple-valued shannon-davio (s/d) trees developed in this paper provide more general polarity with regards to the 114 an extended green sasao hierarchy of canonical ternary galois forms ... • • x f0 f2 f1 0 x 2 x 1 x x f1 f0 +f 1 +f 2 2f2 +f 0 1 2( ”)x 2 x” x f0 f0 +f 1 +f 2 2f1 +f 2 1 2( )x 2 x 2( ’)x 2 x f2 f0 +f 1 +f 2 2f0 +f 1 1 x’ + + + + + 0 x 1 1 x 2 x 2 x x’ x” fi fk fj f x 1 2 � � � � � � � + +1 fi fk fj f +1 (a) (b) (c) (d) figure 7: various ternary ulms: (a) ulm of gf(3) shannon using three gf(3) multiplication gates and one gf(3) addition gate, (b) ulms of gf(3) davio {d0,d1,d2} using three gf(3) multiplication gates and one gf(3) addition gate, (c) ulm for all gf(3) davio expansions using four gf(3) multiplication gates, one gf(3) addition gate, two shift operators, and one multiplexer, and (d) general ulm of gf(3) s/d node using four gf(3) multiplication gates, one gf(3) addition gate and four multiplexers. corresponding if polarity, which contains the grm as a special case. since universal logic modules (ulms) are complete systems that can implement all possible logic functions utilizing s/d expansions of multiple-valued shannon and davio decompositions, the realization of the presented s/d trees utilizing the corresponding ulms is also introduced. the application of the presented tifs and tgifs can be used to find minimum gfsop for multiple-valued inputs-outputs within logic synthesis, where a gfsop minimizer based on if polarity can be used to minimize the corresponding multiple-valued gfsop expressions. 115 an extended green sasao hierarchy of canonical ternary galois forms ... • • x f0 f2 f1 0 x 2 x 1 x x f1 f0 +f 1 +f 2 2f2 +f 0 1 2( ”)x 2 x” x f0 f0 +f 1 +f 2 2f1 +f 2 1 2( )x 2 x 2( ’)x 2 x f2 f0 +f 1 +f 2 2f0 +f 1 1 x’ + + + + + 0 x 1 1 x 2 x 2 x x’ x” fi fk fj f x 1 2 � � � � � � � + +1 fi fk fj f +1 (a) (b) (c) (d) figure 7: various ternary ulms: (a) ulm of gf(3) shannon using three gf(3) multiplication gates and one gf(3) addition gate, (b) ulms of gf(3) davio {d0,d1,d2} using three gf(3) multiplication gates and one gf(3) addition gate, (c) ulm for all gf(3) davio expansions using four gf(3) multiplication gates, one gf(3) addition gate, two shift operators, and one multiplexer, and (d) general ulm of gf(3) s/d node using four gf(3) multiplication gates, one gf(3) addition gate and four multiplexers. corresponding if polarity, which contains the grm as a special case. since universal logic modules (ulms) are complete systems that can implement all possible logic functions utilizing s/d expansions of multiple-valued shannon and davio decompositions, the realization of the presented s/d trees utilizing the corresponding ulms is also introduced. the application of the presented tifs and tgifs can be used to find minimum gfsop for multiple-valued inputs-outputs within logic synthesis, where a gfsop minimizer based on if polarity can be used to minimize the corresponding multiple-valued gfsop expressions. 115 62 a. n. al-rabadi an extended green sasao hierarchy of canonical ternary galois forms and universal logic modules 63 an extended green sasao hierarchy of canonical ternary galois forms ... 3 universal logic modules for the circuit realization of s/d trees the nonsingular expansions of ternary shannon (s) and ternary davio (d0, d1 and d2), can be realized using a universal logic module (ulm) with control variables corresponding to the variables of the basis functions which are the variables we are expanding upon. we call it a universal logic module, because similarly to a multiplexer, all functions of two variables can be realized with two-level trees of such modules using constants on the second-level data inputs. the presented ulms are complete systems because they can implement all possible functions with certain number of variables. the concept of the universal logic module was used for binary rm logic over gf(2), as well as the general case of linearly independent (li) logic that includes rm logic as a special case. binary li logic extended the universal logic module from just being a multiplexer (shannon expansion), and/exor gate (positive davio expansion) and and/exor/not gate with inverted control variable (negative davio expansion), to the universal logic modules for any expansion over any linearly independent basis functions. analogously to the binary case, figure 7 presents universal logic modules for ternary shannon and ternary davio, respectively. one can note, that any function f can be produced by the application of the independent variable x and the cofactors { fi, f j and fk} as inputs to a ulm. the form of the resulting function depends on our choice of the shift and power operations that we choose inside the ulm for the input independent variable, and on our choice of the weighted combinations of the input cofactors. utilizing this note, we can combine all davio ulms to create the single all-davio ulm, where figure 7(c) illustrates this ulm. also, the more general ulm as shown in figure 7(d), can be generated to implement all ternary gf(3) shannon and davio expansions. in general, the gates in the ulms can be implemented, among other circuit technologies, by using binary logic over gf(2) or using multiple-valued circuit gates. each ternary ulm corresponds to a single node in the nodes of ternary dts that were illustrated previously. the main advantage of such powerful ulms is in high layout regularity that is required by future nanotechnologies, where the trees can be realized in efficient layout because they do not grow exponentially for practical functions. for instance, assuming ulm from figure 7, although every two-variable function can be realized with four such modules, it is highly probable that most of two-variable functions will require less than four modules. because of these properties, this approach is further expected to give good results when applied to the corresponding incompletely specified functions and multiple-valued relations. 4 conclusions and future work in this paper, an extended green-sasao hierarchy of families and forms is introduced. analogously to the binary case, two general families of canonical ternary reed-muller forms, called ternary inclusive forms (tifs) and their generalization of ternary generalized inclusive forms (tgifs), are presented. the second family includes minimum gf(3) galois field sum-of-products (gfsops). the multiple-valued shannon-davio (s/d) trees developed in this paper provide more general polarity with regards to the 114 an extended green sasao hierarchy of canonical ternary galois forms ... • • x f0 f2 f1 0 x 2 x 1 x x f1 f0 +f 1 +f 2 2f2 +f 0 1 2( ”)x 2 x” x f0 f0 +f 1 +f 2 2f1 +f 2 1 2( )x 2 x 2( ’)x 2 x f2 f0 +f 1 +f 2 2f0 +f 1 1 x’ + + + + + 0 x 1 1 x 2 x 2 x x’ x” fi fk fj f x 1 2 � � � � � � � + +1 fi fk fj f +1 (a) (b) (c) (d) figure 7: various ternary ulms: (a) ulm of gf(3) shannon using three gf(3) multiplication gates and one gf(3) addition gate, (b) ulms of gf(3) davio {d0,d1,d2} using three gf(3) multiplication gates and one gf(3) addition gate, (c) ulm for all gf(3) davio expansions using four gf(3) multiplication gates, one gf(3) addition gate, two shift operators, and one multiplexer, and (d) general ulm of gf(3) s/d node using four gf(3) multiplication gates, one gf(3) addition gate and four multiplexers. corresponding if polarity, which contains the grm as a special case. since universal logic modules (ulms) are complete systems that can implement all possible logic functions utilizing s/d expansions of multiple-valued shannon and davio decompositions, the realization of the presented s/d trees utilizing the corresponding ulms is also introduced. the application of the presented tifs and tgifs can be used to find minimum gfsop for multiple-valued inputs-outputs within logic synthesis, where a gfsop minimizer based on if polarity can be used to minimize the corresponding multiple-valued gfsop expressions. 115 an extended green sasao hierarchy of canonical ternary galois forms ... • • x f0 f2 f1 0 x 2 x 1 x x f1 f0 +f 1 +f 2 2f2 +f 0 1 2( ”)x 2 x” x f0 f0 +f 1 +f 2 2f1 +f 2 1 2( )x 2 x 2( ’)x 2 x f2 f0 +f 1 +f 2 2f0 +f 1 1 x’ + + + + + 0 x 1 1 x 2 x 2 x x’ x” fi fk fj f x 1 2 � � � � � � � + +1 fi fk fj f +1 (a) (b) (c) (d) figure 7: various ternary ulms: (a) ulm of gf(3) shannon using three gf(3) multiplication gates and one gf(3) addition gate, (b) ulms of gf(3) davio {d0,d1,d2} using three gf(3) multiplication gates and one gf(3) addition gate, (c) ulm for all gf(3) davio expansions using four gf(3) multiplication gates, one gf(3) addition gate, two shift operators, and one multiplexer, and (d) general ulm of gf(3) s/d node using four gf(3) multiplication gates, one gf(3) addition gate and four multiplexers. corresponding if polarity, which contains the grm as a special case. since universal logic modules (ulms) are complete systems that can implement all possible logic functions utilizing s/d expansions of multiple-valued shannon and davio decompositions, the realization of the presented s/d trees utilizing the corresponding ulms is also introduced. the application of the presented tifs and tgifs can be used to find minimum gfsop for multiple-valued inputs-outputs within logic synthesis, where a gfsop minimizer based on if polarity can be used to minimize the corresponding multiple-valued gfsop expressions. 115 an extended green sasao hierarchy of canonical ternary galois forms ... • • x f0 f2 f1 0 x 2 x 1 x x f1 f0 +f 1 +f 2 2f2 +f 0 1 2( ”)x 2 x” x f0 f0 +f 1 +f 2 2f1 +f 2 1 2( )x 2 x 2( ’)x 2 x f2 f0 +f 1 +f 2 2f0 +f 1 1 x’ + + + + + 0 x 1 1 x 2 x 2 x x’ x” fi fk fj f x 1 2 � � � � � � � + +1 fi fk fj f +1 (a) (b) (c) (d) figure 7: various ternary ulms: (a) ulm of gf(3) shannon using three gf(3) multiplication gates and one gf(3) addition gate, (b) ulms of gf(3) davio {d0,d1,d2} using three gf(3) multiplication gates and one gf(3) addition gate, (c) ulm for all gf(3) davio expansions using four gf(3) multiplication gates, one gf(3) addition gate, two shift operators, and one multiplexer, and (d) general ulm of gf(3) s/d node using four gf(3) multiplication gates, one gf(3) addition gate and four multiplexers. corresponding if polarity, which contains the grm as a special case. since universal logic modules (ulms) are complete systems that can implement all possible logic functions utilizing s/d expansions of multiple-valued shannon and davio decompositions, the realization of the presented s/d trees utilizing the corresponding ulms is also introduced. the application of the presented tifs and tgifs can be used to find minimum gfsop for multiple-valued inputs-outputs within logic synthesis, where a gfsop minimizer based on if polarity can be used to minimize the corresponding multiple-valued gfsop expressions. 115 64 a. n. al-rabadi an extended green sasao hierarchy of canonical ternary galois forms and universal logic modules 65 an extended green sasao hierarchy of canonical ternary galois forms ... future work will include the following items: 1. the investigation of the various multiple-valued and two-valued techniques for the ulm implementation for the important case of quaternary logic. since the ulm realization over gf(3) extends and generalizes from the binary case of the gf(2) ulm realization where most digital system designs are performed the special case of the extension into gf(4) becomes important as this gf(4) ulm realization can be achieved by utilizing the currently existing and widely-used implementations over gf(2). this two-valued realization of quaternary ulm can be done for gf(4) addition utilizing gf(2) addition using vector of exors, (4/2) and (2/4) decoders, and for gf(4) multiplication utilizing gf(2) operations using vectors of xors, ands, (4/2) and (2/4) decoders, and therefore the gf(4) ulm producing quaternary shannon and all davio expansions can be achieved using the corresponding multiplexers and gf(4) additions and multiplications which can be accordingly realized using gf(2) methods. 2. the implementation of the introduced ulms using nanotechnology methods such as quantum computing, quantum dots and carbon nanotubes. 3. the utilization of evolutionary algorithms for function minimization using the introduced s/d trees. 4. the investigation using other more complex types of literals such as the presented generalized literal (gl) and universal literal (ul) to expand upon and consequently construct the corresponding new ulms. acknowledgment this research was performed during sabbatical leave in 2015-2016 granted from the university jordan and spent at philadelphia university references [1] a. n. al-rabadi, reversible logic synthesis: from fundamentals to quantum computing, springer-verlag, 2004. [2] a. n. al-rabadi, "three-dimensional lattice logic circuits, part iii: solving 3d volume congestion problem," facta universitatis electronics and energetics, vol. 18, no. 1, pp. 29 43, 2005. [3] a. n. al-rabadi, "three-dimensional lattice logic circuits, part ii: formal methods," facta universitatis electronics and energetics, vol. 18, no. 1, pp. 15 28, 2005. [4] a. n. al-rabadi, "three-dimensional lattice logic circuits, part i: fundamentals," facta universitatis electronics and energetics, vol. 18, no. 1, pp. 1 13, 2005. 116 an extended green sasao hierarchy of canonical ternary galois forms ... future work will include the following items: 1. the investigation of the various multiple-valued and two-valued techniques for the ulm implementation for the important case of quaternary logic. since the ulm realization over gf(3) extends and generalizes from the binary case of the gf(2) ulm realization where most digital system designs are performed the special case of the extension into gf(4) becomes important as this gf(4) ulm realization can be achieved by utilizing the currently existing and widely-used implementations over gf(2). this two-valued realization of quaternary ulm can be done for gf(4) addition utilizing gf(2) addition using vector of exors, (4/2) and (2/4) decoders, and for gf(4) multiplication utilizing gf(2) operations using vectors of xors, ands, (4/2) and (2/4) decoders, and therefore the gf(4) ulm producing quaternary shannon and all davio expansions can be achieved using the corresponding multiplexers and gf(4) additions and multiplications which can be accordingly realized using gf(2) methods. 2. the implementation of the introduced ulms using nanotechnology methods such as quantum computing, quantum dots and carbon nanotubes. 3. the utilization of evolutionary algorithms for function minimization using the introduced s/d trees. 4. the investigation using other more complex types of literals such as the presented generalized literal (gl) and universal literal (ul) to expand upon and consequently construct the corresponding new ulms. acknowledgment this research was performed during sabbatical leave in 2015-2016 granted from the university jordan and spent at philadelphia university references [1] a. n. al-rabadi, reversible logic synthesis: from fundamentals to quantum computing, springer-verlag, 2004. [2] a. n. al-rabadi, "three-dimensional lattice logic circuits, part iii: solving 3d volume congestion problem," facta universitatis electronics and energetics, vol. 18, no. 1, pp. 29 43, 2005. [3] a. n. al-rabadi, "three-dimensional lattice logic circuits, part ii: formal methods," facta universitatis electronics and energetics, vol. 18, no. 1, pp. 15 28, 2005. [4] a. n. al-rabadi, "three-dimensional lattice logic circuits, part i: fundamentals," facta universitatis electronics and energetics, vol. 18, no. 1, pp. 1 13, 2005. 116 an extended green sasao hierarchy of canonical ternary galois forms ... future work will include the following items: 1. the investigation of the various multiple-valued and two-valued techniques for the ulm implementation for the important case of quaternary logic. since the ulm realization over gf(3) extends and generalizes from the binary case of the gf(2) ulm realization where most digital system designs are performed the special case of the extension into gf(4) becomes important as this gf(4) ulm realization can be achieved by utilizing the currently existing and widely-used implementations over gf(2). this two-valued realization of quaternary ulm can be done for gf(4) addition utilizing gf(2) addition using vector of exors, (4/2) and (2/4) decoders, and for gf(4) multiplication utilizing gf(2) operations using vectors of xors, ands, (4/2) and (2/4) decoders, and therefore the gf(4) ulm producing quaternary shannon and all davio expansions can be achieved using the corresponding multiplexers and gf(4) additions and multiplications which can be accordingly realized using gf(2) methods. 2. the implementation of the introduced ulms using nanotechnology methods such as quantum computing, quantum dots and carbon nanotubes. 3. the utilization of evolutionary algorithms for function minimization using the introduced s/d trees. 4. the investigation using other more complex types of literals such as the presented generalized literal (gl) and universal literal (ul) to expand upon and consequently construct the corresponding new ulms. acknowledgment this research was performed during sabbatical leave in 2015-2016 granted from the university jordan and spent at philadelphia university references [1] a. n. al-rabadi, reversible logic synthesis: from fundamentals to quantum computing, springer-verlag, 2004. [2] a. n. al-rabadi, "three-dimensional lattice logic circuits, part iii: solving 3d volume congestion problem," facta universitatis electronics and energetics, vol. 18, no. 1, pp. 29 43, 2005. [3] a. n. al-rabadi, "three-dimensional lattice logic circuits, part ii: formal methods," facta universitatis electronics and energetics, vol. 18, no. 1, pp. 15 28, 2005. [4] a. n. al-rabadi, "three-dimensional lattice logic circuits, part i: fundamentals," facta universitatis electronics and energetics, vol. 18, no. 1, pp. 1 13, 2005. 116 an extended green sasao hierarchy of canonical ternary galois forms ... future work will include the following items: 1. the investigation of the various multiple-valued and two-valued techniques for the ulm implementation for the important case of quaternary logic. since the ulm realization over gf(3) extends and generalizes from the binary case of the gf(2) ulm realization where most digital system designs are performed the special case of the extension into gf(4) becomes important as this gf(4) ulm realization can be achieved by utilizing the currently existing and widely-used implementations over gf(2). this two-valued realization of quaternary ulm can be done for gf(4) addition utilizing gf(2) addition using vector of exors, (4/2) and (2/4) decoders, and for gf(4) multiplication utilizing gf(2) operations using vectors of xors, ands, (4/2) and (2/4) decoders, and therefore the gf(4) ulm producing quaternary shannon and all davio expansions can be achieved using the corresponding multiplexers and gf(4) additions and multiplications which can be accordingly realized using gf(2) methods. 2. the implementation of the introduced ulms using nanotechnology methods such as quantum computing, quantum dots and carbon nanotubes. 3. the utilization of evolutionary algorithms for function minimization using the introduced s/d trees. 4. the investigation using other more complex types of literals such as the presented generalized literal (gl) and universal literal (ul) to expand upon and consequently construct the corresponding new ulms. acknowledgment this research was performed during sabbatical leave in 2015-2016 granted from the university jordan and spent at philadelphia university references [1] a. n. al-rabadi, reversible logic synthesis: from fundamentals to quantum computing, springer-verlag, 2004. [2] a. n. al-rabadi, "three-dimensional lattice logic circuits, part iii: solving 3d volume congestion problem," facta universitatis electronics and energetics, vol. 18, no. 1, pp. 29 43, 2005. [3] a. n. al-rabadi, "three-dimensional lattice logic circuits, part ii: formal methods," facta universitatis electronics and energetics, vol. 18, no. 1, pp. 15 28, 2005. [4] a. n. al-rabadi, "three-dimensional lattice logic circuits, part i: fundamentals," facta universitatis electronics and energetics, vol. 18, no. 1, pp. 1 13, 2005. 116 an extended green sasao hierarchy of canonical ternary galois forms ... [5] a. n. al-rabadi, "quantum logic circuit design of many-valued galois reversible expansions and fast transforms," journal of circuits, systems, and computers, world scientific, vol. 16, no. 5, pp. 641 671, 2007. [6] a. n. al-rabadi, "representations, operations, and applications of switching circuits in the reversible and quantum spaces," facta universitatis electronics and energetics, vol. 20, no. 3, pp. 507 539, 2007. [7] a. n. al-rabadi, "multi-valued galois shannon-davio trees and their complexity," facta universitatis electronics and energetics, vol. 29, no. 4, pp. 701-720, 2016. [8] b. falkowski and l.-s. lim, "gray scale image compression based on multiple-valued input binary functions, walsh and reed-muller spectra," proc. ismvl, 2000, pp. 279-284. [9] h. fujiwara, logic testing and design for testability, mit press, 1985. [10] d. h. green, "families of reed-muller canonical forms," int. j. electronics, no. 2, pp. 259-280, 1991. [11] m. helliwell and m. a. perkowski, "a fast algorithm to minimize multi-output mixed-polarity generalized reed-muller forms," proc. design automation conference, 1988, pp. 427-432. [12] s. l. hurst, d. m. miller, and j. c. muzio, spectral techniques in digital logic, academic press inc., 1985. [13] m. g. karpovsky, finite orthogonal series in the design of digital devices, wiley, 1976. [14] s. m. reddy, "easily testable realizations of logic functions," ieee trans. comp., c-21, pp. 1183-1188, 1972. [15] t. sasao and m. fujita (editors), representations of discrete functions, kluwer academic publishers, 1996. [16] t. sasao, "easily testable realizations for generalized reed-muller expressions," ieee trans. comp., vol. 46, pp. 709-716, 1997. [17] r. s. stanković, spectral transform decision diagrams in simple questions and simple answers, nauka, 1998. [18] r. s. stanković, c. moraga, and j. t. astola, "reed-muller expressions in the previous decade," proc. reed-muller, starkville, 2001, pp. 7-26. [19] s. b. akers, "binary decision diagrams," ieee trans. comp., vol. c-27, no. 6, pp. 509-516, june 1978. [20] r. e. bryant, "graph-based algorithms for boolean functions manipulation," ieee trans. on comp., vol. c-35, no.8, pp. 667-691, 1986. [21] d. k. pradhan, "universal test sets for multiple fault detection in and-exor arrays," ieee trans. comp., vol. 27, pp. 181-187, 1978. 117 this research was performed during sabbatical leave in 2015-2016 granted from the university of jordan and spent at philadelphia university. 64 a. n. al-rabadi an extended green sasao hierarchy of canonical ternary galois forms and universal logic modules 65 an extended green sasao hierarchy of canonical ternary galois forms ... [5] a. n. al-rabadi, "quantum logic circuit design of many-valued galois reversible expansions and fast transforms," journal of circuits, systems, and computers, world scientific, vol. 16, no. 5, pp. 641 671, 2007. [6] a. n. al-rabadi, "representations, operations, and applications of switching circuits in the reversible and quantum spaces," facta universitatis electronics and energetics, vol. 20, no. 3, pp. 507 539, 2007. [7] a. n. al-rabadi, "multi-valued galois shannon-davio trees and their complexity," facta universitatis electronics and energetics, vol. 29, no. 4, pp. 701-720, 2016. [8] b. falkowski and l.-s. lim, "gray scale image compression based on multiple-valued input binary functions, walsh and reed-muller spectra," proc. ismvl, 2000, pp. 279-284. [9] h. fujiwara, logic testing and design for testability, mit press, 1985. [10] d. h. green, "families of reed-muller canonical forms," int. j. electronics, no. 2, pp. 259-280, 1991. [11] m. helliwell and m. a. perkowski, "a fast algorithm to minimize multi-output mixed-polarity generalized reed-muller forms," proc. design automation conference, 1988, pp. 427-432. [12] s. l. hurst, d. m. miller, and j. c. muzio, spectral techniques in digital logic, academic press inc., 1985. [13] m. g. karpovsky, finite orthogonal series in the design of digital devices, wiley, 1976. [14] s. m. reddy, "easily testable realizations of logic functions," ieee trans. comp., c-21, pp. 1183-1188, 1972. [15] t. sasao and m. fujita (editors), representations of discrete functions, kluwer academic publishers, 1996. [16] t. sasao, "easily testable realizations for generalized reed-muller expressions," ieee trans. comp., vol. 46, pp. 709-716, 1997. [17] r. s. stanković, spectral transform decision diagrams in simple questions and simple answers, nauka, 1998. [18] r. s. stanković, c. moraga, and j. t. astola, "reed-muller expressions in the previous decade," proc. reed-muller, starkville, 2001, pp. 7-26. [19] s. b. akers, "binary decision diagrams," ieee trans. comp., vol. c-27, no. 6, pp. 509-516, june 1978. [20] r. e. bryant, "graph-based algorithms for boolean functions manipulation," ieee trans. on comp., vol. c-35, no.8, pp. 667-691, 1986. [21] d. k. pradhan, "universal test sets for multiple fault detection in and-exor arrays," ieee trans. comp., vol. 27, pp. 181-187, 1978. 117 66 a. n. al-rabadi an extended green sasao hierarchy of canonical ternary galois forms and universal logic modules pb an extended green sasao hierarchy of canonical ternary galois forms ... [5] a. n. al-rabadi, "quantum logic circuit design of many-valued galois reversible expansions and fast transforms," journal of circuits, systems, and computers, world scientific, vol. 16, no. 5, pp. 641 671, 2007. [6] a. n. al-rabadi, "representations, operations, and applications of switching circuits in the reversible and quantum spaces," facta universitatis electronics and energetics, vol. 20, no. 3, pp. 507 539, 2007. [7] a. n. al-rabadi, "multi-valued galois shannon-davio trees and their complexity," facta universitatis electronics and energetics, vol. 29, no. 4, pp. 701-720, 2016. [8] b. falkowski and l.-s. lim, "gray scale image compression based on multiple-valued input binary functions, walsh and reed-muller spectra," proc. ismvl, 2000, pp. 279-284. [9] h. fujiwara, logic testing and design for testability, mit press, 1985. [10] d. h. green, "families of reed-muller canonical forms," int. j. electronics, no. 2, pp. 259-280, 1991. [11] m. helliwell and m. a. perkowski, "a fast algorithm to minimize multi-output mixed-polarity generalized reed-muller forms," proc. design automation conference, 1988, pp. 427-432. [12] s. l. hurst, d. m. miller, and j. c. muzio, spectral techniques in digital logic, academic press inc., 1985. [13] m. g. karpovsky, finite orthogonal series in the design of digital devices, wiley, 1976. [14] s. m. reddy, "easily testable realizations of logic functions," ieee trans. comp., c-21, pp. 1183-1188, 1972. [15] t. sasao and m. fujita (editors), representations of discrete functions, kluwer academic publishers, 1996. [16] t. sasao, "easily testable realizations for generalized reed-muller expressions," ieee trans. comp., vol. 46, pp. 709-716, 1997. [17] r. s. stanković, spectral transform decision diagrams in simple questions and simple answers, nauka, 1998. [18] r. s. stanković, c. moraga, and j. t. astola, "reed-muller expressions in the previous decade," proc. reed-muller, starkville, 2001, pp. 7-26. [19] s. b. akers, "binary decision diagrams," ieee trans. comp., vol. c-27, no. 6, pp. 509-516, june 1978. [20] r. e. bryant, "graph-based algorithms for boolean functions manipulation," ieee trans. on comp., vol. c-35, no.8, pp. 667-691, 1986. [21] d. k. pradhan, "universal test sets for multiple fault detection in and-exor arrays," ieee trans. comp., vol. 27, pp. 181-187, 1978. 117 an extended green sasao hierarchy of canonical ternary galois forms ... [22] m. cohn, switching function canonical form over integer fields, ph.d. dissertation, harvard university, 1960. [23] r. drechsler, a. sarabi, m. theobald, b. becker, and m. a. perkowski, "efficient representation and manipulation of switching functions based on ordered kronecker functional decision diagrams," proc. dac, 1994, pp. 415419. [24] b. j. falkowski and s. rahardja, "classification and properties of fast linearly independent logic transformations," ieee trans. on circuits and systems-ii, vol. 44, no. 8, pp. 646-655, august 1997. [25] a. gaidukov, "algorithm to derive minimum esop for 6-variable function," proc. int. workshop on boolean problems, freiberg, 2002, pp. 141-148. [26] s. hassoun, t. sasao, and r. brayton (editors), logic synthesis and verification, kluwer acad. publishers, 2001. [27] c. y. lee, "representation of switching circuits by binary decision diagrams," bell syst. tech. j., vol. 38, pp. 985-999, 1959. [28] c. moraga, "ternary spectral logic," proc. ismvl, pp. 7-12, 1977. [29] j. c. muzio and t. wesselkamper, multiple-valued switching theory, adamhilger, 1985. [30] r. s. stanković, "functional decision diagrams for multiple-valued functions," proc. ismvl, 1995, pp. 284-289. [31] b. steinbach and a. mishchenko, "a new approach to exact esop minimization," proc. reed-muller, starkville, 2001, pp. 66-81. [32] i. zhegalkin, "arithmetic representations for symbolic logic," math. sb., vol. 35, pp. 311-377, 1928. [33] a. n. al-rabadi, "carbon nano tube (cnt) multiplexers for multiple-valued computing," facta universitatis electronics and energetics, vol. 20, no. 2, pp. 175 186, 2007. [34] a. n. al-rabadi, "closed-system quantum logic network implementation of the viterbi algorithm," facta universitatis electronics and energetics, vol. 22, no. 1, pp. 1 33, 2009. 118 instruction facta universitatis series: electronics and energetics vol. 28, n o 4, december 2015, pp. 571 584 doi: 10.2298/fuee1504571j qrs complex detection based ecg signal artefact discrimination  borisav jovanović 1 , vančo litovski 1 , milan pavlović 2 1 university of niš, the faculty of electronic engineering, niš, serbia 2 university of niš, the faculty of medicine, niš, serbia abstract. a new algorithm dedicated to electrocardiograph telemetry devices is proposed which evaluates the quality of electrocardiogram signals acquired in unsupervised environments, raises the certainty of the produced diagnoses, and accelerates protective actions when necessary. the proposed algorithm is utilized in conditions when electrocardiogram signals are highly susceptible to artefacts. the algorithm is based on novel qrs detection method and is used in microprocessor-based telemetry devices with reduced computing power. the algorithm has been tuned on publicly available databases. the results of its exploitation are also presented. key words: ecg telemetry systems, ecg recordings quality 1. introduction the quality of electrocardiogram (ecg) data influences the diagnosis results [1, 2]. the same stands for data acquired by ecg telemetry devices [3]. the potential limitation of using telemetry devices is measurement artefact immunity and their ability to measure discernible qrs and p waveforms in the presence of noise [4]. the very notion that the patient executes measurement without supervision often has a detrimental impact on the quality of the acquired data, even though the patient will receive training in how to use the device [4]. the ecg telemetry device itself should be able to distinguish ecg signals from artefacts. moreover, the device has to work autonomously and transmit data to doctors when a cardiac disorder happens. one approach fulfilling these requirements is proposed in this paper. it is our intention to emphasize a novel method for ecg signal quality assessment which is applied in the design of an ecg telemetry device. it prevents the transmission of ecg data with insufficient signal quality and also enhances detection of various cardiac disorders. the algorithm is tuned on publicly available ecg recording database [5] and validated on clinical ecg data.  received september 15, 2014; received in revised form january 13, 2015 corresponding author: borisav jovanović university of niš, the faculty of electronic engineering, aleksandra medvedeva 14, 18000 niš, serbia (e-mail: borisav.jovanovic@elfak.ni.ac.rs) 572 b. jovanović,v. litovski, m. pavlović 2. related work the ecg signals acquired by ecg devices are not immune to noise contamination. techniques used for noise elimination are numerous [6], [7], [8]. the technique proposed in [6] is considered to be inappropriate for implementation in telemetry devices, because it requires multiple ecg leads for noise discrimination. specifically, the structure of commercial telemetry devices is simplified. in order to be wearable, many devices measure only single ecg lead. the other technique proposed in [7] finds the locations of qrs complex and other ecg signal waves to perform an adaptive signal filtering. the technique proposed in [8] uses a light-emitting-diode-based sensor to measure the amount of skin stretching, and based on sensor outputs, performs the filtering process. when ecg signals are contaminated by large artefacts filtering methods are not sufficiently successful in recovering the underlying ecg signals. for example, the body movement related artefacts are observed in the frequency range of the electrocardiogram, and often have similar shapes as qrs complexes. therefore, the signal and noise components cannot be easily discerned [9]. the signal to noise ratio cannot be either calculated for ecg signals because the signal components interpreted as ecg signal in one application may be interpreted as noise in the other applications [9]. additionally, as a consequence of aging, the amplitude of r wave can be decreased down to noise levels. instead of suppressing artefacts it is preferable to quantify the quality of an ecg signal. the assessment of signal quality is not completely solved, especially when focusing on telemetry devices with limited processing power [9]. some approaches use additional devices to identify the sections of ecg signal having artefacts. for example, the work in [10] proposes an accelerometer sensor for movement detection. the other methods rely exclusively on ecg recordings. the paper [11] presents a method which reduces the number of false alarms in coronary care units. the method for noise assessment presented in [12] requires two ecg leads. first, the algorithm calculates the locations of the qrs complexes in signal, and after using neural networks, determines whether the signal is a qrs complex or an artefact. in the algorithm described in [13], the researchers estimate the level of deviation of the suspected qrs complex compared to averaged qrs complex. recently, several algorithms dedicated to smart phone platforms are proposed. such method [14] is based on [13] and focuses on the assessment of ecg signal quality when the signal is measured in the unsupervised environments. after applying a filter to remove gross movement artefact, the signal quality is estimated in the remaining ecg signal [14]. the qrs complex detection is a basic step in almost every ecg analysis procedure. the performance of subsequent ecg analyses strongly depends on the robustness of the qrs detection [15]. therefore, ecg quality assessment based on qrs detection seems to be a practical approach that is suitable for most subsequent ecg analysis algorithms [9] and therefore has been adopted also here. the qrs detectors have been thoroughly studied and many methods have been proposed. the qrs detection method is first described by nygards and sornmo [16] and is subsequently updated by pan and tompkins method [17]. an open-source code [18] has been taken as a starting point for a new method for identification of qrs waves, proposed in this paper. this algorithm is a modification of pan and tompkins method [17]; uses adaptive thresholds and has a scan-back procedure which looks back in time if no beats have been detected during a certain period. qrscomplex detection based ecg signal artefact discrimination 573 3. the operation of telemetry devices the telemetry devices have emerged as a technology with a great promise for identifying cardiac disorders that are not easily discernible by other ecg diagnostic devices. compared to other ecg recorders, the ecg telemetry devices have the following advantages in:  using wireless communications for instantaneous reporting of cardiac disorders  having longer recording periods, long enough to capture the arrhythmic episodes and pauses in heart rhythm  having very small size in order to reduce the obtrusiveness of the recording process [4] a novel algorithm for ecg signal quality assessment has been implemented in an ecg telemetry device. the device uses five conductive electrodes (fig. 1). the arm electrodes (r and l) are placed on spots near right and left shoulders (fig. 1). the leg electrodes (f and n) are placed on abdomen bottom side in the legs direction. one additional precordial electrode [19] is placed at one of anatomically referenced landmarks on the anterior chest (v1–v6) given in fig. 1. the precordial landmark v5 is mostly utilized since it provides best sensitivity for myocardial ischemia disease detection [1]. the following standard leads are acquired: i, ii, iii, avr, avl, avf and one precordial the lead v5. fig. 1 the data processing chain in an ecg telemetry device the processing blocks of an ecg telemetry device are depicted in fig. 1. the device receives analog ecg signals from electrodes through conductive patient cables. a brief description of similar amplifier circuits operating in electrocardiograph dedicated to stress testing is given in [20], [21]. the analog signals are converted into digital at data rate of 500 samples per second, and then processed by digital filters. after the filtering operation has been completed, the rr (r wave to r wave) intervals representing the time periods between two consecutive r waves are calculated for the detection of following cardiac disorders:  tachycardia (disorders having high heart-rate rhythm) [22],  bradycardia (low heart-rate rhythm heart condition),  pauses (abnormal delays between qrs waves),  arrhythmia (irregular changes in heart-rate rhythm) [23]. the algorithm for ecg signal noise level estimation which is implemented within a telemetry device determines if an ecg recording has acceptable quality for data transmission. the noise level estimation algorithm, in conjunction with qrs complex detection method, will be described in following sections in detail. 574 b. jovanović,v. litovski, m. pavlović 4. the algorithm for peak detection and noise level estimation 4.1. qrs detection algorithm the digital samples of ecg signal are processed at sampling frequency of 250 samples per second. the operations dedicated to heart beat detection and noise level estimation can be divided into following groups:  ecg signal pre-processing,  peak detection,  noise level estimation,  qrs waves detection the brief description of ecg pre-processing block operations is given in figure 2. the pre-processing block starts with band-pass filtering. two finite impulse response (fir) filters are used:  the low-pass with cut-off frequency of 15hz  high-pass with cut-off frequency 1hz. fig. 2 qrs wave detection operations the absolute value of the signal's first derivative is calculated and the outcome signal is processed by moving the averaging filter. the average value of the input signal is found over a 96ms timing window. the duration of 96 ms is chosen after a thorough analysis was performed using ecg data from database [5]. we initially used the longer interval of 150 ms, which is used in [17]. then, we had to change the value to 96ms because we found that 96ms timing window enables better detection of qrs complexes when ventricular tachycardia is present in an ecg signal. as a result of pre-processing operations each qrs complex produces a knoll at the output of moving the average block. the output is denoted with x[n] in figure 2. the peak detection and noise level estimation blocks (figure 2.) are novel and will be described in the following sections in detail. the block denoted as qrs detection rules in fig. 2 is taken from [18]. the block classifies the peaks found by peak detection block (fig. 2.) as qrs complexes or noise, using peak height, location (relative to last found qrs complex) and adaptive thresholds. if peak is greater than threshold, it is considered as r wave, otherwise it is called noise. besides, all peaks that precede or follow larger peaks by less than time period of 200ms are discarded. the outcome of detection rules block can be used further for beat classification such as detection of arrhythmias. qrscomplex detection based ecg signal artefact discrimination 575 4.2. peak detection algorithm the output signal of pre-processing block, the digital signal x[n], enters the next peak detection block, at data rate of 250 samples per second. the examples of ecg signal ecg[n] and corresponding x[n] are depicted on the top and the middle panels of figure 3 respectively. as one can see from fig. 3, the qrs complexes in ecg[n] produce knolls in signal x[n]. also, the knolls produced by p and t waves are considerably smaller than those created by r waves, and therefore can be successfully distinguished. the peak detection algorithm generates at its output two different signals:  peak[n] which is used by qrs detection rules block for the identification of r waves. peak[n] is presented on the top panel of fig.3.  pulse[n] which is used by noise level estimation block for artefact detection and is presented on the middle panel of fig. 3. fig. 3 set of signals illustrating the peak detection and noise estimation algorithm. the top panel presents signals ecg[n] and peak[n], the middle panel x[n], oldmin[n], oldmax[n], ave[n] and pulse[n], the bottom panel signal state[n]. the algorithm for peak detection uses adaptive thresholds which are adjusted depending on the local minimum and maximum values of signal x[n]. the local minimum and maximum values are named with oldmin and oldmax respectively. the variables oldmin, oldmax are presented in fig. 3 on the middle panel. the oldmax is the most distinct value within the signal x[n], calculated during the last rr interval (fig. 3). the oldmin represents the minimum value of x[n] found in the same interval. the algorithm calculates additional variable ave (given in fig. 3 on middle panel), which represents the approximation of the mean value of signal x[n]. 576 b. jovanović,v. litovski, m. pavlović the operation of the peak detection algorithm can be described by a finite state machine (fsm) which is comprised of three states denoted as s0, s1 and s2. the state transition diagram of fsm is illustrated in figure 4. in the example of an ecg signal depicted in the figure 3, the fsm states are included on the bottom panel by signal state. fig. 4 the finite state machine states for peak detection a qrs complex is recognized (the pulse is generated on the signal peak[n]) when a state transitions s1->s2 occurs (fig. 4). during the isoelectric time interval of an ecg signal [1] which is positioned before the qrs complex, the fsm resides in state s0. the state is changed from s0 to s1 (fig. 4) when the rising edge of signal x[n] is detected. this transition is possible when the condition, given by eq. (1), is met. ((( [ ] [ 4]) ) (( [ 4] [ 8]) )) ( [ ] )x n x n x n x n x n d           (1) the condition given by eq. (1) consists of two parts. the first part calculates the slope of the rising edge of the signal x[n] and checks if it is greater than constant value δ. the δ is determined as the slope of signal x[n], produced by an input ecg signal ecg[n] having the lowest slew rate. the lowest slew rate for an ecg is estimated by dividing the minimum r peak amplitude within the range of 0.5mv to 5mv and dividing it by maximum rise time of the qr interval within the range of 17.5 ms to 52.5 ms. [24] this gives a minimum slew rate of 0.5mv/52.5ms=0.0095v/s. the second part of eq. (1) checks whether the amplitude of x[n] exceeds the threshold d. note that r peaks should be greater than 0.15mv for a qrs detection. the threshold value d is defined as a function of variables ave, oldmin, oldmax and time interval measured after the qrs complex has been detected as described in table 1. motivated by physiological standpoint, the time interval after the qrs complex has been detected is referred to as refractory period. the refractory period is greater than 280ms [1] and consists of absolute and relative refractory periods. the interval measured from the beginning of the qrs complex to the apex of the t wave is referred to as the absolute refractory period. the last half of the t wave is referred to as the relative refractory period. the absolute refractory interval is started when fsm state is changed from s2 to s0 and it lasts for 200ms. during this period, the d value is set to the maximum value. the absolute refractory period is followed by a relative refractory period lasting at least for 80 ms, within which the next qrs complex is more possible to happen. the d value, used during the relative refractory period, is determined empirically after a thorough analysis was performed on ecg data from database [5] and depends on the value of variable ave. as relative refractory interval elapses, the d value is decreased according to the equations described in table 1. because the algorithm is computationally optimized to be executed by low-power microcontrollers the multiplicand constants from table 1 are chosen to speed-up the multiplication operations which can be replaced by combination of more time efficient shift, add and subtract operations. qrscomplex detection based ecg signal artefact discrimination 577 table 1 the threshold d value, used in relation (1) for the transition from state s0 to s1 timer value [ms] threshold d [0-200) oldmax*0.75 [200, 240) ave*1.25+oldmin [240, 280) ave+oldmin [280, +∞) ave*0.875+oldmin the state s1 of fsm covers the rising edge and the peak of the knoll produced by qrs complex (fig. 3). when the amplitude of x[n] becomes smaller than the threshold given by relation (2), the fsm changes its state from s1 to s2 (fig. 4). [ ] 0.75(max )x n oldmin oldmin   (2) the variable max found in eq. (2) represents the local maximum of x[n], calculated during the state s1. also, at the moment of transition from s1 to s2, the variable oldmax is updated with the max (fig. 3). the pulse on signal peak[n] is generated when the fsm state is changed from s1 into s2 (fig. 3). the peak[n] is taken later as an input by detection rules block which classifies detected peaks as either qrs waves or noise. during the state s2 the falling edge of x[n]occurs. the state is changed from s2 to s0 if the relation (3) is met: ( [ ] 0.25( )) ( [ ] 2 )x n oldmin oldmax oldmin x n oldmin      (3) the transition from s2 to s1 (fig. 4) is possible and it is caused by artefacts in the ecg signal. the state s1 changes into s2 when relation (4) is fulfilled: ( [ ] 0.25( )) ( [ ] 10)x n min oldmax oldmin x n min      (4) the variable min found in eq. (4) stores the local minimum of signal x[n], calculated during the state s2. the ave is changed when the state is changed from s2 to s0 to track the signal x[n] maximum, obtained by signal oldmax: 125.0875.0  oldmaxaveave (5) when fsm is in s0, the value of ave is approximately decreased by half after every time period equal to 2 seconds. this is achieved by reducing the signal ave after every interval of 120 ms (30 ecg signal samples): 5 (1 2 ) 0.96875ave ave ave       (6) the multiplicand constant from equation (6) is chosen to speed-up the multiplication operations which can be replaced by shift, add and subtract operations. when fsm is changed from s2 to s0, the oldmin is initialized with x[n]; at the transition from s2 to s1 the variables oldmin and max are initialized with content of variables min and x[n] respectively (fig. 3). 578 b. jovanović,v. litovski, m. pavlović 4.3. noise level estimation algorithm the part of the algorithm dedicated to noise estimation is described as follows. the signal pulse[n] which is used for ecg signal artefact detection is updated at fsm's state transitions and is depicted on the middle panel of figure 3. the state transitions during which the non-zero values on pulse [n] are generated and their amplitudes are defined in the table 2. the pulse[n] is set to the value of one of variables min, oldmin or ave. table 2 the fsm state transitions at which the pulses are produced on pulse[n] and corresponding values state transition amplitude s0 -> s1 min s1 -> s2 ave s2 -> s1 oldmin, if (ave≤20) s2 -> s1 min, if (ave>20) the signal pulse[n] has to be normalized. the algorithm is executed by a microcontroller with limited processing capabilities which does not calculate with fixed-point numbers and the value pulse[n] is multiplied with constant of c=32, which is arbitrarily selected. after these numbers are multiplied, the result is divided by signal ave. normalized signal pulse2[n] is described by eq. (7). 2[ ] 32 [ ] /pulse n pulse n ave  (7) the result of noise level estimation block is the signal noisy_interval[n] providing the information if the ecg signals quality is ‟acceptable‟ or ‟unacceptable‟. the block produces noisy_interval[n] equal to 1 when an interval is identified as noisy. the borders of noisy intervals are estimated considering amplitudes of non-zero values produced by pulse2[n]. the interval denoted as noisy begins when two consecutive non-zero values are detected which are greater than threshold value th. the noisy interval is finished if in the sequence of five adjacent non-zero values of pulse2[n] there are not consecutive two which are greater than th. the th value is related to the value of constant c and it is determined after extensive analysis was performed on ecg data from the database [5].the threshold th=12 is used for the identification of the noisy intervals in which the qrs complexes are often missed or false detected by qrs detector. the artefacts within the detected interval include motion and poor electrode contact related artefacts. using lower value th=6 the sensitivity of noise level estimation algorithm is increased and the intervals are classified as noisy even when artefacts do not influence a qrs detection, but may distort detection of p or t waves. the noise detected using th=6 originate mostly from 50hz-related artefacts and electromyography signals. an example of ecg signals containing noisy segments is given in figure 5. the ecg signal ecg[n] is depicted on the top panel of figure 5. the middle panel presents the signal pulse2[n]. the signals noisy interval[n] and state[n] are depicted on the bottom panel of figure 5. qrscomplex detection based ecg signal artefact discrimination 579 fig. 5 the identification of noisy segments in ecg signals. the top panel presents ecg[n], the middle panel pulse2[n] and the bottom panel state[n] and noisy_interval[n]. fig. 6 noisy segments of ecg signals having ectopic beats. the top panel presents ecg[n], the middle pulse2[n] and the bottom panel signals state[n] and noisy_interval[n]. 580 b. jovanović,v. litovski, m. pavlović another example of ecg signals which are corrupted by patient's movement is depicted on the top panel in figure 6. the example shows the strength of peak processing algorithm. in the example, regular beats are mixed with ectopic beats having significantly larger and wider qrs waves than regular qrs complexes. the middle panel presents the signal pulse2[n]. the noise_interval[n] is shown on the bottom panel. despite of changes in signal morphology between normal and ectopic ecg beats, all qrs complexes are correctly identified. this can be observed by signal state[n] which is shown on the bottom panel. besides, the baseline drift does not influence the detection of qrs complexes. 4. the algorithm validation on public ecg record database the algorithm's performance was validated on physionet challenge database [5]. the database is comprised of one thousand 12-lead ecg recordings. the ecg signals are sampled at data rate of 500 hz, with the resolution of 16-bit per sample and 5 μv per bit. the duration of each ecg record in database is 10 seconds. the ecgs were recorded by nurses, technicians and other volunteers with different amounts of training. after the ecg recordings had been recorded, they were interpreted by a group of people consisting of three cardiologists, five ecg analysts, ten people without prior ecg reading experience and five persons with some previous ecg reading experience [25]. in order to estimate the quality of all ecg recordings from set, each ecg recording was presented to annotators and all the recordings were scored. 775 of the records were considered as acceptable, 223 records as unacceptable and 2 records as undefined. it is worth mentioning that records with artefacts have been labelled as acceptable even if the annotators assumed that the record quality is satisfactory for medical personnel to make accurate diagnosis [25]. the described algorithm for qrs complex detection and noise quality estimation is implemented by the program code written in matlab [26]. one thousand ecg recordings from the database [5] were analysed. the noisy signal segments, obtained using different thresholds values th, are detected for all twelve standard leads of an ecg recording. the following seven measures are extracted for each ecg lead: 1. the portion of ‟weak‟ signals in an ecg recording which have the r wave amplitudes smaller than 0.15mv. the r waves having low amplitudes are assumed to be undetectable 2. the portion of flat line segments in an ecg recording 3. the portion of segments in a recording corrupted by high vertical spikes, disabling the qrs detection in intervals following spikes 4. the value obtained by dividing the duration of noisy segments and the duration of the ecg recording. the noisy segments, which include motion and poor electrode contact related artefacts, are determined by signal noise_interval[n]=1, calculated using the threshold value th=12. 5. the value obtained by dividing the noisy segments duration and the duration of the ecg recording. the noisy segments are determined by noise_interval[n]=1, which is calculated using th=6.the detected noise intervals include artefacts related to 50hz interference and electromyography signals. the ecg signal segments considered by previous measure are excluded. 6. the portion of noisy segments of ecg signal caused by noise sources including previous two measures qrscomplex detection based ecg signal artefact discrimination 581 7. the value of rr interval variability, where the square root of the mean of the sum of the squares of differences between adjacent rr intervals is calculated, divided by mean value of rr intervals, using the following formula: 2 1 1 variability 1 ( ) (%) ( ) n i i i n i i rr rr n rr rr n        (8) the artificial neural network is used to train the relationships between the calculated measures in the presence of ‟acceptable‟ and ‟unacceptable‟ ecg records. the total number of measures used in a training process for an ecg record is 84, consisting of seven distinct measures for each standard ecg lead. the calculated measures were fed into to a multilayer perceptron (mlp) artificial neural network. the back-propagation neural network (bpnn) was used with three-layer feed-forward structure. the first layer is the input layer that has 84 neurons as inputs (seven inputs per each of twelve standard ecg leads). the second layer, called the hidden layer, has 10 neurons. one hidden layer has been proven as sufficient [27]. the number of hidden neurons was obtained after the procedure was applied that is based on methods given in the literature [27], [28]. the output layer has only one neuron providing a quality estimate of the ecg record (‟acceptable‟ or ‟unacceptable‟). in this study, the logistic function is used as activation function for the hidden neurons. the weight and bias values in the bpnn are optimized using levenbergmarquardt algorithm [29]. after the training process of a neural network has been completed, we used the same training set to test neural network effectiveness. the neural network correctly identified 715 recordings of 1000 recordings as being ‟acceptable‟ and 208 recordings as being ‟unacceptable‟. let tp, tn, fp and fn denote true positives, true negatives, false positives, and false negatives, respectively. tp counts correctly classified „unacceptable‟ records, tn correctly classified ‟acceptable‟ records, fp incorrectly classified ‟unacceptable‟ records) and fn incorrectly classified ‟acceptable‟ records. the following results are obtained tp=208, tn=715, fp=17 and fn=58. the results are characterized with sensitivity, specificity and accuracy measures, described by following equations: (%) tp sensitivity tp fn   (9) (%) tnfp tn yspecificit   (10) (%) fnfptntp tntp accuracy    (11) sensitivity is equal to 78.19%, specificity is 96.31% and classification accuracy is 92.48%. 582 b. jovanović,v. litovski, m. pavlović 5. discussion the performances of several qrs detection algorithms including the method from [18] is evaluated in [30] and [31]. the method has been developed and improved over the period of roughly 15 years, and states that the performance of the classification software is as good as or better than the performance reported from other qrs detection algorithms [30]. also, the same algorithm is used as the starting variant of qrs detector in [31], where the algorithm has been adapted to operate in high noise and frequent signal losses environments. the algorithm described in [18] was significantly improved by the proposed algorithm especially when it processes noisy ecg signals. the novel algorithm more efficiently identifies qrs waves then the algorithm described in [18], regardless of qrs complex amplitude level, width and morphology. for example, the algorithm described in [18] suffered from false qrs detections when qrs waves are wide, which was confirmed by simulations and also evaluated on real clinical data. the novel algorithm does not produce false detections for wide qrs complexes being present in ectopic beats. furthermore, it discriminates well p and t waves of ecg signal, which may have amplitudes as high as r waves. the qrs detector [18] should be more robust when detecting signals with low amplitudes and frequent artefacts [31]; for a highly noisy signal it fails to detect the peak location accurately [30]. one of the contributions of the proposed method is that it recognizes qrs complexes better than the algorithm presented in [18], particularly when qrs complexes have small amplitudes in the range from 0.15mv to 0.5mv. the algorithm can operate at rates up to 250 heart beats per minute. moreover, the ecg signal segments with large amount of artefacts are clearly identified by noise level estimation block, rejecting possible false arrhythmia and tachycardia detections. one of the results of the evaluation of proposed algorithm on clinical ecg data is presented in figure 7. the figure shows the detection of qrs complexes in the presence of ventricular tachycardia. the ecg signal, presented on the top panel, is acquired by ecg telemetry device implementing the algorithm we propose. the qrs complexes are indicated by signal state, shown on the bottom panel of figure 7. the signal noise_interval[n] is equal to zero. fig. 7 the detection of ventricular tachycardia (vt). the top panel presents signal ecg[n], the bottom panel state[n] indicating qrs detections and noise_interval[n] qrscomplex detection based ecg signal artefact discrimination 583 several studies were reported on the topic of ecg signal quality assessment [32], [33], [25]. the studies returned a simple binary score for an ecg record, which does not quantify the amount of artefacts, but just decides if an ecg record is of acceptable quality. these ecg signal quality algorithms were validated on physionet challenge database [6] comprised of one thousand 12-lead ecg recordings. we have obtained the value of accuracy of 92.48% which is comparable to those found in [32], [33] and [25]. these accuracy values are obtained for the same input ecg dataset. for example, the method from [32] was has the accuracy of 93.2%. the [33] is a variant of [22] and has accuracy equal to 92.6%. the other studies reported accuracy measures from 83.3 to 92.5% [25]. we want to emphasize the potential role of the proposed quality assessment algorithm in utility of future ecg diagnostic devices. in particular, we assume that the algorithms we propose could improve the performance of ecg telemetry devices. additionally, the quality assessment method could be used in every other kind of ecg recorders to, for example, help inexperienced nurses and technicians to record high quality ecg records. besides, in contrast to the algorithms [32], [33] that are computationally demanding and were conceived in their original design for smart phone applications, the method that we propose is computationally optimized for embedding in low-power microprocessor-based ecg diagnostic devices. 6. conclusion the work presented in this paper demonstrates a framework for combining both qrs detection and ecg signal quality assessment. such approach exploits the covariant information of the noise and relevant ecg data measured in unsupervised ecg signal acquisition environments. a more accurate false alarm reduction system has been developed, which is a must in novel wearable ecg telemetry devices. the results from this work clearly indicate that the ecg record signal quality can be estimated using qrs complex detection method, which is based on the detection of heart refractory time intervals and additionally supported by utilization of artificial neural network techniques. it is shown that the algorithm has almost the same performance as the known state-of-the-art algorithms with a considerable reduction in computational complexity. the ecg telemetry devices can be therefore greatly improved with the use of proposed ecg signal analysis routines. references [1] r. m. rangayyan, biomedical signal analysis – a case-study approach, wiley-ieee press, new york, 2002. [2] g. clifford, f. azuaje and p. mcsharry, advanced methods and tools for ecg data analysis, artech house, inc. norwood, ma, 2006. [3] a. müller, w. scharner, t. borchardt, w. och and h. korb, "reliability of an external loop recorder for automatic recognition and transtelephonic ecg transmission of atrial fibrillation", journal of telemedicine and telecare, vol. 15, no. 8, pp. 391-391, 2009. [4] s. lobodzinski and m. laks, "new devices for very long-term ecg monitoring", cardiology journal, vol. 19, no. 2, pp. 210–214, 2012. [5] physionet 2011 physionet/computing in cardiology challenge 2011 http://www.physionet.org/ challenge/2011 [6] f. la foresta, n. mammone and f. morabito, "neural nets", lecture notes in computer science, vol. 3931, pp. 78–82, berlin: springer, 2006. [7] n. v. thakor and y. s. zhu, "applications of adaptive filtering to ecg analysis: noise cancellation and arrhythmia detection", ieee trans. biomed. eng., vol. 38, issue 8, pp. 785–794, 1991. 584 b. jovanović,v. litovski, m. pavlović [8] y. liu and m. g. pecht, "reduction of skin stretch induced motion artefacts in electrocardiogram monitoring using adaptive filtering", in proceedings of 28th annu. int. conf. of the ieee engineering in medicine and biology conf., pp. 6045–6048, new york city, usa, 2006. [9] d. hayn, b. jammerbund and g. schreier," qrs detection based ecg quality assessment", physiol. meas., vol. 33, no. 9 pp. 1449–1461, iop publishing, september 2012. [10] y. kishimoto, y. kutsuna and k. oguri 2007, "detecting motion artefact ecg noise during sleeping by means of a tri-axis accelerometer", in proceedings of 29 th annu. int. conf. of the ieee engineering in medicine and biology society, pp. 2669–2672, 2007. [11] j. allen and a. murray, "assessing ecg signal quality on a coronary care unit", physiol. meas., vol. 17 issue 4, 249–258, 1996. [12] y. kigawa and k. oguri 2005 "support vector machine based error filtering for holter electrocardiogram analysis", in proceedings of 27 th annu. int. conf. of the ieee engineering in medicine and biology society, pp. 3872–3875, 2005. [13] q. li, r. g. mark and g. d. clifford 2008 "robust heart rate estimation from multiple asynchronous noisy sources", physiol. meas., vol. 29, no.1, 15–32, 2008. [14] s. j. redmond, y. xie, d. chang, j. basilakis and n. h. lovell, "electrocardiogram signal quality measures for unsupervised telehealth environments", physiol. meas., vol. 33, no. 9 pp. 1517-1533, iop publishing, september 2012. [15] l. sornmo and p. laguna, bioelectrical signal processing in cardiac and neurological applications, academic press series in biomedical engineering, new york: academic, 2005. [16] m. e. nygårds, l. sörnmo, "delineation of the qrs complex using the envelope of the ecg", journal of medical and biological engineering and computing, vol. 21, issue 5 , pp 538-547, 1983. [17] j. pan and w. j. tompkins 1985 "a real-time qrs detection algorithm", ieee trans. biomed. eng. vol. 32, pp. 30–36, 1985 [18] p. s. hamilton, "open source ecg analysis", in proceedings of computers in cardiology conf., sept. 2002, pp. 101 – 104. [19] d. dubin, "rapid interpretation of ekg‟s" hong kong: cover, 2000. [20] b. jovanović, m. damnjanović and m. pavlović,"12-channel pc-based electrocardiograph", electronics, vol. 10, no.2, university of banja luka, december, 2006, pp.44-48. [21] b. jovanović, m. damnjanović, "novel pc-based cardiac stress test system", in proceedings of lv conf. etran, banjavrućica, bosnia and herzegovina, 06.06.-09.06., 2011, el 2.4. [22] q. li, c. rajagopalan c and g. d. clifford, "ventricular fibrillation and tachycardia classification using a machine learning approach". ieee trans on biomed eng., 2013. [23] v. fuster, l. e. ryden, d. s. cannom, h. j. crijns, a. b. curtis, k. a. ellenbogen, j. l. halperin, j. y. le heuze, g. n. kay, j. e. lowe, s. bertil olsson, e. n. prystowsky, j. l.tamargo, s. wann "acc/aha/esc 2006 guidelines for the management of patients with atrial fibrillation", circulation, vol.8, no. 9, pp. 651-745, 2006. [24] d. prutchi and m. norris, design and development of medical electronic instrumentation: a practical perspective of the design, construction, and test of medical devices, john wiley & sons, inc., hoboken, new jersey, 2004. [25] i. silva, g. moody and l. celi, "improving the quality of ecgs collected using mobile phones: the physionet/computing in cardiology challenge 2011" int. conf. on computing in cardiology, pp. 273– 276, 2011. [26] matlab version 7.6.0.324 natick, massachusetts: the mathworks inc., 2008. [27] t. masters, practical neural network recipes in c++, academic press, san diego, 1993. [28] g.-b. huang and h. a. babri, “upper bound on the number of hidden neurons in feed-forward networks with arbitrary bounded nonlinear activation function”, ieee trans. on neural networks, vol. 9, pp. 224228, 1998. [29] litovski, v., zwolinski, m., "vlsi circuit simulation and optimization", chapman and hall, london, 1997. [30] s. mohan, g.v. kadambi, v.k. reddy, m.d. deshpande, "development of an industry standard qrs detection algorithm for automated ecg analysis", sastech journal, vol. 7, no. 1, april 2008. [31] j. oster, j. behar, r. calloca, q. li, q, li and g. clifford, "open source java-based ecg analysis software and android app for atrial fibrillation screening", in proc. of int. conf. on computing in cardiology, vol. 40, pp. 731-734, 2013. [32] h. xia, g. garcia, j. mcbride, a. sullivan, t. de bock, j. bains, d.wortham and x. zhao 2011 "computer algorithms for evaluating the quality of ecgs in real time", in proc. of int. conf. on computing in cardiology, pp. 369–72, 2011. [33] g. clifford, d. lopez d, q. li and i. rezek, "signal quality indices and data fusion for determining acceptability of electrocardiograms collected in noisy ambulatory environments", in proc. of int. conf. on computing in cardiology, pp. 285–288, 2011. instruction facta universitatis series: electronics and energetics vol. 27, n o 3, september 2014, pp. 317 328 doi: 10.2298/fuee1403317s rapid exploration of cost-performance tradeoffs using dominance effect during design of hardware accelerators reza sedaghat 1 , anirban sengupta 2 1 electrical and computer engineering, ryerson university, toronto, canada 2 computer science and engineering, indian institute of technology, indore, india abstract. modern very large scale integration (vlsi) designs require a tradeoff between cost efficiency and performance (circuit speed). furthermore, the design space exploration (dse) of the cost-performance tradeoffs for the multi objective vlsi designs should also be fast and efficient in nature. this paper presents a novel accelerated dse approach for the exploration of cost-performance tradeoffs of modular multi (trio parametric. viz. cost, execution time and power consumption) objective vlsi hardware accelerators using hierarchical criterion analysis. the selection of the final design point is made after the tradeoffs are explored using the proposed approach. results of the proposed approach when applied to various benchmarks yielded significant acceleration in the exploration process compared to current existing approaches with multi parametric objective. key words: hardware accelerator, rapid, exploration, performance, cost 1. introduction the design space exploration process generally takes into account two conflicting situations such as a) accurately searching the optimal design point from the huge design space b) time taken (or number of architectures analyzed) to evaluate the architecture design space in order to select the optimal design point. the second situation is more significant for modern multi-objective heterogeneous vlsi systems because exhaustive exploration of the architecture space is prohibitive due to the massive size of the design space. the architecture exploration process is therefore a battle between the optimal architecture determination and the speed of the exploration process. furthermore, since present generation vlsi systems are multi-objective in nature, they demand efficient exploration approaches that can satisfy the multi-objective requisite by concurrently reducing the time spent in the architecture evaluation as well as maximizing the opportunity of automating the exploration methodology [1]-[7].  received january 27, 2014 corresponding author: reza sedaghat electrical and computer engineering, ryerson university, toronto, canada (e-mail: rsedagha@ee.ryerson.ca) 318 r. sedaghat, a. sengupta 2. related works exploration has been a subject of research for almost two decades. many approaches have been proposed in the recent past for fast and efficient evaluation of the design architecture space. the evaluation of the architecture design space has been performed by implementing an architecture configuration graph (acg) based on the hierarchical criterion factor [8], [9]. after the creation of the acg, the pareto optimal analysis is performed to find the optimal architecture. although the approach seems promising, the major drawback of this approach is the excessive time taken for the framework to build the architecture design space in order to analyze the variants. on the other hand, authors in [10] use an evolutionary algorithm, such as genetic algorithm (ga), for efficiently searching the optimal solution. they propose a new encoding scheme to improve the efficiency of ga search for design space exploration. using chromosome representation, the precedence relationships among the tasks in the input behavioral specification are encoded with a topological order-based representation to specify schedule priorities. authors in [11] also use ga based on binary encoding of chromosome for efficient design space exploration. additionally, authors in [12], [13] have developed a model that can assist designers at the system-level dse stage to explore the utilization of the reconfigurable resources and evaluate the relative impact of certain design choices. all the above mentioned approaches mostly consider dual objective dse (such as area and delay), but the proposed approach considers multi objective problems (such as cost, delay and power consumption).in addition to the above, a problem space genetic algorithm for design space exploration of data paths have been proposed in [14]. the authors have used the concept of heuristic/problem pair to convert a data flow graph into a valid schedule. the chromosome is encoded based on the „work remaining‟ value of each node. one of the problems with approach [14] is that the second special parent chromosome‟s built in correspondence with the minimum functional units (i.e. serial implementation) does not differ in the work remaining field of the first special chromosome. this may not always lead to the optimal solution. furthermore, the cost function considers only latency and not total execution time. authors in [15] describe an approach to solve the dse problem which is based on ga and weighted sum particle swarm optimization (wspso). the authors use crossover between parent and local-best-solution, then parent and globalbest-solution to implement particle swarm optimization (pso) behavior. the authors do not consider velocity to update the position. moreover in wspso, the authors also do not consider user constraints for area and execution time in cost function. in [16], authors describe another approach for dse in high level systems based on binary encoding of the chromosomes. however, they consider only traditional latency and not execution time constraint for data pipelining. authors in [17] suggest that identification of a few superior design points from the pareto set is enough for an excellent design process. the work shown in [18] discusses the optimization of area, delay and power in behavioral synthesis but does not consider execution time during data pipelining. the problem of design space exploration is also addressed in [19] by suggesting order of efficiency, which assists in deciding preferences amongst the different pareto optimal points. authors in [20] introduce a tool called systemcodesigner that offers rapid design space exploration with rapid prototyping of behavioral systemc models. in [21] evolutionary algorithms such as the genetic algorithm (ga) have been suggested to yield better results for the design space exploration process. an automated tool was developed by integrating behavioral rapid exploration of cost-performance tradeoffs using dominance effect during design.... 319 synthesis into their design flow, while authors in [22] describe current state-of-the-art high-level synthesis techniques for dynamically reconfigurable systems. additionally, authors in [23]-[25] also use a genetic algorithm for scheduling and resource allocation for data path synthesis. another class of scheduling methods employed previously was probabilistic in nature. for example the simulated annealing (sa) and simulated evolution (se) based scheduling techniques have been used for the high level synthesis problem. authors in [26], [27] have proposed a simulated annealing scheduling method called „salsa‟, which uses many probabilistic search operators to enhance the performance of the sa-based technique for high level synthesis problems. in addition, authors have also proposed an extended binding model for handling the scheduling problem in high level synthesis. simulated evolution has been proposed by authors in [28] to solve the combined problem of scheduling and resource allocation in high level synthesis. unfortunately, approaches [23]-[28] do not consider execution time, chaining and data pipelining. authors in [20],[29] proposed alternate approaches based on integer linear programming (ilp).although they are capable of providing good results, the computational complexity is massive and therefore require and extensive amount of time. furthermore, the concept of data pipelining based on execution time was not shown during system trade-off. work shown in [30] for dse suggests an evolutionary algorithm for successful evaluation of the design for an application specific soc. other well known tools for hls exist, such as gaut [31]. gaut inputs a c/c++ behaviour description for automatically generating a rtl structure based on compulsory constraint of throughput (or initiation interval) and clock period. in addition, authors in [32] propose an opensource hls tool called legup for fpga-based processor/accelerator systems. legup is able to synthesize c language to hardware, thereby providing a nice platform for hls. different fpga architectures are supported by this tool, which allows new scheduling algorithms and parallel accelerators. moreover, roccc, proposed in [33], is an opensource hls tool for generating rtl structure from c. it was designed for kernels that perform computation intensive tasks, such as most dsp applications. therefore, roccc applies to a specific class of applications (streaming-oriented applications) and is not a general c-to-hardware compiler, unlike legup [32]. 3. the proposed framework behind design space exploration 3.1 the proposed framework for cost model the model for the cost of the resources is proposed in this section and is an extension of the authors‟ previous work [3]-[5] on the area model. let the area of the resources be given as „a‟. ri denotes the resources available for system designing; where 1<=i<=n. „rclk‟ refers to the clock oscillator used as a resource providing the necessary clock frequency to the system. the total area can be represented as the sum of all the resources used for designing the system, such as adder, multiplier, divider etc, and clock frequency oscillator. total area is shown in equation (1). ( )a a ri  (1) )()...( 2211 rclkaknknkna rnrnrrrr  (2) 320 r. sedaghat, a. sengupta where „nri’ represents the number of resource „ri‟,and „kri‟ represent the area occupied per unit resource „ri‟. let the total cost of all resources in the system be „cr‟. further, cost per area unit of the resource (such as adders, multipliers etc) is given as „cri‟ and the cost per area unit of the clock oscillator is „crclk‟. therefore total cost of the resources is given as: 1 1 2 2 ( .. ) ( ) r r r r r rn rn ri rclk c n k n k n k c a rclk c          (3) applying partial derivative to equation (3) nr1 ….nrn, nrm,and arclkyields equations (4) to (7) respectively as shown below: 1 111 1 ])()..[( r rclkrnrnrnrrr r r n crclkacknckn n c     11 rr ck  (4) rn rclkrnrnrnrrr rn r n crclkacknckn n c ])()..[( 111     rnrn ck  (5) rclk rclk r c a c    (6) according to the theory of approximation by differentials, the change in the total area can be approximated by the following equation: rclk rclk r rn rn r r r r r a a c n n c n n c dc           1 1 (7) substituting equations (4) to (6) into equation (7) yields equation (8): equation (8) represents the change in total cost of resources with a change in the number of all resources and the clock period (clock frequency). the pf for cost of resources is defined as follows: 1 1 1 ( 1) r r ri r n k c pf r n     (9) ( ) rn rn ri rn n k c pf rn n     (10) ( ) ( ) rclk rclk a rclk c pf rclk n    (11) equations (9) and (10) indicate the average deviation of cost with respect to change in resource r1,….rn. note: this average deviation of cost helps in finding the dominance the change of cost contributed by resource rn the change of cost contributed by resource clock rclkrirnrnr crclkackndc  )( (8) rapid exploration of cost-performance tradeoffs using dominance effect during design.... 321 effect of corresponding resource types on cost. further, equation (11) indicates the change of cost of the system with respect to change in resource „rclk‟ (i.e. the dominance effect of rclk). 3.2 the framework used for execution time this section introduces a new mathematical pf model for clock oscillator resource, thus extending the authors‟ previous work [3]-[5] on pf model of functional resources. the priority factor of the resources r1, …rn (such as adders, multipliers etc) for the execution time is derived from [3]-[5].from [3]-[5], the priority factor for the resources r1,...rn for execution time, is defined as: max ( ) ( )rn rn p rn n t pf rn t n     (12) the pf model for the clock oscillator is defined as: rclk min rclk max rclk n tt rclkpf  )( (13) in equation (13), ‘trclk max ’ and ‘trclk min ’are the maximum and minimum values of „execution time‟ and all the available resources have the maximum value. the pf defined in equations (12) and (13) indicates the average change in execution time with a change in number of a particular resource. this average deviation of execution time depends on various resources to find the dominance effect of corresponding resource types on execution time. 3.3 the framework used for power consumption pf for power consumption is defined as: max( ) ( )rn rn c rn n k pf rn p n     (14) max( ) ( )rm rm c rm n k pf rm p n     (15) 1 1 2 2 .. ( ) ( ) clk r r r r rn rn c r n t n t n t pf rclk p n          (16) similarly as explained above, the priority factors for power consumption defined in equations (14), (15) and (16) indicate the average change in the total power consumption of the system with the change in number of resources at maximum clock frequency. therefore, as discussed before, equations (14),(15),(16) indicate the dominance effect of resource types rn, rm and rclk on power metric. 322 r. sedaghat, a. sengupta 4. proposed demonstration 4.1 system specifications the case study of a selected benchmark has been provided for demonstration of the proposed method based on multiple real system specifications (as shown in table 1). the function of the selected second order digital iir chebyshev filter benchmark is given in (17). ( ) 0.041 ( ) 0.082 ( 1) 0.041 ( 2) 0.6743 ( 2) 1.4418 ( 1)y n x n x n x n y n y n         (17) x(n), x(n-1) and x(n-2) are the input vector variables for the function. the previous outputs are given by y (n-1) and y(n-2), while the present output is y(n). table 1 system specifications and constraints 1) maximum cost of resources: 1588 area units 2) maximum time of execution: 200µs (for d =1000 sets of data) 3) power consumption: minimum 4) maximum resources available for the system design: a) 3 adder/subtractor units. b) 3 multiplier units c) 3 clock frequency oscillators: : 24 mhz, 100 mhz and 400 mhz 5) no. of clock cycles needed for multiplier and adder/subtractor to finish each operation: 4 cc and 2cc 6) area occupied by each adder/subtractor and multiplier: 12 area units (a.u), and 65a.u on the chip (e.g. 12 clbs on fpga for adder/subtractor) 7) area occupied by the 24 mhz, 100 mhz and 400 mhz clock oscillator: 6 a.u., 10 a.u. and 14 a.u. 8) power consumed at 24mhz, 100mhz and 400 mhz: 10mw/a.u., 32 mw/a.u. and 100mw/a.u. respectively. 9) cost per area unit resource (cri) = 10 units and cost per area unit clock oscillator = 8 units 4.2 arrangement of the design space (consisting of resources) in increasing orders of magnitude in the form of architecture tree for cost model this paper proposes the use of a hierarchical tree topology for arrangement of design points in sorted orders and exploration of the optimal design point. unlike the authors‟ previous works [3]-[5] using vector design space, this approach uses a more convenient topology for exploration. the tree structure is easy to construct and does not require a special algorithm to order the design space in increasing/decreasing order. the pf of the different resources for cost model is given in equations below: 1 1 1 (3 1) 12 10 ( 1) 80 3 r r ri r n k c pf r n          (18) 2 2 2 (3 1) 65 10 ( 2) 433.33 3 r r ri r n k c pf r n          (19) rapid exploration of cost-performance tradeoffs using dominance effect during design.... 323 ( ) (14 6) 8 ( ) 21.36 3 rclk rclk a rclk c pf rclk n        (20) based on the pf calculated for cost model, the architecture tree for cost can be constructed. the tree is constructed in such a way, so that the resource with the highest pf is assigned level (l1) in the tree, followed by level (l2) being assigned to the resources with next highest pf and finally the last level being assigned to the resource with the lowest pf. the resource with the highest pf influences the cost of the system the most compared to the resource with the least pf. after the assigning the levels, the architecture tree comprising of the design space is automatically arranged in increasing orders of magnitude for the cost model. the architecture tree for the cost model is shown in fig. 1. after the design space is sorted in increasing order of magnitude, searching is applied on the design space. a mixed searching approach is proposed in this work by extracting the advantages of two different well known searching algorithms viz. interpolation search and binary search. previous works [3]-[5] employed a mono binary searching procedure. however, as highlighted in fig. 1, a mixed searching approach is proposed to further enhance the speed of the exploration process. interpolation search is used with the cost model in order to search for the border variant for cost, while for the execution time model binary search is used to find the border variant. the interpolation search performs faster than binary search in cases of uniformly sorted models, such as design space for cost (cost is an increasingly linear function of the number of resources, i.e. cost of the system increases with increase in number of resources). on the other hand, binary search exploits the „divide and conquer‟ approach. hence, it works faster on nonuniformly linear sorted models, such as execution time (execution time being a nonuniformly decreasing linear function of the number of resources i.e. increase in number of resources does not always decrease execution time, but remains same). therefore applying interpolation search on the sorted design space for cost, shown in fig.1 yields the border variant in just 2 comparisons (cost is calculated according to eqn.(3)). the border variant for cost is the last variant in the design space (in fig.1) which satisfies the constraint for cost specified. the border variant obtained for cost is „v11‟. fig. 1 architecture tree representing the design space for cost arranged in increasing order 324 r. sedaghat, a. sengupta 4.3 arrangement of the design space in decreasing orders of magnitude in the form of architecture tree for execution time model the pf of the different resources used in system design for execution time model is given below equations: max1 1 1 (3 1) 2 ( 1 ) ( ) 0.0416 0.055 3 r r p r n t pf r t n          (21) max2 2 2 (3 1) 4 ( 2) ( ) 0.0416 0.111 3 r r p r n t pf r t n          (22) 333.5 20.01 ( ) 104.50 3 max min rclk rclk rclk t t pf rclk n      (23) similarly, as described in section ii.b, the architecture tree for execution time is constructed based on the pf calculated for execution time. thus, the architecture tree obtained after construction is now also automatically arranged (sorted) in decreasing orders of magnitude. after arrangement, binary searching is applied in order to find the border variant for execution time (execution time is calculated according to the model of execution time shown in [4]). the border variant for execution time is the first variant in the design space, which satisfies the constraint for cost specified. the border variant obtained is variant „v5‟. after the border variants for both cost and execution time are found, the pareto optimal set is derived as explained in [3]-[5]. the architecture tree for power consumption is constructed similarly in increasing orders of magnitude for power consumption. among the variants of the pareto set, the one which appears first in the ascending ordered sorted design space (in the tree), is the one with the minimum power consumption. it concurrently satisfies the constraints for cost, execution time and power consumption (specified in table1) for the design problem. therefore the optimal variant obtained, which satisfies all the specified constraints, is variant „v5‟ (marked bold red in fig.1). 5. analysis and results the results of the proposed approach using pf and mixed searching scheme for rapid exploration of cost performance tradeoffs are verified for a number of benchmarks. compared to the authors‟ previous works [3]-[5], the proposed approach is capable of further enhancing the speedup of the exploration process. the search of the border architecture in the case of execution time (using binary search) requires only log2  n i=1 vri where „n‟ = number of type of resources and „vri‟ is the number of variants of resource „ri‟. the search of the border architecture (using interpolation search) for cost parameter requires log2 log2 log2  n i=1 vri. in the design space exploration approach presented here, three objective parameters have been used; execution time and cost are the parametric constraints and power consumption is the optimization parameter. the total number of architecture evaluations performed during searching using the proposed method is given as: rapid exploration of cost-performance tradeoffs using dominance effect during design.... 325 log2 log2  n i=1 vri + log2  n i=1 vri when applied on various benchmarks, the proposed approach indicated massive acceleration in the speedup compared to the exhaustive approach. the proposed method was also compared with a current approach in [8], [9]. the acceleration obtained, compared to the [8], [9], for both small and large size benchmarks is shown in tables2 and 3 respectively. moreover, the proposed approach has also been compared with a heuristic approach (wspso) [15]. as evident from tables 4 and 5, the proposed approach performs lower architecture evaluations than [15] for both small and large benchmarks respectively. for example, in case of mpeg mmv (shown in table 5) the proposed approach performs only 14 evaluations, while [15] perform 53 evaluations to search a final solution. table 2 experimental results of comparison between proposed dse approach with the current approach [8], [9] for small benchmarks benchmarks [2],[34],[35] total possible architecture in the design space for one parameter architecture evaluation using proposed approach (number of variants analyzed) architecture evaluation using approach [8],[9] (number of variants analyzed) percentage speed up using proposed approach compared to [8],[9] average speedup using proposed approach compared to [8],[9] cost execution time total iir chebyshev filter 27 4 6 10 18 44.44 % 41. 85 % mesa horner 36 5 6 11 19 42.10 % elliptic wave filter 78 5 7 12 19 36.84 % differential equation solver (hal) 90 5 7 12 19 47.82 % bpf 100 5 8 13 21 38.09 % table 3 experimental results of comparison between proposed dse approach with the current approach [8], [9] for large benchmarks benchmarks [2],[34],[35] total possible architecture in the design space for two parameters architecture evaluation using proposed approach (number of variants analyzed) architecture evaluation using approach [8],[9] (number of variants analyzed) percentage speed up using proposed approach compared to [8],[9] average speedup using proposed approach compared to [8],[9] cost execution time total auto regressive filter 144 5 8 13 21 38.09 % 37.56 % mpeg mmv 200 5 9 14 23 39.13 % matrix multiplication 400 6 10 16 25 36 % jpeg_idct 900 6 11 17 27 37.03 % 326 r. sedaghat, a. sengupta table 4 experimental results of comparison between proposed dse approach and the current approach [15] for small benchmarks benchmarks [2],[34],[35] total possible architecture in the design space for one parameter architecture evaluation using proposed approach (number of variants analyzed) architecture evaluation using approach [15] (number of variants analyzed) percentage speed up using proposed approach compared to [15] average speedup using proposed approach compared to [15] cost execution time total iir chebyshev filter 27 4 6 10 17 41% 48.7% mesa horner 36 5 6 11 21 47% elliptic wave filter 78 5 7 12 31 61% differential equation solver (hal) 90 5 7 12 32 62.5% bpf 100 5 8 13 35 62% table 5 experimental results of comparison between proposed dse approach and the current approach [15] for large benchmarks benchmarks [2][34][35] total possible architecture in the design space for two parameters architecture evaluation using proposed approach (number of variants analyzed) architecture evaluation using approach [15] (number of variants analyzed) percentage speed up using proposed approach compared to [15] average speedup using proposed approach compared to [15] cost execution time total auto regressive filter 144 5 8 13 52 75% 75% mpeg mmv 200 5 9 14 53 73.5% matrix multiplication 400 6 10 16 65 75.3% jpeg_idct 900 6 11 17 72 76.3% 6. conclusions this paper presented a novel framework for rapid exploration of the cost-performance tradeoffs for modular multi-objective hardware accelerators. once the design space for the cost-performance is explored, the final design point with minimum power consumption is searched from the obtained small pareto optimal set. the proposed dse approach for different benchmarks yielded superior results in terms of acceleration obtained compared to the current existing approaches. acknowledgement: this work is supported by the optimization and algorithm research lab (opral), ryerson university, canadian microelectronics corporation (cmc), motorola, nserc crsng, ontario innovation trust and sun microsystems. additionally, this work acknowledges the assistance provided by science and engineering research board (serb), department of science and technology, govt. of india. rapid exploration of cost-performance tradeoffs using dominance effect during design.... 327 references [1] g. de micheli, “synthesis and optimization of digital circuits”. mcgraw-hill: new york, 1994. [2] saraju p. mohanty, nagarajan ranganathan, elias kougianos and priyadarsan patra, “low-power highlevel synthesis for nanoscale cmos circuits” chapterhigh-level synthesis fundamentals, springer us, 2008 [3] anirban sengupta, reza sedaghat, zhipeng zeng, “a high level synthesis design flow with a novel approach for efficient design space exploration in case of multi parametric optimization objective”, microelectronics reliability, science direct, elsevier, volume 50, issue 3, march 2010, pp. 424-437. [4] zhipeng zeng, reza sedaghat, anirban sengupta, “a framework for fast design space exploration using fuzzy search for vlsi computing architectures”, accepted to appear in the proceedings of 2010 ieee international symposium on circuits and systems (iscas), june 2, 2010. [5] anirban sengupta, reza sedaghat, zhipeng zeng, “rapid design space exploration for multi parametric optimization of vlsi designs”, proceedings of 2010 ieee international symposium on circuits and systems (iscas), june 2, 2010, paris, france, article # 2016 (session: logic & high-level synthesis, c2l-f). [6] anirban sengupta, reza sedaghat, zhipeng zeng, “hardware efficient design of speed optimized power stringent application specific processor”, proceedings of ieee 21st international conference on microelectronics (icm), morocco, december 22, 2009, pp. 167-170. [7] d. gajski, n. dutt, a.wu, and s. lin, “high level synthesis: introduction to chip and system design”. kluwer: norwell, ma, 1992. [8] kirischian, l;geurkov, v., kirischian, v. and terterian, i. „multi-parametric optimisation of the modular computer architecture‟, int. j.technology, policy and management, vol. 6, no. 3,2006, pp.327–346. [9] kirischian, l. „optimization of parallel task execution on the adaptive reconfigurable group organized computing system‟, proc. of international conference parelec 2000, canada, pp.150–154. [10] vyas krishnan and srinivaskatkoori, “a genetic algorithm for the design space exploration of datapaths during high-level synthesis, ieee transactions on evolutionary computation, vol. 10, no. 3, june 2006, pp.229-313. [11] e. torbey and j. knight, “performing scheduling and storage optimization simultaneously using genetic algorithms,” in proc. ieee midwest symp. circuits systems, 1998, pp. 284–287. [12] giuseppe ascia, vincenzo catania, alessandro g. di nuovo, maurizio palesi, davide patti, “efficient design space exploration for application specific systems-on-a-chip” journal of systems architecture 53 (2007) pp. 733–750. [13] c. h. gebotys and m. i. elmasry, “global optimization approach for architectural synthesis,” ieee trans. comput.-aided des., vol. 12, 1993, pp. 1266–1278. [14] m. k. dhodhi, f. h. hielscher, r. h. storer, and j. bhasker, “datapath synthesis using a problem-space genetic algorithm,” in ieee trans.comput.-aided des., vol. 14, 1995, pp. 934–944. [15] harish ram d. s., m. c. bhuvaneswari, and shanthi s. prabhu, (2012) a novel framework for applying multiobjective ga and pso based approaches for simultaneous area, delay, and power optimization in high level synthesis of datapaths, vlsi design hindawi, article id 273276, 12 pages [16] e. torbey and j. knight, “high-level synthesis of digital circuits using genetic algorithms,” in proc. int. conf. evol. comput, may 1998, pp.224–229. [17] alessandro g. di nuovo, maurizio palesi, davide patti, fuzzy decision making in embedded system design,” proceedings of the 4th international conference on hardware/software codesign and system synthesis, october 2006,pp. 223-228. [18] a.c.williams, a.d.brown and m. zwolinski,“simultaneous optimisation of dynamic power, area and delay in behavioural synthesis”, iee proc.-comput. digit. tech, vol. 147, no. 6, 2000, pp. 383-390. [19] i. das. a preference ordering among various pareto optimal alternatives. structural and multidisciplinary optimization, 18(1):aug. 1999, pp.30–35. [20] christian haubelt, thomas schlichter, joachim keinert, mike meredith, “systemcodesigner: automatic design space exploration and rapid prototyping from behavioral models”, proceedings of the 45th annual acm ieee design automation conference, 2008, pp. 580-585. [21] j. c. gallagher, s. vigraham, and g. kramer,“a family of compact genetic algorithms for intrinsic evolvable hardware,” ieee trans. evolutionary computation., vol. 8, no. 2 , apr. 2004, pp. 111–126. [22] xuejie zhang and kam w. ng, “a review of high-level synthesis for dynamically reconfigurable fpgas”, microprocessors and microsystems, elsevier, volume 24, issue 4, 2000, pp. 199-211. [23] r. m. san and j. p. knoght, “genetic algorithms for optimization of integrated circuit synthesis,” in proc. 5th int. conf. genetic algorithms, san mateo, ca, 1993., pp. 432–438. 328 r. sedaghat, a. sengupta [24] r. j. cloutier and d. e. thomas, “the combination of scheduling, allocation and mapping in a single algorithm,” in proc. 27th design automation conf., jun. 1990, pp. 71–76. [25] n. wehn et al., “a novel scheduling and allocation approach to datapath synthesis based on genetic paradigms,” in proc. ifipworking conf. logic architecture synthesis, 1991, pp. 47–56. [26] g. krishnamoorthy and j. a. nestor, “data path allocation using extended binding model,” in proc. 32nd acm/ieee design automation conf.1992, pp. 279–284. [27] j. a. nestor and g. krishnamoorthy, “salsa: a new approach to scheduling with timing constraints,” ieee trans. comput.-aided des., vol. 12, 1993, pp. 1107–1122. [28] t. a. ly and j. t. mowchenko, “applying simulated evolution to high level synthesis,” ieee trans. comput.-aided des., vol. 12, no. 2, feb. 1993, pp.389–409. [29] c. t. hwang, j. h. lee, y. c. hsu, and y. l. lin, “a formal approach to the scheduling problem in highlevel synthesis,” ieee trans. comput.aided des., vol. 10, no. 2, feb1991, pp. 464–475. [30] giuseppe ascia, vincenzo catania, alessandro g. di nuovo, maurizio palesi, davide patti, “efficient design space exploration for application specific systems-on-a-chip” journal of systems architecture 53, 2007, pp. 733–750. [31] gaut: a high-level synthesis tool for dsp applications”, p. coussy, c. chavet, p. bomel et al., in high-level synthesis: from algorithm to digital circuits, springer, 2008, pp. 147-169. [32] canis, a., choi, j., aldham, m., zhang, v., kammoona, a., czajkowski, t., brown, s. d., and anderson, j. h. 2013. legup: an open-source high-level synthesis tool for fpga-based processor/accelerator systems. acm trans. embedd. comput. syst. 13, 2, article 24 (september 2013), 27 pages. [33] villarreal, j., park, a., najjar, w., and halstead, r. 2010. “designing modular hardware accelerators in c with roccc 2.0”. in proceedings of the ieee international symposium on field-programmable custom computing machines. 2010, pp. 127–134. [34] http://www.cbl.ncsu.edu/benchmarks/. [35] http://express.ece.ucsb.edu/benchmark/ http://express.ece.ucsb.edu/benchmark/ facta universitatis series: electronics and energetics vol. 32, n o 4, december 2019, pp. 539-554 https://doi.org/10.2298/fuee1904539p © 2019 by university of niš, serbia | creative commons license: cc by-nc-nd on the node ordering of progressive polynomial approximation for the sensor linearization  aneta prijić, aleksandar ilić, zoran prijić, emilija živanović, branislav randjelović university of niš, faculty of electronic engineering, niš, serbia abstract. many sensors exhibit nonlinear dependence between their input and output variables and specific techniques are often applied for the linearization of their transfer characteristics. some of them include additional analog circuits, while the others are based on different numerical procedures. one commonly used software solution is progressive polynomial approximation. this method for sensor transfer function linearization shows strong dependence on the order of selected nodes in the linearization vector. there are several modifications of this method which enhance its effectiveness but require extensive computational time. this paper proposes the methodology that shows improvement over progressive polynomial approximation without additional increase of complexity. it concerns the order of linearization nodes in linearization vector. the optimal order of nodes is determined on the basis of sensor transfer function concavity. the proposed methodology is compared to the previously reported methods on a set of analytical functions. it is then implemented in the temperature measurement system using a set of thermistors with negative temperature coefficients. it is shown that its implementation in the low-cost microcontrollers integrated into the nodes of reconfigurable sensor networks is justified. key words: sensor linearization, progressive polynomial approximation, reconfigurable sensor networks, ntc thermistor 1. introduction transfer functions of sensors used in measurement systems usually do not have linear dependence between input and output variables. in addition, transfer functions often change with time. for these reasons, measurement systems based on the sensors exhibit various errors such as offset, gain, hysteresis, cross-sensitivity, drift, and non-linearity [1], [2]. in order to achieve reliable measurement, these errors should be compensated. one approach is to use additional analog circuits to condition sensors output signal [3], [4], [5]. however, analog compensation is not always appropriate for sensors integrated received february 5, 2019; received in revised form july 1, 2019 corresponding author: aneta prijić faculty of electronic engineering, aleksandra medvedeva 14, 18000 niš, serbia (e-mail: aneta.prijic@elfak.ni.ac.rs)  540 a. prijić, a. ilić, z. prijić, e. živanović, b. randjelović into reconfigurable sensor networks [1], [6], [7]. a more flexible solution is to convert sensor output into the digital domain, where various numerical linearization methods may be applied in the form of compensation algorithms. these methods rely on a set of correction functions applied on a so-called linearization vector composed of linearization nodes. the effectiveness of the linearization method is evaluated on the basis of the number of nodes required to reduce non-linearity below a specific value, computation time, and implementation complexity. the simplest linearization method is based on a look-up table (lut) which is also the fastest one. however, to obtain a high accuracy of the estimated input value, a high number of linearization nodes should be implemented in the lut, making it memory consuming. to reduce memory requirement, a sparse lut can be combined with an interpolation method [8]. a simple method is piece-wise linear interpolation which connects each two adjacent lut values with an appropriate linear function. this type of interpolation can be also used for linearization of the whole transfer function. for n linearization nodes, sensors inverse transfer function will be represented by n−1 first order polynomials [9]. the main disadvantage of this method is a high number of nodes required for the linearization of highly nonlinear functions. this implies either a large memory requirement or a slow response time of the system [8]. more advanced methods are lagrange, newton [10] and spline [11] interpolations. however, lagrange interpolation suffers from overfitting effect for polynomials of higher degree and it is generally not applicable to highly nonlinear sensors [10], [12]. the newton method is more flexible and efficient when additional linearization nodes are introduced, but it is primarily applied for equidistant nodes [13]. on the other hand, spline interpolation is effective, but it comes at high implementation costs [14]. more effective and more commonly used methods are progressive polynomial approximation (ppa) [15], [16] and linearization methods based on the artificial neural networks (ann) [17], [18]. the effectiveness of the ppa method, besides the linearization nodes number, depends on the node ordering in the linearization vector. results obtained using the same compensation algorithm, but with the different order of nodes (permutation) [19], may vary between almost perfect in some cases, to even increased non-linearity in the other. in the case of the ann method, effectiveness depends on the neural network topology and the time needed for its training. this paper proposes the methodology that improves the accuracy of the ppa while keeping its simplicity. theoretical background, a summary of ppa, and an overview of its modifications are presented in section 2. section 3 contains an analysis of the ppa method effectiveness considering different permutations of the linearization vector for four different functions. two of these functions are convex and two concave. this is done in order to elaborate on the idea that the optimal order of nodes in the linearization vector is dependent on the transfer function shape. in such a sense, an extensive computation time needed to accomplish the desired linearity by analysis of all permutations may be avoided. experimental support of presented numerical results is given in section 4, using negative temperature coefficient (ntc) thermistors as sensors. on the node ordering of progressive polynomial approximation for the sensor linearization 541 2. linearization methods 2.1. underlying theory the transfer function of sensors is usually expressed as , where x is sensor input and y is sensor output [20]. the linearization method calculates the desired output value , using the workflow depicted in fig. 1. sensor output is first digitized, using an analog-to-digital converter (adc), and then the linearization algorithm is applied. obtained output value should vary linearly with the sensor input, i.e. , where k is the gain and n (usually 0) is the offset of the desired transfer function [1]. all operations are performed by a microcontroller, which is a part of the reconfigurable sensor node. fig. 1 workflow of a sensor linearization process there are two types of linearization methods, as illustrated in fig. 2. the first involves an estimation of the sensor transfer function and subsequent numerical determination of sensor inverse transfer function which is used to obtain the estimated input value . the linearized output value is . methods of the second type modify sensors output y using a correction function , so the linearized output is calculated as . estimated input value, in this case, is calculated as . fig. 2 linearization of a sensor output for two distinct methods. the linearization node is a pair of two values: input x and corresponding output y. values are determined experimentally by applying a known stimulus at the sensor input and measuring the value of its output. input values are usually chosen equidistantly, starting at the minimal input that a sensor can detect, and ending at the full-scale. nodes are then ordered to form the linearization vector. linearization coefficients, implemented into the correction function , are determined using the linearization vector, and then stored in the memory of a microcontroller. on each measurement, these coefficients are used to calculate the linear output value of the sensor. 542 a. prijić, a. ilić, z. prijić, e. živanović, b. randjelović the effectiveness of linearization is evaluated using relative non-linearity [21]: , (1) where is full-scale input and is the maximum deviation of the real input from the ideal transfer characteristic. due to the nature of the linearization algorithm, nonlinearity will be equal to zero for any of the linearization nodes if a quantization error introduced by ad conversion is neglected. therefore, additional measurements need to be performed to form a set of ne evaluation nodes and calculate the maximum deviation as: (| |) for i=1,…ne, (2) where is the applied input value and is the corresponding output value. 2.2. progressive polynomial approximation (ppa) calculation of the correction function in ppa is a successive process [15]. for each linearization node, new correction function, denoted as , is defined. therefore, i-th function corrects the non-linearity around the i-th node, while keeping the corrections introduced by the previously defined functions. the final correction function is the one defined for the last node. in order to calculate correction function, linear output value ti at each node should be defined as: for i = 1, . . . , n. (3) these values are then used to calculate linearization coefficients for correction functions. the first correction function is defined as: (4) where is the linearization coefficient calculated for the first linearization node: . (5) thus, adds the value to the sensor output, eliminating the offset (if present). the correction function at i-th node is defined as: ∏ for i = 2, . . . , n (6) and the linearization coefficient is calculated as: ∏ ( ) for i = 2, . . . , n. (7) these correction functions eliminate the gain error and successively minimize the transfer function non-linearity. since the final correction function includes all the linearization nodes, it will output the desired value for any of these nodes while between them linearized output will deviate from the ideal one to some extent. on the node ordering of progressive polynomial approximation for the sensor linearization 543 2.3. modifications of the ppa method in ppa, the first two nodes in the linearization vector are chosen from the ends of a sensor range, thus eliminating the offset and gain errors [15]. then, each new node is added halfway between the previous two. when the linearization vector ordered in that way is used, achieved results are not optimal, but large non-linearity is avoided. note that nodes are not necessarily equidistant. the main advantage of ppa lies in its simplicity, so it does not require intensive time-consuming operations. this makes it particularly suitable for the implementation in reconfigurable sensor networks. improved progressive polynomial approximation (imppa) is the method based on permutations of nodes in the initial linearization vector. each permutation is tested in order to find the best one [19]. to reduce the number of arithmetic operations, imppa does not test all possible permutations of the initial vector. rather, it fixes the first node from the beginning of the sensor range as the first, the node from the end of the range as the second, and then permutes the remaining ones. the effectiveness of each permutation is determined by the non-linearity obtained at nodes which are inserted between the nodes of the linearization vector. a major drawback of this method is increased implementation costs in terms of the complexity and computation time. it should be noted that this method finds the optimal permutation of the given linearization vector for equidistant nodes. the method can be further improved if linearization vector with non-equidistant nodes is used. a probability density function is used to improve ppa, as presented in [22], [23]. this approach proposes an accumulation of linearization nodes in the part of a transfer function that will be used most commonly during a sensor lifetime. this can significantly improve measurement system accuracy in some cases, but the problem of the further ordering of the selected nodes still remains unsolved. a different linearization method inspired by ppa, called modified progressive polynomial approximation (mppa), is addressed in [24]. methodology for selection of the nodes which does not consider their order in the linearization vector is introduced. the larger set of equidistant nodes is formed first. the linearization vector is not predefined, but it is populated at each step by the node from the original set at which current linearized function deviates most from the linear one. consequently, selected nodes in the linearization vector are not equidistant. 3. proposed methodology proposed modification of the ppa method concerns the order of nodes in the linearization vector to obtain the desired transfer function linearity without increasing the algorithm complexity. several analytic expressions commonly used to model sensors transfer functions are analyzed. in order to make a comparison of the results, transfer functions are normalized before the linearization methodology was applied. both, input (argument) and output (function value) are normalized to range [0, 1]. if x m is sensor input and y m is sensor output, normalized input and output values are calculated using the following equations: , (8) , (9) 544 a. prijić, a. ilić, z. prijić, e. živanović, b. randjelović where and are minimum and full range input values, while and are the corresponding output values. the initial linearization vector ti is formed as a set of n equidistant nodes starting from the beginning of the sensors range. it is expressed as: [ ], (10) or, using shorthand notation: [ ]. permutations of the linearization vector are denoted as , i = 1, 2, . . . , n − 1!. 3.1. convex functions the ppa linearization methodology is applied to an exponential function: ⁄ (11) where p is parameter used to adjust its non-linearity (0